E-assessment – the high stakes strategy

E-assessment is becoming an increasingly hot topic, with an increasing number of governments around the world taking their first steps in this area. Whilst e-assessment has alway been an option in Learning Management Systems, formalised testing at national scale is a relatively new phenomenon. This article explores the opportunities, risks and architectures associated with delivering e-assessment at scale.

For the clarity, the term “e-assessment” is used here as the collective noun for electronic delivery of High Stakes and Low Stakes testing, diagnostics and examinations. The term also covers both summative and formative testing.

Norway – with over 800,000 school-age students – was the first country to implement national level e-assessment. As part of a national programme for improving education, and after successful trials in 2009 where students took examinations on their laptops, all the national tests for Reading, English and Math are now digital. A large part of Norway’s exams are also conducted digitally.

Log-on screen for students taking national e-assessments in Norway

Students enrol in the exam at least one week before they sit it. On the day of the exam they are given a user-name and password. The PCs that they take the assessments on are owned by the students but provided by the school, so there is a minimum specification for the hardware and browser (HTML 5). It’s acceptable for students to use materials that they have stored on their Hard Drive or a USB but not to gain help over the internet. Schools are 100% responsible for ensuring compliance with the rules, and E-assessments are monitored by a combination of teachers and software.

Arild Stangeland from The Directorate for Education and Training at the Norway Ministry of Education explains that the Norwegian system breaks down into the following components:

  • Administration of examinations, registration and results/reports
  • Electronic national tests and diagnostic tests
  • E-examinations
  • Collaboration Solution for preparing exams and tests
  • E-processes for preparation of materials for exams for the students

Each of these components have a separate technical architecture supported by a large stack of applications written in .NET, Java, and Flash, and maintained by the The Directorate for Education and Training. Several hundred servers are used, and BizTalk Server is at the centre of the architecture to co-ordinate traffic between different systems. A locally produced Learning Management System is used to deliver the assessments.

E-assessments in Georgia

Another country that has implemented a national level e-examination system is Georgia, in Eastern Europe. Microsoft’s Shota Murtskhvaladze reports that school graduation exams are now delivered through a “Computer Adaptive Testing” (CAT) system. Last year, 50,000 schoolchildren took the school-leaving exams in 8 subjects in 1520 public and private schools within an eight-day-long timeframe. The solution was developed by an agency of the MoE’s National Examinations Center.

There are a number or drivers behind the move towards e-examination:

  1. Cost – the English examination system cost ~ $1bn in 2009. Much of this is tied up in paper-based processes – printing, delivering, collecting and scanning papers.
  2. Flexibility – the potential for going beyond what students can physically write on a paper.
  3. Speed and accuracy – the time from sitting the assessment to getting an accurate the result in front of those who need to know is compressed with e-assessment.

Whilst the benefits of moving entirely to electronic assessment are clear, some countries are using technology to manage individual component parts.

The assessment division of British company RM Education handles a range of tasks for a large number of UK and international examination and assessment boards. They deliver authoring, delivery, marking and results services. For example, the company carries out on-screen marking of scanned paper scripts for the International Baccalaureate.

RM Assessement have a range of service offerings

Since 2009, RM Assessment has been working in partnership with Cambridge Assessment, the University’s international exams group, to enable e-assessment in more than 3,000 test venues across 18 countries.

In 2007, Romanian company SIVECO, worked with the Ministry of Education in Lebanon to develop an Examination Management System to manage and automate the examinations processes. Whilst the examination system remains paper based, the solution automates the examination administration tasks.

In Romania in 2011, SIVECO built a solution to publish the results of National High School exams. The solution produced 30 reports showing the results for 200,000 candidates and had to deal with high peak usage in a small time-frame – just 2 days.

To handle the peaks, SIVECO used Cloud technologies – Windows Azure in particular.  In this project the Romanian Ministry of Education gained ample processing power, eliminated downtime, and avoided spending $100,000 for a comparable on-premises infrastructure. Romania is far from alone in experiencing peaks in data generation and process – the whole assessment industry experiences significant peaks in demand and load during one or two months of the year, which makes Cloud technologies an ideal candidate for e-assessment solutions.

SIVECO used Azure to handle peak data loads

Cloud technologies are also being used to support e-assessment in Columbia. There, the Instituto Colombiano para la Evaluación de la Educación (ICFES) administers standardised tests to students and has used Cloud technologies to reduce costs and better manage online queries when scores are posted. ICFES moved to a Windows Azure platform in partnership with Asesoftware, and has cut costs by 80% and provided students a faster and more reliable solution.

Taking this a step further, the New South Wales Department of Education and Culture – the largest School District in the Southern Hemisphere – has moved to a complete cloud based e-assessment system for Year 9 Science Standards diagnostic testing (ESSA tests). Working in partnership with Australian company Janison, 65,000 students were tested last year in a comprehensive diagnostic assessment.

Part of New South Wales’s ESSA tests – c/o NSW DEC

Tests online revealed much more about how students were thinking, enabling the NSW DEC to provide high quality advice on how to improve teaching and learning. There were other benefits too – saving $200,000 on server infrastructure costs, saving printing and distribution costs, and gaining a week on marking time over previous years.

Risks

So if it’s that easy to do, why aren’t more countries doing it? The main barrier is risk. An assessment system failing during the critical period is headline news, as is inequity and inaccuracy. Many of these risks, however, are inherent in paper based systems too. There are plenty of examples of the wrong papers being delivered to schools, and papers getting lost on return to the examination centres. Like all mission critical IT systems, the key is to architect the system with risk mitigation as a top priority.

Architectures

A basic building-block view of an e-assessment system looks like this:

Key functions include:

A simplified Azure enabled workflow looks like this:

Using Azure as a key component in delivering e-assessment at scale. This is the kind of approach used by Janison for the NSW DEC ESSEA assessments.
  1. Exam/Assessment Board produces and signs-off assessment content collaboratively.
  2. Assessment content is pushed into the Cloud and distributed via a Content Delivery Network
  3. Assessment content is cached at school/exam center level after the first student has viewed a particular resource. As candidates enter the examination centre, they are given a username and password on a card.
  4. Just before the assessment starts, policies are enforced on the candidate’s client computer, and the assessment content is cached either in a dedicated application or on the browser. The candidate’s response data is cached locally and periodically sent to the Cloud via the school level cache.
  5. In the Cloud, the candidate’s data sits in a queue, and is then stored in flat tables.
  6. Encrypted data from the Cloud is sent to a data center for longer-term storage and processing and in relational databases. Once all the candidate’s response data is taken from the Cloud to the data warehouse, and the Cloud application is stopped.
  7. Markers grade the work and ensure leveling and normalisation.
  8. Results are collated, reported and analysed.
  9. Results are passed on to relevant agencies for recognition and certificate distribution.

Security and Equity

It’s crucial that candidates are all able to use devices of the same minimal specification, which makes a straight BYOD policy – where any device is acceptable – a difficult proposition.

Enforcing policies on the client computer is a key component. Until recently, attaining ‘lock-down’ would have required each computer to join a domain. Whilst having a Domain and Active Directory joined client computer has many advantages, there is another approach –  a solution developed by FullArmor called GPAnywhere. This allows “portable” policies to be created from Group Policy Objects and be applied to any end point including a Virtual Application. This means that any device running Windows can have an Assessment policy applied to them.

FullArmor’s GPAnywhere

What next?

Another approach to delivery being considered by some is VDI. The ability to be able to push a virtual assessment desktop to a device and lock it down is appealing as it is potentially a simpler approach. However, there are continuity of service risks with VDI which have yet to be fully tested.

assessment is in its infancy, but many leading examination and assessment authorities are looking carefully into what’s next in this space.

There are thee key areas where assessment has much greater potential than paper based assessment:
CAT

Computerised Adaptive Testing (CAT) is a form of computer-based test that adapts to the examinee’s ability level. Medical students at St George’s, University of London using CAT based e-assessment tools are asked to make decisions along a branched narrative in which information and choices available at a later stage depend on the choices the student made earlier.

ACARA – the Australian Curriculum and Assessment and Reporting Authority – takes this a step further and are talking about how to provide candidates with branched routes through the assessment so they get appropriate recognition for what they have learned. A student who struggles with a question or task can be routed along a less demanding pathway, whilst a more able or better prepared student can be routed along a more demanding pathway – both are able to get the best out of the assessment process. Test-takers also do not waste their time attempting items that are too hard or trivially easy.

Simulations

The New South Wales DEC were able to exploit interactivity when they ran their science tests online. Being able to use interactivity in an assessment opens up a wide range of testing options – for example, asking candidates to build or construct something, conduct virtual experiments, use haptics to test dexterity, or develop an animated scenario. None of these options are practical in a paper and pencil assessment.

21st Century Skills

Whilst we will see paper-based assessment for a long time yet, the pressure is on to find ways of assessing 21st Century skills such as creativity, problem solving, communication and collaboration. Problem Solving is now part of PISA 2012 framework Also, ATC21 – the 21st Century Skills assessment project – is doing some very interesting work in the area of collaborative assessment – www.youtube.com/atc21s One thing is certain – pencil and paper testing won’t help much in diagnosing and assessing whether students have acquired 21st Century Skills or not, so its reasonable to conclude that assessment has a big future.

Conclusion

E-assessment has come a long way in a very short time and is one of the last main barriers to the wider adoption of ICT in schooling. It’s clear that Cloud technology is changing the game here – not only enabling lower cost of service, but also opening the possibility of global e-assessment, with assessment and Examination Boards being able to offer their services to anyone on the planet. With the advent of better biometrics, and new ways of supervising assessments remotely perhaps the most exciting prospect is the notion of assessments being available at any point in one’s lifetime, not just at specified times in the calendar.

Practically everyone on the planet takes many examinations and assessment over their lifetime, so the prospects of this age-old process being made more fair, accurate, helpful, available and engaging are very exciting indeed.

Additional Information

New South Wales ESSA (Science diagnostics tests)
Norway
Changing faces of assessment

Azure

http://www.windowsazure.com

Thanks to:

Arild Stangeland, The Directorate for Education and Training, Norway Ministry of Education

Wayne Houlden, Aaron Wittman, Caroline Thompson and Niels Grootscholten, Janison, Australia

Eric Jamieson, Robert Cordaiy, Joanne Sim, Jim Sturgiss, and Penny Gill, from New South Wales DEC, Australia

Peter Adams, ACARA, Australia

Steve Harrington and Dave Patrick, RM Assessment

Alexandru Cosbuc and Florian Ciolacu, Siveco

Bob Chung, FullArmor

Horng Shya Chua and Puay San Ng, Microsoft Singapore; Bjørnar Hovemoen, Microsoft Norway; Shota Murtskhvaladze, Microsoft Georgia; Teo Milev and Ksenia Filippova, Microsoft Central and Eastern Europe; and Brad Tipp, Corporate HQ.

Memorisation or Understanding? – Erik Mazur

Think of something you are really good at – something that you excel in to the point that others would comment on just how good you are at it.

Now think about how you achieved this. What did you do to become excellent at that particular thing? Which of these best describes how you acquired your excellence:

1. Trial and error

2. Lecture

3. Practice

4. Apprenticeship

5. Other

If you picked “Practice” you will have been in the majority. If you picked “Lecture” you will have been in an extreme minority. And yet, lecturing is how most of education is “delivered”.

So starts Erik Mazur’s talk on “Memorisation or understanding – are we teaching the right thing?”

Erik Marur is a Professor of Physics and Applied Physics at Harvard University and has spent his teaching career applying scientific principles to teaching and learning. Making extensive use of data, Professor Mazur shows that much “instruction” only gives an illusion of learning as its based on memorisation, not understanding.

“I thought I was a good teacher until I discovered my students were just memorising information rather than learning to understand the material”. Professor Mazur explains how he came to the conclusion that it was his teaching that caused students to fail, and how he changed his approach with the result of significant improvements to his students’ performance.

For the full story, watch this Youtube video (fast forward to 3:02 if you want to skip the intros):

Artificial Intelligence in Schooling Systems

Q. “What do you give a hurt lemon?”

A. “Lemon aid”

Like me, you may have thought that the writer of this joke is a student. Actually, the joke writer in this case is Artificial Intelligence software – a “joke generator” called JAPE.

Artificial Intelligence (AI) has growing implications for schooling, and this article aims to set out some of AI’s main concepts, and explore how they can be applied to improving learning.

What is Artificial Intelligence?

Artificial Intelligence is a mature field in Computer Science that has delivered many innovations, for example:

  • Deep Blue, the chess program that beat Kasparov
  • “iRobot Roomba” automated vacuum cleaner, and “PackBot” used in Afghanistan and Iraq wars
  • Spam filters that use Machine Learning
  • Question answering systems that automatically answer factoid questions

AI is best known for aiming to reproduce human intelligence. The field was founded on the claim that intelligence can be simulated by a machine. Essentially AI is the design of systems that perceive their environment and take action that maximize their chances of success. AI addresses natural language processing, reasoning, knowledge, planning, learning, communication, perception and the ability to move and manipulate objects. AI is about many things including interacting with the real world; reasoning and planning; learning and adaptation.

Different Approaches

There are several approaches to AI including:

  • Building models of human cognition using psychology and cognitive science
  • The logical thought approach with emphasis on “correct” inference
  • Building rational “agents” –  a computing object that perceives and acts

Key areas of application of AI in education include:

  • Robotics
  • Simulations
  • Games
  • Expert systems
  • Intelligent tutoring systems
  • Search, question and answers

Key AI Concepts

An initial view of AI reveals a field that is deeply divided into seemingly unrelated subfields. Some of these sub-fields even appear contradictory. For example, Neural Network techniques are considered by some a better model of human reasoning than rule-based Expert Systems, so lets take a closer look at these two approaches.

Neural Networks

This approach mimics the human brain through the use of “nodes”, which resemble neurons. Neural Network technology – which uses layers of “input”, “hidden (process)” and “output” nodes – has been applied successfully to speech recognition, image analysis, adaptive control, games and robots. Most of neural networks are based on statistical estimations, classification optimization and control theory. Neural networks can be programmed to model the behavior of natural systems – e.g. responding to stimuli.

Expert Systems

Expert Systems emulate the decision-making ability of a human expert by reasoning about knowledge – as opposed to following the procedures set out by a software developer as is the case of conventional programming. An expert system is divided into three parts – a knowledge base; an inference engine; and a dialog interface to communicate with users.

Machine Learning

Neural Networks can be applied to the problem of Machine Learning – the design and development of algorithms that allow computers to evolve behaviors based on data from sensors, input devices, or databases. An important task in Machine Learning is pattern recognition, in which machines “learn” to automatically recognize complex patterns, and to make intelligent predictions.

In games which have concrete rules and multiple permutations – eg Chess – Machine Learning calculates the most likely outcomes of the game given the position on the board by playing simulated games into the future. In addition, pattern recognition enables the game to analyze the relative merits of different moves in the game, based on which ‘shapes’ were created by experts in historical games.

Intelligent Agents

An intelligent agent is a set of independent software tools linked with other applications and databases running within one or several computer environments. Agent based technology systems include a degree of autonomous problem-solving ability. The primary function of an intelligent agent is to help a user better use, manage, and interact with a system or application. Additionally, software agents, like human agents (for example, an administrative assistant), can be authorized to make decisions and perform certain tasks.

Coach Mike, is an Intelligent Agent used at the Boston Museum of Science. Coach Mike’s job is to help visitors at Robot Park, an interactive exhibit for computer programming. By tracking visitor interactions and through the use of animation, gestures, and synthesized speech, Coach Mike provides several forms of support that seek to improve the experiences of museum visitors. These include orientation tactics, exploration support, and problem solving guidance. Additional tactics use encouragement and humor to entice visitors to stay more deeply engaged. Preliminary analysis of interaction logs suggest that visitors can follow Coach Mike’s guidance and may be less prone to immediate disengagement.

Enhancing Learning

Herbert A. Simon, an AI pioneer, said – “If we understand the human mind, we begin to understand what we can do with educational technology.”

Human learning and reasoning is founded on multiple knowledge representations with different kinds of structures, such as trees, chains, dominance hierarchies, neighborhood graphs, and directed networks. From MIT Open Courseware (Image by Prof. Joshua Tenenbaum.)
 

With systems that can both “learn” and provide “expertise”, the implications of AI for schooling are profound. Whilst AI has potential for solving problems like optimal resourcing and improving operational performance, the strongest area for the application of AI in schooling is to make learning more effective.

AI in schooling can be traced back to 1967 when Logo was created. Since the introduction of Logo and “floor-bots” such as Turtles, ever more sophisticated robots along with associated control technologies such as Lego Mindstorms – have been used in schools. Products such as Focus Educational’s “BeeBot” is a recent addition to systems applying some of the principles of AI in a schooling environment.

AI in schooling is evolving in several different ways:

Question and Answer Systems (QA)

By 2020, we’ll be creating enough data for a stack of DVDs containing it to reach the moon and back three times! Regrettably, the quality of answers does not necessarily improve in proportional to the amount of information available. The current generation of search engines are essentially information retrieval systems providing a list of “hits” from which the user has to deduce the closest match. One of the goals of AI, therefore, is to enable more natural questioning resulting in better answers and related information.

The first QA systems were developed in the 1960s as natural-language interfaces to expert systems. Current QA systems first typically classify questions and then apply Natural Language Processing. Natural language ‘annotations’ describe content associated with ‘information segments’. An information segment is retrieved when its annotation matches an input question. A generating module then produces sentences – ‘candidate answers’. Finally, ‘answer extraction’ processes determine if the candidate answer does indeed answer the question.

The implications for QA systems in schooling are enormous and raise significant questions about the role of teachers, learning content and assessment.

Learning With Expert Systems

Imagine students being given the task of recognizing patterns on science laboratory slides and making correct classifications. By combining expert and pedagogic models we are able to exploit AI to “mash” both domain specific and more general learning principles into a rich learning experience. When classifying the slides, students will be not just presented with a “right or wrong” response, but their behavior will be refined through “machine understanding” of why the student is making their decisions. AI differs from more conventional computing approaches by being able to generate and handle both “feed-forward” and “feed-back”.

Intelligent Tutors

Taking this a step further are Intelligent Tutors. These record their interactions with students to better understand how to teach them. Computer tutors are capable of recording both longitudinal data, as well as data at a fine-time scale, such as mouse clicks and response time data. Using these interactions as a source of data to be mined provides a new view into understanding student learning processes.

Games and Simulations

Currently, the area in which AI is applied the most is Computer Games – and by a large margin. The use of scenario-based simulations and serious games for training has been well-accepted in many domains. Simulations require active processing and provide intrinsic feedback in an environment in which it is safe to make mistakes. Artificial ecosystems – like the one shown below – have proved popular and have their uses in schooling.

An interesting learning mechanism used in game based learning that is potentially usable in other contexts is “Transfer Learning” – which can help improve the speed and quality of learning. The idea is to use knowledge from previous experiences to improve the process of solving a new problem.

Two key AI methods underpin this approach –

  • Case-Based Reasoning (CBR) – a set of techniques for solving new problems from related solutions that were previously successful.
  • Reinforcement Learning (RL) – set of algorithms for solving problems using positive or negative feedback from the environment.

Reinforcement Learning can be delivered through the following mechanism –

  1. A central database with a collection of rules, mapping all possible actions and relative values.
  2. A learning component that takes feedback from the environment, and updates the utility value of each action. This is done using a reinforcement learning policy which estimates if there were any improvements since the last step.
  3. A planner then takes these rules, and computes a plan of action randomly based on the utility of the actions.

To anyone who has explored managed learning, this should sound quite familiar.

Two interesting models for understanding human learning in AI and Games context have come out of Microsoft Research:

This model classifies different types of learning in the context of games environments, but has transferability to broader understandings of the interface between computing and learning:

This model helps visualize the relative ease with which a game player can learn, depending on the granularity of detail presented to them:

  • Too coarse: cannot learn a good policy
  • Too fine: impossible to learn from little experience
  • Just right: learn a good policy from little experience

Personalized learning

Ramona Pierson, Chief Scientific Officer for Promethean, talks about ‘mashable’ digital content with embedded assessments tightly coupled to the curriculum, and learning progressions made ‘dynamic’ by AI. This can adjust learning progressions continually for each student, presenting cross-curriculum content and learning strategies based on a dynamic learning process.

“Imagine how powerful it would be for a student to have a customised textbook, sequencing of lessons, and embedded assessments that dynamically changed to ensure that he/she masters the material in the way that makes sense, and would result in obtaining nationally set benchmarks and learning outcomes”. (Mass Customisation And Personalisation Of Learning, Education Technology Solutions).

Nick Fekos, a former AI programmer in the financial sector and now at Athens College, agrees and is formulating plans for an intelligent object oriented knowledgebase that ‘learns’ from ‘experience’ and adjusts accordingly. The system Nick has in mind will implement dynamic, self-organizing and differentiated learning paths. The more the learning algorithm is used, the better it will get – perhaps something that can be said for the more general application of AI to schooling itself.

So How Do I Build an AI System?

Firstly, there is plenty of opportunities for getting students developing AI systems.

Besides Logo, its worth looking into Kodu – a  visual programming language made specifically to enable children to create games.

Also check out Microsoft Robotics Developer Studio which helps make it easy to develop robot applications. The current version (4), which is in Beta, provides extensive support for the Kinect sensor hardware allowing developers to create Kinect-enabled robots in both a ‘Visual Simulation Environment’ and real-life.Integrating AI into other learning workloads is an altogether more complex task.

For anyone wanting to understanding the mechanics of programming an AI system, this excellent article shows how to programme a neural network in C#.

For a more comprehensive desicription, including important architectural principles, check out this paper from University of Southern California which explains how to build a simulation to teach soft skills such as negotiation and cultural awareness.

For a comprehensive coverage of the field of AI in Education, look at the proceedings from Artificial Intelligence in Education, 15th International Conference, AIED 2011, Auckland, New Zealand, June 28 – July 2011.

For a comprehensive coverage of the field of Intelligent Tutoring, look at the proceedings from the 10th International Conference on Intelligent Tutoring Systems, ITS 2010, Pittsburgh, PA, USA, June 14-18, 2010