第七期 2004年01月
編者的話 首頁  

放大

學習評量
>> THE PRICE OF TESTING IN MACAU'S SCHOOLS By Keith Morison,
Translated by U Ngai

[ 中文版 ] [ English Version ]

Testing in crisis?

The use of testing seems unstoppable. All over the world the emphasis placed on tests is enormous. Tests, it seems, are the putative guardians of standards, gatekeepers to success, and assurance that students have learned. Their allure seems irresistible to teachers, school systems and parents alike. For teachers, marks signify and record learning; for learners and parents they signify success or failure. The very act of testing seems to bring with it a guarantee of an objective assessment of performance - be it of the teacher or the student. This may or may not be true. Though testing is neither intrinsically undesirable or worthless, in Macau the emphasis placed on testing, the negative consequences of testing on students and teachers, the nature and contents of testing, and the fallout of testing on curricula and learning are worrying.

A little over a decade ago Lewin and Wang (1990) reported that widespread testing in China led to low-level recall, discouraged creativity, demotivation, lowering self-esteem, lack of originality; it narrowed the content and framing of curricula, elevated content over skills, and encouraged rote learning. In a published study of Macau (Morrison and Tang, 2002) I reported that, rather than the situation improving since the time of Lewin and Wang, the problem is exacerbated in this small state. Testing, in this case the testing largely of students' ability to repeat book knowledge and facts, if left unchecked, becomes part of a self-defeating dependency culture, a hermetically sealed system in which curricula and testing mutually reinforce each other in producing a low-level, facts-driven curriculum, dangerously didactic pedagogy, rote learning, a distortion of student motivation, a powerful controlling mechanism on teachers and students, a narrow transmission view of teaching, and the destruction of learners qua people. The tests become the benchmarks rather than the minimum competencies for learning.

We know from brain-based research (Sousa, 2001) that cognitive and affective factors - emotions and learning - are not only deeply, structurally interlinked in brain functioning, hard-wired together, but effective learning requires, as a sine qua non, the promotion of positive affective states, for example: motivation, enjoyment, the experience of success, the opportunity for the exercise of choice and autonomy (e.g. the experience of control over one's learning), a positive self-image and self-esteem, and the reduction of undue stress and pressure. Negative affective states inhibit learning powerfully, sometimes for life; students may gain marks in a test but be switched off learning for ever, and, in an era of lifelong learning, enormous attention should be given to preparing and promoting the positive attitudes of young learners to learning for life. Put simply, motivation, autonomy, the experience of success (however small) and self-esteem are central to learning. Yet when one looks at the effects of testing these important factors are often damaged.

The overuse of marks demeans learning, teaching, students, teachers and education. In research on teaching, learning and assessment in Macau (e.g. Morrison and Tang, 2002; Tang, 2002), several studies of which are in the public domain and housed in the Inter-University Institute of Macau, the message is overwhelmingly clear: in many Macau schools, teachers, maybe with the best of intentions, tell students what to think, how to think, when to think it and, through testing, how well they have thought and how they must show their thinking. This is both intellectually and emotionally stifling; it suffocates education, learning and development. More worrying still, such suffocation reduces creativity - exactly at a time when Macau needs creative thinkers - and leads to 'learned helplessness', now recognized as a medical syndrome which, pushed to its limit, metaphorically and literally kills. What terrible testing and examination stress and pressure prompts children in East Asia deliberately to step out of windows of high rise apartments and into oblivion?

I am not saying that testing on its own is responsible; that would be ridiculous. But I am saying that the amount, nature and consequences of testing are powerful components of an anti-educational spiral of decline in which everything is fixed, controlled, decided and closed, and in which failure is built-in. As Sacks (1999) wrote, there is a high price to be paid for a culture of testing: tests standardize minds. Schools should not be factories.

Test scores are treated as though that they are correct, reliable and fair proxy measures of learning. This is frequently spurious. Commonplace notions of standard error argue against this. Further, on several occasions when I have given teachers in Macau samples of students' work to mark, experienced teachers not only cannot agree on the marks to be awarded, but their level of disagreement is massive. Yet we continue to place a belief in marks as though they are fair, reliable and accurate, and, more problematic, we make major judgements about, and decisions on, students based on these marks. Marks in a test become measures of people.

In a marks-driven system if I do not gain 100% then the mark I receive is often viewed as a measure of failure rather than as a measure of success. Why should failure follow if a student scores 50% or a 60%, as is common in Macau? On what criteria is such a judgement based? What exactly are the criteria for failure? Simply the inability to gather enough marks? Why is there a cut-off point for 'passing' and 'failing', and, even if it is decided that such a cut-off point is desirable, why should it be what it is? Why 50%? Why 60%? Where do the cut-off 'standards' come from? How are they derived from educational arguments and not from simplistic distributions of marks or arbitrary decisions of what a failure or passing level is? What do we think of a system that routinely fails (and punishes) so many of its students simply because they do not conform to a lock-step, uniform view of learning, reinforced by testing? Looking at the disproportionate number of schools repeaters in Macau should be enough to tell us that it is the system, rather than the individual, that is at fault.
The system is at fault where a rigid curriculum reinforced by testing, creates repeaters because it fails to address individual differences. The detrimental effects of students 'failing' are massive, yet the problem is often an inflexible curriculum which takes little or no account of individual differences, and then proceeds to grade students as if they were all of the same ability, which, clearly, they are not. Such a view offends common sense, natural justice and human rights. Following this up by punishing students for low marks is like blaming the victim. Of course some students may be lazy and may deserve to be chastised for lack of effort or achievement, but to do this simply and routinely on the basis of only a mark seems ridiculous. A mark does not always reflect learning, ability, personality or emotional engagement with learning; it only reflects a fraction of performance. If we were to give teachers a single mark for their performance, based on a single scale, then the evidence from other countries (e.g. in inspection systems where this occurs) is that they feel extremely degraded, insulted, demeaned, demotivated, powerless and extremely angry; their self-esteem suffers hugely. Why do we do this to students?

A prominent international expert on assessment, Black (1998), suggests that there are problems with teachers conducting their own tests (the predominant form of testing in Macau), not the least of which is that teachers resort to simplistic testing rather than richer and more extended forms of assessment. Indeed he cites four main problems:

  • Classroom evaluation practices generally encourage superficial and rote learning, concentrating on recall of isolated details, usually items of knowledge which students soon forget.
  • Teachers do not generally review the assessment questions that they use and do not discuss them critically with peers, so there is little reflection on what is being assessed.
  • The grading function is over-emphasized and the learning function is under-emphasized.
  • There is a tendency to use a normative rather than a criterion approach, which emphasizes competition between students rather than personal improvement of each. The evidence is that with such practices the effect of feedback is to teach the weaker students that they lack ability, so that they are de-motivated and lose confidence in their own capacity to learn.

The description might apply equally to Macau.

At what price do we produce a society of test-givers and test-takers? There is plentiful evidence that East Asian students outperform many other countries in international tests of achievement, yet the most important question about this often remains unaddressed and unanswered: at what cost? Runaway testing carries with it the serious risk of an impoverished view of teaching and learning, and, more important, an impoverished view of people and their capacity to think, to act, to create and to relate. People are not only containers of bits of knowledge, to be reproduced when the appropriate trigger is activated; we are not Pavlov's dogs.

I am not against memorization. It can be wonderful. As a school and university student, I delighted in learning by heart the poetry of Yeats and Pasternak, and I carry the music of Schubert and Bach in my head to this day. Nobody forced me to, and I was not tested on any of this, but I have them with me in my mind and my heart every day. But less us not equate this with the ritualistic, repetitive learning and reproduction through tests of textbook knowledge whose meaning may be either opaque or irrelevant to many students, and which they often soon forget after the test.

True learning requires the application and construction of ideas. Gardner (1999) suggests that brain-based research indicates the value of the dictum 'use it or lose it' - ideas and concepts must not be inert but must be applied and developed, away from the mere retention and memorization of facts. Does testing really let students apply their learning, create and test ideas? There is evidence from Macau that it does not, but that it is construed as testing the students' temporary absorption of a textbook and material learnt in class, and teachers control large classes by concentrating the curriculum on the delivery of textbook-based information. With class sizes in Macau often being large, teachers frequently report that the only way in which they can cope is in 'survival mode', which is by emphasis on repetition of facts, and instructional styles which are reinforced by testing; the system 'keeps the lid' on large classes.

One has to be cautious in over-criticizing rote learning, memorization, putative low-level cognitive strategies, large classes and putative teacher-centred teaching, because: (a) Asian students achieve highly on international measures of performance; (b) repetition and memorization do not preclude, indeed they can lead to, understanding, deep rather than superficial learning, and high level cognitive strategies; (c) many Chinese teachers handle large classes in cognitively sophisticated, high-level, involved and engaging ways.1 That said, evidence from Macau, gained from the teachers in Macau, whilst not questioning such published findings, indicates that this may not be true in Macau. Though rote and memorization may ultimately lead to learning (the Chinese saying 'if you read a text a hundred times you can understand the meaning automatically' (Dahlin and Watkins, 2000)), not only does that seem highly inefficient learning but we have to ask what else such a view and practice of learning does to learners and teachers.

Two studies of testing in Macau

Tests in Macau schools - their contents, frequency, scope, use and nature - are very largely controlled by the teachers themselves. In two studies of education in Macau (Morrison and Tang, 2002), teachers claimed that great emphasis was placed on tests and examinations, and there were many advantages of testing, in that it:

  • 'is the driving force to make the students study';
  • ensures that 'student understand the lecture';
  • measures 'how much students have learnt';
  • is an objective and reliable way of measuring performance;
  • indicates 'how much knowledge a student has on a topic';
  • ensures that lazy students learn (a feature mentioned by many respondents);
  • 'forces students to learn their lessons';
  • 'puts pressure on students to learn';
  • 'makes students study, as they are highly marks oriented';
  • provides evidence of 'how effective is the teachers' teaching';
  • 'keeps teachers working hard';
  • is a way of assessing 'large numbers of students';
  • 'prepares students for university entrance'.

It is interesting, perhaps, to note the references to 'lazy students' and pressure; that this might be a symptom of deeper problems (e.g. student motivation) is not mentioned, as if testing were unproblematic. On the other hand, the same teachers in the study also indicated several disadvantages of testing, in that tests:

  • put students and teachers under severe pressure and overload;
  • test only the topics covered in the class: 'the students only learn what the teachers have assigned them to study';
  • lack variety, dominate the kinds and amounts of assessments, and dominate the curriculum, reinforcing its rigidity and narrowness;
  • significantly under-use self-assessment and self-diagnosis by students;
  • are demotivating and do not guarantee long-term learning; students forget things after the tests/exams;
  • largely examine only book knowledge;
  • depress students' self-esteem and motivation;
  • build in failure and created resentment in students;
  • punish the weaker students;
  • are strong partners to didactic, textbook-driven methods, drill, rote learning and memorization, superficial learning, student passivity and spoon-feeding.
  • create a culture of only mark-seeking in students;
  • train students 'to study mechanically' and 'do not make students study in the right way', thereby causing them 'to lose interest in studying';
  • 'cannot show the real situation of learning';
  • are 'not very encouraging on students who have trouble studying';
  • 'require too much memorization' of inert facts, often 'without understanding';
  • suppress creativity and critical thinking (one respondent remarked that 'if the questions ask for purely critical thinking, students don't bother to answer the question');
  • encourage 'students [to] spend too much time on remembering the dead knowledge for the test. It wastes a lot of their time';
  • build in passivity and make students lazy.

Students learn in order to pass the tests and then bleach much of the material from their minds; short-term memorization is followed by forgetting, as one respondent mentioned: 'after testing they forget all'.

The effect of testing on students is to create a mind-set in which passing tests is not only the goal of education but failing tests is to be avoided at all costs; only marks matter, and what is educationally important is that which gains marks. There are model answers, and marks are deducted when students do not repeat the model answer verbatim; as one respondent in the study mentioned: 'they [students] think that if they can dictate them [lines from the textbook] out during the test, they will already score high marks'.

When asked about the amount of testing, the teachers in the study reported the following:

  • The frequency of testing each class was:
    More than once per week: 27.8%
    Once a week: 22.2%
    Once a fortnight: 16.7%
    Between once a fortnight and once a month: 11.1%
    Less than once a month: 22.2%

  • The time spent on assessment and marking each week was:
    Less than five hours: 15.8%
    5-14 hours: 52.6%
    15-24 hours: 15.8%
    Over 24 hours: 15.8%

  • The time spent on testing each week was:
    1-5 hours: 2.2%
    6-10 hours: 16.7%
    11-14 hours: 11.1%

Clearly testing occupies a prime position in these teachers' and students' lives. The modal score of testing each class more than once a week is staggering (extrapolated to mean that a student is tested twice every school day). In terms of teachers' time, the modal scores of spending between 5 and 14 hours per week on marking, and the equivalent of nearly one working day per week on testing (1-5 hours) demonstrates how deeply saturated testing is in the minds of teachers. We must break the mantra of testing in Macau, as it dehumanizes education.

Moving away from testing

Teachers themselves, as well as students, must learn from the test results, and must use the results to modify their teaching. For example, if an average class mark of 70% is scored, then, as Black (1998) suggests, teachers often take this as an indication to continue rather than as an indication of the need to revise and re-teach, despite the fact that, by implication (an average implies scores above and below the average) a significant proportion of the class may not have understood half of what was taught.

Assessment must replace testing, and assessment must be formative. Black (1998) makes the very telling comment that one cannot have genuine or extended formative assessment unless one is prepared to modify the curriculum. This is a salutary message for those committed to a lock-step curriculum, whose pace, timing, and contents are prescribed for every student. He makes the point that formative assessment cannot just be bolted onto an existing scheme; it changes schemes.

Further, teacher assessment, he suggests, is effective in raising levels of achievement and motivation if it:

  • is criterion-referenced rather than norm-referenced;
  • uses praise rather than blame;
  • is differentiated to meet individual needs;
  • concentrates on, and is referenced to, learning goals;
  • sets attainable targets;
  • is part of a flexible and changeable programme of learning.

Ineffective and unhelpful feedback comprises statements like: 'try harder'; 'your spelling is poor'; 70%; Grade D. It is impossible for the learner to know from this how to improve and what to do to improve. Effective feedback, on the other hand, indicates what needs to be done to improve, what are the targets and how they can be reached, where attention needs to be focused, how errors can be corrected, how the learner can improve, and it is timely, frequent and ongoing. Feedback tells the student what were the results of her/his work; added to that, guidance - feedforward - enables the student to act on the feedback.

Formative assessment plays a major role in student learning; it improves learning and achievement. Improving learning through assessment is dependent on several key factors (Black and William, 1998):

  • the provision of effective feedback to students;
  • the active involvement of students in their own learning;
  • adjusting teaching to take account of the results of assessment;
  • a recognition of the profound influence assessment has on the motivation and self-esteem of students, both of which are crucial for learning;
  • the need for students to be able to assess themselves and understand how to improve;
  • sharing learning goals with students;
  • involving students in self-assessment;
  • providing feedback which leads to students recognizing their next steps and how to take them;
  • underpinning by confidence that every student can improve.

Conclusion

I am not saying that we should not have tests. That is nonsense. Tests have their place in education, but it is limited and, indeed, limiting. What I am saying is that tests should be massively reduced in Macau, that formative assessment rather than simply testing should be increased significantly, and that tests, if they are to be used should:

  1. be the consequence rather than the drivers of education;
  2. require application and higher order thinking rather than simple repetition;
  3. not constrain or drive curricula unduly;
  4. be reduced in frequency;
  5. not be couched largely in terms of passing and failing;
  6. not be used as sole indicators of learning;
  7. not be taken to be the only, or principal aspect of education and learning that is important;
  8. accept that the reliability of marks is suspect;
  9. be used to modify and improve teaching;
  10. promote learning;
  11. increase motivation, self-esteem and enthusiasm for learning.

Testing should be reduced in order to release time for:

  1. deep and higher order learning and thinking;
  2. application and construction of knowledge;
  3. exploration, creativity and discovery;
  4. breadth, flexibility and open-endedness of curricula;
  5. student autonomy;
  6. teaching and learning.

Test less; learn more. Test less; teach more. Test less; achieve more. The idea that 'weighing a pig' (constantly testing) increases the size of the pig (improves the intellectual capacity of learners) offends logic. Pigs need food in order to grow, not measurements.

The price of testing in Macau is frequently success for a few but failure for the majority. What kind of educational community is it that not only considers education in terms of success and failure but believes that success or failure can be shown in marks, however deep-seated in Macau's - and East Asian - culture marks and tests are? A student's failure should be the school's failure.

The spectre of students and teachers being caught up in a humdrum cycle of textbook-driven learning, reinforced by testing, is educationally bankrupt and desperately dam-aging to students and teachers. Teachers and students are reduced to technicians. Schooling becomes circular; tests drive teaching and teaching is reinforced by testing. Schooling is closed; it is going nowhere. That is the antithesis of education.

(Keith Morrison, Vice Rector of Inter-University Institute of Macau. U Ngai ,staff of Division of Research and Education Reform Education, Education and Youth Affairs Bureau of Macau.)

NOTES
1. See Biggs (1996a; 1996b), Marton et al (1996), Dahlin and Watkins (2000), Biggs and Watkins (2001), Watkins and Biggs (2001), Cortazzi and Jin (2001) and Mok et al (2001).

REFERENCES
Biggs, J. B. (1996a) Western misperceptions of the Confucian-heritage learning culture. In. D. A. Watkins and J. B. Biggs (Eds) The Chinese Learner: Cultural, Psychological and Contextual Factors. Hong Kong and Australia: Comparative Education Research Centre and the Australian Council for Educational Research Ltd., pp. 45-67.
Biggs, J. B. (1996b) Learning, schooling and socialization: A Chinese solution to a Western problem. In Sing Lau (Ed.) Growing Up the Chinese Way: Chinese Child and Adolescent Development. Hong Kong: The Chinese University of Hong Kong: The Chinese University Press, pp. 147-67.
Biggs, J. B. and Watkins, D. A. (2001) Insights into teaching the Chinese learner. In D. A. Watkins and J. B. Biggs (Eds) Teaching the Chinese Learner: Psychological and Pedagogical Perspectives. Hong Kong and Australia: Comparative Education Research Centre and the Australian Council for Educational Research Ltd., pp. 277-300.
Black, P. (1998) Testing: Friend or Foe?. London: Falmer.
Black, P. and Wiliam, D. (1998) Inside the Black Box: Raising Standards through Classroom Assessment. London: Kings College, University of London. http://www.pdkintl.org/kappan/kbla9810.htm.
Cortazzi, M. and Jin, L. (2001) Large classes in China: 'good' teachers and interaction. In D. A. Watkins and J. B. Biggs (Eds) op cit, pp. 115-34.
Dahlin, B. and Watkins, D. (2000) The role of repetition in the processes of memorising and understanding: A comparison of the views of German and Chinese secondary school students in Hong Kong. British Journal of Educational Psychology, 70, pp. 65-84.
Gardner, H. (1999) The Disciplined Mind. New York: Simon and Schuster, pp. 76-82.
Lewin, K. and Lu, W. (1990) University entrance examinations in China: a quiet revolution, in P. Broadfoot, R. Murphy, and H. Torrance, H. (Eds) (1990) Changing Educational Assessment. London: Routledge, pp. 153-76.
Marton, F., Dall'Alba, G., and Tse, L. K. (1996) Memorizing and understanding: the keys to the paradox? In D. Watkins and J. Biggs (Eds) op cit, pp. 69-83.
Mok, I., Chik, P. M., Ko, P. Y., Kwan, T., Lo, M.L., Marton, F., No, D. F. P., Pang, M. F., Runesson, U. and Szeto, L. H. (2001) Solving the paradox of the Chinese learner. In D. A. Watkins and J. B. Biggs (Eds) op cit. pp. 161-79.
Morrison, K. R. B. and Tang, F. H. (2002) Testing to destruction: a problem in a small state. Assessment in Education, 9 (3), pp. 289-317.
Sacks, P. (1999) Standardized Minds. Cambridge, MA: Perseus Books.
Sousa, D. A. (2001) How the Brain Learns (second edition). Thousand Oaks, CA: Corwin Press Inc.
Tang, F. H. (2002) An Investigation into English Teaching, Learning and Achievements in Macau. Unpublished Ed. D. thesis. University of Durham, UK.


教育及青年發展局