80
Número 15 / DICIEMBRE, 2021 (80-96)
Universidad de Concepción, Facultad de
Educación, Departamento de Currículum e
Instrucción, Concepción, Chile.
claudiodiaz@udec.cl
ORCID:
https://orcid.org/0000-0003-2394-2378
Universidad de Concepción, Facultad de
Educación, Departamento de Currículum e
Instrucción, Concepción, Chile.
alajara@udec.cl
ORCID:
https://orcid.org/0000-0002-8667-5222
Universidad de Concepción, Facultad de
Educación, Departamento de Currículum e
Instrucción, Concepción, Chile.
yrosales@udec.cl
ORCID:
https://orcid.org/0000-0003-1913-9363
Universidad de Concepción, Facultad de
Educación, Departamento de Currículum e
Instrucción, Concepción, Chile.
msanhuezav@udec.cl
ORCID:
https://orcid.org/0000-0003-4680-3207
Recibido:
(15/10/2020)
Aceptado:
(08/01/2021)
Claudio Díaz Larenas
Alan Felipe Jara Díaz
Yesenia Ester Rosales
Orellana
María José Sanhueza
Villalón
CHARACTERIZING ENGLISH
ASSESSMENT INSTRUMENTS: AN
OVERVIEW OF THEIR DESIGN
CARACTERIZACIÓN DE INSTRUMENTOS
DE EVALUACIÓN DEL INGLÉS: UNA
MIRADA A SU DISEÑO
DOI:
Artículo de Investigación
https://doi.org/10.37135/chk.002.15.05
81
Assessment tends to be associated with students and learners; however, the
term assessment encompasses both teachers and students. To understand
the purpose of language assessment instruments, it is key to look for
the designers and their preferences. This research aims to characterize
209 assessment instruments created by English teachers. This is a non-
experimental and descriptive study that analyzes the types of instruments,
the educational level, the language systems and skills, and the type and
number of items. Two of the most important ndings are related to the
preferences Chilean English teachers have towards traditional assessment
and the tendency to assess vocabulary and grammar; besides, the
participants’ preference for tests and ll-in-the gap items.
Palabras clave: Assessment, teaching, students, tests
La La evaluación tiende a estar asociada a estudiantes y aprendices; sin
embargo, el término evaluación abarca tanto a profesores como estudiantes.
Para entender el propósito de los instrumentos de evaluación del idioma
es clave examinar a los diseñadores y sus preferencias. El objetivo de esta
investigación es caracterizar 209 instrumentos de evaluación creados por
profesores de inglés. Se trata de un estudio no experimental y descriptivo,
que analiza los tipos de instrumentos, el nivel educativo, los sistemas y
habilidades de la lengua inglesa, y el tipo y número de ítems. Dos de los
más importantes hallazgos están relacionados con las preferencias que los
profesores de inglés chilenos tienen hacia la evaluación tradicional y la
tendencia a evaluar el vocabulario y la gramática; además, de preferir
los test e ítems de completación de oraciones como los de uso más común.
Keywords: Evaluación, enseñanza, estudiantes, pruebas
Abstract
Resumen
CHARACTERIZING
ENGLISH ASSESSMENT
INSTRUMENTS: AN
OVERVIEW OF THEIR
DESIGN
CARACTERIZACIÓN
DE INSTRUMENTOS
DE EVALUACIÓN DEL
INGLÉS: UNA MIRADA A
SU DISEÑO
Número 15 / DICIEMBRE, 2021 (80-96)
CHARACTERIZING ENGLISH ASSESSMENT INSTRUMENTS: AN OVERVIEW OF THEIR DESIGN
Número 15 / DICIEMBRE, 2021 (80-96) 82
INTRODUCTION
A good portion of students, if not all, have been
assessed by teachers for a certain work done
in class. This assessment could vary among
teachers, schools, and even countries. There
are plenty of options to assess students’ class
performance. For instance, tests and quizzes
are two of the many language assessment
instruments available for teachers to use.
Teachers must be able to choose among this large
quantity of language assessment instruments to
meet learners’ needs.
However, there is often a misconception about
the term assessment, the assessment process
itself, and its use. The term assessment relates
to students and teachers, given that most of
the time teachers are the ones who design the
dierent assessment instruments by taking into
consideration their own learners’ needs.
In this study, we will characterize 209 language
assessment instruments created by several
Chilean English teachers. These assessment
instruments come from kindergarten to university
teachers and include tools from public and
private educational establishments. This study
will also describe all the language assessment
items and will show the dierent types of
assessment instruments, their educational level,
the language system, the language skill presented
in the assessment, and the type and number of
items.
It will explain the tendency of Chilean teachers of
preferring traditional assessment over alternative
assessment. This paper is in the context of
the research grant FONDECYT 1191021
entitled Estudio correlacional y propuesta de
intervención en evaluación del aprendizaje
del inglés: las dimensiones cognitiva, afectiva
y social del proceso evaluativo del idioma
extranjero.
THEORETICAL FRAMEWORK
LANGUAGE ASSESSMENT
According to Le Grange & Reddy (1998:3),
“assessment occurs when judgments are made
about a learners performance, and entails
gathering and organizing information about
learners, to make decisions and judgments
about their learning”. Assessment aims to gather
information and evidence of students from
original sources to make assumptions of gained
knowledge and competences. Boud (1990) stated
that assessing students improves the learning
quality and the standards of performance.
Several studies show assessment as a positive
inuence on students (Black & William 1998;
Kennedy, Chan, Fok & Yu 2008). It provides
feedback, allowing students to acknowledge
their strengths and weaknesses to improve
their learning process. There is a vast range of
assessment methods and tools to help educators
assess various aspects of student learning.
Assessment methods are the techniques,
strategies, and instruments an educator may
use for gathering data on students’ learning.
Methods will vary depending on the learning
outcomes and the students’ level (Allen, Noel,
Rienzi & McMillin 2002), and they can take
dierent forms: tests, rubrics, checklists, rating
scales, etc.
TRADITIONAL ASSESSMENT
Traditional assessment, often related to testing
and standardized tests, has been challenged by
alternative assessment. Many authors agree that
traditional assessment is indirect, inauthentic,
and it only measures what learners can do at a
particular time in a decontextualized context
(Dikli 2003). Even though it might be hard
to believe that educators still use this type of
assessment as their only tool to test, traditional
assessment continues to be the preferred norm.
Traditional assessment stands out for its
objectivity, reliability, and validity (Law & Eckes
1995), as these aspects belong to standardized
CHARACTERIZING ENGLISH ASSESSMENT INSTRUMENTS: AN OVERVIEW OF THEIR DESIGN
Número 15 / DICIEMBRE, 2021 (80-96) 83
tests and multiple-choice items. Traditional
assessment often seems to be more practical,
since the type of items presented can be easily
corrected, and sometimes they are even scored
by automatized machines, providing reliable
results.
TESTS AND QUIZZES
Tests are powerful tools with a variety of purposes
for education (Davis 1993). They help to test
and assess whether a student is learning what is
expected. A well-designed test can motivate and
help students to focus on their academic eorts.
As Crooks (1988), McKeachie (1986) & Wergin
(1998) claimed, learners study according to
what they think teachers will test. For instance,
if a student expects a test based on facts, he
will memorize information. On the other hand,
if a student expects a test will require problem-
solving, they will work on understanding and
applying information.
Tests and quizzes are dierent, based on the
extent of content covered and their weight in
calculating a nal grade in a subject (Jacobs &
Chase 1992). The focus of a test is on particular
aspects of subject-based material, and it has a
limited extent of content. There are several
test items to measure learning, for instance:
multiple choice, true or false questions, reading
comprehension questions, ll in the blanks, etc.
It is key to highlight that tests can be adapted to
fulll students’ needs (Ministerio de Educación
de Chile, 2019). A quiz, on the other hand, is a
quick test and does not have a great impact on
a nal grade. A quiz is often very limited in its
content extension, and it is a way to keep track
of students’ gained knowledge.
LANGUAGE TESTING AND TYPE
OF ITEMS
Language testing is often mistaken with
assessment, as both terms appear together when
we talk about assessment. Language testing is
the practice of measuring the prociency of an
individual in using English. It is important to
understand this terminology as language tests
are part of our education system and society. The
scores from tests are a tool to make inferences
about individuals’ language ability.
As Bachman (2004:3) stated, “language tests
thus have the potential for helping us collect
useful information that will benet a wide variety
of individuals.” Testing is as old as language
teaching “since any kind of teaching has been
followed by some sort of testing” (Farhady
2018:1). From university to school, teachers
have used tests to measure students’ abilities and
English knowledge.
Most teachers develop their tests as they are a
tool for them to decide what to do inside the
classroom (Spaan 2006). The prime consideration
to develop any test is that of purpose. Thus, test
developers need to consider dierent factors
to develop their tests. These factors may vary
from classroom to classroom, from school to
school, and from region to region within the
same country. Spaan (2006:72) denes test
takers “in terms of age, academic or professional
level, language prociency level, and possibly
geographical location or cultural background”.
The next step when designing a test is to
develop the test specications. Teachers must
decide the language skills to be measured
(listening, reading, speaking, and writing), and
if they are going to be measured as integrated
or independent skills. The content and level
must also be dened beforehand, along with the
design of the test itself. “How long will the test
be, both in terms of size and number of items
and in terms of time? Will the test be timed
or not? Will it be speeded?” (Spaan 2006:74).
Scoring is also part of the decisions about the
test specications, and practical considerations,
such as the number of students, or the size of the
classroom.
What follows next is to determine the type of
items to include in the test. Most educators agree
that the best tests contain a variety of items and
response types to achieve their purpose. No
item type by itself has been useful. According
to Spaan (2006), the best tests are the ones that
CHARACTERIZING ENGLISH ASSESSMENT INSTRUMENTS: AN OVERVIEW OF THEIR DESIGN
Número 15 / DICIEMBRE, 2021 (80-96) 84
contain dierent item types, “which is fairer to
test takers in that it acknowledges a variety of
learning styles, balancing objective items with
subjectively scored items” (Spaan 2006:79).
Objective items require the individual to select
the correct answer from several alternatives or to
supply a word to answer a question or complete
a statement; while subjective items allow the
individual to organize and present an original
answer (CTL Illinois 2019). Among objective
items are included: multiple-choice, true-
false, matching, chronological sequence, and
completion; whereas subjective items include
essay-answer, open-ended questions, problem-
solving, and performance test items.
ALTERNATIVE ASSESSMENT
It is important to support students and make
them actively involved in the assessment process
(Black & William 1998), to build self-awareness
of their learning processes. Alternative
assessment includes self and peer-assessment,
which aims to develop autonomy, responsibility,
and critical thinking in learners (Sambell &
McDowell 1998).
The use of alternative assessment over traditional
assessment encourages the use of critical
thinking and the use of real-world problems,
being more meaningful to the learner (Mertler
2016). Whereas traditional assessment only
develops the skill of recalling, in which learning
outside the classroom becomes meaningless to
students.
This idea of a real-life problem is further enforced
by Dikli (2003), who explained that several
approaches are under the concept of alternative
assessment. However, two of them stand out as
the most relevant: real-world instructions and the
use of critical thinking to solve contextualized
problems. The author further describes the
activities considered as alternative assessments
such as open-ended questions, portfolios, and
projects, among others.
RUBRICS
Torres & Perera (2010) dene the rubric as
an instrument of evaluation based on two
scales: qualitative and quantitative. Rubrics
are composed of pre-established criteria, which
measure the actions taken by a student over a
task. Rubrics are specic models to test gained
knowledge in the classroom and topics assigned
by the teacher.
A rubric is designed as a chart. The chart contains
specic descriptors and criteria for the students’
performance. Besides, a rubric always shows the
goals to work as a wonderful source of feedback
for both students and teachers. Teachers can
adapt rubrics to assess and work as a guide for
students.
Students can identify the purpose of the topic,
the steps to follow, and how they will be assessed
(Brindley & Wigglesworth 1997). There are two
types of rubrics: holistic and analytic rubrics.
The holistic rubric provides a global knowledge
appreciation, while the analytic rubric allows
focusing on a specic knowledge aspect.
EMPIRICAL STUDIES
Astawa, Handayani, Mantra & Wardana (2017)
carried out a study on language test items.
The study comprised how dierent test items
presented a high ratio of validity and reliability
in an experimental group of teachers in which
it had a perpetual eect on language habit
development. For this experiment, the authors
decided to only work with an experimental
group. The experimental group had to create a
test focused on the writing skill to analyze if it
presented validity and reliability.
After a week of attending the workshop
organized by the researchers, the teachers learned
how to construct dierent test items. Likewise,
the teachers could identify the principles of
validity and reliability in their tests. The last
part of the workshop comprised how promptly
and consistently the teachers could apply the