-
基础英语测试学复习题
(一)英语测试的发展阶段
1
Pre-scientific stage
This stage refers
to the period that lasted to 1950s. At that time
the foreign language
teaching put
emphasis on the written form rather than on the
communicative aspect of
the
language.
No
special
skill
or
expertise
in
testing
was
required:
the
subjective
judgment of the
teacher was considered to be of paramount
importance. Test usually
consisted of
essay writing, translation, grammatical analysis
and sometimes oral tests.
Spolsky
calls
the
traditional
approach
pre-scientific
because
it
paid
no
attention
to
such
matters as reliability and objectivity and didn’t
use statistical methods.
2
Psychometric--structure stage
The
dominant
testing
form
in
the
1950s
and
1960s
was
psychometric--structural
approach. American linguist, Robert
Lado, one of the first testing scholar, stressed
on
two points with an aim to establish
an objective aspect of measuring human language
proficiency: First, test should test
language usages and not knowledge about language.
Second, the structures to
be tested should be valid structures in colloquial
language
use.
The
structural
sentence-based
view
of
language
fitted
in
quite
well
with
the
psychometric quest for samples of
individual elements to be tested. This resulted in
standardized
tests
with
an
emphasis
on
discrete
point
items,
which
can
be
further
divided into subtests. The tests are
characterized by the conviction that testing can
be
objective, precise, reliable and
scientific. The multiple-choice format was
regarded as
a satisfactory instrument
in measuring the proficiency in the comprehension
of written
English.
3 Psycho
linguistic
—
sociolinguistic
stage
This
stage
is
a
period
that
covers
the
1960s
and
1970s
when
Noam
Chomsky’s
linguistics
theory
swayed
the
basis
of
structuralism.
This
involves
the
testing
of
language
in
context
and
is
thus
concerned
primarily
with
meaning
and
the
total
communicative effect
of the discourse. As early as 1961, John B.
Carroll emphasized
that, in addition to
discrete point tests, integrative tests should be
used, these are tests
which
are
not
so
much
aimed
at
testing
separate
elements
but
measuring
the
total
communicative effect
of an utterance. The integrative test does not
seek to separate
the
language
skills
into
neat
divisions
to
improve
test
reliability;
instead,
they
are
often
designed to assess the learner’s ability to use
two or more skills simultaneously.
Integrative test is best characterized
by the use of close testing and of dictation. Oral
interview, translation and essay
writing are also included in integrative tests.
4 Communicative pragmatic stage
The communicative pragmatic
period began to
occur
corresponding with the
global
shift
since
1970s
to
the
communicative
aspect
of
language
in
the
international
educational
circle.
The
rise
of
the
sociolinguistics
has
brought
about
a
shift
to
the
interest
from
linguistic
competence
to
communicative
competence.
The
communicative
test
aims
to
measure
candidates’
communicative
competency.
Communicative
competency is associated not only with the
linguistic competence, but
also with
the communicative competence. The communicative
test is characterized by
1
the strong emphasis on actual use of
language in real-life settings which have to be
based on the communicative needs of the
learners.
(二)测试的效度和信度
1
Validity
Validity
is
the
degree
to
which
a
test
measures
what
it
is
supposed
to
measure,
or
can
be
used
successfully for the purposes for which
it is intended. Every test, whether it be a short,
informal
classroom or a public
examination, should be as valid as the constructor
can make it. A number of
types of
validation are applied to tests. In this brief
survey, we shall limit our consideration to just
a few of the most common categories.
A: Content validity
This
kind
of
validity
depends
on
a
careful
analysis
of
the
language
being
tested
and
of
the
particular
course
objectives.
This
test
should
be
so
constructed
as
to
contain
a
representative
sample
of
the
course,
the
relationship
between
the
test
items
and
the
course
objectives
always
being apparent. When embarking on the
construction of
a test, the test writer should first draw
up a table of test specifications
describing in very clear and precise terms the
particular language
skills and areas to
be included in the test. The test writer should
attempt to qualify and balance the
test
components,
assigning
a
certain
value
to
indicate
the
importance
of
each
component
in
relation to the other components in the
test. In this way, the test could achieve content
validity and
reflect the component
skills and areas which the test writer wishes to
include in the assessment.
B: Construct
validity
If
a
test
has
construct
validity,
it
is
capable
of
measuring
certain
specific
characteristics
in
accordance
with
a
theory
of
language
behavior
and
learning.
This
type
of
validity
assumes
the
existence of certain
learning theories or constructs underlying the
acquisition of abilities and skills.
For example, it can be argued that a
speed reading test based on a short comprehension
passages is
an
inadequate
measure
of
reading
ability
(and
thus
has
a
low
construct
validity)
unless
it
is
believed
that the speed reading of short passages relates
to the ability to read a book quickly and
efficiently and is a proven factor in
reading ability.
C: Empirical validity
The
best
way
to
check
on
the
actual
effectiveness
of
a
test
is
to
determine
how
test
scores
are
related
to
some
independent,
outside
criterion
such
as
marks
given
at
the
end
of
a
course
or
assessors’ ratings. If the evidence
shows that there is a high co
-relation
between test scores and a
trustworthy
external criterion, then the test can be claimed
to have empirical validity. Empirical
validity is of two general kinds,
predicative and concurrent validity, depending on
the whether test
scores are co-related
with subsequent or concurrent criterion measures.
Empirical validity relies in
large part
on the reliability of both test and criterion
measure.
D: Face validity
Face
validity
here
simply
refers
to
the
way
the
test
looks
—
to
the
examiners,
test
administers,
educators
and
the
like.
Obviously,
this
is
not
validity
in
the
technical
sense,
yet
its
importance
should not be
underestimated, for if the content of a test
appears irrelevant, silly or inappropriate,
test
administers
will
hesitate
to
adopt
it
and
examinees
will
lack
the
proper
motivation.
For
example,
if
a
test
of
reading
comprehension
contains
many
dialect
words
which
might
be
unknown to the students,
the test may be said to lack face validity.
2
2 Reliability
Reliability is a measure of
the degree to which a test gives consistent
results. A test is said to be
reliable
if it gives the same results when it is given on
different occasions or when it is used by
different people. A test can’t measure
anything well unless it measures consistently. Two
di
fferent
types of
consistency or reliability are involved:
reliability of the test itself and reliability of
the
scoring of the test. Test
reliability is affected by a number of factors,
chiefly among them being the
adequacy
of
the
samplings
of
tasks.
Gener
ally
speaking,
the
more
samples
of
students’
performance
we
take,
the
more
reliable
will
be
our
assessment
of
knowledge
and
ability.
In
addition,
test
reliability
will
be
adversely
affected
if
the
conditions
under
which
the
test
is
administered tend to
fluctuate from administration to administration.
Poor students motivation will
also
lower the reliability of a test. There are some
factors beyond the examiners’ control. (e.g., by
illness or personal problems affecting
a number of examinees). Scorer or rater
reliability concerns
the
stability
or
consistency
with
which
tests
performances
are
evaluated.
Scorer
reliability
is
nearly perfect in the case of multiple-
choice tests, but tends to be low in the case of
free-response
test.
(三)测试的目的
Purposes
of test
(
1
)
Language
tests
can
be
used
as
sources
of
information
for
making
decisions
within
the
context of education programs.
A .Selection for instructional programs
Some educational programs
require selection or entrance examinations. The
purpose of this type
of
tests
is
to
discriminate
among
those
students
who
are
prepared
for
an
academic
or
training
program and those who are not
.
For example, the American TOEFL TEST is
used to measure
the English language
proficiency of foreign students who want to study
in U.S.A.
B. Placement of students
In
many
language
programs,
students
are
grouped
homogeneously
according
to
such
factors
as
levels
of
language
proficiency,
language
aptitude,
language
use
needs
and
professional
or
academic
specialization.
Probably
the
most
common
criterion
for
grouping
students
in
such
programs is the level of language
proficiency. Therefore, placement tests are
frequently designed
to measure the
students’ language proficiency .
C. Measuring progress
The
common use of language testing is to
provide information on students’
progress, so that the
teacher
will
be
able
to
locate
the
precise
areas
of
difficulty
encountered
by
the
class
or
by
the
individual
students
and
make
decisions
regarding
appropriate
modifications
in
the
instructional
procedures and
learning activities.
D.
Evaluation of language teaching
The
tests
should
also
enable
the
teacher
to
ascertain
which
parts of
the
language
program
have
been found difficult by
the class and whether the class objectives are
met. In this way, the teacher
can
evaluate the effectiveness of the syllabus as well
as the methods and materials he is using.
(
2
)
p>
Language tests have a potentially
important role in virtually all language research,
both basic
and applied, which is
related to the nature of language proficiency,
language processing, language
acquisition and language teaching. Much
current research into the nature of language
proficiency
has
now
come
to
focus
on
identifying
and
empirically
verifying
its
various
components.
Of
particular interest in
this regard are models of communicative
competence, which have provided
3
the theoretical basis for
the development of communicative testing. Such
tests in turn provide the
basis
for
verifying
the
theoretical
models.
Language
tests
can
also
be
used
in
research
into
the
nature
of language processing. Response to the language
tests can offer a rich body of data for the
identification of processing errors and
their explanations, while language testing
techniques can
serve as elicitation
procedures for collecting information on language
processing.
A third
research use of language tests is in the
examination of the nature of language acquisition.
Several studies have used tests of
different components of communicative language
proficiency as
criteria for examining
the effect of learner variables such as length of
residence in country, age of
first
exposure
to
the
target
language
and
motivational
orientation
on
language
acquisition.
Language
tests
are
also
sometimes
used
as
indicators
of
factors
related
to
the
second
language
acquisition, such
as language aptitude and level of proficiency in
the native language.
Language
tests
play
an
important
role
in
the
investigation
of
effects
of
different
instructional
settings
and
techniques
on
the
language
teaching.
They
have
provided
criterion
indicators
of
language
proficiency
for
studies
in
the
classroom
—
centered
second
language
learning,
and
for
research
into the relationship between different language
teaching strategies and aspects of second
language components.
(四)对语言能力的不同定义
1
Chomsky’s definition about the competence
Chomsky (1965) first
introduced the term competence and performance.
Competence refers to the
speaker
–hearer’s knowledge
of language and regards as competent those who
know the language
perfectly and are
―
unaffected by such grammatically
irrelevant conditions as memory limitations,
distractions,
shifts
of
attention
and
interest,
and
errors
in
their
actual
performances
(Chomsky
,1965).
He
briefly
addresses
performance
as
the
actual
use
of
language
in
concrete
situations. He regards competence as
the proper domain of his linguistic study and
largely ignores
the study of
performance.
2 Hymes’ model
about the communicative competence
Hymes (1972) criticized the Chomsky’s
formulation of competence and proposed
h
is own notion
of
the
competence
which
encompasses
a
much
larger
scope
—
communicative
competence.
This
notion is intended to include not only
grammatical competence ( or implicit or explicit
knowledge
of the rules of the
grammar),but also contextual or sociolinguistic
competence (knowledge of the
rules of
language use.)
3 Canale and Swain’s
model about the communicative
competence
Canale
and
Swain
(1980)
expanded
Hymes’
notion
of
communicative
competence
further
to
include
grammatical
competence,
sociolinguistic
competence
and
strategic
competence.
The
model
was
subsequently
updated
by
Canale
(1983),
who
proposed
a
four-dimensional
model
comprising linguistic, sociolinguistic,
discourse and strategic competence. Their model
provides a
useful
theoretical
framework
for
developing
communicative
tests.
Linguistic
competence
is
understood
to
reflect
the
knowledge
of
vocabulary
and
the
rules
of
the
word
formation,
pronunciation
/spelling
and
sentence
formation.
Such
competence
focuses
directly
on
the
knowledge
and
skills
required
to
understand
and
express
accurately
the
literal
meaning
of
utterances. Sociolinguistic competence
addresses the extent to which utterances are
produced and
understood appropriately
in different sociolinguistic contexts, depending
on the contextual factors
such
as
topic,
status
of
participants
and
the
purpose
of
the
interaction.
Appropriateness
of
the
utterance
refers
to
the
both
appropriateness
of
meaning
and
appropriateness
of
form.
Discourse
4