-
研
究
生
学
位
< br>课
程
试
卷
院(系、所)
外国语学院
专业
英语
考
试
科
目
英语教育测量与评价
第
二
学期
研究生姓名
戎竞雄
学号
132300176
考试成绩
导师评语:
导师签字
年
月
日
说
明
一、
凡学
位课程考试试题、
试卷必须与本封面一起装订。
阅卷导师务必用
红笔批卷,并在本封面规定位置打分、写完
评语后在二周(论文考试一个月)内交院(系
、所)办公室
教务员,教务员及时做好成绩登记,在学期结束前或第二学
期初将成绩单交研究生处统一整理归档。试题、试卷由院
(系、所)办公室保管
。
二、学位课程考试用纸除计算机专用打字纸、
16
开小方
格稿子纸外,一律使用研究生处统一印制
的“学位课程考试
纸”
。
三、该封面请用
A4
纸双面打印,将此说明打印于
封面
背面。
上海师范大学标准试卷
2013 ~
2014
学年
第二学期
考试日期
2014
年
8
月
日
科目:英语教育测量与评价
学科教学+课程与教学论
专业
硕士
13
年级
姓名
戎竞雄
学号
_132300176__
项目
分值
得分
一
45
二
55
三
四
五
六
七
八
总分
100
我承诺,遵守《上海师范大
学考场规则》
,诚信考试。签名:
__
戎竞雄
_____
I. Answer in detail
the following questions: (
45%, 15
points for each
)
1.
Suggest
the
differences
between
proficiency
tests
and
achievement
tests.
Give
examples if necessary.
Answer:
A
proficiency
test
assesses
the
general
knowledge
or
skills
commonly
required
for
entry
into
a
group
of
similar
institutions.
One
example
is
TOEFL. Proficiency tests are norm-
referenced tests because NRTs have all
the qualities desirable for proficiency
decisions. While an achievement test
must
be
designed
with
very
specific
reference
to
a
particular
course.
And
the
achievement tests are often directly based on
course objectives and will
therefore be
criterion-referenced. Such tests
will
typically be administered
at the end of
a course to determine how effectively students
have mastered
the
instructional
objectives.
Achievement
tests
must
be
not
only
very
specifically designed to
measure the objectives of a
given course but
also
flexible enough to help teachers
readily respond to what they learn from the
test
about
the
students'
abilities,
the
students'
needs,
and
the
students'
learning of the
course objectives. One example is the tests at the
end of the
course.
2. The following are two different
kinds of score distribution.
.
What
information
do
these
two
figures
convey
us?
(Discuss
from
the
score
distribution of NRT and that of CRT)
Answer:
The
first
figure
conveys
us
that
it
is
a
norm-referenced
test,
which
is
designed to measure global language
abilities (for instance, overall English
language
proficiency
including
listening
ability,
reading
comprehension,
and so on).
Each student's score on such a test is interpreted
relative to the
scores of all other
students who took the test. Such comparisons are
usually
done
with
reference
to
the
concept
of
the
normal
distribution
(familiarly
known as the
bell curve). The purpose of an NRT is to spread
students out
along a continuum of
scores so that those with poor language abilities
are at
one end of the normal
distribution, while those with
the
other end (with the bulk of the students falling
near the middle).
The
second
figure
shows
it
is
a
criterion-referenced
test
(CRT),
which
is
usually
produced
to
measure
well-defined
and
fairly
specific
objectives.
Often
these
objectives
are
specific
to
a
particular
course
or
program.
Each
student's score is meaningful without
reference to the other students' scores.
A
student's
score
on
a
particular
objective
indicates
the
percent
of
the
knowledge or skill in
that objective that the student has learned.
Moreover, the distribution
of scores on a CRT need not necessarily be normal.
If all the students know 100% of the
material on all the objectives, then all
the students should receive the same
score.
The purpose of a CRT is to
measure the amount of learning that a student has
accomplished on each objective. In most
cases, the students would know in
advance
what
types
of
questions,
tasks,
and
content
to
expect
for
each
objective.
3.
What
is
reliability
and
validity?
What
is
the
relationship
between
reliability
and
validity? To assess a
candidate
’
s oral language
ability in an oral test, the examining
body often asks two examiners to score
that candidate
’
s
performance. Similarly, when
an
examiner
is
grading
a
composition
for
a
certain
test,
i.e.
TEM4,
the
same
composition can be
marked by the same examiner on two occasions.
Explain in detail
why such measures
should be taken.
Answer:
The
test
reliability
is
defined
as
the
extent
to
which
the
results
can
be
considered consistent or
stable. Test validity is defined here as the
degree to
which a test measures what it
claims, or purports, to be measuring.
If teachers administer a
placement test
to
their students
on one
occasion,
they
would
like
the
scores
to
be
very
much
the
same
if
they
were
to
administer the same test
again. The degree to which a test is consistent,
or
reliable can be estimated by
calculating a reliability coefficient, which can
go
as
high
as
+1.0
for
a
perfectly
reliable
test
or
as
low
as
0
when
the
results
on
the
test
are
totally
unreliable.
Once
the
tests
are
administered
twice and the
pairs of scores for each student are lined up,
simply calculate
a Pearson product-
moment correlation coefficient between the two
sets of
scores.
The
correlation
coefficient
will
provide
a
conservative
estimate
(that
is
underestimate)
of
the
reliability
of
the
test
over
time.
This
reliability estimate can be interpreted
as the percent of reliable variance on
the test.
Test validity is
defined here as the degree to which a test
measures what it
claims,
or
purports,
to
be
measuring.
For
example,
if
a
test
claims
to
measure proficiency in German listening
comprehension, that is just what
it
should assess.
II.
Discussion (
55%, 17 points for 1 and 2,
21 points for 3
)
1. Look at
the following table and answer the questions that
follow:
1) Calculate the
total standard scores for the two students
2)
Compare
the
total
standard
scores
between
the
two
students,
see
which
student
scored
higher,
and
explain
briefly
why
a
teacher
had
better
use
the
total
standard
scores instead of the total raw scores.
Subject
Mean
SD
Student A
Student B
Psychology
81
6
85
80
Writing
85
9
80
91
Listening
70
5
76
85
Comprehension
Reading
74
10
93
66
Literature
88
3
90
95
Total
424
417
Answer :
1)
Subject
Mean
SD
Student
Standard
Student
Standard
A
scores
B
scores
Psychology
81
6
85
0.667
80
-0.17
-
-
-
-
-
-
-
-
-
上一篇:教育政策术语
下一篇:CPA英语词汇通关必备手册(审计1-4)