ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	4

Descriptor

Comparative Analysis	9
Item Analysis	9
Testing Programs	9
Test Items	4
Scores	3
Academic Achievement	2
Achievement Tests	2
Aptitude Tests	2
Computer Assisted Testing	2
English (Second Language)	2
Essays	2
Foreign Countries	2
Item Response Theory	2
Language Tests	2
Regression (Statistics)	2
Scoring	2
Simulation	2
Standardized Tests	2
Statistical Analysis	2
Ability Grouping	1
Adaptive Testing	1
Administration	1
Basic Skills	1
Black Students	1
Comparative Education	1
More ▼

Source

ACT, Inc.	1
Compare: A Journal of…	1
ETS Research Report Series	1
Journal of Educational and…	1

Author

Breyer, F. Jay	1
Chapelle, Carol, Ed.	1
Chen, Hanwei	1
Clarke, S. C. T.	1
Cui, Zhongmin	1
Douglas, Dan, Ed.	1
Engelhard, George, Jr.	1
Gao, Xiaohong	1
Guo, Hongwen	1
Lorenz, Florian	1
Reckase, Mark D.	1
Sinharay, Sandip	1
Somerset, Anthony	1
Zhang, Mo	1
Zhu, Rongchun	1
More ▼

Publication Type

Reports - Research	4
Journal Articles	3
Reports - Evaluative	3
Numerical/Quantitative Data	2
Books	1
Collected Works - General	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Elementary Secondary Education

Audience

Location

Canada

Laws, Policies, & Programs

Assessments and Surveys

California Achievement Tests	1
California Test of Mental…	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Nonparametric Item Response Curve Estimation with Correction for Measurement Error

Peer reviewed

Direct link

Guo, Hongwen; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2011

Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…

Descriptors: Testing Programs, Measurement, Item Analysis, Error of Measurement

Strengthening Educational Quality in Developing Countries: The Role of National Examinations and International Assessment Systems

Peer reviewed

Direct link

Somerset, Anthony – Compare: A Journal of Comparative and International Education, 2011

Educational practitioners rely predominantly on measures of outcome, rather than of inputs or process, in making judgements as to quality. Outcome measures are available from two main sources: (1) the relatively new international assessment systems; and (2) the traditional national examinations systems. The two types of system differ in their…

Descriptors: Testing Programs, Educational Quality, National Competency Tests, Educational Improvement

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Evaluating the Effects of Differences in Group Abilities on the Tucker and the Levine Observed-Score Methods for Common-Item Nonequivalent Groups Equating. ACT Research Report Series 2010-1

Download full text

Chen, Hanwei; Cui, Zhongmin; Zhu, Rongchun; Gao, Xiaohong – ACT, Inc., 2010

The most critical feature of a common-item nonequivalent groups equating design is that the average score difference between the new and old groups can be accurately decomposed into a group ability difference and a form difficulty difference. Two widely used observed-score linear equating methods, the Tucker and the Levine observed-score methods,…

Descriptors: Equated Scores, Groups, Ability Grouping, Difficulty Level

Analysis of Standardized Testing Program Results 1973-74: Grades 1, 5, and 8 and ACT Report.

Download full text

New Mexico State Dept. of Education, Santa Fe. Evaluation, Assessment, and Testing Unit. – 1974

This survey of the standardized testing program summarizes the data accumulated from the most recent administration of selected instruments in October 1973. It compares these findings with information from previous years and points to a few trends and possible conclusions. Assessment of mental abilities--1973-74 is presented for grade 1, and…

Descriptors: Academic Achievement, Comparative Analysis, Ethnic Groups, Grade 1

Ability Estimation and Item Calibration Using the One and Three Parameter Logistic Models: A Comparative Study. Research Report 77-1.

Download full text

Reckase, Mark D. – 1977

Latent trait model calibration procedures were used on data obtained from a group testing program. The one-parameter model of Wright and Panchapakesan and the three-parameter logistic model of Wingersky, Wood, and Lord were selected for comparison. These models and their corresponding estimation procedures were compared, using actual and simulated…

Descriptors: Achievement Tests, Adaptive Testing, Aptitude Tests, Comparative Analysis

An Empirical Comparison of Mantel-Haenszel and Rasch Procedures for Studying Differential Item Functioning on Teacher Certification Tests.

Download full text

Engelhard, George, Jr.; And Others – 1989

The agreement between Mantel-Haenszel and Rasch procedures for identifying differential item functioning (DIF) on teacher certification tests was studied. Two specific research questions were addressed: (1) whether the Mantel-Haenszel and Rasch procedures identify the same items as functioning differently; and (2) how consistently each method…

Descriptors: Administration, Black Students, Comparative Analysis, Early Childhood Education

Technical Report on Edmonton Grade III Achievement. 1956-1977 Comparisons. A Study Conducted for the Alberta Advisory Committee on Educational Studies.

Clarke, S. C. T.; And Others – 1978

The Edmonton Grade III Achievement: 1956-1977 study is a comparison of achievement in reading, arithmetic, and language involving all of the third grade students in a large school system. Six basic skills tests which were administered to all of the Edmonton third grade students in 1956 were reprinted and administered to all of the third grade…

Descriptors: Academic Achievement, Achievement Tests, Aptitude Tests, Basic Skills

A New Decade of Language Testing Research: Selected Papers from the Annual Language Testing Research Colloquium (12th, San Francisco, California, March 1990).

Douglas, Dan, Ed.; Chapelle, Carol, Ed. – 1993

Papers from the conference on language testing include: "Foundations and Directions for a New Decade of Language Testing" (Carol Chapelle, Dan Douglas); "A Comparison of the Abilities Measured by the Cambridge and Educational Testing Service EFL Test Batteries" (Lyle F. Bachman, Fred Davidson, John Foulkes); "Judgments in…

Descriptors: Comparative Analysis, Computer Assisted Testing, Diagnostic Tests, Educational Trends