Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 0 |
| Since 2007 (last 20 years) | 4 |
Descriptor
| Comparative Analysis | 9 |
| Item Analysis | 9 |
| Testing Programs | 9 |
| Test Items | 4 |
| Scores | 3 |
| Academic Achievement | 2 |
| Achievement Tests | 2 |
| Aptitude Tests | 2 |
| Computer Assisted Testing | 2 |
| English (Second Language) | 2 |
| Essays | 2 |
| More ▼ | |
Author
| Breyer, F. Jay | 1 |
| Chapelle, Carol, Ed. | 1 |
| Chen, Hanwei | 1 |
| Clarke, S. C. T. | 1 |
| Cui, Zhongmin | 1 |
| Douglas, Dan, Ed. | 1 |
| Engelhard, George, Jr. | 1 |
| Gao, Xiaohong | 1 |
| Guo, Hongwen | 1 |
| Lorenz, Florian | 1 |
| Reckase, Mark D. | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 4 |
| Journal Articles | 3 |
| Reports - Evaluative | 3 |
| Numerical/Quantitative Data | 2 |
| Books | 1 |
| Collected Works - General | 1 |
| Speeches/Meeting Papers | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Elementary Secondary Education | 1 |
Audience
Location
| Canada | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| California Achievement Tests | 1 |
| California Test of Mental… | 1 |
What Works Clearinghouse Rating
Guo, Hongwen; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2011
Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…
Descriptors: Testing Programs, Measurement, Item Analysis, Error of Measurement
Somerset, Anthony – Compare: A Journal of Comparative and International Education, 2011
Educational practitioners rely predominantly on measures of outcome, rather than of inputs or process, in making judgements as to quality. Outcome measures are available from two main sources: (1) the relatively new international assessment systems; and (2) the traditional national examinations systems. The two types of system differ in their…
Descriptors: Testing Programs, Educational Quality, National Competency Tests, Educational Improvement
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Chen, Hanwei; Cui, Zhongmin; Zhu, Rongchun; Gao, Xiaohong – ACT, Inc., 2010
The most critical feature of a common-item nonequivalent groups equating design is that the average score difference between the new and old groups can be accurately decomposed into a group ability difference and a form difficulty difference. Two widely used observed-score linear equating methods, the Tucker and the Levine observed-score methods,…
Descriptors: Equated Scores, Groups, Ability Grouping, Difficulty Level
New Mexico State Dept. of Education, Santa Fe. Evaluation, Assessment, and Testing Unit. – 1974
This survey of the standardized testing program summarizes the data accumulated from the most recent administration of selected instruments in October 1973. It compares these findings with information from previous years and points to a few trends and possible conclusions. Assessment of mental abilities--1973-74 is presented for grade 1, and…
Descriptors: Academic Achievement, Comparative Analysis, Ethnic Groups, Grade 1
Reckase, Mark D. – 1977
Latent trait model calibration procedures were used on data obtained from a group testing program. The one-parameter model of Wright and Panchapakesan and the three-parameter logistic model of Wingersky, Wood, and Lord were selected for comparison. These models and their corresponding estimation procedures were compared, using actual and simulated…
Descriptors: Achievement Tests, Adaptive Testing, Aptitude Tests, Comparative Analysis
Engelhard, George, Jr.; And Others – 1989
The agreement between Mantel-Haenszel and Rasch procedures for identifying differential item functioning (DIF) on teacher certification tests was studied. Two specific research questions were addressed: (1) whether the Mantel-Haenszel and Rasch procedures identify the same items as functioning differently; and (2) how consistently each method…
Descriptors: Administration, Black Students, Comparative Analysis, Early Childhood Education
Clarke, S. C. T.; And Others – 1978
The Edmonton Grade III Achievement: 1956-1977 study is a comparison of achievement in reading, arithmetic, and language involving all of the third grade students in a large school system. Six basic skills tests which were administered to all of the Edmonton third grade students in 1956 were reprinted and administered to all of the third grade…
Descriptors: Academic Achievement, Achievement Tests, Aptitude Tests, Basic Skills
Douglas, Dan, Ed.; Chapelle, Carol, Ed. – 1993
Papers from the conference on language testing include: "Foundations and Directions for a New Decade of Language Testing" (Carol Chapelle, Dan Douglas); "A Comparison of the Abilities Measured by the Cambridge and Educational Testing Service EFL Test Batteries" (Lyle F. Bachman, Fred Davidson, John Foulkes); "Judgments in…
Descriptors: Comparative Analysis, Computer Assisted Testing, Diagnostic Tests, Educational Trends

Peer reviewed
Direct link
