Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 8 |
| Since 2007 (last 20 years) | 25 |
Descriptor
| Comparative Analysis | 47 |
| Scoring | 47 |
| Test Reliability | 47 |
| Test Validity | 25 |
| Test Construction | 12 |
| Test Items | 11 |
| Foreign Countries | 10 |
| Correlation | 9 |
| English (Second Language) | 8 |
| Item Analysis | 8 |
| Item Response Theory | 8 |
| More ▼ | |
Source
Author
| Alqarni, Abdulelah Mohammed | 1 |
| Attali, Yigal | 1 |
| August, Diane | 1 |
| Bailey, Dallin J. | 1 |
| Balogh, Jennifer | 1 |
| Bauer, Daniel | 1 |
| Beach, Tyson A. C. | 1 |
| Beaujean, A. Alexander | 1 |
| Bernstein, Jared | 1 |
| Blaker, Lisa | 1 |
| Bunker, Lisa | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 30 |
| Journal Articles | 26 |
| Reports - Evaluative | 8 |
| Speeches/Meeting Papers | 4 |
| Books | 2 |
| Guides - Non-Classroom | 2 |
| Reports - Descriptive | 2 |
| Collected Works - General | 1 |
| Guides - General | 1 |
Education Level
| Higher Education | 4 |
| Postsecondary Education | 4 |
| Elementary Education | 3 |
| Elementary Secondary Education | 2 |
| High Schools | 2 |
| Kindergarten | 2 |
| Secondary Education | 2 |
| Grade 1 | 1 |
| Grade 2 | 1 |
| Grade 4 | 1 |
| Intermediate Grades | 1 |
| More ▼ | |
Audience
| Practitioners | 2 |
| Teachers | 1 |
Location
| Japan | 2 |
| Taiwan | 2 |
| Australia | 1 |
| Europe | 1 |
| Florida | 1 |
| Germany | 1 |
| Iran | 1 |
| Maryland | 1 |
| Switzerland (Geneva) | 1 |
| Tennessee | 1 |
| United Kingdom (England) | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Jeff Allen; Ty Cruce – ACT Education Corp., 2025
This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…
Descriptors: College Entrance Examinations, Testing, Change, Scores
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction
Bailey, Dallin J.; Bunker, Lisa; Mauszycki, Shannon; Wambaugh, Julie L. – International Journal of Language & Communication Disorders, 2019
Background: Acquired apraxia of speech (AOS) involves speech-production deficits on both the segmental and suprasegmental levels. Recent research has identified a non-linear interaction between the metrical structure of bisyllabic words and word-production accuracy in German speakers with AOS, with trochaic words (strong-weak stress) being…
Descriptors: Accuracy, Suprasegmentals, Phonology, German
Alqarni, Abdulelah Mohammed – Journal on Educational Psychology, 2019
This study compares the psychometric properties of reliability in Classical Test Theory (CTT), item information in Item Response Theory (IRT), and validation from the perspective of modern validity theory for the purpose of bringing attention to potential issues that might exist when testing organizations use both test theories in the same testing…
Descriptors: Test Theory, Item Response Theory, Test Construction, Scoring
Xu, Jing; Jones, Edmund; Laxton, Victoria; Galaczi, Evelina – Assessment in Education: Principles, Policy & Practice, 2021
Recent advances in machine learning have made automated scoring of learner speech widespread, and yet validation research that provides support for applying automated scoring technology to assessment is still in its infancy. Both the educational measurement and language assessment communities have called for greater transparency in describing…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Computer Software
Kelleher, Leila K.; Beach, Tyson A. C.; Frost, David M.; Johnson, Andrew M.; Dickey, James P. – Measurement in Physical Education and Exercise Science, 2018
The scoring scheme for the functional movement screen implicitly assumes that the factor structure is consistent, stable, and congruent across different populations. To determine if this is the case, we compared principal components analyses of three samples: a healthy, general population (n = 100), a group of varsity athletes (n = 101), and a…
Descriptors: Factor Structure, Test Reliability, Screening Tests, Motion
Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016
In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…
Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics
Severo, Milton; Gaio, A. Rita; Povo, Ana; Silva-Pereira, Fernanda; Ferreira, Maria Amélia – Anatomical Sciences Education, 2015
In theory the formula scoring methods increase the reliability of multiple-choice tests in comparison with number-right scoring. This study aimed to evaluate the impact of the formula scoring method in clinical anatomy multiple-choice examinations, and to compare it with that from the number-right scoring method, hoping to achieve an…
Descriptors: Anatomy, Multiple Choice Tests, Scoring, Decision Making
Wagemaker, Hans, Ed. – International Association for the Evaluation of Educational Achievement, 2020
Although International Association for the Evaluation of Educational Achievement-pioneered international large-scale assessment (ILSA) of education is now a well-established science, non-practitioners and many users often substantially misunderstand how large-scale assessments are conducted, what questions and challenges they are designed to…
Descriptors: International Assessment, Achievement Tests, Educational Assessment, Comparative Analysis
Mitchell, Alison M.; Truckenmiller, Adrea; Petscher, Yaacov – Communique, 2015
As part of the Race to the Top initiative, the United States Department of Education made nearly 1 billion dollars available in State Educational Technology grants with the goal of ramping up school technology. One result of this effort is that states, districts, and schools across the country are using computerized assessments to measure their…
Descriptors: Computer Assisted Testing, Educational Technology, Testing, Efficiency
Li, Hui-Chuan – International Journal of Mathematical Education in Science and Technology, 2014
This study examines students' procedural and conceptual achievement in fraction addition in England and Taiwan. A total of 1209 participants (561 British students and 648 Taiwanese students) at ages 12 and 13 were recruited from England and Taiwan to take part in the study. A quantitative design by means of a self-designed written test is adopted…
Descriptors: Comparative Analysis, Addition, Mathematics Instruction, Foreign Countries
Hirai, Akiyo; Koizumi, Rie – Language Assessment Quarterly, 2013
In recognition of the rating scale as a crucial tool of performance assessment, this study aims to establish a rating scale suitable for a Story Retelling Speaking Test (SRST), which is a semidirect test of speaking ability in English as a foreign language for classroom use. To identify an appropriate scale, three rating scales, all of which have…
Descriptors: Test Validity, Rating Scales, Story Telling, Speech Tests
Carmichael, Jessica A.; Fraccaro, Rebecca L.; Nordstokke, David W. – Canadian Journal of School Psychology, 2014
Oral language skills are important to consider in school psychology practice, as they are directly tied to many areas of academic functioning. For example, research has demonstrated that oral language skills in early elementary school predict reading comprehension in later grades (Kendeou, van den Broek, White, & Lynch, 2009). With a…
Descriptors: Language Tests, Oral Language, Language Skills, School Psychology
Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie – Measurement and Evaluation in Counseling and Development, 2013
Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…
Descriptors: Item Response Theory, Test Theory, Measures (Individuals), Racial Identification
Tourangeau, Karen; Nord, Christine; Lê, Thanh; Wallner-Allen, Kathleen; Vaden-Kiernan, Nancy; Blaker, Lisa; Najarian, Michelle – National Center for Education Statistics, 2018
This manual provides guidance and documentation for users of the longitudinal kindergarten-fourth grade (K-4) data file of the Early Childhood Longitudinal Study, Kindergarten Class of 2010-11 (ECLS-K:2011). It mainly provides information specific to the fourth-grade round of data collection. The first chapter provides an overview of the…
Descriptors: Children, Longitudinal Studies, Surveys, Kindergarten

Peer reviewed
Direct link
