ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	23
Since 2017 (last 10 years)	41
Since 2007 (last 20 years)	71

Descriptor

Comparative Analysis	123
Test Items	123
Test Validity	80
Foreign Countries	43
Item Analysis	39
Test Construction	35
Test Reliability	35
Difficulty Level	31
Item Response Theory	28
Scores	24
Test Format	24
Validity	24
Language Tests	23
Statistical Analysis	21
Construct Validity	20
English (Second Language)	20
Multiple Choice Tests	18
Correlation	17
Higher Education	17
Second Language Learning	16
Mathematics Tests	15
Reading Tests	15
Achievement Tests	13
Psychometrics	13
Computer Assisted Testing	12
More ▼

Publication Type

Reports - Research	91
Journal Articles	75
Speeches/Meeting Papers	20
Reports - Evaluative	19
Tests/Questionnaires	6
Reports - Descriptive	5
Dissertations/Theses -…	4
Collected Works - General	2
Information Analyses	2
Numerical/Quantitative Data	2
Books	1
Collected Works - Serials	1
Non-Print Media	1
Opinion Papers	1
Reference Materials - General	1
Reports - General	1
More ▼

Education Level

Higher Education	25
Postsecondary Education	23
Secondary Education	12
Elementary Education	11
High Schools	5
Early Childhood Education	4
Elementary Secondary Education	3
Intermediate Grades	3
Grade 4	2
Middle Schools	2
Grade 5	1
Grade 6	1
More ▼

Audience

Administrators	1
Parents	1
Policymakers	1
Researchers	1

Location

Canada	4
Germany	4
Indonesia	3
Israel	3
Japan	3
United States	3
Australia	2
China	2
Colorado	2
Georgia	2
Nevada	2
Ohio	2
Oregon	2
Turkey	2
Turkey (Ankara)	2
United Kingdom	2
Africa	1
Belgium	1
China (Beijing)	1
District of Columbia	1
Europe	1
Idaho	1
Illinois	1
Iran	1
Malaysia	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Showing 1 to 15 of 123 results Save | Export

Developing an MLA-Test for Young Learners -- Insights from Measurement Theory and Language Testing

Peer reviewed

Direct link

Kaja Haugen; Cecilie Hamnes Carlsen; Christine Möller-Omrani – Language Awareness, 2025

This article presents the process of constructing and validating a test of metalinguistic awareness (MLA) for young school children (age 8-10). The test was developed between 2021 and 2023 as part of the MetaLearn research project, financed by The Research Council of Norway. The research team defines MLA as using metalinguistic knowledge at a…

Descriptors: Language Tests, Test Construction, Elementary School Students, Metalinguistics

Reliability and Validity of Methods to Assess Undergraduate Healthcare Student Performance in Pharmacology: Comparison of Open Book versus Time-Limited Closed Book Examinations

Peer reviewed
PDF on ERIC

Download full text

David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023

We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…

Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format

Design Framework for the ACT® Enhancements. ACT Research. Research Report. R2519

Download full text

Jeff Allen; Jay Thomas; Stacy Dreyer; Scott Johanningmeier; Dana Murano; Ty Cruce; Xin Li; Edgar Sanchez – ACT Education Corp., 2025

This report describes the process of developing and validating the enhanced ACT. The report describes the changes made to the test content and the processes by which these design decisions were implemented. The authors describe how they shared the overall scope of the enhancements, including the initial blueprints, with external expert panels,…

Descriptors: College Entrance Examinations, Testing, Change, Test Construction

Generating Social and Emotional Skill Items: Humans vs. ChatGPT. ACT Research. Issue Brief

Download full text

Kate E. Walton; Cristina Anguiano-Carrasco – ACT, Inc., 2024

Large language models (LLMs), such as ChatGPT, are becoming increasingly prominent. Their use is becoming more and more popular to assist with simple tasks, such as summarizing documents, translating languages, rephrasing sentences, or answering questions. Reports like McKinsey's (Chui, & Yee, 2023) estimate that by implementing LLMs,…

Descriptors: Artificial Intelligence, Man Machine Systems, Natural Language Processing, Test Construction

The Enhanced ACT Linking Study Report. ACT Research. Research Paper. R2515

Download full text

Dongmei Li; Shalini Kapoor; Ann Arthur; Chi-Yu Huang; YoungWoo Cho; Chen Qiu; Hongling Wang – ACT Education Corp., 2025

Starting in April 2025, ACT will introduce enhanced forms of the ACT® test for national online testing, with a full rollout to all paper and online test takers in national, state and district, and international test administrations by Spring 2026. ACT introduced major updates by changing the test lengths and testing times, providing more time per…

Descriptors: College Entrance Examinations, Testing, Change, Scoring

The Social Shapes Test as a Self-Administered, Online Measure of Social Intelligence: Two Studies with Typically Developing Adults and Adults with Autism Spectrum Disorder

Peer reviewed

Direct link

Matt I. Brown; Patrick R. Heck; Christopher F. Chabris – Journal of Autism and Developmental Disorders, 2024

The Social Shapes Test (SST) is a measure of social intelligence which does not use human faces or rely on extensive verbal ability. The SST has shown promising validity among adults without autism spectrum disorder (ASD), but it is uncertain whether it is suitable for adults with ASD. We find measurement invariance between adults with (n = 229)…

Descriptors: Interpersonal Competence, Autism Spectrum Disorders, Emotional Intelligence, Verbal Ability

How Useful Is Comparative Judgement of Item Difficulty for Standard Maintaining?

Download full text

Benton, Tom – Research Matters, 2020

This article reviews the evidence on the extent to which experts' perceptions of item difficulties, captured using comparative judgement, can predict empirical item difficulties. This evidence is drawn from existing published studies on this topic and also from statistical analysis of data held by Cambridge Assessment. Having reviewed the…

Descriptors: Test Items, Difficulty Level, Expertise, Comparative Analysis

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Validity of Multiple-Choice Digital Formative Assessment for Assessing Students' (Mis)Conceptions: Evidence from a Mixed-Methods Study in Algebra

Peer reviewed

Direct link

Katrin Klingbeil; Fabian Rösken; Bärbel Barzel; Florian Schacht; Kaye Stacey; Vicki Steinle; Daniel Thurm – ZDM: Mathematics Education, 2024

Assessing students' (mis)conceptions is a challenging task for teachers as well as for researchers. While individual assessment, for example through interviews, can provide deep insights into students' thinking, this is very time-consuming and therefore not feasible for whole classes or even larger settings. For those settings, automatically…

Descriptors: Multiple Choice Tests, Formative Evaluation, Mathematics Tests, Misconceptions

Reliability and Validity Evidence of Diagnostic Methods: Comparison of Diagnostic Classification Models and Item Response Theory-Based Methods

Direct link

Yoo Jeong Jang – ProQuest LLC, 2022

Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…

Descriptors: Classification, Accuracy, Item Response Theory, Correlation

How Administration Stakes and Settings Affect Student Behavior and Performance on a Biology Concept Assessment

Peer reviewed

Direct link

Uminski, Crystal; Hubbard, Joanna K.; Couch, Brian A. – CBE - Life Sciences Education, 2023

Biology instructors use concept assessments in their courses to gauge student understanding of important disciplinary ideas. Instructors can choose to administer concept assessments based on participation (i.e., lower stakes) or the correctness of responses (i.e., higher stakes), and students can complete the assessment in an in-class or…

Descriptors: Biology, Science Tests, High Stakes Tests, Scores

Treatments of Differential Item Functioning: A Comparison of Four Methods

Peer reviewed

Direct link

Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022

Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…

Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Changes in the Speed-Ability Relation through Different Treatments of Rapid Guessing

Peer reviewed

Direct link

Deribo, Tobias; Goldhammer, Frank; Kroehne, Ulf – Educational and Psychological Measurement, 2023

As researchers in the social sciences, we are often interested in studying not directly observable constructs through assessments and questionnaires. But even in a well-designed and well-implemented study, rapid-guessing behavior may occur. Under rapid-guessing behavior, a task is skimmed shortly but not read and engaged with in-depth. Hence, a…

Descriptors: Reaction Time, Guessing (Tests), Behavior Patterns, Bias

The Pattern of Test-Taking Effort across Items in Cognitive Ability Test: A Latent Class Analysis

Peer reviewed
PDF on ERIC

Download full text

Akhtar, Hanif – International Association for Development of the Information Society, 2022

When examinees perceive a test as low stakes, it is logical to assume that some of them will not put out their maximum effort. This condition makes the validity of the test results more complicated. Although many studies have investigated motivational fluctuation across tests during a testing session, only a small number of studies have…

Descriptors: Intelligence Tests, Student Motivation, Test Validity, Student Attitudes

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Educational and Psychological…	6
Language Testing	6
Journal of Educational…	4
Language Assessment Quarterly	4
ProQuest LLC	3
Research in Developmental…	3
ACT Education Corp.	2
CBE - Life Sciences Education	2
International Journal of…	2
International Journal of…	2
Language Testing in Asia	2
Online Submission	2
Research Matters	2
ACT, Inc.	1
Applied Measurement in…	1
Applied Psychological…	1
Assessment & Evaluation in…	1
Assessment in Education:…	1
Bilingual Research Journal	1
Biochemistry and Molecular…	1
British Journal of Language…	1
College Board	1
Communique	1
Diaspora, Indigenous, and…	1
Early Education and…	1
More ▼

Allen, Nancy L.	2
Haladyna, Tom	2
Roid, Gale	2
Wainer, Howard	2
Abel, Michael B.	1
Acar, Tülin	1
Aesaert, Koen	1
Afflerbach, Peter	1
Akbari, Alireza	1
Akhtar, Hanif	1
Alderson, J. Charles	1
Aleyna Altan	1
Alqarni, Abdulelah Mohammed	1
Ann Arthur	1
Arth, Thomas O.	1
Awwad, Abeer	1
Baker, Harley E.	1
Bardovi-Harlig, Kathleen	1
Basset, Katherine	1
Benderson, Albert, Ed.	1
Benton, Tom	1
Berman, Ye'Elah	1
Betebenner, Damian W.	1
Betjemann, Rebecca S.	1
More ▼

Test of English as a Foreign…	5
SAT (College Admission Test)	3
Trends in International…	3
ACT Assessment	2
Program for International…	2
Raven Progressive Matrices	2
Armed Services Vocational…	1
California Achievement Tests	1
Comprehensive Tests of Basic…	1
Defining Issues Test	1
Early Childhood Environment…	1
Graduate Record Examinations	1
Gray Oral Reading Test	1
International Association for…	1
Minnesota Multiphasic…	1
National Assessment of…	1
Progress in International…	1
Sequential Tests of…	1
Stanford Achievement Tests	1
Strong Interest Inventory	1
Test of Written English	1
Wechsler Adult Intelligence…	1
More ▼