ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	6
Since 2017 (last 10 years)	11
Since 2007 (last 20 years)	23

Descriptor

Comparative Analysis	60
Item Analysis	60
Test Reliability	60
Test Validity	24
Test Items	22
Test Construction	19
Statistical Analysis	15
Foreign Countries	13
Scores	11
Psychometrics	9
Error of Measurement	8
Higher Education	8
Scoring	8
Correlation	7
Criterion Referenced Tests	7
Difficulty Level	7
Language Tests	7
Standardized Tests	7
Test Format	7
Achievement Tests	6
Factor Analysis	6
Item Banks	6
Research Reports	6
Undergraduate Students	6
English (Second Language)	5
More ▼

Publication Type

Reports - Research	42
Journal Articles	30
Speeches/Meeting Papers	8
Reports - Evaluative	5
Tests/Questionnaires	4
Information Analyses	2
Books	1
Collected Works - Serials	1
Guides - General	1
Guides - Non-Classroom	1
Numerical/Quantitative Data	1
Opinion Papers	1
More ▼

Education Level

Higher Education	9
Postsecondary Education	8
Middle Schools	2
Early Childhood Education	1
Elementary Education	1
Grade 6	1
Intermediate Grades	1
Kindergarten	1
Preschool Education	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Practitioners	1
Researchers	1
Teachers	1

Location

Iran	4
France	2
Belgium	1
India	1
Luxembourg	1
Spain	1
Switzerland	1
Texas	1
Turkey (Ankara)	1
Turkey (Istanbul)	1
United Kingdom (Belfast)	1
United Kingdom (England)	1
United States	1
Vietnam	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Armed Services Vocational…	1
California Critical Thinking…	1
Childrens Manifest Anxiety…	1
Graduate Record Examinations	1
Group Assessment of Logical…	1
Group Embedded Figures Test	1
Raven Progressive Matrices	1

What Works Clearinghouse Rating

Showing 1 to 15 of 60 results Save | Export

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Treatments of Differential Item Functioning: A Comparison of Four Methods

Peer reviewed

Direct link

Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022

Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…

Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity

Reliability and Validity of Methods to Assess Undergraduate Healthcare Student Performance in Pharmacology: Comparison of Open Book versus Time-Limited Closed Book Examinations

Peer reviewed
PDF on ERIC

Download full text

David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023

We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…

Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format

Developing the Diagnostic Test of Misconceptions of Fractions

Peer reviewed
PDF on ERIC

Download full text

Aleyna Altan; Zehra Taspinar Sener – Online Submission, 2023

This research aimed to develop a valid and reliable test to be used to detect sixth grade students' misconceptions and errors regarding the subject of fractions. A misconception diagnostic test has been developed that includes the concept of fractions, different representations of fractions, ordering and comparing fractions, equivalence of…

Descriptors: Diagnostic Tests, Mathematics Tests, Fractions, Misconceptions

Item-Score Reliability in Empirical-Data Sets and Its Relationship with Other Item Indices

Peer reviewed

Direct link

Zijlmans, Eva A. O.; Tijmstra, Jesper; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2018

Reliability is usually estimated for a total score, but it can also be estimated for item scores. Item-score reliability can be useful to assess the repeatability of an individual item score in a group. Three methods to estimate item-score reliability are discussed, known as method MS, method [lambda][subscript 6], and method CA. The item-score…

Descriptors: Test Items, Test Reliability, Correlation, Comparative Analysis

Same Test, Better Scores: Boosting the Reliability of Short Online Intelligence Recruitment Tests with Nested Logit Item Response Theory Models

Peer reviewed
PDF on ERIC

Download full text

Storme, Martin; Myszkowski, Nils; Baron, Simon; Bernard, David – Journal of Intelligence, 2019

Assessing job applicants' general mental ability online poses psychometric challenges due to the necessity of having brief but accurate tests. Recent research (Myszkowski & Storme, 2018) suggests that recovering distractor information through Nested Logit Models (NLM; Suh & Bolt, 2010) increases the reliability of ability estimates in…

Descriptors: Intelligence Tests, Item Response Theory, Comparative Analysis, Test Reliability

Multiple True-False Items: A Comparison of Scoring Algorithms

Peer reviewed

Direct link

Lahner, Felicitas-Maria; Lörwald, Andrea Carolin; Bauer, Daniel; Nouns, Zineb Miriam; Krebs, René; Guttormsen, Sissel; Fischer, Martin R.; Huwendiek, Sören – Advances in Health Sciences Education, 2018

Multiple true-false (MTF) items are a widely used supplement to the commonly used single-best answer (Type A) multiple choice format. However, an optimal scoring algorithm for MTF items has not yet been established, as existing studies yielded conflicting results. Therefore, this study analyzes two questions: What is the optimal scoring algorithm…

Descriptors: Scoring Formulas, Scoring Rubrics, Objective Tests, Multiple Choice Tests

Evaluation of Different Scoring Rules for a Noncognitive Test in Development. Research Report. ETS RR-16-03

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016

In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…

Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics

An Exploratory Factor Analysis and Construct Validity of the Resident Choice Assessment Scale with Paid Carers of Adults with Intellectual Disabilities and Challenging Behavior in Community Settings

Peer reviewed

Direct link

Ratti, Victoria; Vickerstaff, Victoria; Crabtree, Jason; Hassiotis, Angela – Journal of Mental Health Research in Intellectual Disabilities, 2017

Introduction: The Resident Choice Assessment Scale (RCAS) is used to assess choice availability for adults with intellectual disabilities (ID). The aim of the study was to explore the factor structure, construct validity, and internal consistency of the measure in community settings to further validate this tool. Method: 108 paid carers of adults…

Descriptors: Measures (Individuals), Adults, Intellectual Disability, Factor Analysis

Does MTV Really Do a Good Job of Evaluating Professors? An Empirical Test of the Internet Site Ratemyprofessors.com

Peer reviewed

Direct link

Murray, Keith B.; Zdravkovic, Srdan – Journal of Education for Business, 2016

Considerable debate continues regarding the efficacy of the website RateMyProfessors.com (RMP). To date, however, virtually no direct, experimental research has been reported which directly bears on questions relating to sampling adequacy or item adequacy in producing what favorable correlations have been reported. The authors compare the data…

Descriptors: Computer Assisted Testing, Computer Software Evaluation, Student Evaluation of Teacher Performance, Item Analysis

Assessing Assessment: Evaluating Outcomes and Reliabilities of Grammar, Math, and Writing Skill Measures in an Introductory Journalism Course

Peer reviewed

Direct link

Farwell, Tricia M.; Alligood, Leon; Fitzgerald, Sharon; Blake, Ken – Journalism and Mass Communication Educator, 2016

This article introduces an objective grammar and math assessment and evaluates the assessment's outcome and reliability when fielded among eighty-one students in media writing courses. In addition, the article proposes a rubric for grading straight news leads and compares the rubric's reliability with the reliability of rating straight news leads…

Descriptors: Journalism, Journalism Education, Introductory Courses, Reliability

Examination of Test and Item Statistics from Visual and Verbal Mathematics Questions

Peer reviewed
PDF on ERIC

Download full text

Alpayar, Cagla; Gulleroglu, H. Deniz – Educational Research and Reviews, 2017

The aim of this research is to determine whether students' test performance and approaches to test questions change based on the type of mathematics questions (visual or verbal) administered to them. This research is based on a mixed-design model. The quantitative data are gathered from 297 seventh grade students, attending seven different middle…

Descriptors: Foreign Countries, Middle School Students, Grade 7, Student Evaluation

Misconceptions about the Naglieri Nonverbal Ability Test: A Commentary of Concerns and Disagreements

Peer reviewed

Direct link

Naglieri, Jack A.; Ford, Donna Y. – Roeper Review, 2015

Black and Hispanic students are undeniably underidentified as gifted and underrepresented in gifted education. The underrepresentation of the two largest groups of "minority" students is long-standing, dating several decades, and is a serious area of contention. Most debates focus on the efficacy of traditional intelligence tests with…

Descriptors: Misconceptions, Nonverbal Ability, Ability, Ability Identification

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Educational and Psychological…	4
International Journal of…	2
Advances in Health Sciences…	1
Developmental Psychology	1
ETS Research Report Series	1
Early Education and…	1
Edinburgh Working Papers in…	1
Educational Research and…	1
Hispanic Journal of…	1
International Journal of…	1
Journal of Chemical Education	1
Journal of Consulting and…	1
Journal of Education for…	1
Journal of Educational…	1
Journal of Experimental…	1
Journal of Intelligence	1
Journal of Mental Health…	1
Journal of Teacher Education	1
Journal of Vocational Behavior	1
Journalism and Mass…	1
Language Assessment Quarterly	1
Language Testing	1
Online Submission	1
Practitioner Research in…	1
Psychometrika	1
More ▼

Bashaw, W. L.	3
Benson, Jeri	3
Haladyna, Tom	2
Rentz, R. Robert	2
Afflerbach, Peter	1
Aleyna Altan	1
Alligood, Leon	1
Alpayar, Cagla	1
Argulewicz, Ed N.	1
Baghaei, Purya	1
Baron, Simon	1
Bauer, Daniel	1
Bennett, Randy Elliot	1
Bernard, David	1
Bernknopf, Stanley	1
Blake, Ken	1
Bowes, Neal	1
Brennan, Robert L,	1
Broonen, Jean-Paul	1
Carlson, Jerry S.	1
Chase, Clinton I.	1
Claessens, Amy	1
Clark, Teresa P.	1
Crabtree, Jason	1
More ▼