ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	7

Descriptor

Models	13
Test Format	13
Test Reliability	13
Test Construction	8
Test Items	8
Test Validity	6
Comparative Analysis	4
Item Response Theory	4
Multiple Choice Tests	4
Computer Assisted Testing	3
Psychometrics	3
Ability	2
Comparative Testing	2
Criterion Referenced Tests	2
Distractors (Tests)	2
Estimation (Mathematics)	2
Foreign Countries	2
Guessing (Tests)	2
High School Students	2
High Schools	2
Item Analysis	2
Language Tests	2
Responses	2
Scores	2
Student Evaluation	2
More ▼

Source

ETS Research Report Series	2
Annual Review of Applied…	1
Assessment & Evaluation in…	1
College Board	1
Educational and Psychological…	1
Journal of Intelligence	1
Online Submission	1
Practical Assessment,…	1

Publication Type

Journal Articles	8
Reports - Research	8
Reports - Evaluative	3
Information Analyses	2
Speeches/Meeting Papers	2
Numerical/Quantitative Data	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Administrators	1
Practitioners	1

Location

France	1
Georgia	1
Saudi Arabia (Riyadh)	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Testing Vocabulary Associations for Effective Long Term Learning

Download full text

Al-Jarf, Reima – Online Submission, 2023

This article aims to give a comprehensive guide to planning and designing vocabulary tests which include Identifying the skills to be covered by the test; outlining the course content covered; preparing a table of specifications that shows the skill, content topics and number of questions allocated to each; and preparing the test instructions. The…

Descriptors: Vocabulary Development, Learning Processes, Test Construction, Course Content

Same Test, Better Scores: Boosting the Reliability of Short Online Intelligence Recruitment Tests with Nested Logit Item Response Theory Models

Peer reviewed
PDF on ERIC

Download full text

Storme, Martin; Myszkowski, Nils; Baron, Simon; Bernard, David – Journal of Intelligence, 2019

Assessing job applicants' general mental ability online poses psychometric challenges due to the necessity of having brief but accurate tests. Recent research (Myszkowski & Storme, 2018) suggests that recovering distractor information through Nested Logit Models (NLM; Suh & Bolt, 2010) increases the reliability of ability estimates in…

Descriptors: Intelligence Tests, Item Response Theory, Comparative Analysis, Test Reliability

Reducing the Need for Guesswork in Multiple-Choice Tests

Peer reviewed

Direct link

Bush, Martin – Assessment & Evaluation in Higher Education, 2015

The humble multiple-choice test is very widely used within education at all levels, but its susceptibility to guesswork makes it a suboptimal assessment tool. The reliability of a multiple-choice test is partly governed by the number of items it contains; however, longer tests are more time consuming to take, and for some subject areas, it can be…

Descriptors: Guessing (Tests), Multiple Choice Tests, Test Format, Test Reliability

Exploring Equity Properties in Equating Using AP® Examinations. Research Report No. 2012-4

Download full text

Lee, Eunjung; Lee, Won-Chan; Brennan, Robert L. – College Board, 2012

In almost all high-stakes testing programs, test equating is necessary to ensure that test scores across multiple test administrations are equivalent and can be used interchangeably. Test equating becomes even more challenging in mixed-format tests, such as Advanced Placement Program® (AP®) Exams, that contain both multiple-choice and constructed…

Descriptors: Test Construction, Test Interpretation, Test Norms, Test Reliability

Test Administration Models

Peer reviewed
PDF on ERIC

Download full text

Becker, Kirk A.; Bergstrom, Betty A. – Practical Assessment, Research & Evaluation, 2013

The need for increased exam security, improved test formats, more flexible scheduling, better measurement, and more efficient administrative processes has caused testing agencies to consider converting the administration of their exams from paper-and-pencil to computer-based testing (CBT). Many decisions must be made in order to provide an optimal…

Descriptors: Testing, Models, Testing Programs, Program Administration

The Effects of Rater Severity and Rater Distribution on Examinees' Ability Estimation for Constructed-Response Items. Research Report. ETS RR-13-23

Peer reviewed
PDF on ERIC

Download full text

Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013

The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…

Descriptors: Test Format, Test Items, Responses, Computation

Comparison of Multistage Tests with Computerized Adaptive and Paper-and-Pencil Tests. Research Report. ETS RR-07-04

Peer reviewed
PDF on ERIC

Download full text

Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007

Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…

Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models

Developments in Language Testing.

Peer reviewed

Douglas, Dan – Annual Review of Applied Linguistics, 1995

Reviews recent theoretical, methodological, and analytical developments in language testing, focusing on more refined models of language ability, reliability and validity, performance testing, innovative test formats, new applications of Item Response Theory and Generalizability Theory to test performance. An annotated bibliography discusses seven…

Descriptors: Annotated Bibliographies, Evaluation Methods, Language Proficiency, Language Tests

Estimating the Optimum Number of Options per Item Using an Incremental Option Paradigm.

Peer reviewed

Trevisan, Michael S.; And Others – Educational and Psychological Measurement, 1994

The reliabilities of 2-, 3-, 4-, and 5-choice tests were compared through an incremental-option model on a test taken by 154 high school seniors. Creating the test forms incrementally more closely approximates actual test construction. The nonsignificant differences among the option choices support the three-option item. (SLD)

Descriptors: Distractors (Tests), Estimation (Mathematics), High School Students, High Schools

Competency Test Development, Validation, and Standard-Setting.

Download full text

Hambleton, Ronald K.; Eignor, Daniel R. – 1978

In light of the widespread use of competency testing, the authors consider that it is important to determine ways of developing and using competency testing to insure that it achieves its full potential. The paper, in three parts, introduces a model for the development and validation of competency tests, reviews several methods for setting…

Descriptors: Competence, Criterion Referenced Tests, Cutting Scores, Elementary Secondary Education

Estimating the Optimum Choice Format Using an Incremental Option Paradigm.

Download full text

Trevisan, Michael S.; Sax, Gilbert – 1991

The purpose of this study was to compare the reliabilities of two-, three-, four-, and five-choice tests using an incremental option paradigm. Test forms were created incrementally, a method approximating actual test construction procedures. Participants were 154 12th-grade students from the Portland (Oregon) area. A 45-item test with two options…

Descriptors: Comparative Testing, Distractors (Tests), Estimation (Mathematics), Grade 12

Using Confirmatory Factor Analysis of Multitrait-Multimethod Data To Assess the Psychometrical Equivalence of 4-Point and 6-Point Likert-Type Scales.

Download full text

Chang, Lei – 1993

Equivalence in reliability and validity across 4-point and 6-point scales was assessed by fitting different measurement models through confirmatory factor analysis of a multitrait-multimethod covariance matrix. Responses to nine Likert-type items designed to measure perceived quantitative ability, self-perceived usefulness of quantitative…

Descriptors: Ability, Comparative Testing, Education Majors, Graduate Students

Draft of a Model for Vocational Student Assessment.

Southern Association of Colleges and Schools, Atlanta, GA. – 1983

This volume contains the initial draft of a model for assessing students in vocational education programs in Georgia. Addressed in the first section of the draft are some of the components that are believed to be critical in the development of a model for assessing vocational student achievement, including selecting a program for use in developing…

Descriptors: Academic Achievement, Behavioral Objectives, Criterion Referenced Tests, Guidelines

Trevisan, Michael S.	2
Al-Jarf, Reima	1
Baron, Simon	1
Becker, Kirk A.	1
Bergstrom, Betty A.	1
Bernard, David	1
Brennan, Robert L.	1
Bush, Martin	1
Chang, Lei	1
Douglas, Dan	1
Eignor, Daniel R.	1
Hambleton, Ronald K.	1
Lee, Eunjung	1
Lee, Won-Chan	1
Myszkowski, Nils	1
Patsula, Liane	1
Rizavi, Saba	1
Rotou, Ourania	1
Sax, Gilbert	1
Steffen, Manfred	1
Storme, Martin	1
Wang, Zhen	1
Yao, Lihua	1
More ▼