ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	7

Descriptor

Data Collection	14
Item Analysis	14
Test Items	14
Test Construction	7
Achievement Tests	5
Data Analysis	5
Item Response Theory	4
Test Validity	4
Academic Achievement	3
Computer Assisted Testing	3
Difficulty Level	3
Scores	3
Student Evaluation	3
Academic Ability	2
Achievement Gains	2
Computation	2
Elementary Secondary Education	2
Evaluation Methods	2
Foreign Countries	2
Grade 2	2
High School Students	2
Item Banks	2
Latent Trait Theory	2
Multiple Choice Tests	2
Postsecondary Education	2
More ▼

Source

College Board	1
ETS Research Report Series	1
Educational Testing Service	1
Educational and Psychological…	1
International Association for…	1
Journal of Experimental…	1
ProQuest LLC	1
Studies in Second Language…	1

Publication Type

Reports - Research	8
Journal Articles	4
Reports - Evaluative	2
Collected Works - General	1
Dissertations/Theses -…	1
Guides - General	1
Information Analyses	1
Non-Print Media	1
Reference Materials - General	1
Reports - Descriptive	1
Speeches/Meeting Papers	1
More ▼

Education Level

High Schools	2
Secondary Education	2
Elementary Secondary Education	1
Higher Education	1
Postsecondary Education	1

Audience

Location

Mexico

Laws, Policies, & Programs

Assessments and Surveys

Armed Services Vocational…	1
Comprehensive Tests of Basic…	1
National Longitudinal Study…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Practical Considerations in Item Calibration with Small Samples under Multistage Test Design: A Case Study. Research Report. ETS RR-24-03

Peer reviewed
PDF on ERIC

Download full text

Hongwen Guo; Matthew S. Johnson; Daniel F. McCaffrey; Lixong Gu – ETS Research Report Series, 2024

The multistage testing (MST) design has been gaining attention and popularity in educational assessments. For testing programs that have small test-taker samples, it is challenging to calibrate new items to replenish the item pool. In the current research, we used the item pools from an operational MST program to illustrate how research studies…

Descriptors: Test Items, Test Construction, Sample Size, Scaling

Effects of Data-Collection Designs in the Comparison of Computer-Based and Paper-Based Tests

Peer reviewed

Direct link

Arce-Ferrer, Alvaro J.; Bulut, Okan – Journal of Experimental Education, 2019

This study investigated the performance of four widely used data-collection designs in detecting test-mode effects (i.e., computer-based versus paper-based testing). The experimental conditions included four data-collection designs, two test-administration modes, and the availability of an anchor assessment. The test-level and item-level results…

Descriptors: Data Collection, Test Construction, Test Format, Computer Assisted Testing

TIMSS 2023 Assessment Frameworks

Download full text

Mullis, Ina V. S., Ed.; Martin, Michael O., Ed.; von Davier, Matthias, Ed. – International Association for the Evaluation of Educational Achievement, 2021

TIMSS (Trends in International Mathematics and Science Study) is a long-standing international assessment of mathematics and science at the fourth and eighth grades that has been collecting trend data every four years since 1995. About 70 countries use TIMSS trend data for monitoring the effectiveness of their education systems in a global…

Descriptors: Achievement Tests, International Assessment, Science Achievement, Mathematics Achievement

The Handling of Missing Binary Data in Language Research

Peer reviewed
PDF on ERIC

Download full text

Pichette, François; Béland, Sébastien; Jolani, Shahab; Lesniewska, Justyna – Studies in Second Language Learning and Teaching, 2015

Researchers are frequently confronted with unanswered questions or items on their questionnaires and tests, due to factors such as item difficulty, lack of testing time, or participant distraction. This paper first presents results from a poll confirming previous claims (Rietveld & van Hout, 2006; Schafer & Graham, 2002) that data…

Descriptors: Language Research, Data Analysis, Simulation, Item Analysis

A Cognitive Analysis of Developmental Mathematics Students' Errors and Misconceptions in Real Number Computations and Evaluating Algebraic Expressions

Direct link

Titus, Freddie – ProQuest LLC, 2010

Fifty percent of college-bound students graduate from high school underprepared for mathematics at the post-secondary level. As a result, thirty-five percent of college students take developmental mathematics courses. What is even more shocking is the high failure rate (ranging from 35 to 42 percent) of students enrolled in developmental…

Descriptors: Video Technology, Educational Strategies, Test Results, Test Items

Aligning Items and Achievement Levels: A Study Comparing Expert Judgments

Download full text

Kaliski, Pamela; Huff, Kristen; Barry, Carol – College Board, 2011

For educational achievement tests that employ multiple-choice (MC) items and aim to reliably classify students into performance categories, it is critical to design MC items that are capable of discriminating student performance according to the stated achievement levels. This is accomplished, in part, by clearly understanding how item design…

Descriptors: Alignment (Education), Academic Achievement, Expertise, Evaluative Thinking

IRT True-Score Test Equating: A Guide through Assumptions and Applications

Peer reviewed

Direct link

von Davier, Alina A.; Wilson, Christine – Educational and Psychological Measurement, 2007

This article discusses the assumptions required by the item response theory (IRT) true-score equating method (with Stocking & Lord, 1983; scaling approach), which is commonly used in the nonequivalent groups with an anchor data-collection design. More precisely, this article investigates the assumptions made at each step by the IRT approach to…

Descriptors: Calculus, Item Response Theory, Scores, Data Collection

A Unified Approach to IRT Scale Linking and Scale Transformations. Research Report. RR-04-09

Download full text

von Davier, Matthias; von Davier, Alina A. – Educational Testing Service, 2004

This paper examines item response theory (IRT) scale transformations and IRT scale linking methods used in the Non-Equivalent Groups with Anchor Test (NEAT) design to equate two tests, X and Y. It proposes a unifying approach to the commonly used IRT linking methods: mean-mean, mean-var linking, concurrent calibration, Stocking and Lord and…

Descriptors: Measures (Individuals), Item Response Theory, Item Analysis, Models

Use of Three-Parameter Item Response Theory in the Development of CTBS, Form U, and TCS.

Yen, Wendy M. – 1982

The three-parameter logistic model discussed was used by CTB/McGraw-Hill in the development of the Comprehensive Tests of Basic Skills, Form U (CTBS/U) and the Test of Cognitive Skills (TCS), published in the fall of 1981. The development, standardization, and scoring of the tests are described, particularly as these procedures were influenced by…

Descriptors: Achievement Tests, Bayesian Statistics, Cognitive Processes, Data Collection

Handbook for Using the Intensive Time-Series Design.

Download full text

Mayer, Victor J.; Monk, John S. – 1983

Work on the development of the intensive time-series design was initiated because of the dissatisfaction with existing research designs. This dissatisfaction resulted from the paucity of data obtained from designs such as the pre-post and randomized posttest-only designs. All have the common characteristic of yielding data from only one or two…

Descriptors: Academic Achievement, Computer Oriented Programs, Data Analysis, Data Collection

Armed Services Vocational Aptitude Battery: Development of an Adaptive Item Pool.

Prestwood, J. Stephen; And Others – 1985

In order to take advantage of advances in the field of mental measurement, the Armed Forces and the Department of Defense have supported the development of a computerized adaptive version of the Armed Services Vocational Aptitude Battery (ASVAB) for use in military personnel selection and classification. This report describes the development and…

Descriptors: Aptitude Tests, Armed Forces, Computer Assisted Testing, Data Collection

Progress Report on Reactivity Analyses (October-December Test Data). Beginning Teacher Evaluation Study. Technical Note Series. Technical Note III-5.

Filby, Nikola N. – 1976

The development and refinement of the measures of student achievement in reading and mathematics for the Beginning Teacher Evaluation Study are described. The concept of reactivity to instruction is introduced: the tests used to evaluate instructional processes must be sensitive indicators of classroom learning overtime. Data collection activities…

Descriptors: Achievement Gains, Achievement Tests, Data Analysis, Data Collection

Refinement of Reading and Mathematics Test Through an Analysis of Reactivity. Beginning Teacher Evaluation Study. Technical Report Series. Technical Report III-6.

Filby, Nikola N.; Dishaw, Marilyn – 1976

Major analyses of the achievement tests used in the Beginning Teacher Evaluation Study were conducted to determine test reactivity to instruction. Reading and mathematics tests were administered to second and fifth grade children. Classroom teachers' records were examined to determine the amount of opportunity students had to learn the content…

Descriptors: Academic Ability, Academic Achievement, Achievement Gains, Achievement Tests

National Longitudinal Study of the High School Class of 1972. Reliability and Validity of National Longitudinal Study Measures: An Empirical Reliability Analysis of Selected Data and a Review of the Literature on the Validity and Reliability of Survey Research Questionnaires.

Download full text

Conger, Anthony J.; And Others – 1976

A review of the literature on the validity and reliability of survey data is presented prior to an analysis of the reliability of selected questions in the Second Followup Questionnaire of the National Longitudinal Study of the High School Class of 1972 (NLS). The reliability study includes an evaluation of test-retest reliability as a function of…

Descriptors: Academic Ability, Data Analysis, Data Collection, Demography

Filby, Nikola N.	2
von Davier, Alina A.	2
Arce-Ferrer, Alvaro J.	1
Barry, Carol	1
Bulut, Okan	1
Béland, Sébastien	1
Conger, Anthony J.	1
Daniel F. McCaffrey	1
Dishaw, Marilyn	1
Hongwen Guo	1
Huff, Kristen	1
Jolani, Shahab	1
Kaliski, Pamela	1
Lesniewska, Justyna	1
Lixong Gu	1
Martin, Michael O., Ed.	1
Matthew S. Johnson	1
Mayer, Victor J.	1
Monk, John S.	1
Mullis, Ina V. S., Ed.	1
Pichette, François	1
Prestwood, J. Stephen	1
Titus, Freddie	1
Wilson, Christine	1
More ▼