NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Location
Mexico1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Hongwen Guo; Matthew S. Johnson; Daniel F. McCaffrey; Lixong Gu – ETS Research Report Series, 2024
The multistage testing (MST) design has been gaining attention and popularity in educational assessments. For testing programs that have small test-taker samples, it is challenging to calibrate new items to replenish the item pool. In the current research, we used the item pools from an operational MST program to illustrate how research studies…
Descriptors: Test Items, Test Construction, Sample Size, Scaling
Peer reviewed Peer reviewed
Direct linkDirect link
Arce-Ferrer, Alvaro J.; Bulut, Okan – Journal of Experimental Education, 2019
This study investigated the performance of four widely used data-collection designs in detecting test-mode effects (i.e., computer-based versus paper-based testing). The experimental conditions included four data-collection designs, two test-administration modes, and the availability of an anchor assessment. The test-level and item-level results…
Descriptors: Data Collection, Test Construction, Test Format, Computer Assisted Testing
Mullis, Ina V. S., Ed.; Martin, Michael O., Ed.; von Davier, Matthias, Ed. – International Association for the Evaluation of Educational Achievement, 2021
TIMSS (Trends in International Mathematics and Science Study) is a long-standing international assessment of mathematics and science at the fourth and eighth grades that has been collecting trend data every four years since 1995. About 70 countries use TIMSS trend data for monitoring the effectiveness of their education systems in a global…
Descriptors: Achievement Tests, International Assessment, Science Achievement, Mathematics Achievement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Pichette, François; Béland, Sébastien; Jolani, Shahab; Lesniewska, Justyna – Studies in Second Language Learning and Teaching, 2015
Researchers are frequently confronted with unanswered questions or items on their questionnaires and tests, due to factors such as item difficulty, lack of testing time, or participant distraction. This paper first presents results from a poll confirming previous claims (Rietveld & van Hout, 2006; Schafer & Graham, 2002) that data…
Descriptors: Language Research, Data Analysis, Simulation, Item Analysis
Titus, Freddie – ProQuest LLC, 2010
Fifty percent of college-bound students graduate from high school underprepared for mathematics at the post-secondary level. As a result, thirty-five percent of college students take developmental mathematics courses. What is even more shocking is the high failure rate (ranging from 35 to 42 percent) of students enrolled in developmental…
Descriptors: Video Technology, Educational Strategies, Test Results, Test Items
Kaliski, Pamela; Huff, Kristen; Barry, Carol – College Board, 2011
For educational achievement tests that employ multiple-choice (MC) items and aim to reliably classify students into performance categories, it is critical to design MC items that are capable of discriminating student performance according to the stated achievement levels. This is accomplished, in part, by clearly understanding how item design…
Descriptors: Alignment (Education), Academic Achievement, Expertise, Evaluative Thinking
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Alina A.; Wilson, Christine – Educational and Psychological Measurement, 2007
This article discusses the assumptions required by the item response theory (IRT) true-score equating method (with Stocking & Lord, 1983; scaling approach), which is commonly used in the nonequivalent groups with an anchor data-collection design. More precisely, this article investigates the assumptions made at each step by the IRT approach to…
Descriptors: Calculus, Item Response Theory, Scores, Data Collection
von Davier, Matthias; von Davier, Alina A. – Educational Testing Service, 2004
This paper examines item response theory (IRT) scale transformations and IRT scale linking methods used in the Non-Equivalent Groups with Anchor Test (NEAT) design to equate two tests, X and Y. It proposes a unifying approach to the commonly used IRT linking methods: mean-mean, mean-var linking, concurrent calibration, Stocking and Lord and…
Descriptors: Measures (Individuals), Item Response Theory, Item Analysis, Models
Yen, Wendy M. – 1982
The three-parameter logistic model discussed was used by CTB/McGraw-Hill in the development of the Comprehensive Tests of Basic Skills, Form U (CTBS/U) and the Test of Cognitive Skills (TCS), published in the fall of 1981. The development, standardization, and scoring of the tests are described, particularly as these procedures were influenced by…
Descriptors: Achievement Tests, Bayesian Statistics, Cognitive Processes, Data Collection
Mayer, Victor J.; Monk, John S. – 1983
Work on the development of the intensive time-series design was initiated because of the dissatisfaction with existing research designs. This dissatisfaction resulted from the paucity of data obtained from designs such as the pre-post and randomized posttest-only designs. All have the common characteristic of yielding data from only one or two…
Descriptors: Academic Achievement, Computer Oriented Programs, Data Analysis, Data Collection
Prestwood, J. Stephen; And Others – 1985
In order to take advantage of advances in the field of mental measurement, the Armed Forces and the Department of Defense have supported the development of a computerized adaptive version of the Armed Services Vocational Aptitude Battery (ASVAB) for use in military personnel selection and classification. This report describes the development and…
Descriptors: Aptitude Tests, Armed Forces, Computer Assisted Testing, Data Collection
Filby, Nikola N. – 1976
The development and refinement of the measures of student achievement in reading and mathematics for the Beginning Teacher Evaluation Study are described. The concept of reactivity to instruction is introduced: the tests used to evaluate instructional processes must be sensitive indicators of classroom learning overtime. Data collection activities…
Descriptors: Achievement Gains, Achievement Tests, Data Analysis, Data Collection
Filby, Nikola N.; Dishaw, Marilyn – 1976
Major analyses of the achievement tests used in the Beginning Teacher Evaluation Study were conducted to determine test reactivity to instruction. Reading and mathematics tests were administered to second and fifth grade children. Classroom teachers' records were examined to determine the amount of opportunity students had to learn the content…
Descriptors: Academic Ability, Academic Achievement, Achievement Gains, Achievement Tests
Conger, Anthony J.; And Others – 1976
A review of the literature on the validity and reliability of survey data is presented prior to an analysis of the reliability of selected questions in the Second Followup Questionnaire of the National Longitudinal Study of the High School Class of 1972 (NLS). The reliability study includes an evaluation of test-retest reliability as a function of…
Descriptors: Academic Ability, Data Analysis, Data Collection, Demography