ERIC - Search Results

Publication Date

In 2026	0
Since 2025	8
Since 2022 (last 5 years)	31
Since 2017 (last 10 years)	79
Since 2007 (last 20 years)	136

Descriptor

Test Items	226
Test Length	226
Item Response Theory	90
Sample Size	66
Test Construction	66
Computer Assisted Testing	54
Adaptive Testing	52
Simulation	51
Test Reliability	44
Error of Measurement	41
Comparative Analysis	40
Difficulty Level	40
Accuracy	38
Item Analysis	37
Test Format	37
Test Validity	32
Correlation	30
Computation	29
Statistical Analysis	29
Test Bias	29
Monte Carlo Methods	28
Models	27
Scores	27
Item Banks	26
Goodness of Fit	23
More ▼

Publication Type

Reports - Research	155
Journal Articles	138
Reports - Evaluative	41
Speeches/Meeting Papers	32
Dissertations/Theses -…	19
Reports - Descriptive	7
Numerical/Quantitative Data	6
Guides - Non-Classroom	4
Tests/Questionnaires	3
Information Analyses	2
Opinion Papers	2
Historical Materials	1
Reference Materials -…	1
More ▼

Education Level

Higher Education	14
Postsecondary Education	13
Secondary Education	8
Elementary Education	6
Elementary Secondary Education	6
High Schools	4
Early Childhood Education	3
Grade 3	3
Middle Schools	3
Grade 6	2
Intermediate Grades	2
Primary Education	2
Grade 11	1
Grade 12	1
Junior High Schools	1
Preschool Education	1
More ▼

Audience

Researchers	9
Administrators	1
Community	1
Practitioners	1

Location

Turkey	2
Alabama	1
Asia	1
Australia	1
Germany	1
Illinois (Chicago)	1
Indiana	1
Iran	1
Israel	1
Japan	1
Netherlands	1
New Jersey	1
Peru	1
South Korea	1
Taiwan	1
Ukraine	1
More ▼

Laws, Policies, & Programs

Job Training Partnership Act…	1
Race to the Top	1

What Works Clearinghouse Rating

Test Items X

Showing 136 to 150 of 226 results Save | Export

Empirical Evaluation of Formulae for Correction of Item-Total Point-Biserial Correlations.

Peer reviewed

Berk, Ronald A. – Educational and Psychological Measurement, 1978

Three formulae developed to correct item-total correlations for spuriousness were evaluated. Relationships among corrected, uncorrected, and item-remainder correlations were determined by computing sets of mean, minimum, and maximum deviation coefficients and Spearman rank correlations for nine test lengths. (Author/JKS)

Descriptors: Correlation, Intermediate Grades, Item Analysis, Test Construction

Exploring the Relationship between Item Exposure Rate and Test Overlap Rate in Computerized Adaptive Testing.

Download full text

Chen, Shu-Ying; Ankenmann, Robert D.; Spray, Judith A. – 1999

This paper presents a derivation of an average between-test overlap index as a function of the item exposure index, for fixed-length computerized adaptive tests (CAT). This relationship is used to investigate the simultaneous control of item exposure at both the item and test levels. Implications for practice as well as future research are also…

Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Test Items

One Iota Fills the Quota: A Paradox in Multifacet Reliability Coefficients.

Peer reviewed

Conger, Anthony J. – Educational and Psychological Measurement, 1983

A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)

Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length

On the Consistency of Individual Classification Using Short Scales

Peer reviewed

Direct link

Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007

Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…

Descriptors: Psychiatry, Patients, Error of Measurement, Test Length

An Investigation of Factors Affecting Test Equating in Latent Trait Theory.

Peer reviewed

Sunathong, Surintorn; Schumacker, Randall E.; Beyerlein, Michael M. – Journal of Applied Measurement, 2000

Studied five factors that can affect the equating of scores from two tests onto a common score scale through the simulation and equating of 4,860 item data sets. Findings indicate three statistically significant two-way interactions for common item length and test length, item difficulty standard deviation and item distribution type, and item…

Descriptors: Difficulty Level, Equated Scores, Interaction, Item Response Theory

Sequential Computerized Mastery Tests--Three Simulation Studies

Peer reviewed

Direct link

Wiberg, Marie – International Journal of Testing, 2006

A simulation study of a sequential computerized mastery test is carried out with items modeled with the 3 parameter logistic item response theory model. The examinees' responses are either identically distributed, not identically distributed, or not identically distributed together with estimation errors in the item characteristics. The…

Descriptors: Test Length, Computer Simulation, Mastery Tests, Item Response Theory

An Investigation into the Possible Speededness of the Medical College Admission Test. MCAT Monograph 3.

PDF pending restoration

Neustel, Sandra – 2001

As a continuing part of its validity studies, the Association of American Medical Colleges commissioned a study of the speediness of the Medical College Admission Test (MCAT). If speed is a hidden part of the test, it is a threat to its construct validity. As a general rule, the criterion used to indicate lack of speediness is that 80% of the…

Descriptors: College Applicants, College Entrance Examinations, Higher Education, Medical Education

An Evaluation of "Intentional" Weighting of Extended-Response or Constructed-Response Items in Tests with Mixed Item Types.

Download full text

Ito, Kyoko; Sykes, Robert C. – 2000

This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…

Descriptors: Constructed Response, Elementary Education, Essay Tests, Test Construction

Estimation of Test Length for Domain-Referenced Reading Comprehension Tests.

Peer reviewed

Berk, Ronald A. – Journal of Experimental Education, 1980

A sampling methodology is proposed for determining lengths of tests designed to assess the comprehension of written discourse. It is based on Bormuth's transformational analysis, within a domain-referenced framework. Guidelines are provided for computing sample size and selecting sentences to which the transformational rules can be applied.…

Descriptors: Reading Comprehension, Reading Tests, Sampling, Test Construction

The Influence of Multidimensionality on the Graded Response Model.

Peer reviewed

De Ayala, R. J. – Applied Psychological Measurement, 1994

Previous work on the effects of dimensionality on parameter estimation for dichotomous models is extended to the graded response model. Datasets are generated that differ in the number of latent factors as well as their interdimensional association, number of test items, and sample size. (SLD)

Descriptors: Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics, Sample Size

Estimating the Consistency and Accuracy of Classifications Based on Test Scores.

Peer reviewed

Livingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995

A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)

Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions

Estimating the Reliability of a Test Containing Multiple Item Formats.

Peer reviewed

Qualls, Audrey L. – Applied Measurement in Education, 1995

Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)

Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format

A Consideration for Variable Length Adaptive Tests.

Download full text

Wingersky, Marilyn S. – 1989

In a variable-length adaptive test with a stopping rule that relied on the asymptotic standard error of measurement of the examinee's estimated true score, M. S. Stocking (1987) discovered that it was sufficient to know the examinee's true score and the number of items administered to predict with some accuracy whether an examinee's true score was…

Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)

What's Wrong with Three-Option Multiple Choice Items?

Peer reviewed

Owen, Steven V.; Froman, Robin D. – Educational and Psychological Measurement, 1987

To test further for efficacy of three-option achievement items, parallel three- and five-option item tests were distributed randomly to college students. Results showed no differences in mean item difficulty, mean discrimination or total test score, but a substantial reduction in time spent on three-option items. (Author/BS)

Descriptors: Achievement Tests, Higher Education, Multiple Choice Tests, Test Format

The Classification Accuracy of Shortened versus Full Length Tests with Number Correct Scoring.

Download full text

Schulz, E. Matthew; Wang, Lin – 2001

In this study, items were drawn from a full-length test of 30 items in order to construct shorter tests for the purpose of making accurate pass/fail classifications with regard to a specific criterion point on the latent ability metric. A three-item parameter Item Response Theory (IRT) framework was used. The criterion point on the latent ability…

Descriptors: Ability, Classification, Item Response Theory, Pass Fail Grading

« Previous Page | Next Page »

Pages: 1 | ... | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16

Educational and Psychological…	33
ProQuest LLC	19
Journal of Educational…	16
Applied Measurement in…	9
Applied Psychological…	9
ETS Research Report Series	9
International Journal of…	7
International Journal of…	7
Journal of Educational and…	5
Measurement:…	4
Journal of Psychoeducational…	3
Assessment & Evaluation in…	2
Education and Information…	2
Educational Sciences: Theory…	2
Eurasian Journal of…	2
Grantee Submission	2
Journal of Experimental…	2
Journal of Technology,…	2
Physical Review Physics…	2
ACT Education Corp.	1
AERA Online Paper Repository	1
Advanced Education	1
Anatomical Sciences Education	1
Asia Pacific Education Review	1
Assessment and Evaluation in…	1
More ▼

Wainer, Howard	6
Hambleton, Ronald K.	4
Wang, Wen-Chung	4
Berk, Ronald A.	3
Burton, Richard F.	3
Cohen, Allan S.	3
Huggins-Manley, Anne Corinne	3
Lee, Won-Chan	3
Lee, Yi-Hsuan	3
Pommerich, Mary	3
Reckase, Mark D.	3
Sijtsma, Klaas	3
Wang, Chun	3
Weiss, David J.	3
Zhang, Jinming	3
Bradshaw, Laine	2
Bulut, Okan	2
Chen, Shu-Ying	2
Cheng, Ying	2
Chernyshenko, Oleksandr S.	2
Cui, Ying	2
De Ayala, R. J.	2
Diao, Qi	2
Dogan, Nuri	2
More ▼

Program for International…	4
Test of English as a Foreign…	3
Trends in International…	3
SAT (College Admission Test)	2
ACT Assessment	1
Advanced Placement…	1
Armed Forces Qualification…	1
COMPASS (Computer Assisted…	1
Comprehensive Tests of Basic…	1
Force Concept Inventory	1
Iowa Tests of Basic Skills	1
MacArthur Communicative…	1
Medical College Admission Test	1
National Longitudinal Study…	1
New Jersey College Basic…	1
Otis Lennon School Ability…	1
Raven Advanced Progressive…	1
School and College Ability…	1
Stanford Binet Intelligence…	1
Texas Assessment of Basic…	1
Texas Educational Assessment…	1
Wechsler Intelligence Scale…	1
Wechsler Intelligence Scales…	1
More ▼