ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	4

Descriptor

Item Analysis	41
Test Reliability	41
Testing Problems	41
Test Validity	21
Test Items	17
Test Construction	15
Multiple Choice Tests	8
Response Style (Tests)	8
Test Interpretation	8
Error of Measurement	7
Scoring	7
Test Bias	7
Achievement Tests	6
Sampling	5
Test Length	5
Correlation	4
Criterion Referenced Tests	4
Higher Education	4
Latent Trait Theory	4
Mathematical Models	4
Measurement Techniques	4
Scores	4
Standardized Tests	4
Test Theory	4
Testing	4
More ▼

Source

Educational and Psychological…	3
Journal of Educational…	2
Journal of Experimental…	2
Applied Measurement in…	1
Assessment & Evaluation in…	1
Educational Research Quarterly	1
Educational Research and…	1
Educational Technology	1
J Educ Meas	1
Journal of Educational and…	1
Language Education &…	1
Nursing Outlook	1
Psychology in the Schools	1
Psychometrika	1
School Psychology Review	1
More ▼

Publication Type

Reports - Research	14
Journal Articles	11
Reports - Evaluative	4
Speeches/Meeting Papers	4
Information Analyses	2
Opinion Papers	2
Reports - Descriptive	2
Books	1
Collected Works - Serials	1
Guides - Classroom - Teacher	1
Guides - Non-Classroom	1
Tests/Questionnaires	1
More ▼

Education Level

Elementary Secondary Education

Audience

Practitioners	2
Researchers	2
Teachers	1

Location

Indonesia

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Comprehensive Tests of Basic…	1
Expressive One Word Picture…	1
Iowa Tests of Basic Skills	1
Law School Admission Test	1
Metropolitan Achievement Tests	1
Slosson Intelligence Test	1
Stanford Achievement Tests	1
Stanford Binet Intelligence…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 41 results Save | Export

Evaluating Research Reports on the Qualities of Tests of English Language Skills in Indonesian Schools: A Systematic Review

Peer reviewed
PDF on ERIC

Download full text

Patrisius Istiarto Djiwandono; Daniel Ginting – Language Education & Assessment, 2025

The teaching of English as a foreign language in Indonesia has a long history, and it is always important to ask whether the assessment of the students' language skills has been valid and reliable. A screening of many articles in several prominent databases reveal that a number of evaluation studies have been done by Indonesian scholars in the…

Descriptors: Foreign Countries, Language Tests, English (Second Language), Second Language Learning

Screening Test Items for Differential Item Functioning

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014

A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Do Item-Discrimination Indices Really Help Us To Improve Our Tests?

Peer reviewed

Burton, Richard F. – Assessment & Evaluation in Higher Education, 2001

Item-discrimination indices are numbers calculated from test data that are used in assessing the effectiveness of individual test questions. This article asserts that the indices are so unreliable as to suggest that countless good questions may have been discarded over the years. It considers how the indices, and hence overall test reliability,…

Descriptors: Guessing (Tests), Item Analysis, Test Reliability, Testing Problems

Two New Test Statistics for the Rasch Model.

Peer reviewed

van den Wollenberg, Arnold L. – Psychometrika, 1982

Presently available test statistics for the Rasch model are shown to be insensitive to violations of the assumption of test unidimensionality. Two new statistics are presented. One is similar to available statistics, but with some improvements; the other addresses the problem of insensitivity to unidimensionality. (Author/JKS)

Descriptors: Item Analysis, Latent Trait Theory, Statistics, Test Reliability

Stability of Response Process and Response

Peer reviewed

Kuncel, Ruth Boutin; Fiske, Donald W. – Educational and Psychological Measurement, 1974

Four hypotheses regarding stability of response process and response in personality testing are tested and supported. (RC)

Descriptors: College Students, Item Analysis, Personality Measures, Response Style (Tests)

The Effects of Repeaters on Test Equating.

Peer reviewed

Andrulis, Richard S.; And Others – Educational and Psychological Measurement, 1978

The effects of repeaters (testees included in both administrations of two forms of a test) on the test equating process are examined. It is shown that repeaters do effect test equating and tend to lower the cutoff point for passing the test. (JKS)

Descriptors: Cutting Scores, Equated Scores, Item Analysis, Scoring

Development of a School Attitude Questionnaire for Young Children.

Download full text

Strickland, Guy – 1970

This report summarizes the findings of Jackson and Lahadern who used a revised form of the Student Opinion Poll (SOP) and a questionnaire to study the intercorrelations of attitudes and achievement. The study found that: (1) first graders have attitudes toward school work but these attitudes were not differentiated toward specific school subjects;…

Descriptors: Achievement, Attitudes, Evaluation, Item Analysis

Problems in Scoring, Agreement among Raters, and Internal Consistency of Selected Marker Tests.

Peer reviewed

Rusch, Reuben; Steiner, Judith – Journal of Experimental Education, 1979

The Selected Marker Tests were examined for scoring problems and internal consistency and were administered orally to sixth and seventh graders. Scoring problems were discovered and changes were suggested. The problem was found to be item reliability rather than interrater reliability. (Author/MH)

Descriptors: Cognitive Tests, Elementary Education, Item Analysis, Problem Solving

Delphi Methodology: An Empirical Investigation

Peer reviewed

Barnette, J. Jackson; And Others – Educational Research Quarterly, 1978

The DELPHI procedure requires respondents to reply to several questionnaire iterations with subsequent rounds containing previous round feedback. This study investigated the methodology (response rates, effects of feedback) and offered evidence that large-scale DELPHI surveys are not as advantageous as has previously been indicated. Suggestions…

Descriptors: Feedback, Item Analysis, Measurement Techniques, Predictive Measurement

Models, Meanings and Misunderstandings: Some Issues in Applying Rasch's Theory

Peer reviewed

Whitely, Susan E. – Journal of Educational Measurement, 1977

A debate concerning specific issues and the general usefulness of the Rasch latent trait test model is continued. Methods of estimation, necessary sample size, and the applicability of the model are discussed. (JKS)

Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Measurement

Misunderstanding the Rasch Model

Peer reviewed

Wright, Benjamin D. – Journal of Educational Measurement, 1977

Statements made in a previous article of this journal concerning the Rasch latent trait test model are questioned. Methods of estimation, necessary sample sizes, several formuli, and the general usefulness of the Rasch model are discussed. (JKS)

Descriptors: Computers, Error of Measurement, Item Analysis, Mathematical Models

Constructing Higher Level Multiple Choice Questions Covering Factual Content

Miller, Harry G.; Williams, Reed G. – Educational Technology, 1973

Descriptors: Content Analysis, Item Analysis, Measurement Techniques, Multiple Choice Tests

Conceptualization of Issues in Construct and Content Validity. Studies in Measurement and Methodology, Work Unit No. 1: Conceptual and Design Problems in Competency-Based Measurements.

Linn, Robert – 1978

A series of studies on conceptual and design problems in competency-based measurements are explained. The concept of validity within the context of criterion-referenced measurement is reviewed. The authors believe validation should be viewed as a process rather than an end product. It is the process of marshalling evidence to support…

Descriptors: Criterion Referenced Tests, Item Analysis, Item Sampling, Test Bias

Previous Page | Next Page »

Pages: 1 | 2 | 3

Adkins, Dorothy C.	1
Altepeter, Tom	1
Anderson, Frances E.	1
Andrulis, Richard S.	1
Ballif, Bonnie L.	1
Barnette, J. Jackson	1
Bohning, Gerry	1
Bridgeman, Brent	1
Burton, Richard F.	1
Cahen, Leonard S.	1
Camilli, Gregory	1
Chase, Clinton I.	1
Craig, Robert	1
Cuttance, Peter F.	1
Daniel Ginting	1
Ebel, Robert L.	1
Evans, Franklin R.	1
Farr, Roger	1
Ferguson, Richard L.	1
Fiske, Donald W.	1
Floden, Robert E.	1
Gilmer, Jerry S.	1
Guilliams, Clark I.	1
Harnisch, Delwyn L.	1
More ▼