ERIC - Search Results

Publication Date

In 2026	0
Since 2025	5
Since 2022 (last 5 years)	10
Since 2017 (last 10 years)	33
Since 2007 (last 20 years)	51

Descriptor

Test Length	133
Test Reliability	133
Test Validity	63
Test Items	44
Test Construction	42
Scores	24
Test Format	23
Computer Assisted Testing	21
Error of Measurement	20
Foreign Countries	20
Item Response Theory	19
Comparative Analysis	16
Statistical Analysis	16
Psychometrics	15
Difficulty Level	14
Item Analysis	14
Adaptive Testing	13
Language Tests	13
Testing Problems	13
Correlation	12
Higher Education	12
Mathematical Models	12
Testing	12
Mastery Tests	11
Cutting Scores	10
More ▼

Publication Type

Reports - Research	91
Journal Articles	74
Speeches/Meeting Papers	18
Reports - Evaluative	16
Reports - Descriptive	6
Tests/Questionnaires	4
Guides - Non-Classroom	3
Information Analyses	2
Opinion Papers	2
Reference Materials -…	2
Collected Works - Serials	1
Guides - General	1
Numerical/Quantitative Data	1
Reports - General	1
More ▼

Education Level

Higher Education	12
Postsecondary Education	11
Elementary Education	9
Secondary Education	6
Early Childhood Education	4
Grade 6	4
Intermediate Grades	4
Middle Schools	4
Primary Education	4
Grade 3	3
Grade 5	3
Grade 7	3
Junior High Schools	3
Elementary Secondary Education	2
Grade 2	2
Grade 4	2
Grade 8	2
High Schools	2
Grade 1	1
Grade 9	1
Kindergarten	1
More ▼

Audience

Researchers	4
Practitioners	2
Community	1
Support Staff	1

Location

China	4
Turkey	3
Australia	2
Canada	2
Ireland	2
Netherlands	2
Singapore	2
United Kingdom	2
Alabama	1
California	1
Germany	1
Illinois (Chicago)	1
Indiana	1
Japan	1
Kenya	1
Maryland	1
New Jersey	1
New Zealand	1
Pennsylvania	1
Peru	1
Poland	1
Portugal	1
South Korea	1
Spain	1
Taiwan	1
More ▼

Laws, Policies, & Programs

Job Training Partnership Act…

What Works Clearinghouse Rating

Test Reliability X

Showing 61 to 75 of 133 results Save | Export

Estimating the Consistency and Accuracy of Classifications Based on Test Scores.

Peer reviewed

Livingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995

A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)

Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions

Estimating the Reliability of a Test Containing Multiple Item Formats.

Peer reviewed

Qualls, Audrey L. – Applied Measurement in Education, 1995

Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)

Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format

Comparison of Multistage Tests with Computerized Adaptive and Paper-and-Pencil Tests. Research Report. ETS RR-07-04

Peer reviewed
PDF on ERIC

Download full text

Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007

Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…

Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models

A Short Form of the WISC-III for Clinical Use.

Peer reviewed

Donders, Jacques – Psychological Assessment, 1997

Eight subtests were selected from the Wechsler Intelligence Scale for Children--Third Edition (WISC-III) to make a short form for clinical use. Results with the 2,200 children from the WISC-III standardization sample indicated the adequate reliability and validity of the short form for clinical use. (SLD)

Descriptors: Children, Clinical Diagnosis, Intelligence Tests, Test Format

Corrected Estimates of WAIS-R Short Form Reliability and Standard Error of Measurement.

Peer reviewed

Axelrod, Bradley N.; And Others – Psychological Assessment, 1996

The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)

Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests

Sampling Knowledge and Understanding: How Long Should a Test Be?

Peer reviewed

Direct link

Burton, Richard F. – Assessment & Evaluation in Higher Education, 2006

Many academic tests (e.g. short-answer and multiple-choice) sample required knowledge with questions scoring 0 or 1 (dichotomous scoring). Few textbooks give useful guidance on the length of test needed to do this reliably. Posey's binomial error model of 1932 provides the best starting point, but allows neither for heterogeneity of question…

Descriptors: Item Sampling, Tests, Test Length, Test Reliability

The Relationship of Reliability in Classroom Research to the Amount of Observation: An Extension of the Spearman-Brown Formula.

Peer reviewed

Rowley, Glenn – Journal of Educational Measurement, 1978

The reliabilities of various observational measures were determined, and the influence of both the number and the length of the observation periods on reliability was examined, both separately and jointly. A single simplifying assumption leads to a variant of the Spearman-Brown formula, which may have wider application. (Author/CTM)

Descriptors: Career Development, Classroom Observation Techniques, Observation, Reliability

The Utility of Two Wechsler Adult Intelligence Scale Short Forms with Prisoners

Peer reviewed

Nelson, W. M., III; And Others – Journal of Personality Assessment, 1978

This study used 126 young adult black and white male inmates to test the comparability of the Pauker and Statz and Mogul short forms with the standard Wechsler Adult Intelligence Scale (WAIS). The Pauker form was superior with this population. Findings should not be generalized to other ages, races, or to women. (Author/CP)

Descriptors: Intelligence, Intelligence Differences, Intelligence Tests, Males

Efficiency of Linear Equating as a Function of the Length of the Anchor Test.

Peer reviewed

Budescu, David – Journal of Educational Measurement, 1985

An important determinant of equating process efficiency is the correlation between the anchor test and components of each form. Use of some monotonic function of this correlation as a measure of equating efficiency is suggested. A model relating anchor test length and test reliability to this measure of efficiency is presented. (Author/DWH)

Descriptors: Correlation, Equated Scores, Mathematical Models, Standardized Tests

Evaluation of Implied Orders as a Basis for Tailored Testing with Simulation Data.

Peer reviewed

Cliff, Norman; And Others – Applied Psychological Measurement, 1979

Monte Carlo research with TAILOR, a program using implied orders as a basis for tailored testing, is reported. TAILOR typically required about half the available items to estimate, for each simulated examinee, the responses on the remainder. (Author/CTM)

Descriptors: Adaptive Testing, Computer Programs, Item Sampling, Nonparametric Statistics

Influence of Test and Person Characteristics on Nonparametric Appropriateness Measurement.

Peer reviewed

Meijer, Rob R.; And Others – Applied Psychological Measurement, 1994

The power of the nonparametric person-fit statistic, U3, is investigated through simulations as a function of item characteristics, test characteristics, person characteristics, and the group to which examinees belong. Results suggest conditions under which relatively short tests can be used for person-fit analysis. (SLD)

Descriptors: Difficulty Level, Group Membership, Item Response Theory, Nonparametric Statistics

Tailor-APL: An Interactive Computer Program for Individual Tailored Testing. Technical Report No. 5.

McCormick, Douglas J. – 1978

Tailored testing increases the efficiency of tests by individually selecting for each person a set of items from an item pool so that the difficulty of the items selected will be such as to maximize the information provided by the score. The tailored testing procedure designed by Cliff orders persons and items on a common ordinal scale and…

Descriptors: Adaptive Testing, Branching, Computer Assisted Testing, Computer Programs

The Factorial and Predictive Validities of a Revised Measure of Zaichkowsky's Personal Involvement Inventory.

Peer reviewed

Munson, J. Michael; McQuarrie, Edward F. – Educational and Psychological Measurement, 1987

A shortened version of Zaichkowsky's 20-item Personal Involvement Inventory was created, removing four items which might be difficult to understand for noncollege-educated populations. The 16-item modified version had acceptable internal consistency; test-retest reliability; and factorial and predictive validity. (Author/GDC)

Descriptors: Factor Structure, Higher Education, Interest Inventories, Personality Measures

Constructing Short Forms from Composite Tests: Reliability and Validity.

Peer reviewed

Willson, Victor L.; Reynold, Cecil R. – Educational and Psychological Measurement, 1985

Techniques for constructing short forms of tests are discussed, and an example is given using the Wechsler Adult Intelligence Scale-Revised. Reliability and validity estimation equations are presented. (GDC)

Descriptors: Adults, Individual Testing, Intelligence Tests, Norm Referenced Tests

Test Review: The Revised PPVT.

Peer reviewed

Kipps, Debi; Hanson, Dave – School Psychology Review, 1983

The Peabody Picture Vocabulary Test-Revised (Dunn and Dunn) is described as a convenient, quick test, possessing improvements over the original. It measures a subject's receptive (hearing) vocabulary for Standard American English. However, the validity information for the test is less than adequate, since no validity studies are presented for it.…

Descriptors: Auditory Tests, Individual Testing, Scores, Test Length

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Educational and Psychological…	13
Journal of Psychoeducational…	8
Applied Psychological…	5
Journal of Educational…	5
Psychometrika	4
Applied Measurement in…	3
Language Testing	3
Assessment & Evaluation in…	2
ETS Research Report Series	2
International Journal of…	2
Journal of Personality…	2
Psychological Assessment	2
Research Matters	2
ACT Education Corp.	1
AERA Online Paper Repository	1
African Educational Research…	1
Anatomical Sciences Education	1
Assessment	1
Assessment and Evaluation in…	1
College Student Journal	1
Contemporary Educational…	1
Education and Information…	1
Educational Research and…	1
Educational Sciences: Theory…	1
Eurasian Journal of…	1
More ▼

Hambleton, Ronald K.	4
Burton, Richard F.	3
Cliff, Norman	2
Gilmer, Jerry S.	2
Huynh, Huynh	2
Lee, Yi-Hsuan	2
Leite, Walter L.	2
Livingston, Samuel A.	2
Marcoulides, Katerina M.	2
Raborn, Anthony W.	2
Reckase, Mark D.	2
Wilcox, Rand R.	2
Yao, Lihua	2
Zhang, Jinming	2
de Jong, John H. A. L.	2
Abrams, Matthew	1
Allison, Paul A.	1
Almeida, Leandro S.	1
Anderson, Judith A.	1
Andrea Fuster	1
Andy Rick Sánchez-Villena	1
Anthony, Christopher J.	1
Anthony, Christopher James	1
Arens, A. Katrin	1
More ▼

Wechsler Adult Intelligence…	3
McCarthy Scales of Childrens…	2
Peabody Picture Vocabulary…	2
Test of English as a Foreign…	2
Wechsler Intelligence Scale…	2
ACT Assessment	1
ACTFL Oral Proficiency…	1
Adaptive Behavior Scale	1
Armed Forces Qualification…	1
Comprehensive Tests of Basic…	1
Developmental Indicators for…	1
Draw a Person Test	1
Fennema Sherman Mathematics…	1
Iowa Tests of Basic Skills	1
MacArthur Communicative…	1
Matching Familiar Figures Test	1
Measures of Academic Progress	1
Medical College Admission Test	1
Minnesota Multiphasic…	1
Multidimensional…	1
National Assessment of…	1
Positive and Negative Affect…	1
School and College Ability…	1
Self Description Questionnaire	1
Stanford Binet Intelligence…	1
More ▼