ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	6

Descriptor

Statistical Analysis	32
Test Reliability	32
Test Results	32
Test Validity	15
Test Interpretation	12
Test Construction	11
Evaluation Methods	7
Standardized Tests	7
Criterion Referenced Tests	6
Scores	6
Difficulty Level	5
Evaluation Criteria	5
Item Analysis	5
Testing	5
Tests	5
Achievement Tests	4
Aptitude Tests	4
Measurement Techniques	4
Student Evaluation	4
Testing Programs	4
Classification	3
Comparative Analysis	3
Correlation	3
Diagnostic Tests	3
Elementary Education	3
More ▼

Source

ACT, Inc.	1
Applied Psychological…	1
Behavioral Research and…	1
Educational and Psychological…	1
International Journal of…	1
International Journal of…	1
Journal of Applied Testing…	1
Journal of Educational…	1
Language Testing	1

Publication Type

Reports - Research	18
Journal Articles	6
Guides - General	2
Numerical/Quantitative Data	2
Reports - Evaluative	2
Speeches/Meeting Papers	2
Guides - Non-Classroom	1
Information Analyses	1
Reports - Descriptive	1

Education Level

Secondary Education	3
High Schools	2
Elementary Education	1
Elementary Secondary Education	1
Grade 2	1
Grade 9	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Researchers

Location

Australia	1
California	1
Indonesia	1
Maryland	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Adjective Check List	1
Armed Services Vocational…	1
Mean Length of Utterance	1
Miller Analogies Test	1
National Assessment of…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 32 results Save | Export

ACT Reporting Category Interpretation Guide: Version 1.0. ACT Working Paper 2016 (05)

Download full text

Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016

ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…

Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement

Developing an Instrument of Scientific Literacy Assessment on the Cycle Theme

Peer reviewed
PDF on ERIC

Download full text

Rusilowati, Ani; Kurniawati, Lina; Nugroho, Sunyoto E.; Widiyatmoko, Arif – International Journal of Environmental and Science Education, 2016

The purpose of this study is to develop scientific literacy evaluation instrument that tested its validity, reliability, and characteristics to measure the skill of student's scientific literacy used four scientific literacy, categories as follow:science as a body of knowledge (category A), science as a way of thinking (category B), science as a…

Descriptors: Foreign Countries, Junior High School Students, Grade 9, Test Construction

Analyzing the Reliability of the easyCBM Reading Comprehension Measures: Grade 2. Technical Report #1201

Download full text

Lai, Cheng-Fei; Irvin, P. Shawn; Alonzo, Julie; Park, Bitnara Jasmine; Tindal, Gerald – Behavioral Research and Teaching, 2012

In this technical report, we present the results of a reliability study of the second-grade multiple choice reading comprehension measures available on the easyCBM learning system conducted in the spring of 2011. Analyses include split-half reliability, alternate form reliability, person and item reliability as derived from Rasch analysis,…

Descriptors: Reading Comprehension, Testing Programs, Statistical Analysis, Elementary School Students

The Deceptive Mean: Conceptual Scoring of Cloze Entries Differentially Advantages More Able Readers

Peer reviewed

Direct link

O'Toole, J. M.; King, R. A. R. – Language Testing, 2011

The "cloze" test is one possible investigative instrument for predicting text comprehensibility. Conceptual coding of student replacement of deleted words has been considered to be more valid than exact coding, partly because conceptual coding seemed fairer to poorer readers. This paper reports a quantitative study of 447 Australian…

Descriptors: Cloze Procedure, Test Results, Language Tests, Reading Comprehension

The Contribution of Constructed Response Items to Large Scale Assessment: Measuring and Understanding Their Impact

Peer reviewed

Direct link

Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012

This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…

Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics

Differences Between the Miller Analogies Test Scores of People Tested Twice

Peer reviewed

Doppelt, Jerome E. – Educational and Psychological Measurement, 1971

Descriptors: Aptitude Tests, Scores, Statistical Analysis, Test Reliability

A Program for Estimating the Relative Efficiency of Tests at Various Ability Levels, for Equating True Scores, and for Predicting Bivariate Distributions of Observed Scores.

Download full text

Stocking, Martha; And Others – 1973

For two tests measuring the same trait, the program, BIV20, equates the scores using the two True score distributions estimated by the univariate method 20 program (see Wingersky, Lees, Lennon, and Lord, 1969) and, with these equated true scores and their distributions, estimates the bivariate distribution scores and the relative efficiency of the…

Descriptors: Computer Programs, Equated Scores, Statistical Analysis, Test Reliability

Psychological Testing and the Philosophy of Measurement.

Download full text

Whaley, Donald L. – 1973

An introductory textbook on psychological tests and measurements is presented in paper back booklet form. The style is informal and humorous, and the book is intended to appeal to the contemporary student. Ten chapters constitute the text: (1) On Measurement and Existence; (2) A Brief, Imprecise History of Psychological Testing; (3) The Creation…

Descriptors: Measurement, Psychological Testing, Sampling, Statistical Analysis

A Consumers' Guide to Criterion-Referenced Test Reliability. Reliability.

Peer reviewed

Berk, Ronald A. – Journal of Educational Measurement, 1980

A dozen different approaches that yield 13 reliability indices for criterion-referenced tests were identified and grouped into three categories: threshold loss function, squared-error loss function, and domain score estimation. Indices were evaluated within each category. (Author/RL)

Descriptors: Classification, Criterion Referenced Tests, Cutting Scores, Evaluation Methods

Sentence Repetition as a Measure of Early Grammatical Development in Italian

Peer reviewed

Direct link

Devescovi, Antonella; Caselli, M. Cristina – International Journal of Language & Communication Disorders, 2007

e mean length of utterance in the Sentence Repetition Task grew from approximately two to three words, and the number of omissions of articles, prepositions and modifiers significantly decreased. After 3;0 years old, omissions of free function words practically disappeared. The results of Study 2 showed that mean length of utterance, omission of…

Descriptors: Test Reliability, Test Results, Statistical Analysis, Memory

Aspects of a Methodology for Creating Criterion-Referenced Tests.

Download full text

Roudabush, Glenn E.; Green, Donald Ross – 1972

In determining how reliable is reliable enough and how much error can be tolerated in criterion-referenced testing, the following relationships hold: (1) the more specific an objective is, the fewer the items required to reliably measure it; (2) the more specific the objectives are, the more objectives required to cover a given span of the…

Descriptors: Behavioral Objectives, Criterion Referenced Tests, Diagnostic Tests, Statistical Analysis

Comparison of ASVAB Test-Retest Results of Male and Female Enlistees. Final Report for Period July 1974-October 1975.

Download full text

Valentine, Lonnie D., Jr.; Massey, Iris H. – 1976

Male and female enlistees were compared on the basis of their performance on the Armed Services Vocational Aptitude Battery. Mean Aptitude Index scores were compared for male and female enlistees on the original testing and on retest. Males scored higher on mechanical and electronics, and females scored higher on administrative and general. Both…

Descriptors: Aptitude Tests, Attitude Measures, Enlisted Personnel, Item Analysis

Preliminary Report on Use of Self-Ratings to Provide J-Coefficient Data.

Download full text

Primoff, Ernest S. – 1971

This report shows how Beta weights for the J-Coefficient may be easily developed without a formal validity study, and indicates how indications of ability other than tests can be used to measure the same abilities that are measured by tests. See also TM 001 163-64,166 for further information on job elements (J-Scale) procedures. (Author/DLG)

Descriptors: Achievement Rating, Correlation, Evaluation Criteria, Occupational Tests

The Effects of Choice Weights and Item Weights on the Reliability and Predictive Validity of Aptitude-Type Tests. Final Report.

Download full text

Bayuk, Robert J. – 1973

An investigation was conducted to determine the effects of response-category weighting and item weighting on reliability and predictive validity. Response-category weighting refers to scoring in which, for each category (including omit and "not read"), a weight is assigned that is proportional to the mean criterion score of examinees selecting…

Descriptors: Aptitude Tests, Correlation, Predictive Validity, Research Reports

The Internal and External Optimality of Decisions Based on Tests.

Peer reviewed

Mellenbergh, Gideon J.; van der Linden, Wim J. – Applied Psychological Measurement, 1979

For six tests, coefficient delta as an index for internal optimality is computed. Internal optimality is defined as the magnitude of risk of the decision procedure with respect to the true score. Results are compared with an alternative index (coefficient kappa) for assessing the consistency of decisions. (Author/JKS)

Descriptors: Classification, Comparative Analysis, Decision Making, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2 | 3

Aaronson, May	1
Alonzo, Julie	1
Bayuk, Robert J.	1
Benson, Jeri	1
Berk, Ronald A.	1
Besel, Ronald	1
Blatchford, Charles H.	1
Caselli, M. Cristina	1
Charters, Moire C.	1
Crocker, Linda	1
Devescovi, Antonella	1
Dinkmeyer, Don	1
Doppelt, Jerome E.	1
Fadale, LaVerna M.	1
Green, Donald Ross	1
Grossen, Neal E.	1
Harris, Deborah J.	1
Hou, Xiaodong	1
Irvin, P. Shawn	1
Izard, J. F.	1
King, R. A. R.	1
Kurniawati, Lina	1
Lai, Cheng-Fei	1
Li, Dongmei	1
More ▼