ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	12

Descriptor

Error of Measurement	25
Item Analysis	25
Test Validity	25
Test Reliability	19
Test Construction	10
Test Items	10
Criterion Referenced Tests	6
Achievement Tests	5
Correlation	5
Psychometrics	5
Statistical Analysis	5
Test Interpretation	5
Comparative Analysis	4
Mathematical Models	4
Sampling	4
Scores	4
Test Theory	4
Academic Achievement	3
Educational Quality	3
Evaluation Methods	3
Foreign Countries	3
Higher Education	3
Item Response Theory	3
Measurement Techniques	3
Sample Size	3
More ▼

Source

Journal of Educational…	3
Applied Measurement in…	2
Assessment for Effective…	1
Biochemistry and Molecular…	1
ETS Research Report Series	1
Educational and Psychological…	1
GED Testing Service	1
International Journal of…	1
International Journal of…	1
Journal of Education and…	1
Language Assessment Quarterly	1
Language Teaching Research	1
Practical Assessment,…	1
Research Quarterly for…	1
More ▼

Publication Type

Reports - Research	14
Journal Articles	12
Reports - Descriptive	3
Reports - Evaluative	2
Reference Materials -…	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	3
Higher Education	3
Elementary Education	1
Grade 3	1
Grade 5	1
Grade 6	1
Grade 9	1
High Schools	1
Intermediate Grades	1
Middle Schools	1
Postsecondary Education	1
More ▼

Audience

Teachers

Location

Canada	1
Japan	1
Maine	1

Laws, Policies, & Programs

Assessments and Surveys

Dimensions of Self Concept	1
General Educational…	1
Graduate Record Examinations	1
Peabody Picture Vocabulary…	1
Sentence Completion Test	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 25 results Save | Export

Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development

Peer reviewed
PDF on ERIC

Download full text

Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018

Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…

Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making

Examination of Different Item Response Theory Models on Tests Composed of Testlets

Peer reviewed
PDF on ERIC

Download full text

Kogar, Esin Yilmaz; Kelecioglu, Hülya – Journal of Education and Learning, 2017

The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…

Descriptors: Item Response Theory, Models, Mathematics Tests, Test Items

Evaluating Procedures for Reducing Measurement Error in Math Curriculum-Based Measurement Probes

Peer reviewed

Direct link

Methe, Scott A.; Briesch, Amy M.; Hulac, David – Assessment for Effective Intervention, 2015

At present, it is unclear whether math curriculum-based measurement (M-CBM) procedures provide a dependable measure of student progress in math computation because support for its technical properties is based largely upon a body of correlational research. Recent investigations into the dependability of M-CBM scores have found that evaluating…

Descriptors: Measurement Techniques, Error of Measurement, Mathematics Curriculum, Curriculum Based Assessment

The Creation and Validation of a Listening Vocabulary Levels Test

Peer reviewed

Direct link

McLean, Stuart; Kramer, Brandon; Beglar, David – Language Teaching Research, 2015

An important gap in the field of second language vocabulary assessment concerns the lack of validated tests measuring aural vocabulary knowledge. The primary purpose of this study is to introduce and provide preliminary validity evidence for the Listening Vocabulary Levels Test (LVLT), which has been designed as a diagnostic tool to measure…

Descriptors: Test Construction, Test Validity, English (Second Language), Second Language Learning

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement. Research Report. ETS RR-12-08

Peer reviewed
PDF on ERIC

Download full text

Zwick, Rebecca – ETS Research Report Series, 2012

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…

Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods

Assumptions of Multiple Regression: Correcting Two Misconceptions

Peer reviewed
PDF on ERIC

Download full text

Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013

In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…

Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables

Construct Validity and Measurement Invariance of the Peabody Picture Vocabulary Test-III Form A

Peer reviewed

Direct link

Pae, Hye K.; Greenberg, Daphne; Morris, Robin D. – Language Assessment Quarterly, 2012

The aim of this study was to apply the Rasch model to an analysis of the psychometric properties of the Peabody Picture Vocabulary Test--III Form A (PPVT--IIIA) items with struggling adult readers. The PPVT--IIIA was administered to 229 African American adults whose isolated word reading skills were between third and fifth grades. Conformity of…

Descriptors: African Americans, Test Items, Construct Validity, Test Validity

Bridging the Educational Research-Teaching Practice Gap: Tools for Evaluating the Quality of Assessment Instruments

Peer reviewed

Direct link

Anderson, Trevor R.; Rogan, John M. – Biochemistry and Molecular Biology Education, 2010

Student assessment is central to the educational process and can be used for multiple purposes including, to promote student learning, to grade student performance and to evaluate the educational quality of qualifications. It is, therefore, of utmost importance that assessment instruments are of a high quality. In this article, we present various…

Descriptors: Educational Assessment, Educational Quality, Student Evaluation, Educational Research

Validity of the Simultaneous Approach to the Development of Equivalent Achievement Tests in English and French

Peer reviewed

Direct link

Rogers, W. Todd; Lin, Jie; Rinaldi, Christia M. – Applied Measurement in Education, 2011

The evidence gathered in the present study supports the use of the simultaneous development of test items for different languages. The simultaneous approach used in the present study involved writing an item in one language (e.g., French) and, before moving to the development of a second item, translating the item into the second language (e.g.,…

Descriptors: Test Items, Item Analysis, Achievement Tests, French

Tests in Europe: Where We Are and Where We Should Go

Peer reviewed

Direct link

Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012

Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…

Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries

Reliability and Validity Evidence for the GED[R] English as a Second Language Test. GED Testing Service[R] Research Studies, 2009-4

Download full text

Setzer, J. Carl – GED Testing Service, 2009

The GED[R] English as a Second Language (GED ESL) Test was designed to serve as an adjunct to the GED test battery when an examinee takes either the Spanish- or French-language version of the tests. The GED ESL Test is a criterion-referenced, multiple-choice instrument that assesses the functional, English reading skills of adults whose first…

Descriptors: Language Tests, High School Equivalency Programs, Psychometrics, Reading Skills

Models, Meanings and Misunderstandings: Some Issues in Applying Rasch's Theory

Peer reviewed

Whitely, Susan E. – Journal of Educational Measurement, 1977

A debate concerning specific issues and the general usefulness of the Rasch latent trait test model is continued. Methods of estimation, necessary sample size, and the applicability of the model are discussed. (JKS)

Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Measurement

Misunderstanding the Rasch Model

Peer reviewed

Wright, Benjamin D. – Journal of Educational Measurement, 1977

Statements made in a previous article of this journal concerning the Rasch latent trait test model are questioned. Methods of estimation, necessary sample sizes, several formuli, and the general usefulness of the Rasch model are discussed. (JKS)

Descriptors: Computers, Error of Measurement, Item Analysis, Mathematical Models

Further Development and Validation of a Self-Concept Measure Involving School-Related Activities.

Peer reviewed

Michael, William B.; And Others – Educational and Psychological Measurement, 1978

For each of two revised forms of the Dimensions of Self-Concept measure (intermediate and secondary forms), statistical information is presented concerning the intercorrelations of each of five factor scales, the reliability and standard error of measurement of each scale, and the results of item analyses. (Author/JKS)

Descriptors: Academic Achievement, Elementary Secondary Education, Error of Measurement, Factor Analysis

Previous Page | Next Page »

Pages: 1 | 2

Haladyna, Tom	3
Roid, Gale	2
Anderson, Trevor R.	1
Beglar, David	1
Bichi, Ado Abdu	1
Bowes, Neal	1
Brennan, Robert L.	1
Briesch, Amy M.	1
Crocker, A. C.	1
Cromack, Theodore R.	1
Elosua, Paula	1
Emrick, John A.	1
Fox, Kenneth R.	1
Gomez Grajales, Carlos Alberto	1
Greenberg, Daphne	1
Harris, Chester W.	1
Hulac, David	1
Iliescu, Dragos	1
Kelecioglu, Hülya	1
Kogar, Esin Yilmaz	1
Kramer, Brandon	1
Kurkiewicz, Dason	1
Lane, Andrew M.	1
Lin, Jie	1
More ▼