ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	11

Descriptor

Comparative Analysis	12
Computation	12
Test Reliability	12
Test Validity	5
Accuracy	4
Foreign Countries	4
Item Response Theory	4
Scores	4
Simulation	3
Test Items	3
Cognitive Tests	2
Diagnostic Tests	2
Evaluation Criteria	2
Evaluation Methods	2
Models	2
Monte Carlo Methods	2
Probability	2
Testing	2
Ability	1
Accountability	1
Achievement Gains	1
Achievement Rating	1
Achievement Tests	1
Age Differences	1
At Risk Students	1
More ▼

Source

Educational and Psychological…	2
Journal of Educational…	2
Asia Pacific Education Review	1
ETS Research Report Series	1
EURASIA Journal of…	1
Exceptional Children	1
IEEE Transactions on Learning…	1
Journal of Teacher Education	1
National Centre for…	1
Psychological Assessment	1

Publication Type

Journal Articles	11
Reports - Research	9
Reports - Evaluative	2
Reports - Descriptive	1

Education Level

Secondary Education	2
Elementary Education	1
Grade 1	1
Higher Education	1
Junior High Schools	1
Kindergarten	1
Middle Schools	1
Postsecondary Education	1

Audience

Location

Australia	1
China	1
Malaysia	1
Portugal	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 12 results Save | Export

The Total Score with Maximal Reliability and Maximal Criterion Validity: An Illustration Using a Career Satisfaction Measure

Peer reviewed

Direct link

Fu, Yuanshu; Wen, Zhonglin; Wang, Yang – Educational and Psychological Measurement, 2018

The maximal reliability of a congeneric measure is achieved by weighting item scores to form the optimal linear combination as the total score; it is never lower than the composite reliability of the measure when measurement errors are uncorrelated. The statistical method that renders maximal reliability would also lead to maximal criterion…

Descriptors: Test Reliability, Test Validity, Comparative Analysis, Attitude Measures

Impact of the Plugged-In and Unplugged Chemistry Computational Thinking Modules on Achievement in Chemistry

Peer reviewed
PDF on ERIC

Download full text

Chongo, Samri; Osman, Kamisah; Nayan, Nazrul Anuar – EURASIA Journal of Mathematics, Science and Technology Education, 2021

Computational thinking (CT) is one of the systematic tools in problem solving and widely accepted as an important skill in the 21st century. This study aimed to identify the effectiveness of the Chemistry Computational Thinking (CT-CHEM) Module on achievement in chemistry. This study also employed a quasi-experimental design with the participation…

Descriptors: Chemistry, Science Instruction, Thinking Skills, Achievement Tests

Attribute-Level and Pattern-Level Classification Consistency and Accuracy Indices for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang – Journal of Educational Measurement, 2015

Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…

Descriptors: Classification, Reliability, Accuracy, Cognitive Tests

Item Response Theory for Peer Assessment

Peer reviewed

Direct link

Uto, Masaki; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2016

As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…

Descriptors: Item Response Theory, Peer Evaluation, Bayesian Statistics, Simulation

VET Program Completion Rates: An Evaluation of the Current Method. Occasional Paper

Download full text

National Centre for Vocational Education Research (NCVER), 2016

This work asks one simple question: "how reliable is the method used by the National Centre for Vocational Education Research (NCVER) to estimate projected rates of VET program completion?" In other words, how well do early projections align with actual completion rates some years later? Completion rates are simple to calculate with a…

Descriptors: Vocational Education, Graduation Rate, Predictive Measurement, Predictive Validity

A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

Peer reviewed

Direct link

Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014

C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests

The Effects of Rater Severity and Rater Distribution on Examinees' Ability Estimation for Constructed-Response Items. Research Report. ETS RR-13-23

Peer reviewed
PDF on ERIC

Download full text

Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013

The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…

Descriptors: Test Format, Test Items, Responses, Computation

The Estimation of the IRT Reliability Coefficient and Its Lower and Upper Bounds, with Comparisons to CTT Reliability Statistics

Peer reviewed

Direct link

Kim, Seonghoon; Feldt, Leonard S. – Asia Pacific Education Review, 2010

The primary purpose of this study is to investigate the mathematical characteristics of the test reliability coefficient rho[subscript XX'] as a function of item response theory (IRT) parameters and present the lower and upper bounds of the coefficient. Another purpose is to examine relative performances of the IRT reliability statistics and two…

Descriptors: Testing, Test Reliability, Statistics, Item Response Theory

The Politics and Statistics of Value-Added Modeling for Accountability of Teacher Preparation Programs

Peer reviewed

Direct link

Lincove, Jane Arnold; Osborne, Cynthia; Dillon, Amanda; Mills, Nicholas – Journal of Teacher Education, 2014

Despite questions about validity and reliability, the use of value-added estimation methods has moved beyond academic research into state accountability systems for teachers, schools, and teacher preparation programs (TPPs). Prior studies of value-added measurement for TPPs test the validity of researcher-designed models and find that measuring…

Descriptors: Teacher Education Programs, Accountability, Politics of Education, School Statistics

The Predictive Utility of Kindergarten Screening for Math Difficulty

Peer reviewed

Direct link

Seethaler, Pamela M.; Fuchs, Lynn S. – Exceptional Children, 2010

This study examined the reliability, validity, and predictive utility of kindergarten screening for risk for math difficulty (MD). Three screening measures, administered in September and May of kindergarten to 196 students, assessed number sense and computational fluency. Conceptual and procedural outcomes were measured at end of first grade, with…

Descriptors: Test Validity, Kindergarten, Grade 1, Screening Tests

The Serial Use of Child Neurocognitive Tests: Development versus Practice Effects

Peer reviewed

Direct link

Slade, Peter D.; Townes, Brenda D.; Rosenbaum, Gail; Martins, Isabel P.; Luis, Henrique; Bernardo, Mario; Martin, Michael D.; DeRouen, Timothy A. – Psychological Assessment, 2008

When serial neurocognitive assessments are performed, 2 main factors are of importance: test-retest reliability and practice effects. With children, however, there is a third, developmental factor, which occurs as a result of maturation. Child tests recognize this factor through the provision of age-corrected scaled scores. Thus, a ready-made…

Descriptors: Validity, Diagnostic Tests, Test Reliability, Children

A Monte Carlo Comparison of Measures of Relative and Absolute Monitoring Accuracy

Peer reviewed

Direct link

Nietfeld, John L.; Enders, Craig K; Schraw, Gregory – Educational and Psychological Measurement, 2006

Researchers studying monitoring accuracy currently use two different indexes to estimate accuracy: relative accuracy and absolute accuracy. The authors compared the distributional properties of two measures of monitoring accuracy using Monte Carlo procedures that fit within these categories. They manipulated the accuracy of judgments (i.e., chance…

Descriptors: Monte Carlo Methods, Test Items, Computation, Metacognition

Bernardo, Mario	1
Chen, Ping	1
Chongo, Samri	1
DeRouen, Timothy A.	1
Dillon, Amanda	1
Ding, Shuliang	1
Enders, Craig K	1
Feldt, Leonard S.	1
Fu, Yuanshu	1
Fuchs, Lynn S.	1
Kim, Seonghoon	1
Lincove, Jane Arnold	1
Luis, Henrique	1
Martin, Michael D.	1
Martins, Isabel P.	1
Meng, Yaru	1
Mills, Nicholas	1
Nayan, Nazrul Anuar	1
Nietfeld, John L.	1
Osborne, Cynthia	1
Osman, Kamisah	1
Robitzsch, Alexander	1
Rosenbaum, Gail	1
Schipolowski, Stefan	1
Schraw, Gregory	1
More ▼