ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	12

Descriptor

Computation	15
Test Reliability	15
Test Validity	8
Scores	5
Test Items	4
At Risk Students	3
Equations (Mathematics)	3
Evaluation Methods	3
Statistical Analysis	3
Academic Achievement	2
Academic Standards	2
Achievement Tests	2
Bias	2
Classification	2
Comparative Analysis	2
Error of Measurement	2
Item Response Theory	2
Mathematics Skills	2
Matrices	2
Measures (Individuals)	2
Models	2
Screening Tests	2
Accountability	1
Achievement Gains	1
Achievement Rating	1
More ▼

Source

Educational and Psychological…	3
Journal of Psychoeducational…	2
Applied Measurement in…	1
Applied Psychological…	1
Exceptional Children	1
Journal of Human Resources	1
Journal of Teacher Education	1
Online Submission	1
Phi Delta Kappan	1
Psychological Methods	1
Research Papers in Education	1
More ▼

Publication Type

Reports - Evaluative	15
Journal Articles	13
Speeches/Meeting Papers	2

Education Level

Higher Education	2
Elementary Education	1
Elementary Secondary Education	1
Grade 1	1
Grade 3	1
Kindergarten	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Texas	2
Alabama	1
California	1
Idaho	1
Nebraska	1
New Mexico	1
New York	1
North Dakota	1
Ohio	1
Texas (Houston)	1
United Kingdom (England)	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Wechsler Individual…	1
Wechsler Intelligence Scale…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

On the Pitfalls of Estimating and Using Standardized Reliability Coefficients

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2021

The population discrepancy between unstandardized and standardized reliability of homogeneous multicomponent measuring instruments is examined. Within a latent variable modeling framework, it is shown that the standardized reliability coefficient for unidimensional scales can be markedly higher than the corresponding unstandardized reliability…

Descriptors: Test Reliability, Computation, Measures (Individuals), Research Problems

Multiple-Component Measurement Instruments in Heterogeneous Populations: Is There a Single Coefficient Alpha?

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Harrison, Michael; Menold, Natalja – Educational and Psychological Measurement, 2019

This note confronts the common use of a single coefficient alpha as an index informing about reliability of a multicomponent measurement instrument in a heterogeneous population. Two or more alpha coefficients could instead be meaningfully associated with a given instrument in finite mixture settings, and this may be increasingly more likely the…

Descriptors: Statistical Analysis, Test Reliability, Measures (Individuals), Computation

Measuring the Reliability of Diagnostic Mastery Classifications at Multiple Levels of Reporting

Peer reviewed

Direct link

Thompson, W. Jake; Clark, Amy K.; Nash, Brooke – Applied Measurement in Education, 2019

As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting are needed. The purpose of this paper is to summarize one simulation-based method for estimating and reporting reliability for an…

Descriptors: Test Reliability, Diagnostic Tests, Classification, Computation

Test Review: Reynolds, C. R., Voress, J. V., Kamphaus, R. W. (2015), "Mathematics Fluency and Calculation Tests (MFaCTs) review." PRO-ED

Peer reviewed

Direct link

Marbach, Joshua – Journal of Psychoeducational Assessment, 2017

The Mathematics Fluency and Calculation Tests (MFaCTs) are a series of measures designed to assess for arithmetic calculation skills and calculation fluency in children ages 6 through 18. There are five main purposes of the MFaCTs: (1) identifying students who are behind in basic math fact automaticity; (2) evaluating possible delays in arithmetic…

Descriptors: Mathematics Tests, Computation, Mathematics Skills, Arithmetic

All Sizzle and No Steak: Value-Added Model Doesn't Add Value in Houston

Direct link

Amrein-Beardsley, Audrey; Geiger, Tray – Phi Delta Kappan, 2017

Houston's experience with the Educational Value-Added Assessment System (R) (EVAAS) raises questions that other districts should consider before buying the software and using it for high-stakes decisions. Researchers found that teachers in Houston, all of whom were under the EVAAS gun, but who taught relatively more racial minority students,…

Descriptors: Value Added Models, School Districts, Computer Software, Educational Technology

What If We Took Our Models Seriously? Estimating Latent Scores in Individuals

Peer reviewed

Direct link

Schneider, W. Joel – Journal of Psychoeducational Assessment, 2013

Researchers often argue that the structural models of the constructs they study are relevant to clinicians. Unfortunately, few clinicians are able to translate the mathematically precise relationships between latent constructs and observed scores into information that can be usefully applied to individuals. Typically this means that when a new…

Descriptors: Factor Analysis, Psychological Studies, Cognitive Ability, Test Reliability

Problems in Estimating Composite Reliability of "Unitised" Assessments

Peer reviewed

Direct link

Bramley, Tom; Dhawan, Vikas – Research Papers in Education, 2013

This paper discusses the issues involved in calculating indices of composite reliability for "modular" or "unitised" assessments of the kind used in GCSEs, AS and A level examinations in England. The increasingly widespread use of on-screen marking has meant that the item-level data required for calculating indices of…

Descriptors: Foreign Countries, Exit Examinations, Secondary Education, Test Reliability

Bayesian Variance Component Estimation Using the Inverse-Gamma Class of Priors in a Nested Generalizability Design

Download full text

Arenson, Ethan A. – Online Submission, 2009

One of the problems inherent in variance component estimation centers around inadmissible estimates. Such estimates occur when there is more variability within groups, relative to between groups. This paper suggests a Bayesian approach to resolve inadmissibility by placing noninformative inverse-gamma priors on the variance components, and…

Descriptors: Computation, Bayesian Statistics, Statistical Analysis, Bias

The Politics and Statistics of Value-Added Modeling for Accountability of Teacher Preparation Programs

Peer reviewed

Direct link

Lincove, Jane Arnold; Osborne, Cynthia; Dillon, Amanda; Mills, Nicholas – Journal of Teacher Education, 2014

Despite questions about validity and reliability, the use of value-added estimation methods has moved beyond academic research into state accountability systems for teachers, schools, and teacher preparation programs (TPPs). Prior studies of value-added measurement for TPPs test the validity of researcher-designed models and find that measuring…

Descriptors: Teacher Education Programs, Accountability, Politics of Education, School Statistics

The Predictive Utility of Kindergarten Screening for Math Difficulty

Peer reviewed

Direct link

Seethaler, Pamela M.; Fuchs, Lynn S. – Exceptional Children, 2010

This study examined the reliability, validity, and predictive utility of kindergarten screening for risk for math difficulty (MD). Three screening measures, administered in September and May of kindergarten to 196 students, assessed number sense and computational fluency. Conceptual and procedural outcomes were measured at end of first grade, with…

Descriptors: Test Validity, Kindergarten, Grade 1, Screening Tests

A Note on Using Stratified Alpha to Estimate the Composite Reliability of a Test Composed of Interrelated Nonhomogeneous Items

Peer reviewed

Direct link

Rae, Gordon – Psychological Methods, 2007

The relationship between stratified alpha (alpha-sub(s)) and the reliability of a test composed of interrelated nonhomogeneous items is examined. It is mathematically demonstrated that when there is congeneric equivalence within the strata or subtests, the difference between the coefficients is a function of the variances of the loadings within…

Descriptors: Test Reliability, Test Items, Computation, Error of Measurement

A Critique of Raju and Oshima's Prophecy Formulas for Assessing the Reliability of Item Response Theory-Based Ability Estimates

Peer reviewed

Direct link

Wang, Wen-Chung – Applied Psychological Measurement, 2008

Raju and Oshima (2005) proposed two prophecy formulas based on item response theory in order to predict the reliability of ability estimates for a test after change in its length. The first prophecy formula is equivalent to the classical Spearman-Brown prophecy formula. The second prophecy formula is misleading because of an underlying false…

Descriptors: Test Reliability, Item Response Theory, Computation, Evaluation Methods

Weights That Maximize Reliability under a Congeneric Model for Performance Assessment.

Wang, Tianyou – 1996

In this paper, formulas for computing the weights that maximize the reliability of a test with multiple parts are derived using a congeneric model. A direct derivation for the three-part test and case and a two-step derivation for the n-part case are presented, and results for these two approaches are shown to be consistent for the three-part…

Descriptors: Computation, Equations (Mathematics), Matrices, Performance Based Assessment

On the Validity of Econometric Techniques with Weak Instruments--Inference on Returns to Education Using Compulsory School Attendance Laws

Peer reviewed

Cruz, Luiz M.; Moreira, Marcelo J. – Journal of Human Resources, 2005

The authors evaluate Angrist and Krueger (1991) and Bound, Jaeger, and Baker (1995) by constructing reliable confidence regions around the 2SLS and LIML estimators for returns-to-schooling regardless of the quality of the instruments. The results indicate that the returns-to-schooling were between 8 and 25 percent in 1970 and between 4 and 14…

Descriptors: School Attendance Legislation, Compulsory Education, Measurement Techniques, Computation

Estimating the Reliability of Dichotomous or Trichotomous Scores

Peer reviewed

Direct link

Feldt, Leonard S. – Educational and Psychological Measurement, 2005

To meet the requirements of the No Child Left Behind Act, school districts and states must compile summary reports of the levels of student achievement in reading and mathematics. The levels are to be described in broad categories: "basic and below," "proficient," or "advanced." Educational units are given considerable latitude in defining the…

Descriptors: Federal Legislation, Academic Achievement, Test Items, Test Validity

Marcoulides, George A.	2
Raykov, Tenko	2
Amrein-Beardsley, Audrey	1
Arenson, Ethan A.	1
Bramley, Tom	1
Clark, Amy K.	1
Cruz, Luiz M.	1
Dhawan, Vikas	1
Dillon, Amanda	1
Feldt, Leonard S.	1
Fuchs, Lynn S.	1
Geiger, Tray	1
Harrison, Michael	1
Lincove, Jane Arnold	1
Marbach, Joshua	1
Menold, Natalja	1
Mills, Nicholas	1
Moreira, Marcelo J.	1
Nash, Brooke	1
Osborne, Cynthia	1
Rae, Gordon	1
Schneider, W. Joel	1
Seethaler, Pamela M.	1
Thompson, W. Jake	1
Wang, Tianyou	1
More ▼