NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 20250
Since 2022 (last 5 years)0
Since 2017 (last 10 years)2
Since 2007 (last 20 years)11
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Fu, Yuanshu; Wen, Zhonglin; Wang, Yang – Educational and Psychological Measurement, 2018
The maximal reliability of a congeneric measure is achieved by weighting item scores to form the optimal linear combination as the total score; it is never lower than the composite reliability of the measure when measurement errors are uncorrelated. The statistical method that renders maximal reliability would also lead to maximal criterion…
Descriptors: Test Reliability, Test Validity, Comparative Analysis, Attitude Measures
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Chongo, Samri; Osman, Kamisah; Nayan, Nazrul Anuar – EURASIA Journal of Mathematics, Science and Technology Education, 2021
Computational thinking (CT) is one of the systematic tools in problem solving and widely accepted as an important skill in the 21st century. This study aimed to identify the effectiveness of the Chemistry Computational Thinking (CT-CHEM) Module on achievement in chemistry. This study also employed a quasi-experimental design with the participation…
Descriptors: Chemistry, Science Instruction, Thinking Skills, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang – Journal of Educational Measurement, 2015
Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
Descriptors: Classification, Reliability, Accuracy, Cognitive Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Uto, Masaki; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2016
As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…
Descriptors: Item Response Theory, Peer Evaluation, Bayesian Statistics, Simulation
National Centre for Vocational Education Research (NCVER), 2016
This work asks one simple question: "how reliable is the method used by the National Centre for Vocational Education Research (NCVER) to estimate projected rates of VET program completion?" In other words, how well do early projections align with actual completion rates some years later? Completion rates are simple to calculate with a…
Descriptors: Vocational Education, Graduation Rate, Predictive Measurement, Predictive Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014
C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…
Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013
The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…
Descriptors: Test Format, Test Items, Responses, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Seonghoon; Feldt, Leonard S. – Asia Pacific Education Review, 2010
The primary purpose of this study is to investigate the mathematical characteristics of the test reliability coefficient rho[subscript XX'] as a function of item response theory (IRT) parameters and present the lower and upper bounds of the coefficient. Another purpose is to examine relative performances of the IRT reliability statistics and two…
Descriptors: Testing, Test Reliability, Statistics, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Lincove, Jane Arnold; Osborne, Cynthia; Dillon, Amanda; Mills, Nicholas – Journal of Teacher Education, 2014
Despite questions about validity and reliability, the use of value-added estimation methods has moved beyond academic research into state accountability systems for teachers, schools, and teacher preparation programs (TPPs). Prior studies of value-added measurement for TPPs test the validity of researcher-designed models and find that measuring…
Descriptors: Teacher Education Programs, Accountability, Politics of Education, School Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Seethaler, Pamela M.; Fuchs, Lynn S. – Exceptional Children, 2010
This study examined the reliability, validity, and predictive utility of kindergarten screening for risk for math difficulty (MD). Three screening measures, administered in September and May of kindergarten to 196 students, assessed number sense and computational fluency. Conceptual and procedural outcomes were measured at end of first grade, with…
Descriptors: Test Validity, Kindergarten, Grade 1, Screening Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Slade, Peter D.; Townes, Brenda D.; Rosenbaum, Gail; Martins, Isabel P.; Luis, Henrique; Bernardo, Mario; Martin, Michael D.; DeRouen, Timothy A. – Psychological Assessment, 2008
When serial neurocognitive assessments are performed, 2 main factors are of importance: test-retest reliability and practice effects. With children, however, there is a third, developmental factor, which occurs as a result of maturation. Child tests recognize this factor through the provision of age-corrected scaled scores. Thus, a ready-made…
Descriptors: Validity, Diagnostic Tests, Test Reliability, Children
Peer reviewed Peer reviewed
Direct linkDirect link
Nietfeld, John L.; Enders, Craig K; Schraw, Gregory – Educational and Psychological Measurement, 2006
Researchers studying monitoring accuracy currently use two different indexes to estimate accuracy: relative accuracy and absolute accuracy. The authors compared the distributional properties of two measures of monitoring accuracy using Monte Carlo procedures that fit within these categories. They manipulated the accuracy of judgments (i.e., chance…
Descriptors: Monte Carlo Methods, Test Items, Computation, Metacognition