NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 202516
Since 2022 (last 5 years)74
Since 2017 (last 10 years)149
Since 2007 (last 20 years)283
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 283 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Tenko Raykov; George Marcoulides; Randall Schumacker – Measurement: Interdisciplinary Research and Perspectives, 2024
An application of Bayesian factor analysis for evaluation of scale reliability is discussed, which is developed within the framework of latent variable modeling. The method permits direct point and interval estimation of the reliability coefficient of multiple-component measuring instruments using Bayesian inference. The approach allows also point…
Descriptors: Reliability, Bayesian Statistics, Measurement Techniques, Computer Software
Peer reviewed Peer reviewed
Direct linkDirect link
Tenko Raykov; George Marcoulides; James Anthony; Natalja Menold – Measurement: Interdisciplinary Research and Perspectives, 2024
A Bayesian statistics-based approach is discussed that can be used for direct evaluation of the popular Cronbach's coefficient alpha as an internal consistency index for multiple-component measuring instruments, as well as for testing its identity to scale reliability. The method represents an application of confirmatory factor analysis within the…
Descriptors: Reliability, Factor Analysis, Bayesian Statistics, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Hanshu Zhang; Ran Zhou; Cheng-You Cheng; Sheng-Hsu Huang; Ming-Hui Cheng; Cheng-Ta Yang – Cognitive Research: Principles and Implications, 2025
Although it is commonly believed that automation aids human decision-making, conflicting evidence raises questions about whether individuals would gain greater advantages from automation in difficult tasks. Our study examines the combined influence of task difficulty and automation reliability on aided decision-making. We assessed decision…
Descriptors: Task Analysis, Difficulty Level, Decision Making, Automation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Melek Gülsah Sahin; Yildiz Yildirim – International Journal of Assessment Tools in Education, 2024
This study aims to generalize the reliability of the GAAIS, which is known to perform valid and reliable measurements, is frequently used in the literature, aims to measure one of today's popular topics, and is one of the first examples developed in the field. Within the meta-analytic reliability generalization study, moderator analyses were also…
Descriptors: Generalization, Meta Analysis, Databases, Research Reports
Peer reviewed Peer reviewed
Direct linkDirect link
Pauline Frizelle; Ana Buckley; Tricia Biancone; Anna Ceroni; Darren Dahly; Paul Fletcher; Dorothy V. M. Bishop; Cristina McKean – Journal of Child Language, 2024
This study reports on the feasibility of using the Test of Complex Syntax- Electronic (TECS-E), as a self-directed app, to measure sentence comprehension in children aged 4 to 5 ½ years old; how testing apps might be adapted for effective independent use; and agreement levels between face-to-face supported computerized and independent computerized…
Descriptors: Language Processing, Computer Software, Language Tests, Syntax
Peer reviewed Peer reviewed
Direct linkDirect link
Fumei Liu – Cogent Education, 2024
This paper details how to effectively share three-dimensional geological models using data conversion between two mainstream mining software, Micromine and Surpac. It also discusses the impact of this conversion method on geological integrated exploration decision-making guidance. The current situation primarily manifests in the fact that both…
Descriptors: Computer Software, Geology, Models, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025
Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…
Descriptors: Models, Test Items, Educational Assessment, Scores
Denise Swanson; Gerald Tindal – Behavioral Research and Teaching, 2024
This technical report provides an authoritative bibliographic resource of all the studies conducted on "easyCBM"® and published on the main website for Behavioral Research and Teaching under Publications (https://brtprojects.org). The "easyCBM"© software is a direct descendent of "Curriculum-based Measurement" (CBM)…
Descriptors: Bibliographies, Computer Software, Test Construction, Test Reliability
Terra Blevins – ProQuest LLC, 2024
While large language models (LLMs) continue to grow in scale and gain new zero-shot capabilities, their performance for languages beyond English increasingly lags behind. This gap is due to the "curse of multilinguality," where multilingual language models perform worse on individual languages than a monolingual model trained on that…
Descriptors: Multilingualism, Computational Linguistics, Second Languages, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Peer reviewed Peer reviewed
Direct linkDirect link
Hancock, Gregory R.; An, Ji – Measurement: Interdisciplinary Research and Perspectives, 2020
As an alternative to Cronbach's [alpha] for estimating scale reliability, McDonald's [omega] has attracted increased attention within the methodological community for its less stringent measurement assumptions. Notwithstanding, [omega] is still seldom used by practitioners, likely due to its unavailability in popular software packages (e.g., SPSS)…
Descriptors: Evaluation, Alternative Assessment, Reliability, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Jiangang Hao; Alina A. von Davier; Victoria Yaneva; Susan Lottridge; Matthias von Davier; Deborah J. Harris – Educational Measurement: Issues and Practice, 2024
The remarkable strides in artificial intelligence (AI), exemplified by ChatGPT, have unveiled a wealth of opportunities and challenges in assessment. Applying cutting-edge large language models (LLMs) and generative AI to assessment holds great promise in boosting efficiency, mitigating bias, and facilitating customized evaluations. Conversely,…
Descriptors: Evaluation Methods, Artificial Intelligence, Educational Change, Computer Software
Abdulrahman Alshammari – ProQuest LLC, 2024
A critical component of modern software development practices, particularly continuous integration (CI), is the halt of development activities in response to test failures which requires further investigation and debugging. As software changes, regression testing becomes vital to verify that new code does not affect existing functionality.…
Descriptors: Computer Software, Programming, Coding, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Phillip K. Wood – Structural Equation Modeling: A Multidisciplinary Journal, 2024
The logistic and confined exponential curves are frequently used in studies of growth and learning. These models, which are nonlinear in their parameters, can be estimated using structural equation modeling software. This paper proposes a single combined model, a weighted combination of both models. Mplus, Proc Calis, and lavaan code for the model…
Descriptors: Structural Equation Models, Computation, Computer Software, Weighted Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Steven Kim; Stephanie Lara-Sotelo; Eric Martin – Measurement in Physical Education and Exercise Science, 2024
A number of familiarization trials are needed for reliable measurement, particularly for inexperienced subjects. Researchers have studied and developed familiarization protocols that vary by exercise and study population. The pace of familiarization and fatigue may be an individual-level characteristic, so a population-level protocol may not fit…
Descriptors: Familiarity, Physical Education, Fatigue (Biology), Reliability
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  19