ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	10

Descriptor

Interrater Reliability	15
Models	15
Statistical Analysis	15
Evaluation Methods	3
Scores	3
Scoring Rubrics	3
Test Validity	3
Academic Achievement	2
Accuracy	2
At Risk Students	2
Behavior Problems	2
Classification	2
Coding	2
Communities of Practice	2
Comparative Analysis	2
Content Analysis	2
Correlation	2
Difficulty Level	2
Essay Tests	2
Essays	2
Evaluators	2
Factor Analysis	2
Foreign Countries	2
Generalizability Theory	2
Goodness of Fit	2
More ▼

Source

Educational and Psychological…	3
American Journal of Distance…	1
Applied Measurement in…	1
CBE - Life Sciences Education	1
California School Psychologist	1
Current Issues in Education	1
Eurasian Journal of…	1
International Review of…	1
Language Testing	1
Language and Literacy Spectrum	1
School Psychology Quarterly	1
More ▼

Publication Type

Journal Articles	13
Reports - Research	13
Reports - Evaluative	3
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Grade 4	3
Higher Education	3
Elementary Education	2
Grade 7	2
Grade 8	2
Postsecondary Education	2
Secondary Education	2
Elementary Secondary Education	1
Grade 1	1
Grade 2	1
Grade 3	1
Grade 5	1
Grade 6	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Kindergarten	1
Middle Schools	1
More ▼

Audience

Researchers

Location

Israel	1
Minnesota	1
New York	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Kappa and Rater Accuracy: Paradigms and Parameters

Peer reviewed

Direct link

Conger, Anthony J. – Educational and Psychological Measurement, 2017

Drawing parallels to classical test theory, this article clarifies the difference between rater accuracy and reliability and demonstrates how category marginal frequencies affect rater agreement and Cohen's kappa. Category assignment paradigms are developed: comparing raters to a standard (index) versus comparing two raters to one another…

Descriptors: Interrater Reliability, Evaluators, Accuracy, Statistical Analysis

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

Interrater Agreement Evaluation: A Latent Variable Modeling Approach

Peer reviewed

Direct link

Raykov, Tenko; Dimitrov, Dimiter M.; von Eye, Alexander; Marcoulides, George A. – Educational and Psychological Measurement, 2013

A latent variable modeling method for evaluation of interrater agreement is outlined. The procedure is useful for point and interval estimation of the degree of agreement among a given set of judges evaluating a group of targets. In addition, the approach allows one to test for identity in underlying thresholds across raters as well as to identify…

Descriptors: Interrater Reliability, Models, Statistical Analysis, Computation

Using Student Writing and Lexical Analysis to Reveal Student Thinking about the Role of Stop Codons in the Central Dogma

Peer reviewed

Direct link

Prevost, Luanna B.; Smith, Michelle K.; Knight, Jennifer K. – CBE - Life Sciences Education, 2016

Previous work has shown that students have persistent difficulties in understanding how central dogma processes can be affected by a stop codon mutation. To explore these difficulties, we modified two multiple-choice questions from the Genetics Concept Assessment into three open-ended questions that asked students to write about how a stop codon…

Descriptors: Science Instruction, Genetics, Scientific Concepts, Scoring

Crossed Random-Effect Modeling: Examining the Effects of Teacher Experience and Rubric Use in Performance Assessments

Peer reviewed
PDF on ERIC

Download full text

Kan, Adnan; Bulut, Okan – Eurasian Journal of Educational Research, 2014

Problem Statement: Performance assessments have emerged as an alternative method to measure what a student knows and can do. One of the shortcomings of performance assessments is the subjectivity and inconsistency of raters in scoring. A common criticism of performance assessments is the subjective nature of scoring procedures. The effectiveness…

Descriptors: Performance Based Assessment, Scoring Rubrics, Models, Experienced Teachers

Enhancing Cognitive Presence in Online Case Discussions with Questions Based on the Practical Inquiry Model

Peer reviewed

Direct link

Sadaf, Ayesha; Olesova, Larisa – American Journal of Distance Education, 2017

The researchers in this study examined the influence of questions designed with the Practical Inquiry Model (PIM), compared with the regular (playground) questions, on students' levels of cognitive presence in online discussions. Students' discussion postings were collected and categorized according to the four levels of cognitive presence:…

Descriptors: Graduate Students, Masters Programs, Cognitive Processes, Web Based Instruction

Confirmation of Models for Interpretation and Use of the Social and Academic Behavior Risk Screener (SABRS)

Peer reviewed

Direct link

Kilgus, Stephen P.; Sims, Wesley A.; von der Embse, Nathaniel P.; Riley-Tillman, T. Chris – School Psychology Quarterly, 2015

The purpose of this investigation was to evaluate the models for interpretation and use that serve as the foundation of an interpretation/use argument for the Social and Academic Behavior Risk Screener (SABRS). The SABRS was completed by 34 teachers with regard to 488 students in a Midwestern high school during the winter portion of the academic…

Descriptors: Screening Tests, Factor Analysis, Models, High School Students

SLA Developmental Stages and Teachers' Assessment of Written French: Exploring Direkt Profil as a Diagnostic Assessment Tool

Peer reviewed

Direct link

Granfeldt, Jonas; Ågren, Malin – Language Testing, 2014

One core area of research in Second Language Acquisition is the identification and definition of developmental stages in different L2s. For L2 French, Bartning and Schlyter (2004) presented a model of six morphosyntactic stages of development in the shape of grammatical profiles. The model formed the basis for the computer program Direkt Profil…

Descriptors: Second Language Learning, Language Tests, French, Language Teachers

Toward a CoI Population Parameter: The Impact of Unit (Sentence vs. Message) on the Results of Quantitative Content Analysis

Peer reviewed
PDF on ERIC

Download full text

Gorsky, Paul; Caspi, Avner; Blau, Ina; Vine, Yodfat; Billet, Amit – International Review of Research in Open and Distance Learning, 2012

The goal of this study is to further corroborate a hypothesized population parameter for the frequencies of social presence versus the sum of teaching presence and cognitive presence as defined by the community of inquiry model in higher education asynchronous course forums. This parameter has been found across five variables: academic institution…

Descriptors: Foreign Countries, Open Universities, Inquiry, Communities of Practice

The Use of Response to Intervention Practices for Behavior: An Examination of the Validity of a Screening Instrument

Peer reviewed
PDF on ERIC

Download full text

Direct link

Muyskens, Paul; Marston, Doug; Reschly, Amy L. – California School Psychologist, 2007

Behavioral difficulties of school-aged students are typically dealt with in a reactive, rather than preventative manner. This article examines a proactive approach, consistent with the Response-to-Intervention model, using a screening measure designed to identify students at risk for behavior difficulties and targeting these students for early…

Descriptors: Early Intervention, At Risk Students, Teacher Attitudes, Academic Achievement

Children's Comprehension and Local-to-Global Recall of Narrative and Expository Texts

Peer reviewed

Direct link

Romero, Fernando; Paris, Scott G.; Brem, Sarah K. – Current Issues in Education, 2005

We examined underlying mechanisms for comprehension differences across expository and narrative text while controlling for factors confounded in the extant literature. Fourth grade students (n=32) read both an expository and a narrative text, and completed both a local comprehension assessment, and a global retelling assessment for each text.…

Descriptors: Reading Comprehension, Grade 4, Psycholinguistics, Models

A Latent-Variable Modeling Approach to Assessing Interrater Reliability, Topic Generalizability, and Validity of a Content Assessment Scoring Rubric.

Peer reviewed

Abedi, Jamal; Baker, Eva L. – Educational and Psychological Measurement, 1995

Results from a performance assessment in which 68 high school students wrote essays support the use of latent variable modeling for estimating reliability, concurrent validity, and generalizability of a scoring rubric. The latent variable modeling approach overcomes the limitations of certain conventional statistical techniques in handling…

Descriptors: Criteria, Essays, Estimation (Mathematics), Generalizability Theory

Essay Topic Difficulty in Relation to Scoring Models.

Dovell, Patricia; Buhr, Dianne C. – 1986

This study examined the difficulty level of essay topics used in the large-scale assessment of writing in relation to five different scoring models, and sought to determine what effects the scoring models would have on passing rates. In model one, examinee's score is the direct result of a score assigned by the reader or the sum of scores assigned…

Descriptors: College Students, Difficulty Level, Essay Tests, Essays

Improving Children's Writing: A Model for Parent Participation

Peer reviewed
PDF on ERIC

Download full text

Guastello, E. Francine; Lenz, Claire – Language and Literacy Spectrum, 2004

This study examined the effects of parental training on students' writing scores. Six classes of fourth grade students from three schools were randomly assigned to three experimental and three control groups. Parents of the students in the experimental group attended training sessions and received instruction in the stages of the writing process…

Descriptors: Writing Improvement, Parent Participation, Experimental Groups, Writing Processes

Forecasting Device Effectiveness: Volume III. Analytic Assessment of Device Effectiveness Forecasting Technique. Final Report.

Download full text

Rose, Andrew M.; And Others – 1985

This third of three volumes reports on analytic procedures conducted to address various aspects of the scalar properties of the Device Effectiveness Forecasting Technique (DEFT). DEFT, a series of microcomputer programs applied to data gathered from rating scales, is used to evaluate simulator devices used in U.S. Army weapons training. The…

Descriptors: Adults, Computer Oriented Programs, Computer Simulation, Data Interpretation

Abedi, Jamal	1
Baker, Eva L.	1
Billet, Amit	1
Blau, Ina	1
Brem, Sarah K.	1
Buhr, Dianne C.	1
Bulut, Okan	1
Caspi, Avner	1
Cohen, Allan	1
Conger, Anthony J.	1
Dimitrov, Dimiter M.	1
Dovell, Patricia	1
Gorsky, Paul	1
Granfeldt, Jonas	1
Guastello, E. Francine	1
Kan, Adnan	1
Kilgus, Stephen P.	1
Knight, Jennifer K.	1
Lenz, Claire	1
Marcoulides, George A.	1
Marston, Doug	1
Muyskens, Paul	1
Olesova, Larisa	1
Paris, Scott G.	1
More ▼