Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 10 |
Descriptor
| Interrater Reliability | 15 |
| Models | 15 |
| Statistical Analysis | 15 |
| Evaluation Methods | 3 |
| Scores | 3 |
| Scoring Rubrics | 3 |
| Test Validity | 3 |
| Academic Achievement | 2 |
| Accuracy | 2 |
| At Risk Students | 2 |
| Behavior Problems | 2 |
| More ▼ | |
Source
Author
| Abedi, Jamal | 1 |
| Baker, Eva L. | 1 |
| Billet, Amit | 1 |
| Blau, Ina | 1 |
| Brem, Sarah K. | 1 |
| Buhr, Dianne C. | 1 |
| Bulut, Okan | 1 |
| Caspi, Avner | 1 |
| Cohen, Allan | 1 |
| Conger, Anthony J. | 1 |
| Dimitrov, Dimiter M. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 13 |
| Reports - Research | 13 |
| Reports - Evaluative | 3 |
| Speeches/Meeting Papers | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Grade 4 | 3 |
| Higher Education | 3 |
| Elementary Education | 2 |
| Grade 7 | 2 |
| Grade 8 | 2 |
| Postsecondary Education | 2 |
| Secondary Education | 2 |
| Elementary Secondary Education | 1 |
| Grade 1 | 1 |
| Grade 2 | 1 |
| Grade 3 | 1 |
| More ▼ | |
Audience
| Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Conger, Anthony J. – Educational and Psychological Measurement, 2017
Drawing parallels to classical test theory, this article clarifies the difference between rater accuracy and reliability and demonstrates how category marginal frequencies affect rater agreement and Cohen's kappa. Category assignment paradigms are developed: comparing raters to a standard (index) versus comparing two raters to one another…
Descriptors: Interrater Reliability, Evaluators, Accuracy, Statistical Analysis
Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018
The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…
Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators
Raykov, Tenko; Dimitrov, Dimiter M.; von Eye, Alexander; Marcoulides, George A. – Educational and Psychological Measurement, 2013
A latent variable modeling method for evaluation of interrater agreement is outlined. The procedure is useful for point and interval estimation of the degree of agreement among a given set of judges evaluating a group of targets. In addition, the approach allows one to test for identity in underlying thresholds across raters as well as to identify…
Descriptors: Interrater Reliability, Models, Statistical Analysis, Computation
Prevost, Luanna B.; Smith, Michelle K.; Knight, Jennifer K. – CBE - Life Sciences Education, 2016
Previous work has shown that students have persistent difficulties in understanding how central dogma processes can be affected by a stop codon mutation. To explore these difficulties, we modified two multiple-choice questions from the Genetics Concept Assessment into three open-ended questions that asked students to write about how a stop codon…
Descriptors: Science Instruction, Genetics, Scientific Concepts, Scoring
Kan, Adnan; Bulut, Okan – Eurasian Journal of Educational Research, 2014
Problem Statement: Performance assessments have emerged as an alternative method to measure what a student knows and can do. One of the shortcomings of performance assessments is the subjectivity and inconsistency of raters in scoring. A common criticism of performance assessments is the subjective nature of scoring procedures. The effectiveness…
Descriptors: Performance Based Assessment, Scoring Rubrics, Models, Experienced Teachers
Sadaf, Ayesha; Olesova, Larisa – American Journal of Distance Education, 2017
The researchers in this study examined the influence of questions designed with the Practical Inquiry Model (PIM), compared with the regular (playground) questions, on students' levels of cognitive presence in online discussions. Students' discussion postings were collected and categorized according to the four levels of cognitive presence:…
Descriptors: Graduate Students, Masters Programs, Cognitive Processes, Web Based Instruction
Kilgus, Stephen P.; Sims, Wesley A.; von der Embse, Nathaniel P.; Riley-Tillman, T. Chris – School Psychology Quarterly, 2015
The purpose of this investigation was to evaluate the models for interpretation and use that serve as the foundation of an interpretation/use argument for the Social and Academic Behavior Risk Screener (SABRS). The SABRS was completed by 34 teachers with regard to 488 students in a Midwestern high school during the winter portion of the academic…
Descriptors: Screening Tests, Factor Analysis, Models, High School Students
Granfeldt, Jonas; Ă…gren, Malin – Language Testing, 2014
One core area of research in Second Language Acquisition is the identification and definition of developmental stages in different L2s. For L2 French, Bartning and Schlyter (2004) presented a model of six morphosyntactic stages of development in the shape of grammatical profiles. The model formed the basis for the computer program Direkt Profil…
Descriptors: Second Language Learning, Language Tests, French, Language Teachers
Gorsky, Paul; Caspi, Avner; Blau, Ina; Vine, Yodfat; Billet, Amit – International Review of Research in Open and Distance Learning, 2012
The goal of this study is to further corroborate a hypothesized population parameter for the frequencies of social presence versus the sum of teaching presence and cognitive presence as defined by the community of inquiry model in higher education asynchronous course forums. This parameter has been found across five variables: academic institution…
Descriptors: Foreign Countries, Open Universities, Inquiry, Communities of Practice
Muyskens, Paul; Marston, Doug; Reschly, Amy L. – California School Psychologist, 2007
Behavioral difficulties of school-aged students are typically dealt with in a reactive, rather than preventative manner. This article examines a proactive approach, consistent with the Response-to-Intervention model, using a screening measure designed to identify students at risk for behavior difficulties and targeting these students for early…
Descriptors: Early Intervention, At Risk Students, Teacher Attitudes, Academic Achievement
Romero, Fernando; Paris, Scott G.; Brem, Sarah K. – Current Issues in Education, 2005
We examined underlying mechanisms for comprehension differences across expository and narrative text while controlling for factors confounded in the extant literature. Fourth grade students (n=32) read both an expository and a narrative text, and completed both a local comprehension assessment, and a global retelling assessment for each text.…
Descriptors: Reading Comprehension, Grade 4, Psycholinguistics, Models
Peer reviewedAbedi, Jamal; Baker, Eva L. – Educational and Psychological Measurement, 1995
Results from a performance assessment in which 68 high school students wrote essays support the use of latent variable modeling for estimating reliability, concurrent validity, and generalizability of a scoring rubric. The latent variable modeling approach overcomes the limitations of certain conventional statistical techniques in handling…
Descriptors: Criteria, Essays, Estimation (Mathematics), Generalizability Theory
Dovell, Patricia; Buhr, Dianne C. – 1986
This study examined the difficulty level of essay topics used in the large-scale assessment of writing in relation to five different scoring models, and sought to determine what effects the scoring models would have on passing rates. In model one, examinee's score is the direct result of a score assigned by the reader or the sum of scores assigned…
Descriptors: College Students, Difficulty Level, Essay Tests, Essays
Guastello, E. Francine; Lenz, Claire – Language and Literacy Spectrum, 2004
This study examined the effects of parental training on students' writing scores. Six classes of fourth grade students from three schools were randomly assigned to three experimental and three control groups. Parents of the students in the experimental group attended training sessions and received instruction in the stages of the writing process…
Descriptors: Writing Improvement, Parent Participation, Experimental Groups, Writing Processes
Rose, Andrew M.; And Others – 1985
This third of three volumes reports on analytic procedures conducted to address various aspects of the scalar properties of the Device Effectiveness Forecasting Technique (DEFT). DEFT, a series of microcomputer programs applied to data gathered from rating scales, is used to evaluate simulator devices used in U.S. Army weapons training. The…
Descriptors: Adults, Computer Oriented Programs, Computer Simulation, Data Interpretation

Direct link
