ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	3

Descriptor

Interrater Reliability	31
Mathematical Models	31
Evaluation Methods	9
Evaluators	9
Equations (Mathematics)	8
Error of Measurement	8
Correlation	7
Estimation (Mathematics)	7
Rating Scales	7
Test Reliability	7
Comparative Analysis	5
Foreign Countries	5
Higher Education	5
Item Response Theory	5
Essay Tests	4
Observation	4
Performance Based Assessment	4
Probability	4
Classification	3
Computer Simulation	3
Educational Assessment	3
Elementary Secondary Education	3
Measurement Techniques	3
Regression (Statistics)	3
Scores	3
More ▼

Source

Educational and Psychological…	3
Applied Psychological…	2
Journal of Educational…	2
Multivariate Behavioral…	2
Contemporary Educational…	1
Creativity Research Journal	1
Educational Measurement:…	1
Evaluation Review	1
Evaluation and Program…	1
International Association for…	1
Journal of Educational…	1
Journal of Outcome Measurement	1
Journal of Science and…	1
More ▼

Publication Type

Journal Articles	17
Reports - Evaluative	15
Reports - Research	13
Speeches/Meeting Papers	8
Collected Works - Proceedings	1
Collected Works - Serials	1
Information Analyses	1
Opinion Papers	1
Reference Materials -…	1
Tests/Questionnaires	1

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Secondary Education	1
Secondary Education	1

Audience

Researchers	4
Practitioners	1

Location

Singapore	2
South Korea	2
Asia	1
Australia	1
Brazil	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Florida	1
Germany	1
Greece	1
Hawaii	1
Ireland	1
Israel	1
Italy	1
Japan	1
Kazakhstan	1
Netherlands	1
Norway	1
Ohio	1
Pakistan	1
Pennsylvania	1
Philippines	1
Portugal	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

NEO Personality Inventory	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 31 results Save | Export

Assessment in Mathematical Modelling

Peer reviewed

Direct link

Simin, Cai; Lam, Toh Tin – Journal of Science and Mathematics Education in Southeast Asia, 2016

This paper presents the design and development of a rubric for assessing mathematical modelling for mathematical modelling tasks at the secondary level. A rubric was crafted based on the mathematical modelling competencies synthesised by the researchers identified from four sources. The rubric was fine-tuned following an interview with three…

Descriptors: Scoring Rubrics, Test Construction, Mathematical Models, Secondary School Mathematics

Improving Creativity Performance Assessment: A Rater Effect Examination with Many Facet Rasch Model

Peer reviewed

Direct link

Hung, Su-Pin; Chen, Po-Hsi; Chen, Hsueh-Chih – Creativity Research Journal, 2012

Product assessment is widely applied in creative studies, typically as an important dependent measure. Within this context, this study had 2 purposes. First, the focus of this research was on methods for investigating possible rater effects, an issue that has not received a great deal of attention in past creativity studies. Second, the…

Descriptors: Item Response Theory, Creativity, Interrater Reliability, Undergraduate Students

Planned and Post Hoc Comparisons in Tests of Concordance and Discordance for G Groups of Judges.

Peer reviewed

Serlin, Ronald C.; Marascuilo, Leonard A. – Journal of Educational Statistics, 1983

Two alternatives to the problems of conducting planned and post hoc comparisons in tests of concordance and discordance for G groups of judges are examined. The two models are illustrated using existing data. (Author/JKS)

Descriptors: Attitude Measures, Comparative Analysis, Interrater Reliability, Mathematical Models

Interjudge Agreement and the Maximum Value of Kappa.

Peer reviewed

Umesh, U. N.; And Others – Educational and Psychological Measurement, 1989

An approach is provided for calculating maximum values of the Kappa statistic of J. Cohen (1960) as a function of observed agreement proportions between evaluators. Separate calculations are required for different matrix sizes and observed agreement levels. (SLD)

Descriptors: Equations (Mathematics), Evaluators, Heuristics, Interrater Reliability

A Coefficient of Agreement for Nominal Scales: An Asymmetric Version of Kappa.

Peer reviewed

Kvalseth, Tarald O. – Educational and Psychological Measurement, 1991

An asymmetric version of J. Cohen's kappa statistic is presented as an appropriate measure for the agreement between two observers classifying items into nominal categories, when one observer represents the "standard." A numerical example with three categories is provided. (SLD)

Descriptors: Classification, Equations (Mathematics), Interrater Reliability, Mathematical Models

Chi Square Tests for the Difference between Correlated Weighted Kappas and Correlated Unweighted Kappas.

Peer reviewed

Ross, Donald C. – Educational and Psychological Measurement, 1992

Large sample chi-square tests of the significance of the difference between two correlated kappas, weighted or unweighted, are derived. Cases are presented with one judge in common between the two kappas and no judge in common. An illustrative calculation is included. (Author/SLD)

Descriptors: Chi Square, Correlation, Equations (Mathematics), Evaluators

A Review of Reliability Procedures for Measuring Observer Agreement.

Peer reviewed

Towstopiat, Olga – Contemporary Educational Psychology, 1984

The present article reviews the procedures that have been developed for measuring the reliability of human observers' judgments when making direct observations of behavior. These include the percentage of agreement, Cohen's Kappa, phi, and univariate and multivariate agreement measures that are based on quasi-equiprobability and quasi-independence…

Descriptors: Interrater Reliability, Mathematical Models, Multivariate Analysis, Observation

On the Reliability of Meta-Analytic Reviews: The Role of Intercoder Agreement.

Peer reviewed

Yeaton, William H.; Wortman, Paul M. – Evaluation Review, 1993

Current practices of reporting a single mean intercoder agreement in meta-analysis leads to systematic bias and overestimates reliability. An alternative is recommended in which average intercoder agreement statistics are calculated within clusters of coded variables. Two studies of intercoder agreement illustrate the model. (SLD)

Descriptors: Coding, Decision Making, Estimation (Mathematics), Interrater Reliability

Another Look at Inter-Rater Agreement. Research Report.

Download full text

Zwick, Rebecca – 1986

Most currently used measures of inter-rater agreement for the nominal case incorporate a correction for "chance agreement." The definition of chance agreement is not the same for all coefficients, however. Three chance-corrected coefficients are Cohen's Kappa; Scott's Pi; and the S index of Bennett, Goldstein, and Alpert, which has…

Descriptors: Error of Measurement, Interrater Reliability, Mathematical Models, Measurement Techniques

Estimating Rater Severity with Multilevel and Multidimensional Item Response Modeling.

Download full text

Wang, Wen-chung – 1997

Traditional approaches to the investigation of the objectivity of ratings for constructed-response items are based on classical test theory, which is item-dependent and sample-dependent. Item response theory overcomes this drawback by decomposing item difficulties into genuine difficulties and rater severity. In so doing, objectivity of ability…

Descriptors: College Entrance Examinations, Constructed Response, Foreign Countries, Interrater Reliability

Latent Structure Agreement Analysis. A RAND Note.

Download full text

Uebersax, John; Grove, Will – 1989

Methods of probability modeling to analyze rater agreement are described, emphasizing their basic similarities and viewing them as variants of a common methodology. Statistical techniques for analyzing agreement data are described to address questions such as how many opinions are required to make a medical diagnosis with necessary accuracy. Kappa…

Descriptors: Clinical Diagnosis, Correlation, Estimation (Mathematics), Evaluation Methods

Nonparametric Test of Ordered Alternatives: Extension of Page's L Test for Two Groups of Unequal Size.

Download full text

Beasley, T. Mark; Leitner, Dennis W. – 1993

The L statistic of E. B. Page (1963) tests the agreement of a single group of judges with an a priori ordering of alternative treatments. This paper extends the two group test of D. W. Leitner and C. M. Dayton (1976), an extension of the L test, to analyze difference in consensus between two unequally sized groups of judges. Exact critical values…

Descriptors: Comparative Analysis, Equations (Mathematics), Estimation (Mathematics), Evaluators

Interrater Reliability: A Selected and Annotated Bibliography of Articles Concerning Interrater Reliability.

Weare, Jane; And Others – 1987

This annotated bibliography was developed upon noting a deficiency of information in the literature regarding the training of raters for establishing agreement. The ERIC descriptor, "Interrater Reliability", was used to locate journal articles. Some of the 33 resulting articles focus on mathematical concepts and present formulas for computing…

Descriptors: Annotated Bibliographies, Cloze Procedure, Correlation, Essay Tests

A Method of Estimating Rater Reliability.

Peer reviewed

van den Bergh, Huub; Eiting, Mindert H. – Journal of Educational Measurement, 1989

A method of assessing rater reliability via a design of overlapping rater teams is presented. Covariances or correlations of ratings can be analyzed with LISREL models. Models in which the rater reliabilities are congeneric, tau-equivalent, or parallel can be tested. Two examples based on essay ratings are presented. (TJH)

Descriptors: Analysis of Covariance, Computer Simulation, Correlation, Elementary Secondary Education

Coefficients for Interrater Agreement.

Peer reviewed

Zegers, Frits E. – Applied Psychological Measurement, 1991

The degree of agreement between two raters rating several objects for a single characteristic can be expressed through an association coefficient, such as the Pearson product-moment correlation. How to select an appropriate association coefficient, and the desirable properties and uses of a class of such coefficients--the Euclidean…

Descriptors: Classification, Correlation, Data Interpretation, Equations (Mathematics)

Previous Page | Next Page »

Pages: 1 | 2 | 3

Cason, Carolyn L.	2
Cason, Gerald J.	2
Beasley, T. Mark	1
Chae, Sunhee	1
Chen, Hsueh-Chih	1
Chen, Po-Hsi	1
Deutsch, Stuart Jay	1
Eiting, Mindert H.	1
Goffin, Richard D.	1
Grove, Will	1
Houston, Walter M.	1
Hung, Su-Pin	1
Jackson, Douglas N.	1
Jaeger, Richard M.	1
Johnson, Eugene G.	1
Kaplan, Bruce A.	1
Kvalseth, Tarald O.	1
Lam, Toh Tin	1
Leitner, Dennis W.	1
Malmborg, Charles J.	1
Marascuilo, Leonard A.	1
McCrae, Robert R.	1
Paden, Patricia A.	1
Raymond, Mark R.	1
More ▼