ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	5
Since 2017 (last 10 years)	6
Since 2007 (last 20 years)	12

Descriptor

Computer Assisted Testing	17
Interrater Reliability	17
Test Reliability	17
Test Validity	9
Correlation	6
Foreign Countries	6
Scoring	6
College Students	5
English (Second Language)	5
Difficulty Level	4
Test Construction	4
At Risk Students	3
Language Tests	3
Second Language Learning	3
Secondary School Students	3
Story Telling	3
Undergraduate Students	3
Writing Tests	3
Accuracy	2
Adolescents	2
Alignment (Education)	2
Artificial Intelligence	2
Calculus	2
Children	2
Cognitive Processes	2
More ▼

Source

American College Testing…	1
Educational Leadership	1
Educational Process:…	1
Educational Testing Service	1
European Journal of Science…	1
Grantee Submission	1
International Association for…	1
Journal of Educational…	1
Journal of Educational…	1
Journal of Mental Health…	1
Language, Speech, and Hearing…	1
Online Submission	1
SAGE Open	1
Smarter Balanced Assessment…	1
Turkish Online Journal of…	1
More ▼

Publication Type

Reports - Research	13
Journal Articles	11
Reports - Evaluative	2
Speeches/Meeting Papers	2
Tests/Questionnaires	2
Collected Works - Proceedings	1
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Higher Education	8
Postsecondary Education	7
Secondary Education	3
Elementary Secondary Education	2
High Schools	1

Audience

Researchers

Location

Turkey	4
Greece	2
Asia	1
Australia	1
Brazil	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Florida	1
Germany	1
Hawaii	1
Ireland	1
Israel	1
Italy	1
Japan	1
Kazakhstan	1
Louisiana	1
Netherlands	1
Norway	1
Ohio	1
Pakistan	1
Pennsylvania	1
Philippines	1
Portugal	1
More ▼

Laws, Policies, & Programs

Pell Grant Program

Assessments and Surveys

Test of English as a Foreign…	2
ACT Assessment	1
Graduate Record Examinations	1
Strengths and Difficulties…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Can AI Grade Like a Human? Validity, Reliability, and Fairness in University Coursework Assessment

Peer reviewed
PDF on ERIC

Download full text

Georgios Zacharis; Stamatios Papadakis – Educational Process: International Journal, 2025

Background/purpose: Generative artificial intelligence (GenAI) is often promoted as a transformative tool for assessment, yet evidence of its validity compared to human raters remains limited. This study examined whether an AI-based rater could be used interchangeably with trained faculty in scoring complex coursework. Materials/methods:…

Descriptors: Artificial Intelligence, Technology Uses in Education, Computer Assisted Testing, Grading

Establishing a Physics Concept Inventory Using Computer Marked Free-Response Questions

Peer reviewed
PDF on ERIC

Download full text

Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023

The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…

Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability

Online Administration of the Test of Narrative Language--Second Edition: Psychometrics and Considerations for Remote Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022

Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…

Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments

Online Administration of the Test of Narrative Language--Second Edition: Psychometrics and Considerations for Remote Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Language, Speech, and Hearing Services in Schools, 2022

Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments

Computer-Based and Paper-and-Pencil Tests: A Study in Calculus for STEM Majors

Peer reviewed

Direct link

Smolinsky, Lawrence; Marx, Brian D.; Olafsson, Gestur; Ma, Yanxia A. – Journal of Educational Computing Research, 2020

Computer-based testing is an expanding use of technology offering advantages to teachers and students. We studied Calculus II classes for science, technology, engineering, and mathematics majors using different testing modes. Three sections with 324 students employed: paper-and-pencil testing, computer-based testing, and both. Computer tests gave…

Descriptors: Test Format, Computer Assisted Testing, Paper (Material), Calculus

Grading: Why You Should Trust Your Judgment

Direct link

Guskey, Thomas R.; Jung, Lee Ann – Educational Leadership, 2016

Many educators consider grades calculated from statistical algorithms more accurate, objective, and reliable than grades they calculate themselves. But in this research, the authors first asked teachers to use their professional judgment to choose a summary grade for hypothetical students. When the researchers compared the teachers' grade with the…

Descriptors: Grading, Computer Assisted Testing, Interrater Reliability, Grades (Scholastic)

Development of a Rubric to Assess Academic Writing Incorporating Plagiarism Detectors

Peer reviewed

Direct link

Razi, Salim – SAGE Open, 2015

Similarity reports of plagiarism detectors should be approached with caution as they may not be sufficient to support allegations of plagiarism. This study developed a 50-item rubric to simplify and standardize evaluation of academic papers. In the spring semester of 2011-2012 academic year, 161 freshmen's papers at the English Language Teaching…

Descriptors: Foreign Countries, Scoring Rubrics, Writing Evaluation, Writing (Composition)

Smarter Balanced Assessment Consortium: Alignment Study Report. Revised

Download full text

Smarter Balanced Assessment Consortium, 2016

The goal of this study was to gather comprehensive evidence about the alignment of the Smarter Balanced summative assessments to the Common Core State Standards (CCSS). Alignment of the Smarter Balanced summative assessments to the CCSS is a critical piece of evidence regarding the validity of inferences students, teachers and policy makers can…

Descriptors: Alignment (Education), Summative Evaluation, Common Core State Standards, Test Content

Subjective Mental Health, Peer Relations, Family, and School Environment in Adolescents with Intellectual Developmental Disorder: A First Report of a New Questionnaire Administered on Tablet PCs

Peer reviewed

Direct link

Boström, Petra; Johnels, Jakob Åsberg; Thorson, Maria; Broberg, Malin – Journal of Mental Health Research in Intellectual Disabilities, 2016

Few studies have explored the subjective mental health of adolescents with intellectual disabilities, while proxy ratings indicate an overrepresentation of mental health problems. The present study reports on the design and an initial empirical evaluation of the Well-being in Special Education Questionnaire (WellSEQ). Questions, response scales,…

Descriptors: Mental Health, Peer Relationship, Family Environment, Educational Environment

Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

Download full text

Haberman, Shelby J. – Educational Testing Service, 2011

Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…

Descriptors: Writing Tests, Scoring, Essays, Language Tests

The Effect of Computers on the Test and Inter-Rater Reliability of Writing Tests of ESL Learners

Download full text

Aydin, Selami – Online Submission, 2006

This research aimed to investigate the effect of computers on the test and inter-rater reliability of writing test scores of ESL learners. Writing samples of 20 pen-paper and 20 computer group students were scored in analytic scoring method by two scorers, and then the scores were analyzed in Alpha (Cronbach) model. The results showed that the…

Descriptors: Writing Tests, Interrater Reliability, Test Reliability, English (Second Language)

The Effect of Computers on the Test and Inter-Rater Reliability of Writing Tests of ESL Learners

Peer reviewed
PDF on ERIC

Download full text

Aydin, Selami – Turkish Online Journal of Educational Technology - TOJET, 2006

Descriptors: Foreign Countries, College Students, Computer Assisted Testing, English (Second Language)

Inventory of Work-Relevant Values: 2001 Revision. ACT Research Report Series, 2004-03

Download full text

Bobek, Becky L.; Gore, Paul A. – American College Testing (ACT), Inc., 2004

This research report describes changes made to the Inventory of Work-Relevant Values when it was revised for online use as a part of the Internet version of DISCOVER. Users will see the following differences between the online and CD-ROM versions of the inventory: 22 items rather than 61, simplified presentation, and the contribution of all items…

Descriptors: Interrater Reliability, Field Tests, Internet, Test Construction

Integrated Test Scoring, Performance Rating and Assessment Records Keeping.

Cason, Gerald J.; And Others – 1987

The Objective Test Scoring and Performance Rating (OTS-PR) system is a fully integrated set of 70 modular FORTRAN programs run on a VAX-8530 computer. Even with no knowledge of computers, the user can implement OTS-PR to score multiple-choice tests, include scores from external sources such as hand-scored essays or scores from nationally…

Descriptors: Clinical Experience, Computer Assisted Testing, Educational Assessment, Essay Tests

Previous Page | Next Page »

Pages: 1 | 2

Anna-Maria Fall	2
Aydin, Selami	2
Beula M. Magimairaj	2
Greg Roberts	2
Philip Capin	2
Ronald B. Gillam	2
Sandra L. Gillam	2
Sharon Vaughn	2
Bobek, Becky L.	1
Boström, Petra	1
Braithwaite, Nicholas St. J.	1
Broberg, Malin	1
Camp, Roberta	1
Carlson, Sybil B.	1
Cason, Gerald J.	1
Georgios Zacharis	1
Gore, Paul A.	1
Guskey, Thomas R.	1
Haberman, Shelby J.	1
Hamid Mohammadi	1
Hedgeland, Holly	1
Johnels, Jakob Åsberg	1
Jordan, Sally E.	1
Jung, Lee Ann	1
More ▼