Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 8 |
Descriptor
| Comparative Analysis | 14 |
| Computer Assisted Testing | 14 |
| Effect Size | 14 |
| Item Response Theory | 5 |
| Meta Analysis | 5 |
| Scores | 5 |
| English (Second Language) | 4 |
| Language Tests | 4 |
| Second Language Learning | 4 |
| Statistical Analysis | 4 |
| Test Format | 4 |
| More ▼ | |
Source
Author
Publication Type
| Journal Articles | 10 |
| Reports - Research | 9 |
| Reports - Evaluative | 4 |
| Information Analyses | 3 |
| Numerical/Quantitative Data | 1 |
| Speeches/Meeting Papers | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Elementary Secondary Education | 2 |
| Grade 3 | 2 |
| Grade 5 | 2 |
| Grade 7 | 2 |
| Grade 9 | 2 |
| Early Childhood Education | 1 |
| Elementary Education | 1 |
| Grade 10 | 1 |
| Grade 11 | 1 |
| Grade 4 | 1 |
| Grade 6 | 1 |
| More ▼ | |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
| ACT Assessment | 1 |
| Embedded Figures Test | 1 |
| Group Embedded Figures Test | 1 |
| Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Ahmed Abdel-Al Ibrahim, Khaled; Karimi, Ali Reza; Abdelrasheed, Nasser Said Gomaa; Shatalebi, Vida – Language Testing in Asia, 2023
Dynamic assessment is heavily based on Vygotskian socio-cultural theory and in recent years researchers have shown interest in the theory as a way to facilitate learning. This study attempted to examine the comparative effect of group dynamic assessment (GDA) and computerized dynamic assessment (CDA) on listening development, L2 learners'…
Descriptors: Evaluation Methods, Computer Assisted Testing, Second Language Learning, Listening Skills
Jiyeo Yun – English Teaching, 2023
Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…
Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring
Li, Dongmei; Yi, Qing; Harris, Deborah – ACT, Inc., 2017
In preparation for online administration of the ACT® test, ACT conducted studies to examine the comparability of scores between online and paper administrations, including a timing study in fall 2013, a mode comparability study in spring 2014, and a second mode comparability study in spring 2015. This report presents major findings from these…
Descriptors: College Entrance Examinations, Computer Assisted Testing, Comparative Analysis, Test Format
Thompson, Gregory L.; Cox, Troy L.; Knapp, Nieves – Foreign Language Annals, 2016
While studies have been done to rate the validity and reliability of the Oral Proficiency Interview (OPI) and Oral Proficiency Interview-Computer (OPIc) independently, a limited amount of research has analyzed the interexam reliability of these tests, and studies have yet to be conducted comparing the results of Spanish language learners who take…
Descriptors: Comparative Analysis, Oral Language, Language Proficiency, Spanish
Liu, Junhui; Brown, Terran; Chen, Jianshen; Ali, Usama; Hou, Likun; Costanzo, Kate – Partnership for Assessment of Readiness for College and Careers, 2016
The Partnership for Assessment of Readiness for College and Careers (PARCC) is a state-led consortium working to develop next-generation assessments that more accurately, compared to previous assessments, measure student progress toward college and career readiness. The PARCC assessments include both English Language Arts/Literacy (ELA/L) and…
Descriptors: Testing, Achievement Tests, Test Items, Test Bias
Flowers, Claudia; Kim, Do-Hong; Lewis, Preston; Davis, Violeta Carmen – Journal of Special Education Technology, 2011
This study examined the academic performance and preference of students with disabilities for two types of test administration conditions, computer-based testing (CBT) and pencil-and-paper testing (PPT). Data from a large-scale assessment program were used to examine differences between CBT and PPT academic performance for third to eleventh grade…
Descriptors: Testing, Test Items, Effect Size, Computer Assisted Testing
Kingston, Neal M. – Applied Measurement in Education, 2009
There have been many studies of the comparability of computer-administered and paper-administered tests. Not surprisingly (given the variety of measurement and statistical sampling issues that can affect any one study) the results of such studies have not always been consistent. Moreover, the quality of computer-based test administration systems…
Descriptors: Multiple Choice Tests, Computer Assisted Testing, Printed Materials, Effect Size
Wang, Shudong; Jiao, Hong; Young, Michael J.; Brooks, Thomas; Olson, John – Educational and Psychological Measurement, 2008
In recent years, computer-based testing (CBT) has grown in popularity, is increasingly being implemented across the United States, and will likely become the primary mode for delivering tests in the future. Although CBT offers many advantages over traditional paper-and-pencil testing, assessment experts, researchers, practitioners, and users have…
Descriptors: Elementary Secondary Education, Reading Achievement, Computer Assisted Testing, Comparative Analysis
Lee, Yong-Won; Breland, Hunter; Muraki, Eiji – International Journal of Testing, 2005
This study has investigated the comparability of computer-based testing writing prompts in the Test of English as a Foreign LanguageTM (TOEFL) for examinees of different native language backgrounds. A total of 81 writing prompts introduced from July 1998 through August 2000 were examined using a 3-step logistic regression procedure for ordinal…
Descriptors: Language Aptitude, Effect Size, Test Bias, English (Second Language)
Bergstrom, Betty A. – 1992
This paper reports on existing studies and uses meta analysis to compare and synthesize the results of 20 studies from 8 research reports comparing the ability measure equivalence of computer adaptive tests (CAT) and conventional paper and pencil tests. Using the research synthesis techniques developed by Hedges and Olkin (1985), it is possible to…
Descriptors: Ability, Adaptive Testing, Comparative Analysis, Computer Assisted Testing
Thompson, Bruce; Melancon, Janet G. – 1990
Effect sizes have been increasingly emphasized in research as more researchers have recognized that: (1) all parametric analyses (t-tests, analyses of variance, etc.) are correlational; (2) effect sizes have played an important role in meta-analytic work; and (3) statistical significance testing is limited in its capacity to inform scientific…
Descriptors: Comparative Analysis, Computer Assisted Testing, Correlation, Effect Size
Nietfeld, John L.; Enders, Craig K; Schraw, Gregory – Educational and Psychological Measurement, 2006
Researchers studying monitoring accuracy currently use two different indexes to estimate accuracy: relative accuracy and absolute accuracy. The authors compared the distributional properties of two measures of monitoring accuracy using Monte Carlo procedures that fit within these categories. They manipulated the accuracy of judgments (i.e., chance…
Descriptors: Monte Carlo Methods, Test Items, Computation, Metacognition
Puhan, Gautam; Boughton, Keith A.; Kim, Sooyeon – ETS Research Report Series, 2005
The study evaluated the comparability of two versions of a teacher certification test: a paper-and-pencil test (PPT) and computer-based test (CBT). Standardized mean difference (SMD) and differential item functioning (DIF) analyses were used as measures of comparability at the test and item levels, respectively. Results indicated that effect sizes…
Descriptors: Comparative Analysis, Test Items, Statistical Analysis, Teacher Certification
Lee, Yong-Won; Breland, Hunter; Muraki, Eiji – ETS Research Report Series, 2004
This study has investigated the comparability of computer-based testing (CBT) writing prompts in the Test of English as a Foreign Language™ (TOEFL®) for examinees of different native language backgrounds. A total of 81 writing prompts introduced from July 1998 through August 2000 were examined using a three-step logistic regression procedure for…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Computer Assisted Testing

Peer reviewed
Direct link
