ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	10
Since 2017 (last 10 years)	13
Since 2007 (last 20 years)	22

Descriptor

Classification	33
Test Format	33
Test Items	33
Test Construction	11
Difficulty Level	10
Multiple Choice Tests	8
Foreign Countries	7
Item Analysis	7
Item Response Theory	7
Accuracy	6
Computer Assisted Testing	6
Language Tests	6
Scoring	6
English (Second Language)	5
Comparative Analysis	4
Correlation	4
Second Language Learning	4
Taxonomy	4
Test Content	4
Undergraduate Students	4
Cues	3
Diagnostic Tests	3
Introductory Courses	3
Mathematics Tests	3
Measurement	3
More ▼

Publication Type

Journal Articles	22
Reports - Research	17
Reports - Descriptive	6
Dissertations/Theses -…	5
Reports - Evaluative	5
Speeches/Meeting Papers	3
Information Analyses	2
Opinion Papers	2
Non-Print Media	1
Reference Materials - General	1

Education Level

Higher Education	10
Postsecondary Education	9
Secondary Education	5
Elementary Education	1
Elementary Secondary Education	1
Grade 7	1
Grade 9	1
High Schools	1
Preschool Education	1

Audience

Location

Greece	1
Indonesia	1
Japan	1
Japan (Tokyo)	1
Massachusetts	1
Minnesota	1
New Jersey	1
Ohio	1
Russia	1
Turkey	1
Turkey (Ankara)	1
Venezuela	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Program for International…	1
SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 33 results Save | Export

The Impact of Scoring Later on Mixed Format Adaptive Testing

Direct link

Jing Ma – ProQuest LLC, 2024

This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…

Descriptors: Scoring, Adaptive Testing, Test Items, Classification

Examining the Comparative Measurement Value of Technology-Enhanced Items

Direct link

Sebastian Moncaleano – ProQuest LLC, 2021

The growth of computer-based testing over the last two decades has motivated the creation of innovative item formats. It is often argued that technology-enhanced items (TEIs) provide better measurement of test-takers' knowledge, skills, and abilities by increasing the authenticity of tasks presented to test-takers (Sireci & Zenisky, 2006).…

Descriptors: Computer Assisted Testing, Test Format, Test Items, Classification

Polytomous Testlet Response Models for Technology-Enhanced Innovative Items: Implications on Model Fit and Trait Inference

Peer reviewed

Direct link

Kang, Hyeon-Ah; Han, Suhwa; Kim, Doyoung; Kao, Shu-Chuan – Educational and Psychological Measurement, 2022

The development of technology-enhanced innovative items calls for practical models that can describe polytomous testlet items. In this study, we evaluate four measurement models that can characterize polytomous items administered in testlets: (a) generalized partial credit model (GPCM), (b) testlet-as-a-polytomous-item model (TPIM), (c)…

Descriptors: Goodness of Fit, Item Response Theory, Test Items, Scoring

Analyzing the Cognitive Complexity of the Questions Contained on Assessments of College and Career Readiness for Grades 6-12

Direct link

Cronin, Sean D. – ProQuest LLC, 2023

This convergent, parallel, mixed-methods study with qualitative and quantitative content analysis methods was conducted to identify what type of thinking is required by the College and Career Readiness Assessment (CCRA+) by (a) determining the frequency and percentage of questions categorized as higher-level thinking within each cell of Hess'…

Descriptors: Cues, College Readiness, Career Readiness, Test Items

Diagnostic Classification Model for Forced-Choice Items and Noncognitive Tests

Peer reviewed

Direct link

Huang, Hung-Yu – Educational and Psychological Measurement, 2023

The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs)…

Descriptors: Test Items, Classification, Bayesian Statistics, Decision Making

Nonparametric Classification Method for Multiple-Choice Items in Cognitive Diagnosis

Peer reviewed

Direct link

Wang, Yu; Chiu, Chia-Yi; Köhn, Hans Friedrich – Journal of Educational and Behavioral Statistics, 2023

The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format…

Descriptors: Multiple Choice Tests, Nonparametric Statistics, Test Format, Educational Assessment

Exploring Confidence Accuracy and Item Difficulty in Changing Multiple-Choice Answers of Scientific Reasoning Test

Peer reviewed
PDF on ERIC

Download full text

Fadillah, Sarah Meilani; Ha, Minsu; Nuraeni, Eni; Indriyanti, Nurma Yunita – Malaysian Journal of Learning and Instruction, 2023

Purpose: Researchers discovered that when students were given the opportunity to change their answers, a majority changed their responses from incorrect to correct, and this change often increased the overall test results. What prompts students to modify their answers? This study aims to examine the modification of scientific reasoning test, with…

Descriptors: Science Tests, Multiple Choice Tests, Test Items, Decision Making

Examination of the Questions in the Primary School Turkish Worksheets in Terms of Various Classification Systems

Peer reviewed
PDF on ERIC

Download full text

Delican, Burak – International Journal of Curriculum and Instruction, 2022

In this research, the questions in the Turkish Course (2,3,4) Worksheets were examined in terms of various classification systems. In this direction, the questions in the worksheets were evaluated with the document-material analysis technique in accordance with the structure of the qualitative research. During the research process, Turkish Course…

Descriptors: Worksheets, Elementary School Students, Turkish, Classification

An Exploratory Criterion Validation of Three Meaning-Recall Vocabulary Test Item Formats

Peer reviewed
PDF on ERIC

Download full text

Tim Stoeckel; Tomoko Ishii – Vocabulary Learning and Instruction, 2024

In an upcoming coverage-comprehension study, we plan to assess learners' meaning-recall knowledge of words as they occur in the study's reading passage. As several meaning-recall test formats exist, the purpose of this small-scale study (N = 10) was to determine which of three formats was most similar to a criterion interview regarding mean score…

Descriptors: Vocabulary Development, Language Tests, Second Language Learning, Classification

Evaluating the Effectiveness of the Expectation-Maximization (EM) Algorithm for Bayesian Network Calibration

Direct link

Tingir, Seyfullah – ProQuest LLC, 2019

Educators use various statistical techniques to explain relationships between latent and observable variables. One way to model these relationships is to use Bayesian networks as a scoring model. However, adjusting the conditional probability tables (CPT-parameters) to fit a set of observations is still a challenge when using Bayesian networks. A…

Descriptors: Bayesian Statistics, Statistical Analysis, Scoring, Probability

The Influence of Passage Cohesion on Cloze Test Item Difficulty

Peer reviewed
PDF on ERIC

Download full text

Jonathan Trace – Language Teaching Research Quarterly, 2023

The role of context in cloze tests has long been seen as both a benefit as well as a complication in their usefulness as a measure of second language comprehension (Brown, 2013). Passage cohesion, in particular, would seem to have a relevant and important effect on the degree to which cloze items function and the interpretability of performances…

Descriptors: Language Tests, Cloze Procedure, Connected Discourse, Test Items

IRT-Based Classification Analysis of an English Language Reading Proficiency Subtest

Peer reviewed

Direct link

Kaya, Elif; O'Grady, Stefan; Kalender, Ilker – Language Testing, 2022

Language proficiency testing serves an important function of classifying examinees into different categories of ability. However, misclassification is to some extent inevitable and may have important consequences for stakeholders. Recent research suggests that classification efficacy may be enhanced substantially using computerized adaptive…

Descriptors: Item Response Theory, Test Items, Language Tests, Classification

Effect of Adjusting Pseudo-Guessing Parameter Estimates on Test Scaling When Item Parameter Drift Is Present

Peer reviewed
PDF on ERIC

Download full text

Han, Kyung T.; Wells, Craig S.; Hambleton, Ronald K. – Practical Assessment, Research & Evaluation, 2015

In item response theory test scaling/equating with the three-parameter model, the scaling coefficients A and B have no impact on the c-parameter estimates of the test items since the cparameter estimates are not adjusted in the scaling/equating procedure. The main research question in this study concerned how serious the consequences would be if…

Descriptors: Item Response Theory, Monte Carlo Methods, Scaling, Test Items

Determining When Single Scoring for Constructed-Response Items Is as Effective as Double Scoring in Mixed-Format Licensure Tests

Peer reviewed

Direct link

Kim, Sooyeon; Moses, Tim – International Journal of Testing, 2013

The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…

Descriptors: Scoring, Test Format, Licensing Examinations (Professions), Test Items

Content-Rich versus Content Deficient Video-Based Visuals in L2 Academic Listening Tests: Pilot Study

Peer reviewed

Direct link

Lesnov, Roman Olegovich – International Journal of Computer-Assisted Language Learning and Teaching, 2018

This article compares second language test-takers' performance on an academic listening test in an audio-only mode versus an audio-video mode. A new method of classifying video-based visuals was developed and piloted, which used L2 expert opinions to place the video on a continuum from being content-deficient (not helpful for answering…

Descriptors: Second Language Learning, Second Language Instruction, Video Technology, Classification

Previous Page | Next Page »

Pages: 1 | 2 | 3

ProQuest LLC	5
Applied Measurement in…	2
Educational and Psychological…	2
Performance and Instruction	2
Astronomy Education Review	1
CBE - Life Sciences Education	1
Cognitive Development	1
College Board	1
E-Journal of Instructional…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Educational and…	1
Journal of Technology,…	1
Language Teaching Research…	1
Language Testing	1
Malaysian Journal of Learning…	1
Practical Assessment,…	1
Statistics Education Research…	1
Vocabulary Learning and…	1
More ▼

Downing, Steven M.	2
Haladyna, Thomas M.	2
Abshire, Elizabeth	1
Anagnostopoulou, Kyriaki	1
Blankenbiller, Margaret	1
Boyd, Joseph L.	1
Brownell, Sara E.	1
Chiu, Chia-Yi	1
Christidou, Vasilia	1
Cronin, Sean D.	1
Deak, Gedeon O.	1
Delican, Burak	1
Dimopoulos, Kostas	1
Draaijer, S.	1
Eddy, Sarah L.	1
Fadillah, Sarah Meilani	1
Gifford, Bernard	1
Ha, Minsu	1
Hambleton, Ronald K.	1
Han, Kyung T.	1
Han, Suhwa	1
Hartog, R. J. M.	1
Hatzinikita, Vassilia	1
Huang, Hung-Yu	1
More ▼