NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 166 to 180 of 636 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Watson, Nicole; Wilkins, Roger – Field Methods, 2015
Computer-assisted personal interviewing (CAPI) offers many attractive benefits over paper-and-pencil interviewing. There is, however, mixed evidence on the impact of CAPI on interview "length," an important survey outcome in the context of length limits imposed by survey budgets and concerns over respondent burden. In this article,…
Descriptors: Interviews, Test Length, Computer Assisted Testing, National Surveys
Peer reviewed Peer reviewed
Direct linkDirect link
Tay, Louis; Huang, Qiming; Vermunt, Jeroen K. – Educational and Psychological Measurement, 2016
In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…
Descriptors: Item Response Theory, Test Bias, Simulation, College Entrance Examinations
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Li, Feifei – ETS Research Report Series, 2017
An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…
Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Runco, Mark A.; Walczyk, Jeffrey John; Acar, Selcuk; Cowger, Ernest L.; Simundson, Melissa; Tripp, Sunny – Journal of Creative Behavior, 2014
This article describes an empirical refinement of the "Runco Ideational Behavior Scale" (RIBS). The RIBS seems to be associated with divergent thinking, and the potential for creative thinking, but it was possible that its validity could be improved. With this in mind, three new scales were developed and the unique benefit (or…
Descriptors: Behavior Rating Scales, Creative Thinking, Test Validity, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Makransky, Guido; Dale, Philip S.; Havmose, Philip; Bleses, Dorthe – Journal of Speech, Language, and Hearing Research, 2016
Purpose: This study investigated the feasibility and potential validity of an item response theory (IRT)-based computerized adaptive testing (CAT) version of the MacArthur-Bates Communicative Development Inventory: Words & Sentences (CDI:WS; Fenson et al., 2007) vocabulary checklist, with the objective of reducing length while maintaining…
Descriptors: Item Response Theory, Computer Assisted Testing, Adaptive Testing, Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bae, Minryoung; Lee, Byungmin – English Teaching, 2018
This study examines the effects of text length and question type on Korean EFL readers' reading comprehension of the fill-in-the-blank items in Korean CSAT. A total of 100 Korean EFL college students participated in the study. After divided into three different proficiency groups, the participants took a reading comprehension test which consisted…
Descriptors: Test Items, Language Tests, Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Veldkamp, Bernard P. – Journal of Educational Measurement, 2016
Many standardized tests are now administered via computer rather than paper-and-pencil format. The computer-based delivery mode brings with it certain advantages. One advantage is the ability to adapt the difficulty level of the test to the ability level of the test taker in what has been termed computerized adaptive testing (CAT). A second…
Descriptors: Computer Assisted Testing, Reaction Time, Standardized Tests, Difficulty Level
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sengul Avsar, Asiye; Tavsancil, Ezel – Educational Sciences: Theory and Practice, 2017
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
Descriptors: Test Items, Psychometrics, Nonparametric Statistics, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lu, Ying – ETS Research Report Series, 2017
For standard- or criterion-based assessments, the use of cut scores to indicate mastery, nonmastery, or different levels of skill mastery is very common. As part of performance summary, it is of interest to examine the percentage of examinees at or above the cut scores (PAC) and how PAC evolves across administrations. This paper shows that…
Descriptors: Cutting Scores, Evaluation Methods, Mastery Learning, Performance Based Assessment
NWEA, 2018
Thousands of U.S. school districts and many international schools use MAP® Growth™ to monitor the academic growth of their students and to inform instruction. The MAP Growth assessment is untimed, meaning that limits are not placed on how much time a student has to respond to the items. However, to help schools understand the amount of time MAP…
Descriptors: Achievement Tests, Test Length, Achievement Gains, Mathematics Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kabasakal, Kübra Atalay; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2015
This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…
Descriptors: Test Bias, Equated Scores, Item Response Theory, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Jihyun; Paek, Insu – Journal of Psychoeducational Assessment, 2014
Likert-type rating scales are still the most widely used method when measuring psychoeducational constructs. The present study investigates a long-standing issue of identifying the optimal number of response categories. A special emphasis is given to categorical data, which were generated by the Item Response Theory (IRT) Graded-Response Modeling…
Descriptors: Likert Scales, Responses, Item Response Theory, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Gelfand, Jessica T.; Christie, Robert E.; Gelfand, Stanley A. – Journal of Speech, Language, and Hearing Research, 2014
Purpose: Speech recognition may be analyzed in terms of recognition probabilities for perceptual wholes (e.g., words) and parts (e.g., phonemes), where j or the j-factor reveals the number of independent perceptual units required for recognition of the whole (Boothroyd, 1968b; Boothroyd & Nittrouer, 1988; Nittrouer & Boothroyd, 1990). For…
Descriptors: Phonemes, Word Recognition, Vowels, Syllables
Peer reviewed Peer reviewed
Direct linkDirect link
Anthony, Christopher James; DiPerna, James Clyde – School Psychology Quarterly, 2017
The Academic Competence Evaluation Scales-Teacher Form (ACES-TF; DiPerna & Elliott, 2000) was developed to measure student academic skills and enablers (interpersonal skills, engagement, motivation, and study skills). Although ACES-TF scores have demonstrated psychometric adequacy, the length of the measure may be prohibitive for certain…
Descriptors: Test Items, Efficiency, Item Response Theory, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015
The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…
Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items
Pages: 1  |  ...  |  8  |  9  |  10  |  11  |  12  |  13  |  14  |  15  |  16  |  ...  |  43