NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 76 to 90 of 636 results Save | Export
Benton, Tom – Research Matters, 2021
Computer adaptive testing is intended to make assessment more reliable by tailoring the difficulty of the questions a student has to answer to their level of ability. Most commonly, this benefit is used to justify the length of tests being shortened whilst retaining the reliability of a longer, non-adaptive test. Improvements due to adaptive…
Descriptors: Risk, Item Response Theory, Computer Assisted Testing, Difficulty Level
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019
The Mantel-Haenszel delta difference (MH D-DIF) and the standardized proportion difference (STD P-DIF) are two observed-score methods that have been used to assess differential item functioning (DIF) at Educational Testing Service since the early 1990s. Latentvariable approaches to assessing measurement invariance at the item level have been…
Descriptors: Test Bias, Educational Testing, Statistical Analysis, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Hula, William D.; Fergadiotis, Gerasimos; Swiderski, Alexander M.; Silkes, JoAnn P.; Kellough, Stacey – Journal of Speech, Language, and Hearing Research, 2020
Purpose: The purpose of this study was to verify the equivalence of 2 alternate test forms with nonoverlapping content generated by an item response theory (IRT)--based computer-adaptive test (CAT). The Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) was utilized as an item bank in a prospective, independent…
Descriptors: Adaptive Testing, Computer Assisted Testing, Severity (of Disability), Aphasia
Peer reviewed Peer reviewed
Direct linkDirect link
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Arikan, Serkan; Aybek, Eren Can – Educational Measurement: Issues and Practice, 2022
Many scholars compared various item discrimination indices in real or simulated data. Item discrimination indices, such as item-total correlation, item-rest correlation, and IRT item discrimination parameter, provide information about individual differences among all participants. However, there are tests that aim to select a very limited number…
Descriptors: Monte Carlo Methods, Item Analysis, Correlation, Individual Differences
Ziying Li; A. Corinne Huggins-Manley; Walter L. Leite; M. David Miller; Eric A. Wright – Educational and Psychological Measurement, 2022
The unstructured multiple-attempt (MA) item response data in virtual learning environments (VLEs) are often from student-selected assessment data sets, which include missing data, single-attempt responses, multiple-attempt responses, and unknown growth ability across attempts, leading to a complex and complicated scenario for using this kind of…
Descriptors: Sequential Approach, Item Response Theory, Data, Simulation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Xu, Peng; Desmarais, Michel C. – International Educational Data Mining Society, 2018
In most contexts of student skills assessment, whether the test material is administered by the teacher or within a learning environment, there is a strong incentive to minimize the number of questions or exercises administered in order to get an accurate assessment. This minimization objective can be framed as a Q-matrix design problem: given a…
Descriptors: Test Items, Accuracy, Test Construction, Skills
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tulek, Onder Kamil; Kose, Ibrahim Alper – Eurasian Journal of Educational Research, 2019
Purpose: This research investigates Tests that include DIF items and which are purified from DIF items. While doing this, the ability estimations and purified DIF items are compared to understand whether there is a correlation between the estimations. Method: The researcher used to R 3.4.1 in order to compare the items and after this situation;…
Descriptors: Test Items, Item Analysis, Item Response Theory, Test Length
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Köse, Alper; Dogan, C. Deha – International Journal of Evaluation and Research in Education, 2019
The aim of this study was to examine the precision of item parameter estimation in different sample sizes and test lengths under three parameter logistic model (3PL) item response theory (IRT) model, where the trait measured by a test was not normally distributed or had a skewed distribution. In the study, number of categories (1-0), and item…
Descriptors: Statistical Bias, Item Response Theory, Simulation, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…
Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Hill, Andrew P.; Donachie, Tracy – Journal of Psychoeducational Assessment, 2020
The measurement of perfectionistic cognitions has recently caused disagreement among researchers. Flett, Hewitt, Blankstein, and Gray proposed that perfectionistic cognitions are unidimensional. However, after re-examining the factor structure of the instrument used to measure perfectionistic automatic thoughts (Perfectionism Cognitions Inventory…
Descriptors: Factor Structure, Test Length, Cognitive Processes, Personality Traits
Peer reviewed Peer reviewed
Direct linkDirect link
Luo, Xiao; Wang, Xinrui – International Journal of Testing, 2019
This study introduced dynamic multistage testing (dy-MST) as an improvement to existing adaptive testing methods. dy-MST combines the advantages of computerized adaptive testing (CAT) and computerized adaptive multistage testing (ca-MST) to create a highly efficient and regulated adaptive testing method. In the test construction phase, multistage…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Construction, Psychometrics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Öztürk, Nagihan Boztunç – Universal Journal of Educational Research, 2019
In this study, how the length and characteristics of routing module in different panel designs affect measurement precision is examined. In the scope of the study, six different routing module length, nine different routing module characteristics, and two different panel design are handled. At the end of the study, the effects of conditions on…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Length, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Pham, Theresa; Bardell, Taylor E.; Vollebregt, Meghan; Kuiack, Alyssa K.; Archibald, Lisa M. D. – Journal of Speech, Language, and Hearing Research, 2022
Purpose: Working memory and linguistic knowledge are highly intertwined in language tasks. Verbal working memory in particular has been studied as a potential constraint on language performance. This, in turn, highlights the need for a clinical assessment tool that will assist clinicians in understanding individual children's performance in…
Descriptors: Short Term Memory, Language Tests, Preschool Children, Verbal Ability
Peer reviewed Peer reviewed
Direct linkDirect link
Zhou, Sherry; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2020
The semi-generalized partial credit model (Semi-GPCM) has been proposed as a unidimensional modeling method for handling not applicable scale responses and neutral scale responses, and it has been suggested that the model may be of use in handling missing data in scale items. The purpose of this study is to evaluate the ability of the…
Descriptors: Models, Statistical Analysis, Response Style (Tests), Test Items
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  43