Publication Date
| In 2026 | 0 |
| Since 2025 | 9 |
| Since 2022 (last 5 years) | 112 |
| Since 2017 (last 10 years) | 216 |
| Since 2007 (last 20 years) | 377 |
Descriptor
| Comparative Analysis | 598 |
| Item Analysis | 598 |
| Test Items | 230 |
| Foreign Countries | 182 |
| Scores | 103 |
| Item Response Theory | 98 |
| Statistical Analysis | 97 |
| Correlation | 93 |
| Test Construction | 86 |
| Factor Analysis | 83 |
| Difficulty Level | 80 |
| More ▼ | |
Source
Author
| Hambleton, Ronald K. | 5 |
| Weiss, David J. | 4 |
| Bashaw, W. L. | 3 |
| Benson, Jeri | 3 |
| Blanton, Maria | 3 |
| Facon, Bruno | 3 |
| Gongjun Xu | 3 |
| Haladyna, Tom | 3 |
| Knuth, Eric | 3 |
| Lord, Frederic M. | 3 |
| Reckase, Mark D. | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 15 |
| Practitioners | 4 |
| Teachers | 4 |
| Students | 2 |
| Policymakers | 1 |
Location
| Australia | 13 |
| China | 13 |
| Germany | 13 |
| Turkey | 13 |
| Canada | 8 |
| United Kingdom | 8 |
| United Kingdom (England) | 8 |
| United States | 8 |
| Indonesia | 7 |
| Iran | 7 |
| Japan | 7 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 3 |
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Seyda Aydin-Karaca; Mustafa Serdar Köksal; Bilkay Bi – Journal of Psychoeducational Assessment, 2024
This study aimed to develop a parent rating scale (PRSG) for screening children for further identification process in terms of giftedness. The participants of the study were 255 parents of gifted and non-gifted students. The PRSG, consisting of 30 items, was created by consulting parents and reviewing instruments existent in the literature. As…
Descriptors: Rating Scales, Parent Attitudes, Scores, Comparative Analysis
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Finch, Holmes – Applied Measurement in Education, 2022
Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…
Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation
Carmen Batanero; Luis A. Hernandez-Solis; Maria M. Gea – Statistics Education Research Journal, 2023
We present an exploratory study of Costa Rican and Spanish students' (11-16-year-olds) competence to compare probabilities in urns and compare ratios in mixture problems. A sample of 704 students in Grades 6 through to Grade 10, 292 from Costa Rica and 412 from Spain, were given one of two forms of a questionnaire with three probability comparison…
Descriptors: Statistics Education, Comparative Analysis, Foreign Countries, Probability
Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022
This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…
Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy
Harrison, Scott; Kroehne, Ulf; Goldhammer, Frank; Lüdtke, Oliver; Robitzsch, Alexander – Large-scale Assessments in Education, 2023
Background: Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an…
Descriptors: Scoring, Test Items, Difficulty Level, Foreign Countries
Schaper, Marie Luisa; Kuhlmann, Beatrice G.; Bayen, Ute J. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2023
Item memory and source memory are different aspects of episodic remembering. To investigate metamemory differences between them, the authors assessed systematic differences between predictions of item memory via Judgments of Learning (JOLs) and source memory via Judgments of Source (JOSs). Schema-based expectations affect JOLs and JOSs…
Descriptors: Memory, Metacognition, Schemata (Cognition), Prediction
Schröder, Jette; Schmiedeberg, Claudia – Sociological Methods & Research, 2023
Despite the fact that third parties are present during a substantial amount of face-to-face interviews, bystander influence on respondents' response behavior is not yet fully understood. We use nine waves of the German Family Panel "pairfam" and apply fixed effects panel regression models to analyze effects of third-party presence on…
Descriptors: Housework, Item Analysis, Interpersonal Relationship, Responses
Wu, Tong – ProQuest LLC, 2023
This three-article dissertation aims to address three methodological challenges to ensure comparability in educational research, including scale linking, test equating, and propensity score (PS) weighting. The first study intends to improve test scale comparability by evaluating the effect of six missing data handling approaches, including…
Descriptors: Educational Research, Comparative Analysis, Equated Scores, Weighted Scores
Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability
Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025
The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…
Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction
Celen, Umit; Aybek, Eren Can – International Journal of Assessment Tools in Education, 2022
Item analysis is performed by developers as an integral part of the scale development process. Thus, items are excluded from the scale depending on the item analysis prior to the factor analysis. Existing item discrimination indices are calculated based on correlation, yet items with different response patterns are likely to have a similar item…
Descriptors: Likert Scales, Factor Analysis, Item Analysis, Correlation
Erdem-Kara, Basak; Dogan, Nuri – International Journal of Assessment Tools in Education, 2022
Recently, adaptive test approaches have become a viable alternative to traditional fixed-item tests. The main advantage of adaptive tests is that they reach desired measurement precision with fewer items. However, fewer items mean that each item has a more significant effect on ability estimation and therefore those tests are open to more…
Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Test Construction
Sahin Kursad, Merve; Cokluk Bokeoglu, Omay; Cikrikci, Rahime Nukhet – International Journal of Assessment Tools in Education, 2022
Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and…
Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Error of Measurement
Roger Young; Emily Courtney; Alexander Kah; Mariah Wilkerson; Yi-Hsin Chen – Teaching of Psychology, 2025
Background: Multiple-choice item (MCI) assessments are burdensome for instructors to develop. Artificial intelligence (AI, e.g., ChatGPT) can streamline the process without sacrificing quality. The quality of AI-generated MCIs and human experts is comparable. However, whether the quality of AI-generated MCIs is equally good across various domain-…
Descriptors: Item Response Theory, Multiple Choice Tests, Psychology, Textbooks

Peer reviewed
Direct link
