Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 2 |
| Since 2007 (last 20 years) | 6 |
Descriptor
| Reliability | 12 |
| Test Format | 12 |
| Test Items | 12 |
| Psychometrics | 4 |
| Scores | 4 |
| Comparative Analysis | 3 |
| Item Response Theory | 3 |
| Test Construction | 3 |
| Adaptive Testing | 2 |
| Classification | 2 |
| Computer Assisted Testing | 2 |
| More ▼ | |
Source
Author
| Barnette, J. Jackson | 1 |
| Chang, Lei | 1 |
| Downing, Steven M. | 1 |
| Earley, Mark A. | 1 |
| Harley, Dwight | 1 |
| Hoffman, Lesa | 1 |
| Kim, Sooyeon | 1 |
| Mertler, Craig A. | 1 |
| Moses, Tim | 1 |
| Papanastasiou, Elena C. | 1 |
| Pommerich, Mary | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 11 |
| Journal Articles | 9 |
| Speeches/Meeting Papers | 2 |
| Dissertations/Theses -… | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Grade 11 | 1 |
| Grade 12 | 1 |
| High Schools | 1 |
| Higher Education | 1 |
| Postsecondary Education | 1 |
| Secondary Education | 1 |
Audience
Location
| Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| Peabody Individual… | 1 |
| Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
Sayin, Ayfer; Sata, Mehmet – International Journal of Assessment Tools in Education, 2022
The aim of the present study was to examine Turkish teacher candidates' competency levels in writing different types of test items by utilizing Rasch analysis. In addition, the effect of the expertise of the raters scoring the items written by the teacher candidates was examined within the scope of the study. 84 Turkish teacher candidates…
Descriptors: Foreign Countries, Item Response Theory, Evaluators, Expertise
Tingir, Seyfullah – ProQuest LLC, 2019
Educators use various statistical techniques to explain relationships between latent and observable variables. One way to model these relationships is to use Bayesian networks as a scoring model. However, adjusting the conditional probability tables (CPT-parameters) to fit a set of observations is still a challenge when using Bayesian networks. A…
Descriptors: Bayesian Statistics, Statistical Analysis, Scoring, Probability
Kim, Sooyeon; Moses, Tim – International Journal of Testing, 2013
The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…
Descriptors: Scoring, Test Format, Licensing Examinations (Professions), Test Items
Hoffman, Lesa; Templin, Jonathan; Rice, Mabel L. – Journal of Speech, Language, and Hearing Research, 2012
Purpose: The present work describes how vocabulary ability as assessed by 3 different forms of the Peabody Picture Vocabulary Test (PPVT; Dunn & Dunn, 1997) can be placed on a common latent metric through item response theory (IRT) modeling, by which valid comparisons of ability between samples or over time can then be made. Method: Responses…
Descriptors: Item Response Theory, Test Format, Vocabulary, Comparative Analysis
Papanastasiou, Elena C.; Reckase, Mark D. – International Journal of Testing, 2007
Because of the increased popularity of computerized adaptive testing (CAT), many admissions tests, as well as certification and licensure examinations, have been transformed from their paper-and-pencil versions to computerized adaptive versions. A major difference between paper-and-pencil tests and CAT from an examinee's point of view is that in…
Descriptors: Simulation, Adaptive Testing, Computer Assisted Testing, Test Items
Pommerich, Mary – Journal of Technology, Learning, and Assessment, 2007
Computer administered tests are becoming increasingly prevalent as computer technology becomes more readily available on a large scale. For testing programs that utilize both computer and paper administrations, mode effects are problematic in that they can result in examinee scores that are artificially inflated or deflated. As such, researchers…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Format, Scores
Barnette, J. Jackson – Research in the Schools, 2001
Studied the primacy effect (tendency to select items closer to the left side of the response scale) in Likert scales worded from "Strongly Disagree" to "Strongly Agree" and in the opposite direction. Findings for 386 high school and college students show no primacy effect, although negatively worded stems had an effect on Cronbach's alpha. (SLD)
Descriptors: College Students, High School Students, High Schools, Higher Education
Downing, Steven M. – Advances in Health Sciences Education, 2005
The purpose of this research was to study the effects of violations of standard multiple-choice item writing principles on test characteristics, student scores, and pass-fail outcomes. Four basic science examinations, administered to year-one and year-two medical students, were randomly selected for study. Test items were classified as either…
Descriptors: Medical Education, Medical Students, Test Items, Test Format
Mertler, Craig A.; Earley, Mark A. – 2003
A study was conducted to compare the psychometric qualities of two forms of an identical survey: one administered in a paper-and-pencil format and the other administered in Web format. The survey addressed the topic of college course anxiety and used to survey a sample of 236 undergraduate students. The psychometric qualities investigated included…
Descriptors: Anxiety, Comparative Analysis, Higher Education, Psychometrics
Peer reviewedChang, Lei – Applied Psychological Measurement, 1994
Reliability and validity of 4-point and 6-point scales were assessed using a new model-based approach to fit empirical data from 165 graduate students completing an attitude measure. Results suggest that the issue of four- versus six-point scales may depend on the empirical setting. (SLD)
Descriptors: Attitude Measures, Goodness of Fit, Graduate Students, Graduate Study
Peer reviewedRogers, W. Todd; Harley, Dwight – Educational and Psychological Measurement, 1999
Examined item-level and test-level characteristics for items in a high-stakes school-leaving mathematics examination. Results from 158 students show that the influence of testwiseness is lessened when three-option items are used. Tests of three-option items are at least equivalent to four-option item tests in terms of internal-consistency score…
Descriptors: Comparative Analysis, High School Students, High Schools, High Stakes Tests
Sykes, Robert C.; Truskosky, Denise; White, Hillory – 2001
The purpose of this research was to study the effect of the three different ways of increasing the number of points contributed by constructed response (CR) items on the reliability of test scores from mixed-item-format tests. The assumption of unidimensionality that underlies the accuracy of item response theory model-based standard error…
Descriptors: Constructed Response, Elementary Education, Elementary School Students, Error of Measurement

Direct link
