NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Natalja Menold; Vera Toepoel – Sociological Methods & Research, 2024
Research on mixed devices in web surveys is in its infancy. Using a randomized experiment, we investigated device effects (desktop PC, tablet and mobile phone) for six response formats and four different numbers of scale points. N = 5,077 members of an online access panel participated in the experiment. An exact test of measurement invariance and…
Descriptors: Online Surveys, Handheld Devices, Telecommunications, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Karakolidis, Anastasios; O'Leary, Michael; Scully, Darina – International Journal of Testing, 2021
The linguistic complexity of many text-based tests can be a source of construct-irrelevant variance, as test-takers' performance may be affected by factors that are beyond the focus of the assessment itself, such as reading comprehension skills. This experimental study examined the extent to which the use of animated videos, as opposed to written…
Descriptors: Animation, Vignettes, Video Technology, Test Format
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sheybani, Elias; Zeraatpishe, Mitra – International Journal of Language Testing, 2018
Test method is deemed to affect test scores along with examinee ability (Bachman, 1996). In this research the role of method facet in reading comprehension tests is studied. Bachman divided method facet into five categories, one category is the nature of input and the nature of expected response. This study examined the role of method effect in…
Descriptors: Reading Comprehension, Reading Tests, Test Items, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Dwyer, Andrew C. – Journal of Educational Measurement, 2016
This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…
Descriptors: Cutting Scores, Equivalency Tests, Test Format, Academic Standards
Schoen, Robert C.; Yang, Xiaotong; Liu, Sicong; Paek, Insu – Grantee Submission, 2017
The Early Fractions Test v2.2 is a paper-pencil test designed to measure mathematics achievement of third- and fourth-grade students in the domain of fractions. The purpose, or intended use, of the Early Fractions Test v2.2 is to serve as a measure of student outcomes in a randomized trial designed to estimate the effect of an educational…
Descriptors: Psychometrics, Mathematics Tests, Mathematics Achievement, Fractions
Peer reviewed Peer reviewed
Direct linkDirect link
McLean, Stuart; Kramer, Brandon; Beglar, David – Language Teaching Research, 2015
An important gap in the field of second language vocabulary assessment concerns the lack of validated tests measuring aural vocabulary knowledge. The primary purpose of this study is to introduce and provide preliminary validity evidence for the Listening Vocabulary Levels Test (LVLT), which has been designed as a diagnostic tool to measure…
Descriptors: Test Construction, Test Validity, English (Second Language), Second Language Learning
DeStefano, Lizanne; Johnson, Jeremiah – American Institutes for Research, 2013
This paper describes one of the first efforts by the National Assessment of Educational Progress (NAEP) to improve measurement at the lower end of the distribution, including measurement for students with disabilities (SD) and English language learners (ELLs). One way to improve measurement at the lower end is to introduce one or more…
Descriptors: National Competency Tests, Measures (Individuals), Disabilities, English Language Learners
Peer reviewed Peer reviewed
Direct linkDirect link
Girard, Todd A.; Christensen, Bruce K. – Psychological Assessment, 2008
The correlation between a short-form (SF) test and its full-scale (FS) counterpart is a mainstay in the evaluation of SF validity. However, in correcting for overlapping error variance in this measure, investigators have overattenuated the validity coefficient through an intuitive misapplication of P. Levy's (1967) formula. The authors of the…
Descriptors: Error of Measurement, Computation, Psychiatric Services, Correlation
Haladyna, Tom; Roid, Gale – 1981
Two approaches to criterion-referenced test construction are compared. Classical test theory is based on the practice of random sampling from a well-defined domain of test items; latent trait theory suggests that the difficulty of the items should be matched to the achievement level of the student. In addition to these two methods of test…
Descriptors: Criterion Referenced Tests, Error of Measurement, Latent Trait Theory, Test Construction
Gaffney, Patrick V. – 1997
A reliability analysis was conducted of an abbreviated, 10-item version of the Pupil Control Ideology Form (PCI), using the Cronbach's alpha technique (L. J. Cronbach, 1951) and the computation of the standard error of measurement. The PCI measures a teacher's orientation toward pupil control. Subjects were 168 preservice teachers from one private…
Descriptors: Classroom Techniques, Discipline, Error of Measurement, Higher Education