NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 586 to 600 of 3,089 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2016
This article examines the possible dependency of composite reliability on presentation format of the elements of a multi-item measuring instrument. Using empirical data and a recent method for interval estimation of group differences in reliability, we demonstrate that the reliability of an instrument need not be the same when polarity of the…
Descriptors: Test Reliability, Test Format, Test Items, Differences
Peer reviewed Peer reviewed
Direct linkDirect link
Keller, Lisa A.; Keller, Robert; Cook, Robert J.; Colvin, Kimberly F. – Applied Measurement in Education, 2016
The equating of tests is an essential process in high-stakes, large-scale testing conducted over multiple forms or administrations. By adjusting for differences in difficulty and placing scores from different administrations of a test on a common scale, equating allows scores from these different forms and administrations to be directly compared…
Descriptors: Item Response Theory, Equated Scores, Test Format, Testing Programs
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Babcock, Ben – Journal of Educational Measurement, 2016
A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…
Descriptors: Cutting Scores, Psychometrics, Test Construction, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Lewis, Kendra M.; Ewers, Timothy; Miller, JoLynn C.; Bird, Marianne; Borba, John; Hill, Russell D.; Rea-Keywood, Jeannette; Shelstad, Nancy; Trzesniewski, Kali – Journal of Extension, 2018
Research on retention in the 4-H youth development program has consistently shown that one of the primary indicators for youths' dropping out of 4-H is being a first-year member. Extension 4-H professionals from California, Idaho, Wyoming, and New Jersey formed a team to study this issue. Our team surveyed first-year members and their…
Descriptors: Youth Programs, Academic Persistence, School Holding Power, Dropout Research
Peer reviewed Peer reviewed
Direct linkDirect link
Martin-Raugh, Michelle P.; Anguiano-Carrsaco, Cristina; Jackson, Teresa; Brenneman, Meghan W.; Carney, Lauren; Barnwell, Patrick; Kochert, Jonathan – International Journal of Testing, 2018
Single-response situational judgment tests (SRSJTs) differ from multiple-response SJTs (MRSJTS) in that they present test takers with edited critical incidents and simply ask test takers to read over the action described and evaluate it according to its effectiveness. Research comparing the reliability and validity of SRSJTs and MRSJTs is thus far…
Descriptors: Test Format, Test Reliability, Test Validity, Predictive Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Liu, Yuming; Robin, Frédéric; Yoo, Hanwook; Manna, Venessa – ETS Research Report Series, 2018
The "GRE"® Psychology test is an achievement test that measures core knowledge in 12 content domains that represent the courses commonly offered at the undergraduate level. Currently, a total score and 2 subscores, experimental and social, are reported to test takers as well as graduate institutions. However, the American Psychological…
Descriptors: College Entrance Examinations, Graduate Study, Psychological Testing, Scores
Sinharay, Sandip – Grantee Submission, 2018
Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Ling – Journal of Educational Multimedia and Hypermedia, 2021
Running records is an important reading assessment for diagnosing early readers' needs in diverse instructional settings across grade levels. This study develops an innovative app to help teachers administer running records assessment and investigates teachers' perceptions of its functionality and usability in practical classrooms. The app offers…
Descriptors: Miscue Analysis, Reading Comprehension, Reading Tests, Computer Software
Peer reviewed Peer reviewed
Direct linkDirect link
Prodi, Nicola; Visentin, Chiara – Journal of Speech, Language, and Hearing Research, 2019
Purpose: This study examines the effects of reverberation and noise fluctuation on the response time (RT) to the auditory stimuli in a speech reception task. Method: The speech reception task was presented to 76 young adults with normal hearing in 3 simulated listening conditions (1 anechoic, 2 reverberant). Speechlike stationary and fluctuating…
Descriptors: Acoustics, Reaction Time, Auditory Stimuli, Speech Communication
Peer reviewed Peer reviewed
Direct linkDirect link
Steedle, Jeffrey T.; Morrison, Kristin M. – Educational Assessment, 2019
Assessment items are commonly field tested prior to operational use to observe statistical item properties such as difficulty. Item parameter estimates from field testing may be used to assign scores via pre-equating or computer adaptive designs. This study examined differences between item difficulty estimates based on field test and operational…
Descriptors: Field Tests, Test Items, Statistics, Difficulty Level
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Neiro, Jakke; Johansson, Niko – LUMAT: International Journal on Math, Science and Technology Education, 2020
The history and evolution of science assessment remains poorly known, especially in the context of the exam question contents. Here we analyze the Finnish matriculation examination in biology from the 1920s to 1960s to understand how the exam has evolved in both its knowledge content and educational form. Each question was classified according to…
Descriptors: Foreign Countries, Biology, Test Content, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Shar, Kelli; Russ, Rosemary S.; Laverty, James T. – Physical Review Physics Education Research, 2020
Assessments are usually thought of as ways for instructors to get information from students. In this work, we flip this perspective and explore how assessments communicate information to students. Specifically, we consider how assessments may provide information about what faculty and/or researchers think it means to know and do physics, i.e.,…
Descriptors: Epistemology, Science Instruction, Physics, Science Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Nakata, Tatsuya – Studies in Second Language Acquisition, 2017
Although research shows that repetition increases second language vocabulary learning, only several studies have examined the long-term effects of increasing retrieval frequency in one learning session. With this in mind, the present study examined the effects of within-session repeated retrieval on vocabulary learning. The study is original in…
Descriptors: Repetition, Second Language Learning, Vocabulary Development, English
Peer reviewed Peer reviewed
Direct linkDirect link
Höhne, Jan Karem; Schlosser, Stephan; Krebs, Dagmar – Field Methods, 2017
Measuring attitudes and opinions employing agree/disagree (A/D) questions is a common method in social research because it appears to be possible to measure different constructs with identical response scales. However, theoretical considerations suggest that A/D questions require a considerable cognitive processing. Item-specific (IS) questions,…
Descriptors: Online Surveys, Test Format, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Chen-Wei; Wang, Wen-Chung – Journal of Educational Measurement, 2017
The examinee-selected-item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set of items (e.g., choose one item to respond from a pair of items), always yields incomplete data (i.e., only the selected items are answered and the others have missing data) that are likely nonignorable. Therefore, using…
Descriptors: Item Response Theory, Models, Maximum Likelihood Statistics, Data Analysis
Pages: 1  |  ...  |  36  |  37  |  38  |  39  |  40  |  41  |  42  |  43  |  44  |  ...  |  206