NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 7,456 to 7,470 of 136,470 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018
The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…
Descriptors: Test Content, Difficulty Level, Test Items, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
International Journal of Testing, 2018
The second edition of the International Test Commission Guidelines for Translating and Adapting Tests was prepared between 2005 and 2015 to improve upon the first edition, and to respond to advances in testing technology and practices. The 18 guidelines are organized into six categories to facilitate their use: pre-condition (3), test development…
Descriptors: Translation, Test Construction, Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Becker, Anthony; Nekrasova-Beker, Tatiana – Educational Assessment, 2018
While previous research has identified numerous factors that contribute to item difficulty, studies involving large-scale reading tests have provided mixed results. This study examined five selected-response item types used to measure reading comprehension in the Pearson Test of English Academic: a) multiple-choice (choose one answer), b)…
Descriptors: Reading Comprehension, Test Items, Reading Tests, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Javidanmehr, Zahra; Anani Sarab, Mohammad Reza – Language Assessment Quarterly, 2019
With the advancement of Cognitive Diagnostic Assessment (CDA) and the pertinent statistical models, different domains of large-scale testing and assessment have been examined for the sake of reporting more diagnostic information. Applying the generalized deterministic input, noisy, "and" gate (G-DINA) model, the current study analyzed a…
Descriptors: Reading Comprehension, Reading Tests, High Stakes Tests, College Entrance Examinations
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Syahfitri, Jayanti; Firman, Harry; Redjeki, Sri; Sriyati, Siti – International Journal of Instruction, 2019
The purpose of this study was to develop the Critical Thinking Disposition Test in Biology as an alternative instrument in looking at the extent of one's disposition to critical thinking, especially in Biology University. Critical Thinking Disposition Tests in Biology are tests in the form of multiple choice based on biological cases. This…
Descriptors: Biology, Critical Thinking, Science Instruction, Science Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Açikgül Firat, Esra; Köksal, Mustafa S. – Biochemistry and Molecular Biology Education, 2019
In this study, a 'biotechnology literacy test' was developed to determine the biotechnology literacy of prospective science teachers, and its validity and reliability were determined. For this purpose, 42 items were prepared by considering Bybee's scientific literacy classifications (nominal, functional, procedural, and multidimensional). The…
Descriptors: Test Construction, Multiple Choice Tests, Science Teachers, Preservice Teachers
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Vaheoja, Monika; Verhelst, N. D.; Eggen, T.J.H.M. – European Journal of Science and Mathematics Education, 2019
In this article, the authors applied profile analysis to Maths exam data to demonstrate how different exam forms, differing in difficulty and length, can be reported and easily interpreted. The results were presented for different groups of participants and for different institutions in different Maths domains by evaluating the balance. Some…
Descriptors: Feedback (Response), Foreign Countries, Statistical Analysis, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Karakolidis, Anastasios; Scully, Darina; O'Leary, Michael – Practical Assessment, Research & Evaluation, 2021
As part of the growing interest in the measurement of complex constructs in recent years, a body of research examining the extent to which videos are a useful alternative to written text in tests and assessments has emerged. Early attempts to replace written text with videos featured actors, but lately, animated videos have become more popular.…
Descriptors: Animation, Video Technology, Visual Measures, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Foster, Robert C. – Educational and Psychological Measurement, 2021
This article presents some equivalent forms of the common Kuder-Richardson Formula 21 and 20 estimators for nondichotomous data belonging to certain other exponential families, such as Poisson count data, exponential data, or geometric counts of trials until failure. Using the generalized framework of Foster (2020), an equation for the reliability…
Descriptors: Test Reliability, Data, Computation, Mathematical Formulas
Peer reviewed Peer reviewed
Direct linkDirect link
DeCarlo, Lawrence T.; Zhou, Xiaoliang – Journal of Educational Measurement, 2021
In signal detection rater models for constructed response (CR) scoring, it is assumed that raters discriminate equally well between different latent classes defined by the scoring rubric. An extended model that relaxes this assumption is introduced; the model recognizes that a rater may not discriminate equally well between some of the scoring…
Descriptors: Scoring, Models, Bias, Perception
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2021
The population discrepancy between unstandardized and standardized reliability of homogeneous multicomponent measuring instruments is examined. Within a latent variable modeling framework, it is shown that the standardized reliability coefficient for unidimensional scales can be markedly higher than the corresponding unstandardized reliability…
Descriptors: Test Reliability, Computation, Measures (Individuals), Research Problems
Peer reviewed Peer reviewed
Direct linkDirect link
Strachan, Tyler; Cho, Uk Hyun; Kim, Kyung Yong; Willse, John T.; Chen, Shyh-Huei; Ip, Edward H.; Ackerman, Terry A.; Weeks, Jonathan P. – Journal of Educational Measurement, 2021
In vertical scaling, results of tests from several different grade levels are placed on a common scale. Most vertical scaling methodologies rely heavily on the assumption that the construct being measured is unidimensional. In many testing situations, however, such an assumption could be problematic. For instance, the construct measured at one…
Descriptors: Item Response Theory, Scaling, Tests, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Deribo, Tobias; Kroehne, Ulf; Goldhammer, Frank – Journal of Educational Measurement, 2021
The increased availability of time-related information as a result of computer-based assessment has enabled new ways to measure test-taking engagement. One of these ways is to distinguish between solution and rapid guessing behavior. Prior research has recommended response-level filtering to deal with rapid guessing. Response-level filtering can…
Descriptors: Guessing (Tests), Models, Reaction Time, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Norris, John; Drackert, Anastasia – Language Testing, 2018
The Test of German as a Foreign Language (TestDaF) plays a critical role as a standardized test of German language proficiency. Developed and administered by the Society for Academic Study Preparation and Test Development (g.a.s.t.), TestDaF was launched in 2001 and has experienced persistent annual growth, with more than 44,000 test takers in…
Descriptors: German, Second Language Learning, Language Tests, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Stirk, Steven; Field, Bryony; Black, Jessica – Journal of Applied Research in Intellectual Disabilities, 2018
Background: The Learning Disability Screening Questionnaire (LDSQ) has been shown to have high sensitivity and specificity to identify those who are likely to meet intellectual disability diagnostic criteria (McKenzie, et al. [McKenzie K., 2015]). However, there is no independent research to date to support these findings. Materials and Methods:…
Descriptors: Learning Disabilities, Questionnaires, Screening Tests, Diagnostic Tests
Pages: 1  |  ...  |  494  |  495  |  496  |  497  |  498  |  499  |  500  |  501  |  502  |  ...  |  9098