Publication Date
| In 2026 | 0 |
| Since 2025 | 3 |
| Since 2022 (last 5 years) | 10 |
| Since 2017 (last 10 years) | 46 |
| Since 2007 (last 20 years) | 86 |
Descriptor
| Scoring | 214 |
| Test Construction | 214 |
| Test Reliability | 214 |
| Test Validity | 155 |
| Testing | 61 |
| Test Items | 58 |
| Item Analysis | 40 |
| Test Interpretation | 38 |
| Item Response Theory | 31 |
| Language Tests | 31 |
| Psychometrics | 30 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Secondary Education | 28 |
| Elementary Education | 23 |
| Middle Schools | 19 |
| Junior High Schools | 17 |
| Early Childhood Education | 16 |
| Primary Education | 15 |
| Intermediate Grades | 14 |
| Grade 5 | 13 |
| Grade 7 | 13 |
| Higher Education | 13 |
| Grade 4 | 12 |
| More ▼ | |
Location
| Nebraska | 7 |
| New York | 6 |
| Canada | 5 |
| Florida | 5 |
| Turkey | 4 |
| California | 3 |
| New Mexico | 2 |
| United States | 2 |
| Algeria | 1 |
| Australia | 1 |
| Brazil | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 3 |
| Education Consolidation… | 1 |
| Elementary and Secondary… | 1 |
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Reuben S. Asempapa; Doris Lee – Discover Education, 2025
Across the world, standards and practices for preparing teachers of mathematics emphasize the importance of math modeling (MM) in developing students' mathematical thinking. The aim of this research study was to develop the Mathematical Modeling Knowledge Scale (MAMKS), capable of determining preservice teachers' (PSTs') knowledge of MM. The study…
Descriptors: Preservice Teachers, Preservice Teacher Education, Mathematics Education, Mathematics Curriculum
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Bolton, Tiffany; Stevenson, Brittney; Janes, William – Journal of Occupational Therapy, Schools & Early Intervention, 2023
Researchers utilized a cross-sectional secondary analysis of data within an ongoing non-randomized controlled trial study design to establish the reliability and internal consistency of a novel handwriting assessment for preschoolers, the Just Write! (JW), written by the authors. Seventy-eight children from an area preschool participated in the…
Descriptors: Handwriting, Writing Skills, Writing Evaluation, Preschool Children
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction
Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023
The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…
Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability
Güntay Tasçi – Science Insights Education Frontiers, 2024
The present study has aimed to develop and validate a protein concept inventory (PCI) consisting of 25 multiple-choice (MC) questions to assess students' understanding of protein, which is a fundamental concept across different biology disciplines. The development process of the PCI involved a literature review to identify protein-related content,…
Descriptors: Science Instruction, Science Tests, Multiple Choice Tests, Biology
Johnson, Jennifer M.; Meisinger, Elizabeth B.; Robinson, Melissa F. – Journal of Psychoeducational Assessment, 2019
This review focuses on the Feifer Assessment of Reading (FAR) produced by S. G. Feifer and R. G. Nader in 2015. The FAR is a comprehensive reading test that is individually administered to children and adults aged 4 to 21 years. The structure of the FAR is based on a gradient model of brain functioning (Goldberg, 1990; Luria, 1980) and reflects a…
Descriptors: Reading Tests, Scoring, Test Construction, Test Norms
Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020
With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…
Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics
Er, Zübeyde; Dinç Artut, Perihan; Bal, Ayten Pinar – Pegem Journal of Education and Instruction, 2023
The aim of this research is to develop a reliable and valid scale to determine the mathematical thinking skills of gifted students. In addition, with the developed scale, thinking skills of gifted students was examined in terms of various variables. In this context, the research was carried out on two different study groups. The first stage of…
Descriptors: Measures (Individuals), Rating Scales, Test Construction, Construct Validity
Attali, Yigal – Educational Measurement: Issues and Practice, 2019
Rater training is an important part of developing and conducting large-scale constructed-response assessments. As part of this process, candidate raters have to pass a certification test to confirm that they are able to score consistently and accurately before they begin scoring operationally. Moreover, many assessment programs require raters to…
Descriptors: Evaluators, Certification, High Stakes Tests, Scoring
Benton, Tom – Cambridge Assessment, 2018
One of the questions with the longest history in educational assessment is whether it is possible to increase the reliability of a test simply by altering the way in which scores on individual test items are combined to make the overall test score. Most usually, the score available on each item is communicated to the candidate within a question…
Descriptors: Test Items, Scoring, Predictive Validity, Test Reliability
Cesur, Kursat – Educational Policy Analysis and Strategic Research, 2019
Examinees' performances are assessed using a wide variety of different techniques. Multiple-choice (MC) tests are among the most frequently used ones. Nearly, all standardized achievement tests make use of MC test items and there is a variety of ways to score these tests. The study compares number right and liberal scoring (SAC) methods. Mixed…
Descriptors: Multiple Choice Tests, Scoring, Evaluation Methods, Guessing (Tests)
Safak, Pinar; Cakmak, Salih; Karakoc, Tamer; Aydin O'Dwyer, Pinar – European Journal of Educational Research, 2021
This study aimed to develop a valid and reliable instrument that measures the functional vision of students with low vision. Thus, an assessment tool and performance activities were developed for three vision skill groups (near vision skills, distance vision skills, and visual field) that include functional vision skills. The universe was 1485…
Descriptors: Foreign Countries, Vision Tests, Diagnostic Tests, Vision
Alqarni, Abdulelah Mohammed – Journal on Educational Psychology, 2019
This study compares the psychometric properties of reliability in Classical Test Theory (CTT), item information in Item Response Theory (IRT), and validation from the perspective of modern validity theory for the purpose of bringing attention to potential issues that might exist when testing organizations use both test theories in the same testing…
Descriptors: Test Theory, Item Response Theory, Test Construction, Scoring
International Journal of Testing, 2018
The second edition of the International Test Commission Guidelines for Translating and Adapting Tests was prepared between 2005 and 2015 to improve upon the first edition, and to respond to advances in testing technology and practices. The 18 guidelines are organized into six categories to facilitate their use: pre-condition (3), test development…
Descriptors: Translation, Test Construction, Testing, Scoring

Peer reviewed
Direct link
