Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 10 |
| Since 2017 (last 10 years) | 33 |
| Since 2007 (last 20 years) | 51 |
Descriptor
| Test Length | 133 |
| Test Reliability | 133 |
| Test Validity | 63 |
| Test Items | 44 |
| Test Construction | 42 |
| Scores | 24 |
| Test Format | 23 |
| Computer Assisted Testing | 21 |
| Error of Measurement | 20 |
| Foreign Countries | 20 |
| Item Response Theory | 19 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 12 |
| Postsecondary Education | 11 |
| Elementary Education | 9 |
| Secondary Education | 6 |
| Early Childhood Education | 4 |
| Grade 6 | 4 |
| Intermediate Grades | 4 |
| Middle Schools | 4 |
| Primary Education | 4 |
| Grade 3 | 3 |
| Grade 5 | 3 |
| More ▼ | |
Audience
| Researchers | 4 |
| Practitioners | 2 |
| Community | 1 |
| Support Staff | 1 |
Location
| China | 4 |
| Turkey | 3 |
| Australia | 2 |
| Canada | 2 |
| Ireland | 2 |
| Netherlands | 2 |
| Singapore | 2 |
| United Kingdom | 2 |
| Alabama | 1 |
| California | 1 |
| Germany | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Job Training Partnership Act… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Zaporozhets, Olga; Fox, Christine M.; Beltyukova, Svetlana A.; Laux, John M.; Piazza, Nick J.; Salyers, Kathleen – Measurement and Evaluation in Counseling and Development, 2015
This study was to develop a linear measure of change using University of Rhode Island Change Assessment items that represented Prochaska and DiClemente's theory. The resulting Toledo Measure of Change is short, is easy to use, and provides reliable scores for identification of individuals' stage of change and progression within that stage.
Descriptors: Item Response Theory, Change, Measures (Individuals), Test Construction
Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Descriptors: Test Bias, Test Reliability, Performance, Scores
James, Syretta R.; Liu, Shihching Jessica; Maina, Nyambura; Wade, Julie; Wang, Helen; Wilson, Heather; Wolanin, Natalie – Montgomery County Public Schools, 2021
The impact of the COVID-19 pandemic continues to overwhelm the functioning and outcomes of educational systems throughout the nation. The public education system is under particular scrutiny given that students, families, and educators are under considerable stress to maintain academic progress. Since the beginning of the crisis, school-systems…
Descriptors: Achievement Tests, COVID-19, Pandemics, Public Schools
Sengul Avsar, Asiye; Tavsancil, Ezel – Educational Sciences: Theory and Practice, 2017
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
Descriptors: Test Items, Psychometrics, Nonparametric Statistics, Item Response Theory
Anthony, Christopher James; DiPerna, James Clyde – School Psychology Quarterly, 2017
The Academic Competence Evaluation Scales-Teacher Form (ACES-TF; DiPerna & Elliott, 2000) was developed to measure student academic skills and enablers (interpersonal skills, engagement, motivation, and study skills). Although ACES-TF scores have demonstrated psychometric adequacy, the length of the measure may be prohibitive for certain…
Descriptors: Test Items, Efficiency, Item Response Theory, Test Length
Doskey, Elena M.; Lagunas, Brenda; SooHoo, Michelle; Lomax, Amanda; Bullick, Stephanie – Journal of Psychoeducational Assessment, 2013
The Speed DIAL-4 was developed from the Developmental Indicators for the Assessment of Learning, Fourth Edition (DIAL-4), a screening designed to identify children between the ages of 2 years, 6 months through 5 years, 11 months "who are in need of intervention or diagnostic assessment in the following areas: motor, concepts, language,…
Descriptors: Screening Tests, Young Children, Test Length, Scoring
Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2013
To efficiently assess multiple psychological constructs and to minimize the burden on respondents, psychologists increasingly use shortened versions of existing tests. However, compared to the longer test, a shorter test version may have a substantial impact on the reliability and the validity of the test scores in psychological research and…
Descriptors: Test Length, Psychological Testing, Test Use, Test Validity
Sabatini, J.; O'Reilly, T.; Halderman, L.; Bruce, K. – Grantee Submission, 2014
Existing reading comprehension assessments have been criticized by researchers, educators, and policy makers, especially regarding their coverage, utility, and authenticity. The purpose of the current study was to evaluate a new assessment of reading comprehension that was designed to broaden the construct of reading. In light of these issues, we…
Descriptors: Reading Comprehension, Vignettes, Reading Tests, Elementary School Students
Kinyua, Kiragu; Okunya, Luke Odiemo – African Educational Research Journal, 2014
This study was carried out to establish the factors influencing the validity and reliability of teacher made tests in Kenya. It was conducted in Nyahururu District of Laikipia County in Kenya. The study involved 42 teachers and 15 key informants selected from teachers holding various positions of academic responsibilities in their schools in…
Descriptors: Tests, Test Validity, Test Reliability, Physics
Yao, Lihua – Applied Psychological Measurement, 2013
Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection
Dikici, Ayhan; Soh, Kaycheng – Online Submission, 2015
Many measurement tools on creativity are available in the literature. One of these scales is Creativity Fostering Teacher Behaviour Index (CFTIndex) developed for Singaporean teacher originally. It was then translated into Turkish and trialled on teachers in Nigde province with acceptable reliability and factorial validity. The main purpose of…
Descriptors: Creativity, Teacher Behavior, Comparative Analysis, Turkish
Yang, Sophie Xin; Jowett, Sophia – Measurement in Physical Education and Exercise Science, 2013
The Coach-Athlete Relationship Questionnaire was developed to effectively measure affective, cognitive, and behavioral aspects, represented by the interpersonal constructs of closeness, commitment, and complementarity, of the quality of the relationship within the context of sport coaching. The current study sought to determine the internal…
Descriptors: Foreign Countries, Athletes, Athletic Coaches, Interpersonal Relationship
Kahraman, Nilufer; Thompson, Tony – Journal of Educational Measurement, 2011
A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article…
Descriptors: Test Length, Test Items, Alignment (Education), Models
Hacker, Jason; Carr, Andrea; Abrams, Matthew; Brown, Steven D. – Journal of Career Assessment, 2013
Prior research using a 167-item measure of career indecision (Career Indecision Profile-167 [CIP-167]) has suggested that career choice difficulties may be associated with four major sources of career indecision: neuroticism/negative affectivity, choice/commitment anxiety, lack of readiness, and interpersonal conflicts. The purpose of this study…
Descriptors: Career Choice, Decision Making, Measures (Individuals), Factor Structure
Yao, Lihua – Psychometrika, 2012
Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…
Descriptors: Item Banks, Test Length, Simulation, Adaptive Testing

Peer reviewed
Direct link
