Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 11 |
Descriptor
| Test Interpretation | 61 |
| Test Reliability | 56 |
| Test Validity | 41 |
| Standardized Tests | 33 |
| Test Format | 33 |
| Testing | 29 |
| Test Reviews | 27 |
| Test Content | 26 |
| Disability Identification | 24 |
| Screening Tests | 22 |
| Disabilities | 15 |
| More ▼ | |
Source
Author
| White, Edward M. | 6 |
| Angoff, William H. | 1 |
| Ashton, Tamarah M. | 1 |
| Bachor, Dan G. | 1 |
| Bardos, Achilles N. | 1 |
| Bartel, Lee R. | 1 |
| Bartels, Don R. | 1 |
| Beetham, James | 1 |
| Bloom, Paula Jorde | 1 |
| Boland, Lyn | 1 |
| Bradbury, Alice | 1 |
| More ▼ | |
Publication Type
Education Level
| Elementary Secondary Education | 2 |
| Higher Education | 2 |
| Elementary Education | 1 |
| Postsecondary Education | 1 |
Audience
| Researchers | 5 |
| Practitioners | 3 |
| Teachers | 3 |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Sophie Litschwartz – Society for Research on Educational Effectiveness, 2021
Background/Context: Pass/fail standardized exams frequently selectively rescore failing exams and retest failing examinees. This practice distorts the test score distribution and can confuse those who do analysis on these distributions. In 2011, the Wall Street Journal showed large discontinuities in the New York City Regent test score…
Descriptors: Standardized Tests, Pass Fail Grading, Scoring Rubrics, Scoring Formulas
Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016
ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…
Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement
Talan, Teri N.; Bloom, Paula Jorde – Teachers College Press, 2018
The "Business Administration Scale for Family Child Care" (BAS) is the first valid and reliable tool for measuring and improving the overall quality of business and professional practices in family child care settings. It is applicable for multiple uses, including program self-improvement, technical assistance and monitoring, training,…
Descriptors: Business Administration, Child Care, Rating Scales, Qualifications
Fan, Xitao; Sun, Shaojing – Journal of Early Adolescence, 2014
In adolescence research, the treatment of measurement reliability is often fragmented, and it is not always clear how different reliability coefficients are related. We show that generalizability theory (G-theory) is a comprehensive framework of measurement reliability, encompassing all other reliability methods (e.g., Pearson "r,"…
Descriptors: Generalizability Theory, Measurement, Reliability, Correlation
Crawford, John R.; Garthwaite, Paul H.; Morrice, Nicola; Duff, Kevin – Psychological Assessment, 2012
Supplementary methods for the analysis of the Repeatable Battery for the Assessment of Neuropsychological Status are made available, including (a) quantifying the number of abnormally low Index scores and abnormally large differences exhibited by a case and accompanying this with estimates of the percentages of the normative population expected to…
Descriptors: Neurological Impairments, Cognitive Tests, Psychological Testing, Adults
Geisinger, Kurt F. – International Journal of Testing, 2012
This article sets the stage for the description of a variety of approaches to test reviewing worldwide. It describes the importance of test reviewing as a protection of the public and of society and also the benefits of this activity for test users, who must choose measures to use in particular situations with particular clients at a particular…
Descriptors: Test Reviews, Evaluation Methods, Evaluation Criteria, Global Approach
Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011
This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores
Tanner, John R. – School Administrator, 2011
State test scores administered for accountability purposes are regularly used to adjust instruction in nuanced ways. This is no accident--No Child Left Behind demanded that students' scores be returned quickly to teachers in order that this might be the case, and the idea of data-driven decision making continues as one way the promise of education…
Descriptors: Federal Legislation, Standardized Tests, Educational Change, Decision Making
Bradbury, Alice – Journal of Education Policy, 2011
Despite decades of research and debate, the issue of unequal outcomes continues to be a concern in educational systems worldwide. In England, published data relating to pupils' attainment across ethnic groups and by class indicators has been used to demonstrate continued inequalities in schools. This article attempts to deconstruct the…
Descriptors: Ethnic Groups, Urban Areas, Foreign Countries, Educational Policy
Peer reviewedMcKee, Lynne M.; Levinson, Edward M. – Career Development Quarterly, 1990
Discusses general issues and concerns relative to the adaptation of paper-pencil assessment instruments to computerized formats. Describes and evaluates Self-Directed Search computerized version (SDS-CV). Presents strengths and weaknesses of the SDS-CV and makes recommendations for its use. (Author/ABL)
Descriptors: Career Counseling, Computer Oriented Programs, Evaluation Methods, Reliability
Peer reviewedBrown, Jonathan R. – Language, Speech, and Hearing Services in Schools, 1989
The importance of using the standard error of measurement (SEm) in determining reliability in test scores is emphasized. The SEm is compared to the hypothetical true score for standardized tests, and procedures for calculation of the SEm are explained. (JDD)
Descriptors: Elementary Secondary Education, Error of Measurement, Scores, Standardized Tests
Erwin, T. Dary – 1988
Rating scales are a typical method for evaluating a student's performance in outcomes assessment. The analysis of the quality of information from rating scales poses special measurement problems when researchers work with faculty in their development. Generalizability measurement theory offers a set of techniques for estimating errors or…
Descriptors: Educational Assessment, Generalizability Theory, Higher Education, Institutional Research
Peer reviewedImrie, Bradford W. – Assessment and Evaluation in Higher Education, 1982
Evaluation of the final examination should be part of the course evaluation and should include student perceptions of the exam's nature and the questions' quality. The final examination experience of two groups of undergraduate and graduate students are considered. (MSE)
Descriptors: Course Evaluation, Higher Education, Student Attitudes, Student Evaluation
Angoff, William H. – College Board Review, 1982
Some little-understood facts about standardized test scores and how they are reported and interpreted are explained. In particular, the practice of score equating ensures that different test forms have comparable scoring. This and other practices are designed to enhance the equity of testing. (MSE)
Descriptors: Equal Education, Higher Education, Mathematical Formulas, Rating Scales

Direct link
