Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 2 |
| Since 2007 (last 20 years) | 15 |
Descriptor
Source
Author
| White, Edward M. | 6 |
| Ferrara, Steven F. | 2 |
| Herman, Joan L. | 2 |
| Tindal, Gerald | 2 |
| Yen, Shu Jing | 2 |
| Allen, Nancy L. | 1 |
| Almond, Patricia | 1 |
| Alonzo, Julie | 1 |
| Alper Gülay | 1 |
| Alston, Herbert L. | 1 |
| Anderson, Daniel | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 93 |
| Journal Articles | 26 |
| Speeches/Meeting Papers | 23 |
| Numerical/Quantitative Data | 9 |
| Reports - Descriptive | 9 |
| Reports - Evaluative | 3 |
| Tests/Questionnaires | 3 |
Education Level
| Elementary Secondary Education | 5 |
| Elementary Education | 4 |
| Higher Education | 4 |
| Grade 4 | 3 |
| Postsecondary Education | 3 |
| Secondary Education | 3 |
| Grade 5 | 2 |
| Grade 6 | 2 |
| Middle Schools | 2 |
| Adult Education | 1 |
| Grade 12 | 1 |
| More ▼ | |
Audience
| Researchers | 2 |
| Policymakers | 1 |
| Teachers | 1 |
Location
| California | 8 |
| Canada | 4 |
| Georgia | 4 |
| New York | 3 |
| Florida | 2 |
| Louisiana | 2 |
| Maine | 2 |
| North Carolina | 2 |
| Pennsylvania | 2 |
| South Carolina | 2 |
| United States | 2 |
| More ▼ | |
Laws, Policies, & Programs
| Education Consolidation… | 1 |
| Elementary and Secondary… | 1 |
| Elementary and Secondary… | 1 |
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Alper Gülay; Emre Cumali; Damla Cumali – International Journal of Contemporary Educational Research, 2024
This qualitative phenomenological study explores the experiences of parents of children with special needs in Turkey, specifically their encounters with Guidance and Research Centers (GRCs) during the process of obtaining educational assessment reports. Through semi-structured interviews with 25 parents, the study reveals complex emotions and…
Descriptors: Foreign Countries, Special Needs Students, Parent Attitudes, Parent Participation
Atteberry, Allison; Mangan, Daniel – Educational Researcher, 2020
Papay (2011) noticed that teacher value-added measures (VAMs) from a statistical model using the most common pre/post testing timeframe--current-year spring relative to previous spring (SS)--are essentially unrelated to those same teachers' VAMs when instead using next-fall relative to current-fall (FF). This is concerning since this choice--made…
Descriptors: Correlation, Value Added Models, Pretests Posttests, Decision Making
Rutkowski, David; Rutkowski, Leslie; Plucker, Jonathan A. – Phi Delta Kappan, 2015
The OECD and its U.S. administrator, McGraw-Hill Education CTB, have recently concluded the first cycle of the OECD-Test for Schools in the U.S. This test is being marketed to local schools and is designed to compare 15-year-olds from individual participating schools against peers nationally and internationally using the OECD's PISA test as its…
Descriptors: Participation, International Education, Comparative Testing, Comparative Education
McBee, Matthew T.; Peters, Scott J.; Waterman, Craig – Gifted Child Quarterly, 2014
Best practice in gifted and talented identification procedures involves making decisions on the basis of multiple measures. However, very little research has investigated the impact of different methods of combining multiple measures. This article examines the consequences of the conjunctive ("and"), disjunctive/complementary…
Descriptors: Best Practices, Ability Identification, Academically Gifted, Correlation
Guo, Hongwen; Liu, Jinghua; Dorans, Neil; Feigenbaum, Miriam – ETS Research Report Series, 2011
Maintaining score stability is crucial for an ongoing testing program that administers several tests per year over many years. One way to stall the drift of the score scale is to use an equating design with multiple links. In this study, we use the operational and experimental SAT® data collected from 44 administrations to investigate the effect…
Descriptors: Equated Scores, College Entrance Examinations, Reliability, Testing Programs
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Chen, Xinnian; Graesser, Donnasue; Sah, Megha – Advances in Physiology Education, 2015
Laboratory courses serve as important gateways to science, technology, engineering, and mathematics education. One of the challenges in assessing laboratory learning is to conduct meaningful and standardized practical exams, especially for large multisection laboratory courses. Laboratory practical exams in life sciences courses are frequently…
Descriptors: Laboratory Experiments, Standardized Tests, Testing Programs, Testing Problems
Qi, Sen; Mitchell, Ross E. – Journal of Deaf Studies and Deaf Education, 2012
The first large-scale, nationwide academic achievement testing program using Stanford Achievement Test (Stanford) for deaf and hard-of-hearing children in the United States started in 1969. Over the past three decades, the Stanford has served as a benchmark in the field of deaf education for assessing student academic achievement. However, the…
Descriptors: Testing Programs, Educational Testing, Deafness, Academic Achievement
Mrazik, Martin; Janzen, Troy M.; Dombrowski, Stefan C.; Barford, Sean W.; Krawchuk, Lindsey L. – Canadian Journal of School Psychology, 2012
A total of 19 graduate students enrolled in a graduate course conducted 6 consecutive administrations of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV, Canadian version). Test protocols were examined to obtain data describing the frequency of examiner errors, including administration and scoring errors. Results identified 511…
Descriptors: Intelligence Tests, Intelligence, Statistical Analysis, Scoring
Huang, Jinyan – TESOL Journal, 2011
Using generalizability theory, this study examined both the rating variability and reliability of English as a second language (ESL) students' writing in two provincial examinations in Canada. This article discusses expected and unexpected similarities and differences related to rating variability and reliability between the two testing programs.…
Descriptors: Foreign Countries, Generalizability Theory, Test Reliability, Testing Programs
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Unger, Darian – American Journal of Business Education, 2010
Although there is significant research on improving college-level teaching practices, most literature in the field assumes an incentive for improvement. The research presented in this paper addresses the issue of poor incentives for improving university-level teaching. Specifically, it proposes instructor-designed common examinations as an…
Descriptors: Educational Innovation, Educational Improvement, Instructional Improvement, Business Administration Education
Peer reviewedJones, Terry; Cason, Carolyn L.; Mancini, Mary E. – Journal of Professional Nursing, 2002
Registered nurses (n=368) participated in a skills recredentialing program in which competencies were assessed by a knowledge test and performance test under simulated conditions and evaluator ratings in actual patient-care situations. No significant differences in results between the simulated and actual conditions support the validity of the…
Descriptors: Competence, Credentials, Interrater Reliability, Nurses
Peer reviewedHollenbeck, Keith; Tindal, Gerald; Almond, Patricia – Educational Assessment, 1999
Studied the amount of measurement error in a state's performance-based writing task as it relates to high-stakes decision reproducibility. Using 175 eighth-grade writing samples, the study finds moderate correlations between the two raters' scores, with significant differences for the rates for the handwritten, but not the typed, essays.(SLD)
Descriptors: Decision Making, Error of Measurement, Essay Tests, Grade 8
Peer reviewedWalter, Richard A.; Kapes, Jerome T. – Journal of Industrial Teacher Education, 2003
To identify a procedure for establishing cut scores for National Occupational Competency Testing Institute examinations in Pennsylvania, an expert panel assessed written and performance test items for minimally competent workers. Recommendations about the number, type, and training of judges used were made. (Contains 18 references.) (SK)
Descriptors: Cutting Scores, Interrater Reliability, Occupational Tests, Teacher Competency Testing

Direct link
