Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedJafarpur, A. – System, 1999
Examines whether a defect of the C-test can be avoided by constructing a C-test with five texts and 126 items. The test was tried with 146 Iranian English majors. On the basis of item analysis, a tailored C-test with 100 items was developed and tried with 60 other subjects. Results show no gains were made with the classical item analysis.…
Descriptors: College Students, English (Second Language), Higher Education, Item Analysis
Peer reviewedHolahan, John M.; Saunders, T. Clark – Bulletin of the Council for Research in Music Education, 1997
Investigates two problems: (1) do learning effects accrue in accuracy or response time when computerized tests are administered in two sessions? and (2) what are the effects of tonal pattern order and contour types on average item difficulty and length of response time for children with different levels of achievement? (DSK)
Descriptors: Auditory Perception, Children, Cognitive Processes, Computer Assisted Testing
Peer reviewedImpara, James C.; Plake, Barbara S. – Journal of Educational Measurement, 1998
Sixth-grade teachers (n=26) estimated item performance for their students (724 total students) on a 50-item district-wide science test. Teachers were more accurate in estimating performance of the total group than of the borderline group, but in neither case was their accuracy high. Estimating proportion-correct values using the Angoff standard…
Descriptors: Difficulty Level, Elementary School Teachers, Grade 6, Intermediate Grades
Peer reviewedSchwarz, Richard D. – Applied Measurement in Education, 1998
Referral, placement, and retention decisions were analyzed using item response theory (IRT) to study whether classification decisions could be placed on the latent continuum of ability normally associated with test items and to study the existence of classification differential item functioning. Results with 352 kindergarten children demonstrate…
Descriptors: Ability, Classification, Decision Making, Grade Repetition
Peer reviewedBonbright, Jane M.; McGreevy-Nichols, Susan – Arts Education Policy Review, 1999
Reports on the data gleaned from the survey on dance education administered simultaneously with the 1997 National Assessment of Educational Progress (NAEP) arts assessments. Presents the process and problems of developing and implementing assessments in dance. Considers the value of the assessments to dance and offers recommendations for the…
Descriptors: Advocacy, Art Education, Dance Education, Educational Testing
Peer reviewedFan, Xitao; And Others – Educational and Psychological Measurement, 1996
Applying 2 different models of test construction to test results for a pool of more than 190,000 high school students found no systematic bias against groups with smaller or no representation in the test standardization sample. These results support the integrity of widely used sampling and item selection procedures. (SLD)
Descriptors: Culture Fair Tests, Ethnic Groups, High School Students, High Schools
Peer reviewedSnetzler, Suzi; Qualls, Audrey L. – Educational and Psychological Measurement, 2000
Examined the incidence of differential item functioning (DIF) on 3 subtests of the Iowa Tests of Basic Skills using test scores for 2,867 Alaskan students, characterized as "Native" or White at fourth and sixth grades or sixth and eighth grades. Effect size differences favoring whites were larger when students of equal English…
Descriptors: Achievement Tests, Alaska Natives, Item Bias, Limited English Speaking
Peer reviewedGarden, Robert A. – Studies in Educational Evaluation, 1999
Describes the development of the performance assessment tasks of the Third International Mathematics and Science Study. The challenge was to produce tasks that would measure the achievement of curricular objectives while being sufficiently reliable to allow comparisons between countries and of groups within countries. (SLD)
Descriptors: Comparative Analysis, Elementary Secondary Education, Foreign Countries, International Education
Peer reviewedBennett, Randy Elliot; Morley, Mary; Quardt, Dennis; Rock, Donald A.; Singley, Mark K.; Katz, Irvin R.; Nhouyvanisvong, Adisack – Journal of Educational Measurement, 1999
Evaluated a computer-delivered response type for measuring quantitative skill, the "Generating Examples" (GE) response type, which presents under-determined problems that can have many right answers. Results from 257 graduate students and applicants indicate that GE scores are reasonably reliable, but only moderately related to Graduate…
Descriptors: College Applicants, Computer Assisted Testing, Graduate Students, Graduate Study
Niemi, Richard G.; Sanders, Mitchell S.; Whittington, Dale – Theory and Research in Social Education, 2005
We make five over-time comparisons of student knowledge of civics and government: a) knowledge among 4th, 8th, and 11/12th graders between 1975/6 and 1998 using two separate NAEP trend assessments; b) knowledge over the same period by comparing responses to individual items asked in the 1975/6 and 1981/2 assessments to responses on identical…
Descriptors: Secondary School Students, Civics, Knowledge Level, Social Studies
Chen, Shu-Ying; Ankenman, Robert D. – Journal of Educational Measurement, 2004
The purpose of this study was to compare the effects of four item selection rules--(1) Fisher information (F), (2) Fisher information with a posterior distribution (FP), (3) Kullback-Leibler information with a posterior distribution (KP), and (4) completely randomized item selection (RN)--with respect to the precision of trait estimation and the…
Descriptors: Test Length, Adaptive Testing, Computer Assisted Testing, Test Selection
Tomkowicz, Joanna; Rogers, W. Todd – Alberta Journal of Educational Research, 2005
Ability estimates yielded by the one- (1PL), two- (2PL), and three-parameter (3PL) models and the nominal response model (NRM) were compared with the number-right (NR) scoring model using items not susceptible to test-wiseness (NTW) and items susceptible to the ID1 test-wiseness strategy. These items were contained in grade 12 diploma examinations…
Descriptors: Scoring, Social Studies, Grade 12, Chemistry
Vigneau, Francois; Bors, Douglas A. – Educational and Psychological Measurement, 2005
The problem of dimensionality with respect to Raven's Advanced Progressive Matrices (APM) specifically and, more generally, "g" or fluid intelligence, has been a long-standing issue. The present article reports two studies examining the dimensionality of both the original Set II of the APM (n = 506) and a short form (n = 644), using principal…
Descriptors: Context Effect, Item Response Theory, Intelligence Tests, Test Items
Ercikan, Kadriye; Gierl, Mark J.; McCreith, Tanya; Puhan, Gautam; Koh, Kim – Applied Measurement in Education, 2004
This research examined the degree of comparability and sources of incomparability of English and French versions of reading, mathematics, and science tests that were administered as part of a survey of achievement in Canada. The results point to substantial psychometric differences between the 2 language versions. Approximately 18% to 36% of the…
Descriptors: Foreign Countries, Psychometrics, Science Tests, French
van der Linden, Wim J.; Veldkamp, Bernard P.; Carlson, James E. – Applied Psychological Measurement, 2004
A popular design in large-scale educational assessments as well as any other type of survey is the balanced incomplete block design. The design is based on an item pool split into a set of blocks of items that are assigned to sets of "assessment booklets." This article shows how the problem of calculating an optimal balanced incomplete block…
Descriptors: Grade 8, National Competency Tests, Item Banks, Research Design

Direct link
