Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 5 |
Descriptor
| Data Analysis | 5 |
| Evaluation Methods | 4 |
| Data Collection | 3 |
| Test Construction | 3 |
| Testing Programs | 3 |
| Design | 2 |
| Equated Scores | 2 |
| Experience | 2 |
| Inferences | 2 |
| Item Response Theory | 2 |
| Mathematics Tests | 2 |
| More ▼ | |
Source
| Applied Measurement in… | 9 |
Author
| Bolt, Sara E. | 1 |
| Bridgeman, Brent | 1 |
| Burton, Nancy | 1 |
| Carol Eckerly | 1 |
| Cline, Frederick | 1 |
| Crocker, Linda | 1 |
| Crouse, Jill D. | 1 |
| Ercikan, Kadriye | 1 |
| Haberman, Shelby | 1 |
| Harris, Deborah J. | 1 |
| John R. Donoghue | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 9 |
| Reports - Evaluative | 9 |
| Information Analyses | 1 |
Education Level
| Higher Education | 1 |
| Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
| Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016
Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…
Descriptors: Evaluation Methods, Test Construction, Design, Scaling
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011
The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…
Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis
Bridgeman, Brent; Burton, Nancy; Cline, Frederick – Applied Measurement in Education, 2009
Descriptions of validity results based solely on correlation coefficients or percent of the variance accounted for are not merely difficult to interpret, they are likely to be misinterpreted. Predictors that apparently account for a small percent of the variance may actually be highly important from a practical perspective. This study combined two…
Descriptors: Predictive Validity, College Entrance Examinations, Graduate Study, Grade Point Average
Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…
Descriptors: Test Items, Test Content, Testing Programs, Simulation
Peer reviewedKane, Michael – Applied Measurement in Education, 1997
Licensure and certification decisions are usually based on a chain of inference from results of a practice analysis to test specifications, the test, examinee performance, and a pass-fail decision. This article focuses on the design of practice analyses and translation of practice analyses results into test specifications. (SLD)
Descriptors: Certification, Data Collection, Experience, Inferences
Peer reviewedHarris, Deborah J.; Crouse, Jill D. – Applied Measurement in Education, 1993
Criteria used in the equating process proposed in the literature are reviewed. The discussion begins by examining how equating is defined. The controversy over the best criterion, the utility of some, and whether a criterion is needed at all means that much work needs to be done in this area. (SLD)
Descriptors: Data Collection, Definitions, Equated Scores, Evaluation Criteria
Peer reviewedCrocker, Linda – Applied Measurement in Education, 1997
The experience of the National Board for Professional Teaching Standards illustrates how issues of assessing the content representativeness of performance assessment can be addressed to ensure validity for certification procedures. Explores the challenges of collecting validation evidence when expert judgments of content are used. (SLD)
Descriptors: Content Validity, Credentials, Data Collection, Evaluation Methods
Bolt, Sara E.; Ysseldyke, James E. – Applied Measurement in Education, 2006
Although testing accommodations are commonly provided to students with disabilities within large-scale testing programs, research findings on how well accommodations allow for comparable measurement of student knowledge and skill remain inconclusive. The purpose of this study was to examine the extent to which 1 commonly held belief about testing…
Descriptors: Oral Reading, Testing Accommodations, Disabilities, Special Needs Students

Direct link
