NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 5,791 to 5,805 of 9,530 results Save | Export
Peer reviewed Peer reviewed
Clauser, Brian; And Others – Journal of Educational Measurement, 1994
The effect of reducing the number of score groups in the matching criterion of the Mantel-Haenszel procedure when screening for differential item functioning was investigated with a simulated data set. Results suggest that more than modest reductions cannot be recommended when ability distributions of reference and focal groups differ. (SLD)
Descriptors: Ability, Experimental Groups, Item Bias, Reference Groups
Peer reviewed Peer reviewed
Huynh, Huynh – Psychometrika, 1994
Given a Masters partial credit item with n known step difficulties, conditions are stated for the existence of a set of (locally) independent Rasch binary items such that their raw score and the partial credit raw score have identical probability density functions. (Author/SLD)
Descriptors: Equations (Mathematics), Item Response Theory, Performance Based Assessment, Probability
Peer reviewed Peer reviewed
Stocking, Martha L.; And Others – Applied Psychological Measurement, 1993
A method of automatically selecting items for inclusion in a test with constraints on item content and statistical properties was applied to real data. Tests constructed manually from the same data and constraints were compared to tests constructed automatically. Results show areas in which automated assembly can improve test construction. (SLD)
Descriptors: Algorithms, Automation, Comparative Testing, Computer Assisted Testing
Peer reviewed Peer reviewed
Hampton, David R.; And Others – Journal of Education for Business, 1993
Four management and four marketing professors classified multiple-choice questions in four widely adopted introductory textbooks according to the two levels of Bloom's taxonomy of educational objectives: knowledge and intellectual ability and skill. Inaccuracies may cause instructors to select questions that require less thinking than they intend.…
Descriptors: Administrator Education, Case Studies, Higher Education, Marketing
Peer reviewed Peer reviewed
Brown, James Dean – Language Testing, 1999
Explored the relative contributions to Test of English as a Foreign Language (TOEFL) score dependability of various numbers of persons, items, subtests, languages, and their various interactions. Sampled 15,000 test takers, 1000 each from 15 different language backgrounds. (Author/VWL)
Descriptors: English (Second Language), Language Tests, Second Language Learning, Student Characteristics
Peer reviewed Peer reviewed
Goodwin, Laura D. – Applied Measurement in Education, 1999
The relations between Angoff ratings (minimum passing levels) and the actual "p" values for borderline examinees were studied with 115 examinees taking the Certified Financial Planner examination. Findings do not suggest that the Angoff judges' task is nearly impossible, but they do suggest the need to improve standard-setting…
Descriptors: Cutting Scores, Difficulty Level, Judges, Licensing Examinations (Professions)
Peer reviewed Peer reviewed
Alagumalai, Sivakumar; Keeves, John P. – Journal of Outcome Measurement, 1999
How distractors in a test item function differentially is discussed. Also discussed are methods to identify distractor bias, including the Pearson chi square, likelihood-ratio chi square, and the Neyman weighted-least squares chi square tests. Problems from a physics test illustrate possible causes of distractor bias. (SLD)
Descriptors: Chi Square, Distractors (Tests), Identification, Item Bias
Peer reviewed Peer reviewed
Taylor, Catherine S. – Educational Assessment, 1998
Investigated item-by-item, holistic, and "trait" scoring with three mathematics performance-based tasks completed by 53 to 79 junior high and senior high school students per task. Results suggest that holistic scoring and item-by-item scoring methods provide similar information, but that trait score tapped into different aspects of…
Descriptors: High Schools, Holistic Approach, Junior High Schools, Mathematics Tests
Peer reviewed Peer reviewed
Roberts, James S.; Laughlin, James E.; Wedell, Douglas H. – Educational and Psychological Measurement, 1999
Highlights the theoretical differences between the approaches of R. Likert (1932) and L. Thurstone (1928) to attitude measurement. Uses real and simulated data on attitudes toward abortion to illustrate that attitude researchers should pay more attention to the empirical-response characteristics of items on a Likert attitude questionnaire. (SLD)
Descriptors: Abortions, Attitude Measures, Attitudes, Likert Scales
Peer reviewed Peer reviewed
Stone, Clement A. – Journal of Educational Measurement, 2000
Describes a goodness-of-fit statistic that considers the imprecision with which ability is estimated and involves constructing item fit tables based on each examinee's posterior distribution of ability, given the likelihood of the response pattern and an assumed marginal ability distribution. Also describes a Monte Carlo resampling procedure to…
Descriptors: Goodness of Fit, Item Response Theory, Mathematical Models, Monte Carlo Methods
Peer reviewed Peer reviewed
Cizek, Gregory J.; Robinson, K. Lynne; O'Day, Denis M. – Educational and Psychological Measurement, 1998
The effect of removing nonfunctioning items from multiple-choice tests was studied by examining change in difficulty, discrimination, and dimensionality. Results provide additional support for the benefits of eliminating nonfunctioning options, such as enhanced score reliability, reduced testing time, potential for broader domain sampling, and…
Descriptors: Difficulty Level, Multiple Choice Tests, Sampling, Scores
Peer reviewed Peer reviewed
Norcini, John; Grosso, Lou – Applied Measurement in Education, 1998
Ratings of test item relevance were collected from 57 practitioners from a pretest of a medical certifying examination. Ratings were correlated with item difficulty, but the relationship between ratings and item discrimination was less clear. Application of generalizability theory shows that reasonable estimates of item, stem, and total test…
Descriptors: Certification, Difficulty Level, Estimation (Mathematics), Generalizability Theory
Peer reviewed Peer reviewed
Murphy, Colette; Beggs, Jim; Hickey, Ivor; O'Meara, Jim; Sweeney, John – Educational Research, 2001
British students who had compulsory science in the National Curriculum from ages 11-16 (n=115) had significantly higher scores on a science test than those for whom secondary science had been optional (n=30). Almost all had very low scores on questions related to the circulatory system and sound and light, regardless of their science background.…
Descriptors: British National Curriculum, Foreign Countries, Required Courses, Science Education
Peer reviewed Peer reviewed
Dockrell, Julie E.; Messer, David; George, Rachel – Language and Cognitive Processes, 2001
Studied children with word finding difficulties who were identified through a wider survey of educational provision for those with language and communication difficulties. (Author/VWL)
Descriptors: Cognitive Processes, Comparative Analysis, Error Analysis (Language), Foreign Countries
Peer reviewed Peer reviewed
Wainer, Howard; Thissen, David – Review of Educational Research, 1994
This article summarizes results from tests that have allowed examinee choice of test items. It paints a bleak psychometric picture for the use of examinee choice within fair tests. Choice is anathema to standardized testing unless the aspects that characterize the test are irrelevant to what is being tested. (SLD)
Descriptors: Adaptive Testing, Educational Assessment, Elementary Secondary Education, Equal Education
Pages: 1  |  ...  |  383  |  384  |  385  |  386  |  387  |  388  |  389  |  390  |  391  |  ...  |  636