NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 2,506 to 2,520 of 2,831 results Save | Export
Ankenmann, Robert D.; Stone, Clement A. – 1992
Effects of test length, sample size, and assumed ability distribution were investigated in a multiple replication Monte Carlo study under the 1-parameter (1P) and 2-parameter (2P) logistic graded model with five score levels. Accuracy and variability of item parameter and ability estimates were examined. Monte Carlo methods were used to evaluate…
Descriptors: Computer Simulation, Estimation (Mathematics), Item Bias, Mathematical Models
Johnson, Colleen Cook – 1993
The purpose of this study is to help define the precise nature and limits of the tolerable range in which a researcher may be relatively confident about the statistical validity of his or her research findings, focusing specifically on the statistical validity of results when violating the assumptions associated with the one-way, fixed-effects…
Descriptors: Analysis of Covariance, Analysis of Variance, Comparative Analysis, Computer Simulation
Schumacker, Randall E.; And Others – 1994
Rasch between and total weighted and unweighted fit statistics were compared using varying test lengths and sample sizes. Two test lengths (20 and 50 items) and three sample sizes (150, 500, and 1,000 were crossed. Each of the six combinations were replicated 100 times. In addition, power comparisons were made. Results indicated that there were no…
Descriptors: Comparative Analysis, Goodness of Fit, Item Response Theory, Power (Statistics)
Boldt, Robert F. – 1986
This study of the validity of the Graduate Record Examinations (GRE) General Test used data from predictive validity studies that were conducted by the GRE Validity Study Service (VSS) in 79 graduate departments. The performance criterion was first-year grades in graduate school. Observed validities were computed, and for each graduate department…
Descriptors: College Entrance Examinations, Departments, Grade Point Average, Graduate Study
Blair, R. Clifford; Higgins, James J. – 1985
Monte Carlo methods were employed to assess the relative power of the paired samples t test and Wilcoxon's signed-ranks test under ten population shapes. Results of the study indicated that: (1) each of the two statistics was more powerful than the other in given situations; (2) the power advantages of the t test under normal theory were small;…
Descriptors: Estimation (Mathematics), Literature Reviews, Measurement Techniques, Monte Carlo Methods
Cason, Gerald J.; Cason, Carolyn L. – 1989
The use of three remedies for errors in the measurement of ability that arise from differences in rater stringency is discussed. Models contrasted are: (1) Conventional; (2) Handicap; and (3) deterministic Rater Response Theory (RRT). General model requirements, power, bias of measures, computing cost, and complexity are contrasted. Contrasts are…
Descriptors: Ability, Achievement Rating, Error of Measurement, Evaluation Methods
Bunch, Michael B.; Littlefair, Wendy – 1988
A total of 2,000 essays written by 1,000 students was submitted to generalizability analyses for domain-referenced tests. Each student had written one essay on each of two prompts representing two models of discourse. Each essay was read by six readers and judged on a scale of from 1 to 4. No reader read essays from both prompts. Reader agreement…
Descriptors: Cutting Scores, Essay Tests, Generalizability Theory, Interrater Reliability
Reckase, Mark D. – 1978
Five comparisons were made relative to the quality of estimates of ability parameters and item calibrations obtained from the one-parameter and three-parameter logistic models. The results indicate: (1) The three-parameter model fit the test data better in all cases than did the one-parameter model. For simulation data sets, multi-factor data were…
Descriptors: Comparative Analysis, Goodness of Fit, Item Analysis, Mathematical Models
PDF pending restoration PDF pending restoration
Estes, Carole; Estes, Gary D. – 1980
Multiple matrix sampling is a sampling design in which both test items and examinees are randomly sampled from their respective populations. This study was designed to develop and assess a method for computing an estimate of a correlation coefficient when a multiple matrix sampling design is used. The examinee populations included 212 third-grade…
Descriptors: Correlation, Elementary Secondary Education, Evaluation Methods, Grade 3
Saunders, Joseph C.; Huynh, Huynh – 1980
In most reliability studies, the precision of a reliability estimate varies inversely with the number of examinees (sample size). Thus, to achieve a given level of accuracy, some minimum sample size is required. An approximation for this minimum size may be made if some reasonable assumptions regarding the mean and standard deviation of the test…
Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests
Koffler, Stephen L. – 1976
The power of the classical Linear Discriminant Function (LDF) is compared, using Monte Carlo techniques with five other procedures for classifying observations from certain non-normal distributions. The alternative procedures considered are the Quadratic Discriminant Function, a Nearest Neighbor Procedure with Probability Blocks, and three density…
Descriptors: Behavioral Science Research, Classification, Comparative Analysis, Discriminant Analysis
Peer reviewed Peer reviewed
Schweinhart, Lawrence J., And Others – Early Childhood Research Quarterly, 1986
Responds to commentaries by Bereiter and Gersten concerning findings of the High/Scope Educational Research Foundation's 15-year Preschool Curriculum Comparison Study. Addresses issues of design of study, experimenter bias, sample size, self-report validity, interpretation of findings, and related topics. (NH)
Descriptors: Adolescents, Delinquency, Preschool Children, Preschool Curriculum
Peer reviewed Peer reviewed
McBean, Edward A.; Lennox, William C. – Higher Education, 1985
The influences of class size and the number of students completing surveys on faculty and course ratings were studied. For classes of 30 or more, a 50 percent response rate gives an acceptable indication of rating, while for a class of less than 30, about 80 percent return is needed. (Author/SW)
Descriptors: Class Size, College Students, Course Evaluation, Faculty Evaluation
Boldt, R. F. – 1994
The comparison of item response theory models for the Test of English as a Foreign Language (TOEFL) was extended to an equating context as simulation trials were used to "equate the test to itself." Equating sample data were generated from administration of identical item sets. Equatings that used procedures based on each model (simple…
Descriptors: Comparative Analysis, Cutting Scores, English (Second Language), Equated Scores
Peer reviewed Peer reviewed
Cattell, Raymond B.; Cattell, Heather E. P. – Educational and Psychological Measurement, 1995
The development of the new fifth edition of the Sixteen Personality Factor Questionnaire (16PF) is described. The factor structure of the new 16PF was explored with 4 samples ranging from 646 to 3,498 subjects. Results support the validity of the 16PF factor structure and its continuity with earlier versions. (SLD)
Descriptors: Adults, Factor Analysis, Factor Structure, Personality Assessment
Pages: 1  |  ...  |  164  |  165  |  166  |  167  |  168  |  169  |  170  |  171  |  172  |  ...  |  189