NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Job Training Partnership Act…1
What Works Clearinghouse Rating
Showing 46 to 60 of 133 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bechger, Timo M.; Maris, Gunter; Hsiao, Ya Ping – Applied Psychological Measurement, 2010
The main purpose of this article is to demonstrate how halo effects may be detected and quantified using two independent ratings of the same person. A practical illustration is given to show how halo effects can be avoided. (Contains 2 tables, 7 figures, and 2 notes.)
Descriptors: Performance Based Assessment, Test Reliability, Test Length, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Arens, A. Katrin; Yeung, Alexander Seeshing; Craven, Rhonda G.; Hasselhorn, Marcus – International Journal of Research & Method in Education, 2013
This study aims to develop a short German version of the Self Description Questionnaire (SDQ I-GS) in order to present a robust economical instrument for measuring German preadolescents' multidimensional self-concept. A full German version of the SDQ I (SDQ I-G) that maintained the original structure and thus length of the English original SDQ I…
Descriptors: Foreign Countries, Questionnaires, Test Construction, Test Length
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lee, Yi-Hsuan; Zhang, Jinming – ETS Research Report Series, 2010
This report examines the consequences of differential item functioning (DIF) using simulated data. Its impact on total score, item response theory (IRT) ability estimate, and test reliability was evaluated in various testing scenarios created by manipulating the following four factors: test length, percentage of DIF items per form, sample sizes of…
Descriptors: Test Bias, Item Response Theory, Test Items, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bulut, Okan; Kan, Adnan – Eurasian Journal of Educational Research, 2012
Problem Statement: Computerized adaptive testing (CAT) is a sophisticated and efficient way of delivering examinations. In CAT, items for each examinee are selected from an item bank based on the examinee's responses to the items. In this way, the difficulty level of the test is adjusted based on the examinee's ability level. Instead of…
Descriptors: Adaptive Testing, Computer Assisted Testing, College Entrance Examinations, Graduate Students
Peer reviewed Peer reviewed
Streiner, David L.; Miller, Harold R. – Journal of Clinical Psychology, 1986
Numerous short forms of the Minnesota Multiphasic Personality Inventory have been proposed in the last 15 years. In each case, the initial enthusiasm has been replaced by the questions about the clinical utility of the abbreviated version. Argues that the statistical properties of the test and reduced reliability due to shortening the scales…
Descriptors: Test Construction, Test Format, Test Length, Test Reliability
Peer reviewed Peer reviewed
Ray, John J. – Journal of Personality Assessment, 1974
The reliability of measures of need for achievement can be improved by increasing the number of items and by using different scoring systems and stimulus materials. (MLP)
Descriptors: Achievement Need, Personality Measures, Projective Measures, Scoring
Peer reviewed Peer reviewed
Huynh, Huynh – Psychometrika, 1978
The use of Cohen's kappa index as a measure of the reliability of multiple classifications is developed. Special cases of the index as well as the effects of test length on the index are also explored. (JKS)
Descriptors: Career Development, Classification, Mastery Tests, Test Length
Peer reviewed Peer reviewed
Cureton, Edward E.; And Others – Educational and Psychological Measurement, 1973
Study based on F. M. Lord's arguments in 1957 and 1959 that tests of the same length do have the same standard error of measurement. (CB)
Descriptors: Error of Measurement, Statistical Analysis, Test Interpretation, Test Length
Peer reviewed Peer reviewed
Conger, Anthony J. – Educational and Psychological Measurement, 1983
A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)
Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length
Peer reviewed Peer reviewed
Allison, Paul A. – Psychometrika, 1976
A direct proof is given for the generalized Spearman-Brown formula for any real multiple of test length. (Author)
Descriptors: Correlation, Error of Measurement, Raw Scores, Test Length
Wilcox, Rand R. – 1980
Wilcox (1977) examines two methods of estimating the probability of a false-positive on false-negative decision with a mastery test. Both procedures make assumptions about the form of the true score distribution which might not give good results in all situations. In this paper, upper and lower bounds on the two possible error types are described…
Descriptors: Cutting Scores, Mastery Tests, Mathematical Models, Student Placement
Peer reviewed Peer reviewed
Serlin, Ronald C.; Kaiser, Henry F. – Educational and Psychological Measurement, 1978
When multiple-choice tests are scored in the usual manner, giving each correct answer one point, information concerning response patterns is lost. A method for utilizing this information is suggested. An example is presented and compared with two conventional methods of scoring. (Author/JKS)
Descriptors: Correlation, Factor Analysis, Item Analysis, Multiple Choice Tests
Peer reviewed Peer reviewed
Taylor, James B. – Educational and Psychological Measurement, 1977
The reliability and item homogeneity of personality scales are in part dependent on the content domain being sampled, and this characteristic reliability cannot be explained by item ambiguity or scale length. It is suggested that clarity of self concept is also a determinant. (Author/JKS)
Descriptors: Item Analysis, Personality Assessment, Personality Measures, Personality Theories
Peer reviewed Peer reviewed
Strommen, Erik F.; Smith, Jeffrey K. – Educational and Psychological Measurement, 1987
The internal consistency of the Goodenough-Harris Draw-A-Person Test was examined using 150 children, aged 5-8. The 72-item full scales showed good internal consistency at all ages, with no sex differences. Administration of a 42-item short form resulted in sex effects and differential internal consistency. (Author/GDC)
Descriptors: Freehand Drawing, Primary Education, Sex Differences, Test Bias
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9