NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 2,506 to 2,520 of 5,131 results Save | Export
Hunt, Richard A. – Educ Psychol Meas, 1970
Descriptors: Computer Programs, Item Analysis, Psychological Evaluation, Rating Scales
Koppel, Mark A.; Sechrest, Lee – Educ Psychol Meas, 1970
Descriptors: Correlation, Experimental Groups, Humor, Intelligence
Peer reviewed Peer reviewed
Frisbie, David A. – Educational and Psychological Measurement, 1981
The Relative Difficulty Ratio (RDR) was developed as an index of test or item difficulty for use when raw score means or item p-values are not directly comparable because of chance score differences. Computational RDR are described. Applications of the RDR at both the test and item level are illustrated. (Author/BW)
Descriptors: Difficulty Level, Item Analysis, Mathematical Formulas, Test Items
Peer reviewed Peer reviewed
Jackson, Paul H. – Psychometrika, 1979
Use of the same term "split-half" for division of an n-item test into two subtests containing equal (Cronbach), and possibly unequal (Guttman), numbers of items sometimes leads to a misunderstanding about the relation between Guttman's maximum split-half bound and Cronbach's coefficient alpha. This distinction is clarified. (Author/JKS)
Descriptors: Item Analysis, Mathematical Formulas, Technical Reports, Test Reliability
Peer reviewed Peer reviewed
Hills, John R. – Educational Measurement: Issues and Practice, 1989
Test bias detection methods based on item response theory (IRT) are reviewed. Five such methods are commonly used: (1) equality of item parameters; (2) area between item characteristic curves; (3) sums of squares; (4) pseudo-IRT; and (5) one-parameter-IRT. A table compares these and six newer or less tested methods. (SLD)
Descriptors: Item Analysis, Test Bias, Test Items, Testing Programs
Peer reviewed Peer reviewed
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2001
Item-discrimination indices are numbers calculated from test data that are used in assessing the effectiveness of individual test questions. This article asserts that the indices are so unreliable as to suggest that countless good questions may have been discarded over the years. It considers how the indices, and hence overall test reliability,…
Descriptors: Guessing (Tests), Item Analysis, Test Reliability, Testing Problems
Peer reviewed Peer reviewed
Direct linkDirect link
van der Linden, Wim J. – Journal of Educational Measurement, 2005
In test assembly, a fundamental difference exists between algorithms that select a test sequentially or simultaneously. Sequential assembly allows us to optimize an objective function at the examinee's ability estimate, such as the test information function in computerized adaptive testing. But it leads to the non-trivial problem of how to realize…
Descriptors: Law Schools, Item Analysis, Admission (School), Adaptive Testing
National Foundation for Educational Research, 2007
Statistical neighbour models provide one method for benchmarking progress. For each local authority (LA), these models designate a number of other LAs deemed to have similar characteristics. These designated LAs are known as statistical neighbours. Any LA may compare its performance (as measured by various indicators) against its statistical…
Descriptors: Benchmarking, Evaluation Methods, Evaluation Research, Research Tools
Peer reviewed Peer reviewed
Direct linkDirect link
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…
Descriptors: Psychiatry, Patients, Error of Measurement, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Beevers, Christopher G.; Strong, David R.; Meyer, Bjorn; Pilkonis, Paul A.; Miller, Ivan R. – Psychological Assessment, 2007
Despite a central role for dysfunctional attitudes in cognitive theories of depression and the widespread use of the Dysfunctional Attitude Scale, form A (DAS-A; A. Weissman, 1979), the psychometric development of the DAS-A has been relatively limited. The authors used nonparametric item response theory methods to examine the DAS-A items and…
Descriptors: Measures (Individuals), Psychometrics, Depression (Psychology), Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Mullan, Mary; Lewis, Christopher Alan – Journal of Beliefs & Values, 2007
There are few self-report measures of morality. The Religious Status Inventory--"Being Ethical" subscale represents one approach. However, at present there is limited information on the psychometric properties of either the original 20-item version (RSInv-20) or the shortened embedded 10-item version (RSInv-S10). The aim of the present…
Descriptors: Item Analysis, Psychometrics, Ethics, Measures (Individuals)
Peer reviewed Peer reviewed
Direct linkDirect link
Horst, S. Jeanne; Finney, Sara J.; Barron, Kenneth E. – Contemporary Educational Psychology, 2007
The current research explored the theory of social goal orientation. More specifically, we conducted three studies utilizing six-independent university student samples to evaluate the construct validity of the Social Achievement Goal Orientation Scale (SAGOS; Ryan & Hopkins, 2003), a measure representing the construct of social goal orientation.…
Descriptors: Measures (Individuals), Validity, Factor Structure, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Webb, Norman L. – Applied Measurement in Education, 2007
A process for judging the alignment between curriculum standards and assessments developed by the author is presented. This process produces information on the relationship of standards and assessments on four alignment criteria: Categorical Concurrence, Depth of Knowledge Consistency, Range of Knowledge Correspondence, and Balance of…
Descriptors: Educational Assessment, Academic Standards, Item Analysis, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Elosua, Paula; Lopez-Jauregui, Alicia – International Journal of Testing, 2007
This report shows a classification of differential item functioning (DIF) sources that have an effect on the adaptation of tests. This classification is based on linguistic and cultural criteria. Four general DIF sources are distinguished: cultural relevance, translation problems, morph syntactical differences, and semantic differences. The…
Descriptors: Semantics, Cultural Relevance, Classification, Test Bias
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Liao, Chi-Wen; Livingston, Samuel A. – ETS Research Report Series, 2008
Randomly equivalent forms (REF) of tests in listening and reading for nonnative speakers of English were created by stratified random assignment of items to forms, stratifying on item content and predicted difficulty. The study included 50 replications of the procedure for each test. Each replication generated 2 REFs. The equivalence of those 2…
Descriptors: Equated Scores, Item Analysis, Test Items, Difficulty Level
Pages: 1  |  ...  |  164  |  165  |  166  |  167  |  168  |  169  |  170  |  171  |  172  |  ...  |  343