Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 11 |
| Since 2007 (last 20 years) | 24 |
Descriptor
| Scoring Formulas | 146 |
| Test Reliability | 146 |
| Test Validity | 66 |
| Multiple Choice Tests | 47 |
| Guessing (Tests) | 38 |
| Test Construction | 33 |
| Test Interpretation | 26 |
| Test Items | 25 |
| Higher Education | 23 |
| Scoring | 23 |
| Item Analysis | 22 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 7 |
| Postsecondary Education | 6 |
| Elementary Education | 2 |
| Elementary Secondary Education | 2 |
| Secondary Education | 2 |
| Adult Education | 1 |
| High Schools | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
Audience
| Researchers | 2 |
| Practitioners | 1 |
Location
| New York (New York) | 2 |
| Australia | 1 |
| Canada | 1 |
| Germany | 1 |
| India | 1 |
| Malaysia | 1 |
| Minnesota | 1 |
| Mississippi | 1 |
| New York | 1 |
| North Carolina | 1 |
| Ohio | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Herbert, Ian P.; Joyce, John; Hassall, Trevor – Accounting Education, 2014
The design, delivery and assessment of a complete educational scheme, such as a degree programme or a professional qualification course, is a complex matter. Maintaining alignment between the stated aims of the curriculum and the scoring of student achievement is an overarching concern. The potential for drift across individual aspects of an…
Descriptors: Higher Education, Student Evaluation, Communities of Practice, Interrater Reliability
Gafoor, K. Abdul; Naseer, A. R. – Online Submission, 2015
With a view to support instruction, formative and summative assessment and to provide model handwriting performance for students to compare their own performance, a Malayalam handwriting scale is developed. Data from 2640 school students belonging to Malappuram, Palakkad and Kozhikode districts, sampled by taking 240 students per each grade…
Descriptors: Formative Evaluation, Summative Evaluation, Handwriting, Performance Based Assessment
Beltrán, Jorge – Working Papers in TESOL & Applied Linguistics, 2016
In the assessment of aural skills of second language learners, the study of the inclusion of visual stimuli has almost exclusively been conducted in the context of listening assessment. While the inclusion of contextual information in test input has been advocated for by numerous researchers (Ockey, 2010), little has been said regarding the…
Descriptors: Achievement Tests, Speech Skills, Speech Tests, Second Language Learning
Taskinen, Päivi H.; Steimel, Jochen; Gräfe, Linda; Engell, Sebastian; Frey, Andreas – Peabody Journal of Education, 2015
This study examined students' competencies in engineering education at the university level. First, we developed a competency model in one specific field of engineering: process dynamics and control. Then, the theoretical model was used as a frame to construct test items to measure students' competencies comprehensively. In the empirical…
Descriptors: Models, Engineering Education, Test Items, Outcome Measures
Runco, Mark A.; Acar, Selcuk – Creativity Research Journal, 2012
Divergent thinking (DT) tests are very often used in creativity studies. Certainly DT does not guarantee actual creative achievement, but tests of DT are reliable and reasonably valid predictors of certain performance criteria. The validity of DT is described as reasonable because validity is not an all-or-nothing attribute, but is, instead, a…
Descriptors: Creativity, Creative Activities, Creative Thinking, Test Validity
Ahmed, Ayesha; Pollitt, Alastair – Assessment in Education: Principles, Policy & Practice, 2011
At the heart of most assessments lies a set of questions, and those who write them must achieve "two" things. Not only must they ensure that each question elicits the kind of performance that shows how "good" pupils are at the subject, but they must also ensure that each mark scheme gives more marks to those who are…
Descriptors: Academic Achievement, Classification, Educational Quality, Quality Assurance
Stewart, Jeffrey; White, David A. – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2011
Multiple-choice tests such as the Vocabulary Levels Test (VLT) are often viewed as a preferable estimator of vocabulary knowledge when compared to yes/no checklists, because self-reporting tests introduce the possibility of students overreporting or underreporting scores. However, multiple-choice tests have their own unique disadvantages. It has…
Descriptors: Guessing (Tests), Scoring Formulas, Multiple Choice Tests, Test Reliability
Peer reviewedGorsuch, Richard L. – Educational and Psychological Measurement, 1980
Kaiser and Michael reported a formula for factor scores giving an internal consistency reliability and its square root, the domain validity. Using this formula is inappropriate if variables are included which have trival weights rather than salient weights for the factor for which the score is being computed. (Author/RL)
Descriptors: Factor Analysis, Factor Structure, Scoring Formulas, Test Reliability
Peer reviewedMorsbach, Gisela; And Others – Journal of Clinical Psychology, 1975
This study investigated (a) interscorer reliability of the Bender-Gestalt Test by using more than one person to score the same test protocols; and (b) rate-rerate reliability of the Bender-Gestalt Test after a half-year interval. (Author)
Descriptors: Psychological Studies, Research Methodology, Scoring Formulas, Tables (Data)
Peer reviewedHolmes, Roy A.; And Others – Educational and Psychological Measurement, 1974
Descriptors: Chemistry, Multiple Choice Tests, Scoring Formulas, Test Reliability
Swineford, Frances – 1973
Results obtained by the Kudar-Richardson formula (20) adapted for use with R-KW scoring are compared with three other reliability formulas. Based on parallel tests administered at the same sitting the KR (20) estimates are compared with alternate-form correlations and with odd-even correlations adjusted by the Spearman-Brown prophecy formula.…
Descriptors: Aptitude Tests, Scoring Formulas, Test Interpretation, Test Reliability
Peer reviewedZimmerman, Donald W. – Educational and Psychological Measurement, 1972
Although a great deal of attention has been devoted over a period of years to the estimation of reliability from item statistics, there are still gaps in the mathematical derivation of the Kuder-Richardson results. The main purpose of this paper is to fill some of these gaps, using language consistent with modern probability theory. (Author)
Descriptors: Mathematical Applications, Probability, Scoring Formulas, Statistical Analysis
Peer reviewedClaudy, John G. – Applied Psychological Measurement, 1978
Option weighting is an alternative to increasing test length as a means of improving the reliability of a test. The effects on test reliability of option weighting procedures were compared in two empirical studies using four independent sets of items. Biserial weights were found to be superior. (Author/CTM)
Descriptors: Higher Education, Item Analysis, Scoring Formulas, Test Items
Peer reviewedBejar, Issac I.; Weiss, David J. – Educational and Psychological Measurement, 1977
The reliabilities yielded by several differential option weighting scoring procedures were compared among themselves as well as against conventional testing. It was found that increases in reliability due to differential option weighting were a function of inter-item correlations. Suggestions for the implementation of differential option weighting…
Descriptors: Correlation, Forced Choice Technique, Item Analysis, Scoring Formulas
Livingston, Samuel A.; Kastrinos, William – 1982
Leo Nedelsky developed a method for determining absolute grading standards for multiple choice tests. His method required a group of judges to examine each test question and eliminate those responses which the lowest D- student should be able to reject as incorrect. The correct answer probabilities remaining were used in computing an expected test…
Descriptors: Cutting Scores, Judges, Multiple Choice Tests, Real Estate

Direct link
