ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	11
Since 2007 (last 20 years)	24

Descriptor

Scoring Formulas	146
Test Reliability	146
Test Validity	66
Multiple Choice Tests	47
Guessing (Tests)	38
Test Construction	33
Test Interpretation	26
Test Items	25
Higher Education	23
Scoring	23
Item Analysis	22
Response Style (Tests)	22
Measurement Techniques	21
Weighted Scores	19
Testing Problems	18
Statistical Analysis	16
Testing	15
Correlation	14
Comparative Analysis	12
Confidence Testing	12
Evaluation Methods	12
Scores	12
Achievement Tests	11
True Scores	11
Factor Analysis	10
More ▼

Publication Type

Reports - Research	72
Journal Articles	48
Speeches/Meeting Papers	14
Reports - Evaluative	10
Tests/Questionnaires	7
Reports - Descriptive	6
Guides - Non-Classroom	4
Guides - Classroom - Teacher	2
Information Analyses	2
Opinion Papers	2
Collected Works - General	1
Guides - General	1
Numerical/Quantitative Data	1
More ▼

Education Level

Higher Education	7
Postsecondary Education	6
Elementary Education	2
Elementary Secondary Education	2
Secondary Education	2
Adult Education	1
High Schools	1
Junior High Schools	1
Middle Schools	1

Audience

Researchers	2
Practitioners	1

Location

New York (New York)	2
Australia	1
Canada	1
Germany	1
India	1
Malaysia	1
Minnesota	1
Mississippi	1
New York	1
North Carolina	1
Ohio	1
Turkey	1
United Kingdom	1
United States	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

Graduate Record Examinations	3
Bender Gestalt Test	2
California Achievement Tests	2
Group Embedded Figures Test	2
Rod and Frame Test	2
SAT (College Admission Test)	2
Comprehensive Tests of Basic…	1
General Aptitude Test Battery	1
Graduate Management Admission…	1
Learning Style Inventory	1
Matching Familiar Figures Test	1
Preliminary Scholastic…	1
Rosenberg Self Esteem Scale	1
Strong Vocational Interest…	1
Test of English as a Foreign…	1
Wechsler Intelligence Scale…	1
Woodcock Reading Mastery Test	1
More ▼

What Works Clearinghouse Rating

Test Reliability X

Showing 16 to 30 of 146 results Save | Export

Assessment in Higher Education: The Potential for a Community of Practice to Improve Inter-Marker Reliability

Peer reviewed

Direct link

Herbert, Ian P.; Joyce, John; Hassall, Trevor – Accounting Education, 2014

The design, delivery and assessment of a complete educational scheme, such as a degree programme or a professional qualification course, is a complex matter. Maintaining alignment between the stated aims of the curriculum and the scoring of student achievement is an overarching concern. The potential for drift across individual aspects of an…

Descriptors: Higher Education, Student Evaluation, Communities of Practice, Interrater Reliability

Development of Malayalam Handwriting Scale for School Students in Kerala

Download full text

Gafoor, K. Abdul; Naseer, A. R. – Online Submission, 2015

With a view to support instruction, formative and summative assessment and to provide model handwriting performance for students to compare their own performance, a Malayalam handwriting scale is developed. Data from 2640 school students belonging to Malappuram, Palakkad and Kozhikode districts, sampled by taking 240 students per each grade…

Descriptors: Formative Evaluation, Summative Evaluation, Handwriting, Performance Based Assessment

The Effects of Visual Input on Scoring a Speaking Achievement Test

Peer reviewed
PDF on ERIC

Download full text

Beltrán, Jorge – Working Papers in TESOL & Applied Linguistics, 2016

In the assessment of aural skills of second language learners, the study of the inclusion of visual stimuli has almost exclusively been conducted in the context of listening assessment. While the inclusion of contextual information in test input has been advocated for by numerous researchers (Ockey, 2010), little has been said regarding the…

Descriptors: Achievement Tests, Speech Skills, Speech Tests, Second Language Learning

A Competency Model for Process Dynamics and Control and Its Use for Test Construction at University Level

Peer reviewed

Direct link

Taskinen, Päivi H.; Steimel, Jochen; Gräfe, Linda; Engell, Sebastian; Frey, Andreas – Peabody Journal of Education, 2015

This study examined students' competencies in engineering education at the university level. First, we developed a competency model in one specific field of engineering: process dynamics and control. Then, the theoretical model was used as a frame to construct test items to measure students' competencies comprehensively. In the empirical…

Descriptors: Models, Engineering Education, Test Items, Outcome Measures

Divergent Thinking as an Indicator of Creative Potential

Peer reviewed

Direct link

Runco, Mark A.; Acar, Selcuk – Creativity Research Journal, 2012

Divergent thinking (DT) tests are very often used in creativity studies. Certainly DT does not guarantee actual creative achievement, but tests of DT are reliable and reasonably valid predictors of certain performance criteria. The validity of DT is described as reasonable because validity is not an all-or-nothing attribute, but is, instead, a…

Descriptors: Creativity, Creative Activities, Creative Thinking, Test Validity

Improving Marking Quality through a Taxonomy of Mark Schemes

Peer reviewed

Direct link

Ahmed, Ayesha; Pollitt, Alastair – Assessment in Education: Principles, Policy & Practice, 2011

At the heart of most assessments lies a set of questions, and those who write them must achieve "two" things. Not only must they ensure that each question elicits the kind of performance that shows how "good" pupils are at the subject, but they must also ensure that each mark scheme gives more marks to those who are…

Descriptors: Academic Achievement, Classification, Educational Quality, Quality Assurance

Estimating Guessing Effects on the Vocabulary Levels Test for Differing Degrees of Word Knowledge

Peer reviewed

Direct link

Stewart, Jeffrey; White, David A. – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2011

Multiple-choice tests such as the Vocabulary Levels Test (VLT) are often viewed as a preferable estimator of vocabulary knowledge when compared to yes/no checklists, because self-reporting tests introduce the possibility of students overreporting or underreporting scores. However, multiple-choice tests have their own unique disadvantages. It has…

Descriptors: Guessing (Tests), Scoring Formulas, Multiple Choice Tests, Test Reliability

Factor Score Reliabilities and Domain Validities.

Peer reviewed

Gorsuch, Richard L. – Educational and Psychological Measurement, 1980

Kaiser and Michael reported a formula for factor scores giving an internal consistency reliability and its square root, the domain validity. Using this formula is inappropriate if variables are included which have trival weights rather than salient weights for the factor for which the score is being computed. (Author/RL)

Descriptors: Factor Analysis, Factor Structure, Scoring Formulas, Test Reliability

Two Aspects of Scorer Reliability in the Bender Gestalt Test

Peer reviewed

Morsbach, Gisela; And Others – Journal of Clinical Psychology, 1975

This study investigated (a) interscorer reliability of the Bender-Gestalt Test by using more than one person to score the same test protocols; and (b) rate-rerate reliability of the Bender-Gestalt Test after a half-year interval. (Author)

Descriptors: Psychological Studies, Research Methodology, Scoring Formulas, Tables (Data)

The Comparative Validities of Three Scoring Systems Applied to an Objective Achievement Examination in Chemistry

Peer reviewed

Holmes, Roy A.; And Others – Educational and Psychological Measurement, 1974

Descriptors: Chemistry, Multiple Choice Tests, Scoring Formulas, Test Reliability

An Assessment of the Kuder-Richardson Formula (20) Reliability Estimate for Moderately Speeded Tests.

Download full text

Swineford, Frances – 1973

Results obtained by the Kudar-Richardson formula (20) adapted for use with R-KW scoring are compared with three other reliability formulas. Based on parallel tests administered at the same sitting the KR (20) estimates are compared with alternate-form correlations and with odd-even correlations adjusted by the Spearman-Brown prophecy formula.…

Descriptors: Aptitude Tests, Scoring Formulas, Test Interpretation, Test Reliability

Test Reliability and the Kuder-Richardson Formulas: Derivation from Probability Theory

Peer reviewed

Zimmerman, Donald W. – Educational and Psychological Measurement, 1972

Although a great deal of attention has been devoted over a period of years to the estimation of reliability from item statistics, there are still gaps in the mathematical derivation of the Kuder-Richardson results. The main purpose of this paper is to fill some of these gaps, using language consistent with modern probability theory. (Author)

Descriptors: Mathematical Applications, Probability, Scoring Formulas, Statistical Analysis

Biserial Weights: A New Approach to Test Item Option Weighting

Peer reviewed

Claudy, John G. – Applied Psychological Measurement, 1978

Option weighting is an alternative to increasing test length as a means of improving the reliability of a test. The effects on test reliability of option weighting procedures were compared in two empirical studies using four independent sets of items. Biserial weights were found to be superior. (Author/CTM)

Descriptors: Higher Education, Item Analysis, Scoring Formulas, Test Items

A Comparison of Empirical Differential Option Weighting Scoring Procedures as a Function of Inter-Item Correlation

Peer reviewed

Bejar, Issac I.; Weiss, David J. – Educational and Psychological Measurement, 1977

The reliabilities yielded by several differential option weighting scoring procedures were compared among themselves as well as against conventional testing. It was found that increases in reliability due to differential option weighting were a function of inter-item correlations. Suggestions for the implementation of differential option weighting…

Descriptors: Correlation, Forced Choice Technique, Item Analysis, Scoring Formulas

A Study of the Reliability of Nedelsky's Method for Choosing a Passing Score.

Livingston, Samuel A.; Kastrinos, William – 1982

Leo Nedelsky developed a method for determining absolute grading standards for multiple choice tests. His method required a group of judges to examine each test question and eliminate those responses which the lowest D- student should be able to reject as incorrect. The correct answer probabilities remaining were used in computing an expected test…

Descriptors: Cutting Scores, Judges, Multiple Choice Tests, Real Estate

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

Educational and Psychological…	18
Journal of Educational…	10
Applied Psychological…	6
Journal of Experimental…	3
Psychology in the Schools	3
Psychometrika	3
ETS Research Report Series	2
Educational Leadership	2
Evaluation and the Health…	2
Journal of Computer-Based…	2
Journal of Educational…	2
Accounting Education	1
Advances in Health Sciences…	1
American Educational Research…	1
Anatomical Sciences Education	1
Assessment & Evaluation in…	1
Assessment in Education:…	1
Child Abuse and Neglect: The…	1
College Board	1
Creativity Research Journal	1
Diagnostique	1
Educational Assessment	1
Educational Sciences: Theory…	1
English Language Teaching	1
Higher Education: The…	1
More ▼

Weiss, David J.	5
Echternacht, Gary	4
Frary, Robert B.	3
Hambleton, Ronald K.	3
Rippey, Robert M.	3
Albanese, Mark A.	2
Bejar, Issac I.	2
Cross, Lawrence H.	2
Frederiksen, Norman	2
Hakstian, A. Ralph	2
Huynh, Huynh	2
Kane, Michael T.	2
Kansup, Wanlop	2
Larkin, Kevin C.	2
Moloney, James M.	2
Reilly, Richard R.	2
Traub, Ross E.	2
Ward, William C.	2
Wilcox, Rand R.	2
Abedi, Jamal	1
Abramson, Paul R.	1
Abu-Sayf, F. K.	1
Acar, Selcuk	1
Aghbar, Ali-Asghar	1
More ▼