ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	29
Since 2017 (last 10 years)	65
Since 2007 (last 20 years)	97

Descriptor

Test Format	239
Test Reliability	239
Test Validity	108
Test Items	81
Test Construction	68
Higher Education	64
Multiple Choice Tests	54
Foreign Countries	48
Comparative Analysis	46
Scores	38
Psychometrics	35
Computer Assisted Testing	33
College Students	32
Item Analysis	28
Difficulty Level	27
Comparative Testing	26
Scoring	25
Item Response Theory	22
Language Tests	22
Statistical Analysis	22
Correlation	21
Test Interpretation	19
Undergraduate Students	18
Achievement Tests	17
Likert Scales	17
More ▼

Publication Type

Reports - Research	239
Journal Articles	164
Speeches/Meeting Papers	42
Tests/Questionnaires	15
Reports - Descriptive	9
Information Analyses	4
Guides - Non-Classroom	3
Numerical/Quantitative Data	3
Opinion Papers	2
Reports - Evaluative	1

Education Level

Higher Education	40
Postsecondary Education	35
Elementary Education	11
Secondary Education	10
Early Childhood Education	5
Middle Schools	5
High Schools	4
Junior High Schools	4
Primary Education	4
Elementary Secondary Education	3
Grade 3	3
Grade 7	3
Grade 8	3
Intermediate Grades	3
Grade 4	2
Kindergarten	2
Grade 1	1
Grade 5	1
Grade 6	1
Grade 9	1
Preschool Education	1
More ▼

Audience

Researchers	9
Practitioners	7
Teachers	6
Administrators	4

Location

Turkey	8
California	6
Japan	4
Germany	3
Israel	3
Canada	2
Indonesia	2
South Africa	2
United Kingdom	2
Bangladesh	1
China	1
Czech Republic	1
Estonia	1
Finland	1
France	1
Georgia	1
Indiana	1
Iran	1
Ireland	1
Louisiana	1
Maryland	1
Missouri	1
Netherlands	1
New Jersey	1
New York	1
More ▼

Laws, Policies, & Programs

Pell Grant Program

What Works Clearinghouse Rating

Showing 1 to 15 of 239 results Save | Export

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

The Effects of Reverse Items on Psychometric Properties and Respondents' Scale Scores According to Different Item Reversal Strategies

Peer reviewed
PDF on ERIC

Download full text

Mustafa Ilhan; Nese Güler; Gülsen Tasdelen Teker; Ömer Ergenekon – International Journal of Assessment Tools in Education, 2024

This study aimed to examine the effects of reverse items created with different strategies on psychometric properties and respondents' scale scores. To this end, three versions of a 10-item scale in the research were developed: 10 positive items were integrated in the first form (Form-P) and five positive and five reverse items in the other two…

Descriptors: Test Items, Psychometrics, Scores, Measures (Individuals)

Do Different Devices Perform Equally Well with Different Numbers of Scale Points and Response Formats? A Test of Measurement Invariance and Reliability

Peer reviewed

Direct link

Natalja Menold; Vera Toepoel – Sociological Methods & Research, 2024

Research on mixed devices in web surveys is in its infancy. Using a randomized experiment, we investigated device effects (desktop PC, tablet and mobile phone) for six response formats and four different numbers of scale points. N = 5,077 members of an online access panel participated in the experiment. An exact test of measurement invariance and…

Descriptors: Online Surveys, Handheld Devices, Telecommunications, Test Reliability

Do Scoring Techniques and Number of Choices Affect the Reliability of Multiple-Choice Tests in Elementary Schools?

Peer reviewed
PDF on ERIC

Download full text

Herwin, Herwin; Pristiwaluyo, Triyanto; Ruslan, Ruslan; Dahalan, Shakila Che – Cypriot Journal of Educational Sciences, 2022

The application of multiple-choice tests often does not consider the scoring technique and the number of choices. The study aims at describing the effect of the scoring technique and numerous options towards the reliability of multiple-choice objective tests on social subjects in elementary school. The study is quantitative research with…

Descriptors: Scoring, Multiple Choice Tests, Test Reliability, Elementary School Students

Improving the Efficiency of the Digits-in-Noise Hearing Screening Test: A Comparison between Four Different Test Procedures

Peer reviewed

Direct link

Dambha, Tasneem; Swanepoel, De Wet; Mahomed-Asmail, Faheema; De Sousa, Karina C.; Graham, Marien A.; Smits, Cas – Journal of Speech, Language, and Hearing Research, 2022

Purpose: This study compared the test characteristics, test-retest reliability, and test efficiency of three novel digits-in-noise (DIN) test procedures to a conventional antiphasic 23-trial adaptive DIN (D23). Method: One hundred twenty participants with an average age of 42 years (SD = 19) were included. Participants were tested and retested…

Descriptors: Auditory Tests, Screening Tests, Efficiency, Test Format

Evaluating the Evaluators: A Comparative Study of AI and Teacher Assessments in Higher Education

Peer reviewed
PDF on ERIC

Download full text

Tugra Karademir Coskun; Ayfer Alper – Digital Education Review, 2024

This study aims to examine the potential differences between teacher evaluations and artificial intelligence (AI) tool-based assessment systems in university examinations. The research has evaluated a wide spectrum of exams including numerical and verbal course exams, exams with different assessment styles (project, test exam, traditional exam),…

Descriptors: Artificial Intelligence, Visual Aids, Video Technology, Tests

Practical Considerations in Choosing an Anchor Test Form for Equating under the Random Groups Design

Peer reviewed

Direct link

Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023

Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…

Descriptors: Test Format, Equated Scores, Best Practices, Test Construction

Meta[superscript 2]: A Meta-Analysis and Psychometric Evaluation of the Metacognitive Awareness Inventory (MAI) in the Context of Health Professions Education

Peer reviewed

Direct link

Andrew S. Cale; Elizabeth R. Agosto; Brenda Kucha Anak Ganeng; Megan E. Kruskie; Margaret A. McNulty; Kyle A. Robertson; Cecelia J. Vetter; Sabrina C. Woods; Md. Nazmul Karim; Adam B. Wilson – Anatomical Sciences Education, 2025

To keep pace with medicine's unpredictable changes, medical trainees must learn to accurately monitor and evaluate themselves via metacognition (i.e., thinking about thinking). The Metacognitive Awareness Inventory (MAI) can assess and guide the metacognitive development of trainees. This study summarizes existing psychometric evidence and…

Descriptors: Meta Analysis, Psychometrics, Metacognition, Measures (Individuals)

A Reliability Generalization Meta-Analysis of Runco Ideational Behavior Scale

Peer reviewed

Direct link

Sen, Sedat – Creativity Research Journal, 2022

The purpose of this study was to estimate the overall reliability values for the scores produced by Runco Ideational Behavior Scale (RIBS) and explore the variability of RIBS score reliability across studies. To achieve this, a reliability generalization meta-analysis was carried out using the 86 Cronbach's alpha estimates obtained from 77 studies…

Descriptors: Generalization, Creativity, Meta Analysis, Higher Education

Reliability of Computer-Based CBMs versus Paper/Pencil Administration for Fact and Complex Operations in Mathematics

Peer reviewed

Direct link

VanDerHeyden, Amanda M.; Codding, Robin; Solomon, Benjamin G. – Remedial and Special Education, 2023

Computer-based curriculum-based measurement (CBM) is a relatively common practice, but surprisingly few studies have examined the reliability of computer-based CBM. This study sought to examine the reliability of CBM administered via paper/pencil versus the computer. Twenty-one of 25 students in two third-grade classes (N = 21) participated in two…

Descriptors: Curriculum Based Assessment, Computer Assisted Testing, Test Format, Grade 3

The DAATS Battery Short Form as a Measure of Teacher Dispositions

Peer reviewed
PDF on ERIC

Download full text

Judy R. Wilkerson; W. Steve Lang; LaSonya Moore – Journal of Research in Education, 2025

The DAATS (Dispositions Assessments Aligned with Teacher Standards) battery is a series of five instruments of different item types that measure teachers' consistency with the critical dispositions embedded in the InTASC Standards. The purpose of this study was to continue a 20-year research project on the development and implementation of…

Descriptors: Educational Assessment, National Standards, Teacher Evaluation, Teacher Competencies

A Two-Level Adaptive Test Battery

Peer reviewed

Direct link

Wim J. van der Linden; Luping Niu; Seung W. Choi – Journal of Educational and Behavioral Statistics, 2024

A test battery with two different levels of adaptation is presented: a within-subtest level for the selection of the items in the subtests and a between-subtest level to move from one subtest to the next. The battery runs on a two-level model consisting of a regular response model for each of the subtests extended with a second level for the joint…

Descriptors: Adaptive Testing, Test Construction, Test Format, Test Reliability

Are the Verbal TTCT Forms Actually Interchangeable?

Peer reviewed

Direct link

Grajzel, Katalin; Dumas, Denis; Acar, Selcuk – Journal of Creative Behavior, 2022

One of the best-known and most frequently used measures of creative idea generation is the Torrance Test of Creative Thinking (TTCT). The TTCT Verbal, assessing verbal ideation, contains two forms created to be used interchangeably by researchers and practitioners. However, the parallel forms reliability of the two versions of the TTCT Verbal has…

Descriptors: Test Reliability, Creative Thinking, Creativity Tests, Verbal Ability

Not Liking the Likert? A Rasch Analysis of Forced-Choice Format and Usefulness in Survey Design

Peer reviewed

Direct link

Celeste Combrinck – SAGE Open, 2024

We have less time and focus than ever before, while the demand for attention is increasing. Therefore, it is no surprise that when answering questionnaires, we often choose to strongly agree or be neutral, producing problematic and unusable data. The current study investigated forced-choice (ipsative) format compared to the same questions on a…

Descriptors: Likert Scales, Test Format, Surveys, Design

Can High-Dimensional Questionnaires Resolve the Ipsativity Issue of Forced-Choice Response Formats?

Peer reviewed

Direct link

Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021

Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…

Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 16

Educational and Psychological…	17
Journal of Educational…	7
Psychological Assessment	6
Applied Psychological…	4
Assessment	4
ETS Research Report Series	4
Assessment & Evaluation in…	3
Grantee Submission	3
International Journal of…	3
Journal of Experimental…	3
Language Testing	3
Perceptual and Motor Skills	3
Applied Measurement in…	2
Assessment for Effective…	2
College Board	2
Education and Information…	2
Evaluation and the Health…	2
Higher Education	2
Hispanic Journal of…	2
International Journal of…	2
Journal of Creative Behavior	2
Journal of Educational and…	2
Journal of Psychoeducational…	2
Language Assessment Quarterly	2
Measurement and Evaluation in…	2
More ▼

White, Edward M.	6
Melancon, Janet G.	4
Thompson, Bruce	4
Federico, Pat-Anthony	3
Bush, Martin	2
Conoyer, Sarah J.	2
Frisbie, David A.	2
Green, Kathy	2
Hambleton, Ronald K.	2
Henk, William A.	2
Henning, Grant	2
Kapes, Jerome T.	2
Menold, Natalja	2
Sax, Gilbert	2
Schriesheim, Chester A.	2
Trevisan, Michael S.	2
Vansickle, Timothy R.	2
Acar, Selcuk	1
Adam B. Wilson	1
Agbo, George Chibuike	1
Agbo, Philomina Akudo	1
Ahmed, Md. Kawser	1
Ahnberg, Jamie L.	1
Aiken, Lewis R.	1
More ▼

Embedded Figures Test	3
Test of English as a Foreign…	3
Wechsler Intelligence Scale…	3
Beck Depression Inventory	2
Graduate Record Examinations	2
Wechsler Adult Intelligence…	2
ACT Assessment	1
Attribution Style…	1
Bem Sex Role Inventory	1
Bruininks Oseretsky Test of…	1
California Critical Thinking…	1
Computer Attitude Scale	1
Cornell Critical Thinking Test	1
Defining Issues Test	1
Dimensions of Self Concept	1
Graduate Management Admission…	1
Illinois Test of…	1
Iowa Tests of Basic Skills	1
Marlowe Crowne Social…	1
Minnesota Multiphasic…	1
Peabody Picture Vocabulary…	1
Praxis Series	1
Program for International…	1
Raven Progressive Matrices	1
Rosenberg Self Esteem Scale	1
More ▼