Publication Date
In 2025 | 42 |
Since 2024 | 165 |
Since 2021 (last 5 years) | 588 |
Since 2016 (last 10 years) | 1225 |
Since 2006 (last 20 years) | 2731 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 169 |
Practitioners | 49 |
Teachers | 32 |
Administrators | 8 |
Policymakers | 8 |
Counselors | 4 |
Students | 4 |
Media Staff | 1 |
Location
Turkey | 172 |
Australia | 81 |
Canada | 79 |
China | 70 |
United States | 55 |
Germany | 43 |
Taiwan | 43 |
Japan | 40 |
United Kingdom | 38 |
Iran | 36 |
Spain | 33 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Does not meet standards | 1 |

Sax, Gilbert – Educational and Psychological Measurement, 1996
Using various Latin square and incomplete Latin square formats, the Fields test formats provide a novel way of presenting tests to students using machine scoreable answer sheets that can be item analyzed. Items can be constructed to help students acquire knowledge or to measure the attainment of course objectives. (SLD)
Descriptors: Answer Sheets, Item Analysis, Measures (Individuals), Scoring

Prien, Borge – Studies in Educational Evaluation, 1989
Under certain conditions it may be possible to determine the difficulty of previously untested test items. Although no recipe can be provided, reflections on this topic are presented, drawing on concepts of item banking. A functional constructive method is suggested as having the most potential. (SLD)
Descriptors: Difficulty Level, Educational Assessment, Foreign Countries, Item Analysis

Green, Bert F.; And Others – Journal of Educational Measurement, 1989
A method of analyzing test item responses is advocated to examine differential item functioning through distractor choices of those answering an item incorrectly. The analysis uses log-linear models of a three-way contingency table, and is illustrated in an analysis of the verbal portion of the Scholastic Aptitude Test. (TJH)
Descriptors: College Entrance Examinations, Distractors (Tests), Evaluation Methods, High School Students

Henning, Grant – Language Testing, 1988
Violations of item unidimensionality on language tests produced distorted estimates of person ability, and violations of person unidimensionality produced distorted estimates of item difficulty. The Bejar Method was sensitive to such distortions. (Author)
Descriptors: Construct Validity, Content Validity, Difficulty Level, Item Analysis

Bauer, Hannspeter – System, 1991
Discusses the "sore-finger" form of multiple choice test items. "Sore finger" items employ authentic texts making use of the rules of text grammar, achieve high discrimination levels, and show a high validity for several subtests, especially style tests. The use of sore finger items in German nationwide English and German tests…
Descriptors: English (Second Language), Foreign Countries, German, Grammar

Vaughn, Sharon; And Others – Elementary School Journal, 1993
Reports two studies of students' perceptions of hypothetical teachers' adaptations to individual needs. Found that the Students' Perception of Teachers (SPT) Scale was appropriate for use with elementary students, but procedures for administration should be altered for elementary students. Also found that high-achieving students preferred teachers…
Descriptors: Elementary Education, Elementary School Students, Item Analysis, Student Attitudes
Zhou, Zheng; Boehm, Ann E. – Psychology in the Schools, 2004
Two hundred first- and second-grade Chinese children's knowledge of basic relational concepts in following directions was assessed on the "Applications Booklet" of the "Boehm Test of Basic Concepts-Revised" (BTBC-R, 1986). Chinese children's performance was then compared with that of the standardization sample of the BTBC-R.…
Descriptors: Young Children, Grade 1, Grade 2, Knowledge Level
Schulz, E. Matthew; Betebenner, Damian; Ahn, Meeyeon – Journal of Educational Measurement, 2004
Whether hierarchical logistic regression can reduce the sample size requirement for estimating optimal cutoff scores in a course placement service where predictive validity is measured by a threshold utility function is explored. Data from courses with varying class size were randomly partitioned into two halves per course. Nonhierarchical and…
Descriptors: Class Size, Sample Size, Cutting Scores, Predictive Validity
Meyer, J. Patrick; Huynh, Huynh; Seaman, Michael A. – Journal of Educational Measurement, 2004
Exact nonparametric procedures have been used to identify the level of differential item functioning (DIF) in binary items. This study explored the use of exact DIF procedures with items scored on a Likert scale. The results from an attitude survey suggest that the large-sample Cochran-Mantel-Haenszel (CMH) procedure identifies more items as…
Descriptors: Test Bias, Attitude Measures, Surveys, Predictive Validity
Fletcher, Richard; Hattie, John – Educational and Psychological Measurement, 2005
Typically, group differences are analyzed at the subdomain or test level using composite scores. This can mask the effect of individual items across groups. For example, two items from the Physical Self-Description Questionnaire (PSDQ) are worded in terms of internal ("I am good looking") and external ("Nobody thinks Im good looking") frames of…
Descriptors: Secondary School Students, Foreign Countries, Gender Differences, Test Bias
Schlosser, Lewis Z.; Gelso, Charles J. – Journal of Counseling Psychology, 2005
The development of the Advisory Working Alliance Inventory-Advisor Version (AWAI-A) is presented. In the first study, data from 236 faculty members from APA-accredited counseling psychology programs were subjected to a principal components analysis, yielding 3 subscales: Rapport (15 items), Apprenticeship (8 items), and Task Focus (8 items). The…
Descriptors: Validity, Self Efficacy, Science Interests, Faculty
Saiki, Jun; Koike, Takahiko; Takahashi, Kohske; Inoue, Tomoko – Journal of Experimental Psychology: Human Perception and Performance, 2005
The underlying mechanism of search asymmetry is still unknown. Many computational models postulate top-down selection of target-defining features as a crucial factor. This feature selection account implies, and other theories implicitly assume, that predefined target identity is necessary for search asymmetry. The authors tested the validity of…
Descriptors: Visual Perception, Computation, Predictive Validity, Task Analysis
Anderson, Robin D.; Thelk, Amy D. – Assessment Update, 2005
When a college or university needs an assessment instrument, it has two choices: select an existing instrument or develop a new one. Since the cost of instrument development in both time and human resources is often prohibitive, many assessment programs rely on existing instruments. The challenge then becomes choosing one that provides the best…
Descriptors: Community Colleges, Content Validity, Item Analysis, College Faculty
Sakko, Gina; Martin, Toby L.; Vause, Tricia; Martin, Garry L.; Yu, C. T. – American Journal on Mental Retardation, 2004
The Assessment of Basic Learning Abilities test (ABLA) is a useful tool for choosing appropriate training tasks for persons with developmental disabilities. This test assesses the ease or difficulty with which persons are able to learn six hierarchically positioned discrimination tasks. A visual-visual nonidentity matching prototype task was…
Descriptors: Developmental Disabilities, Learning Problems, Task Analysis, Predictive Validity
Lai, Ah-Fur; Chen, Deng-Jyi; Chen, Shu-Ling – Journal of Educational Multimedia and Hypermedia, 2008
The IRT (Item Response Theory) has been studied and applied in computer-based test for decades. However, almost of all these existing studies evaluated focus merely on test questions with text-based (or static text/graphic) type of presentation form illustrated exclusively. In this paper, we present our study on test questions using both…
Descriptors: Elementary School Students, Semantics, Difficulty Level, Item Response Theory