Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 6 |
| Since 2017 (last 10 years) | 15 |
| Since 2007 (last 20 years) | 26 |
Descriptor
| Difficulty Level | 57 |
| Test Length | 57 |
| Test Items | 40 |
| Item Response Theory | 23 |
| Sample Size | 18 |
| Test Reliability | 14 |
| Comparative Analysis | 12 |
| Item Analysis | 12 |
| Computer Assisted Testing | 11 |
| Correlation | 11 |
| Equated Scores | 11 |
| More ▼ | |
Source
Author
| De Ayala, R. J. | 2 |
| Hambleton, Ronald K. | 2 |
| Wainer, Howard | 2 |
| Alessio, Helaine M. | 1 |
| Arikan, Serkan | 1 |
| Aybek, Eren Can | 1 |
| Bailer, A. John | 1 |
| Bashaw, W. L. | 1 |
| Bazaldua, Diego A. Luna | 1 |
| Benton, Tom | 1 |
| Bergstrom, Betty | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 43 |
| Journal Articles | 28 |
| Speeches/Meeting Papers | 12 |
| Reports - Evaluative | 8 |
| Dissertations/Theses -… | 5 |
| Guides - Non-Classroom | 2 |
| Information Analyses | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Higher Education | 3 |
| Postsecondary Education | 2 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
| Secondary Education | 1 |
Audience
| Researchers | 4 |
Location
| Armenia | 1 |
| Australia | 1 |
| China | 1 |
| Netherlands | 1 |
| United Kingdom | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| Test of English as a Foreign… | 2 |
| Comprehensive Tests of Basic… | 1 |
| New Jersey College Basic… | 1 |
| Otis Lennon School Ability… | 1 |
| SAT (College Admission Test) | 1 |
| Stanford Binet Intelligence… | 1 |
What Works Clearinghouse Rating
Oosterhof, Albert C.; Coats, Pamela K. – 1981
Instructors who develop classroom examinations that require students to provide a numerical response to a mathematical problem are often very concerned about the appropriateness of the multiple-choice format. The present study augments previous research relevant to this concern by comparing the difficulty and reliability of multiple-choice and…
Descriptors: Comparative Analysis, Difficulty Level, Grading, Higher Education
Gialluca, Kathleen A.; And Others – 1984
In this study, simulated and actual Air Force test data were used to compare the different procedures for equating mental tests: conventional (equipercentile and linear), Item Response Theory (IRT), and strong true-score theory (STST); data collection designs used were single-group, equivalent-groups, and anchor test. Equating transformations were…
Descriptors: Adults, Cognitive Ability, Cognitive Tests, Comparative Analysis
Robertson, David W.; And Others – 1977
A comparative study of item analysis was conducted on the basis of race to determine whether alternative test construction or processing might increase the proportion of black enlisted personnel among those passing various military technical knowledge examinations. The study used data from six specialists at four grade levels and investigated item…
Descriptors: Difficulty Level, Enlisted Personnel, Item Analysis, Occupational Tests
Hambleton, Ronald K.; And Others – 1987
The study compared two promising item response theory (IRT) item-selection methods, optimal and content-optimal, with two non-IRT item selection methods, random and classical, for use in fixed-length certification exams. The four methods were used to construct 20-item exams from a pool of approximately 250 items taken from a 1985 certification…
Descriptors: Comparative Analysis, Content Validity, Cutting Scores, Difficulty Level
Samejima, Fumiko – 1986
Item analysis data fitting the normal ogive model were simulated in order to investigate the problems encountered when applying the three-parameter logistic model. Binary item tests containing 10 and 35 items were created, and Monte Carlo methods simulated the responses of 2,000 and 500 examinees. Item parameters were obtained using Logist 5.…
Descriptors: Computer Simulation, Difficulty Level, Guessing (Tests), Item Analysis
Test-Retest Analyses of the Test of English as a Foreign Language. TOEFL Research Reports Report 45.
Henning, Grant – 1993
This study provides information about the total and component scores of the Test of English as a Foreign Language (TOEFL). First, the study provides comparative global and component estimates of test-retest, alternate-form, and internal-consistency reliability, controlling for sources of measurement error inherent in the examinees and the testing…
Descriptors: Difficulty Level, English (Second Language), Error of Measurement, Estimation (Mathematics)
Livingston, Samuel A. – 1987
The effect of increased writing or planning time on a test of basic college level writing ability was studied. The essay portion of the New Jersey College Basic Skills Placement Test was given to students in nine New Jersey public colleges and three New Jersey public high schools. Each student wrote two essays on two different topics. The first…
Descriptors: Academic Ability, Difficulty Level, Essay Tests, High Schools
Peer reviewedPlake, Barbara S.; Melican, Gerald J. – Educational and Psychological Measurement, 1989
The impact of overall test length and difficulty on the expert judgments of item performance by the Nedelsky method were studied. Five university-level instructors predicting the performance of minimally competent candidates on a mathematics examination were fairly consistent in their assessments regardless of length or difficulty of the test.…
Descriptors: Difficulty Level, Estimation (Mathematics), Evaluators, Higher Education
Peer reviewedBergstrom, Betty A.; And Others – Applied Measurement in Education, 1992
Effects of altering test difficulty on examinee ability measures and test length in a computer adaptive test were studied for 225 medical technology students in 3 test difficulty conditions. Results suggest that, with an item pool of sufficient depth and breadth, acceptable targeting to test difficulty is possible. (SLD)
Descriptors: Ability, Adaptive Testing, Change, College Students
Hisama, Kay K.; And Others – 1977
The optimal test length, using predictive validity as a criterion, depends on two major conditions: the appropriate item-difficulty rather than the total number of items, and the method used in scoring the test. These conclusions were reached when responses to a 100-item multi-level test of reading comprehension from 136 non-native speakers of…
Descriptors: College Students, Difficulty Level, English (Second Language), Foreign Students
PDF pending restorationManpower Administration (DOL), Washington, DC. – 1972
The Basic Occupational Literacy Test (BOLT) was developed as an achievement test of basic skills in reading and arithmetic, for educationally disadvantaged adults. The objective was to develop a test appropriate for this population with regard to content, format, instructions, timing, norms, and difficulty level. A major issue, the use of grade…
Descriptors: Achievement Tests, Adult Basic Education, Adults, Basic Skills
Cliff, Norman; And Others – 1977
TAILOR is a computer program that uses the implied orders concept as the basis for computerized adaptive testing. The basic characteristics of TAILOR, which does not involve pretesting, are reviewed here and two studies of it are reported. One is a Monte Carlo simulation based on the four-parameter Birnbaum model and the other uses a matrix of…
Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Programs, Difficulty Level


