Publication Date
| In 2026 | 0 |
| Since 2025 | 389 |
| Since 2022 (last 5 years) | 1887 |
| Since 2017 (last 10 years) | 4031 |
| Since 2007 (last 20 years) | 6737 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 644 |
| Teachers | 455 |
| Researchers | 440 |
| Administrators | 126 |
| Policymakers | 68 |
| Students | 68 |
| Counselors | 26 |
| Parents | 24 |
| Community | 10 |
| Support Staff | 5 |
| Media Staff | 3 |
| More ▼ | |
Location
| Turkey | 603 |
| Australia | 339 |
| Canada | 254 |
| China | 180 |
| Indonesia | 147 |
| United States | 143 |
| United Kingdom | 130 |
| Germany | 116 |
| Taiwan | 111 |
| California | 109 |
| Spain | 107 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 2 |
Peer reviewedWadden, Paul; Hilke, Robert; Hamp-Lyons, Liz – TESOL Quarterly, 1999
Provides a form of argumentative dialectic to Liz Hamp-Lyons's forum commentary published in an earlier issue of this journal, "Ethical Test Preparation Practice: The Case of TOEFL." Hamp-Lyons responds to the comments.(Author/VWL)
Descriptors: English (Second Language), Ethics, Second Language Instruction, Test Construction
Peer reviewedSanders, Piet F.; Verschoor, Alfred J. – Applied Psychological Measurement, 1998
Presents minimization and maximization models for parallel test construction under constraints. The minimization model constructs weakly and strongly parallel tests of minimum length, while the maximization model constructs weakly and strongly parallel tests with maximum test reliability. (Author/SLD)
Descriptors: Algorithms, Models, Reliability, Test Construction
Peer reviewedTimminga, Ellen – Applied Psychological Measurement, 1998
Discusses problems of diagnosing and repairing infeasible linear-programming models in computerized test assembly. Demonstrates that it is possible to localize the causes of infeasibility, although this is not always easy. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Linear Programming, Test Construction
Peer reviewedvan der Linden, Wim J.; Adema, Jos J. – Journal of Educational Measurement, 1998
Proposes an algorithm for the assembly of multiple test forms in which the multiple-form problem is reduced to a series of computationally less intensive two-form problems. Illustrates how the method can be implemented using 0-1 linear programming and gives two examples. (SLD)
Descriptors: Algorithms, Linear Programming, Test Construction, Test Format
Peer reviewedMeier, Scott T. – Measurement and Evaluation in Counseling and Development, 1998
Traditional Item Selection Rules (TISRs) were compared to Intervention Item Selection Rules (IISRs) on the same set of alcohol attitude items from archival data, producing scales with differing psychometric properties. A cross-validation study was run. Guidelines for selecting change-sensitive items, problems and advantages of IISRs are…
Descriptors: Item Analysis, Measurement Techniques, Psychometrics, Statistical Analysis
Peer reviewedWilson, Mark; Sloane, Kathryn – Applied Measurement in Education, 2000
Describes the principles that guided the creation and implementation of a system of embedded assessments, the Berkeley Evaluation and Assessment Research System (BEAR). The assessment system builds on methodological advances in alternative assessment. Discusses how the application of the principles generates the component parts of the system. (SLD)
Descriptors: Educational Practices, Evaluation Methods, Research, Student Evaluation
Peer reviewedCarlstedt, Berit; Gustafsson, Jan-Eric; Ullstadius, Eva – Intelligence, 2000
Studied whether a change of test item sequencing, intended to increase test complexity, would cause increased involvement of general intelligence using a sample of Swedish military recruits who received heterogeneous (n=1,778) or homogeneous (n=363) tests. Items presented homogeneously showed higher general intelligence ("G") loadings.…
Descriptors: Foreign Countries, Intelligence, Military Personnel, Test Construction
Peer reviewedWright, Benjamin D.; Stenner, A. Jackson – Popular Measurement, 1999
Discusses the use of "Lexile" units in test construction. (SLD)
Descriptors: Measurement Techniques, Reading Achievement, Scaling, Student Evaluation
Peer reviewedvan der Linden, Wim J.; Veldkamp, Bernard P.; Reese, Lynda M. – Applied Psychological Measurement, 2000
Presents an integer programming approach to item bank design that can be used to calculate an optimal blueprint for an item bank in order to support an existing testing program. Demonstrates the approach empirically using an item bank designed for the Law School Admission Test. (SLD)
Descriptors: Item Banks, Item Response Theory, Test Construction, Testing Programs
Peer reviewedArmstrong, Ronald D.; Jones, Douglas H.; Wang, Zhaobo – Journal of Educational and Behavioral Statistics, 1998
Generating a test from an item bank using a criterion based on classical test theory parameters poses considerable problems. A mathematical model is formulated that maximizes the reliability coefficient alpha, subject to logical constraints on the choice of items. Theorems ensuring appropriate application of the Lagragian relation techniques are…
Descriptors: Item Banks, Mathematical Models, Reliability, Test Construction
Peer reviewedSprenger, Marilee – Educational Leadership, 1998
Our memories are not necessarily "bad," but stored in different areas. By understanding the five memory lanes (semantic, episodic, procedural, automatic, and emotional), a high school English teacher discovered why her students could not do fractions (to calculate grades) in English class. Paper-and-pencil tests can be redesigned to assess memory…
Descriptors: Brain, Elementary Secondary Education, Memory, Student Evaluation
Peer reviewedGierl, Mark J.; Leighton, Jacqueline P.; Hunka, Stephen M. – Educational Measurement: Issues and Practice, 2000
Discusses the logic of the rule-space model (K. Tatsuoka, 1983) as it applies to test development and analysis. The rule-space model is a statistical method for classifying examinees' test item responses into a set of attribute-mastery patterns associated with different cognitive skills. Directs readers to a tutorial that may be downloaded. (SLD)
Descriptors: Item Analysis, Item Response Theory, Test Construction, Test Items
Peer reviewedSmisko, Ann; Twing, Jon S.; Denny, Patricia – Applied Measurement in Education, 2000
Describes the Texas test development process in detail, showing how each test development step is linked to the "Standards for Educational and Psychological Testing." The routine use of this process provides evidence of the content and curricular validity of the Texas Assessment of Academic Skills. (SLD)
Descriptors: Achievement Tests, Curriculum, Models, Test Construction
Peer reviewedMiller, Michael J.; Woehr, David J.; Hudspeth, Natasha – Journal of Vocational Behavior, 2002
Reviews six studies that examined properties of the Multidimensional Work Ethic Profile: (1) replication demonstrating the multidimensionality of the work ethic construct; (2) initial psychometric evaluation; (3) relationship among subscales and with measures of cognitive ability and personality; (4) generalizability between student and nonstudent…
Descriptors: Measures (Individuals), Test Construction, Test Validity, Values
Peer reviewedSampson, James P., Jr.; Lumsden, Jill A. – Journal of Career Assessment, 2000
Addresses ethical issues regarding Internet career assessment: reliability, validity, user readiness, administration, lack of practitioner awareness, equitable access, confidentiality, and privacy. Makes recommendations in the areas of research and development, training, standards, and stable funding of assessment development. (SK)
Descriptors: Career Counseling, Ethics, Internet, Privacy


