Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 23 |
| Since 2017 (last 10 years) | 563 |
| Since 2007 (last 20 years) | 1786 |
Descriptor
| Statistical Analysis | 2533 |
| Reliability | 1278 |
| Test Reliability | 1074 |
| Foreign Countries | 940 |
| Correlation | 633 |
| Test Validity | 630 |
| Factor Analysis | 559 |
| Validity | 508 |
| Questionnaires | 479 |
| Measures (Individuals) | 411 |
| Test Construction | 338 |
| More ▼ | |
Source
Author
| Alonzo, Julie | 12 |
| Price, Gary G. | 12 |
| Tindal, Gerald | 10 |
| Lai, Cheng-Fei | 9 |
| Brennan, Robert L. | 8 |
| Raykov, Tenko | 8 |
| Feldt, Leonard S. | 7 |
| Livingston, Samuel A. | 7 |
| Park, Bitnara Jasmine | 7 |
| Irvin, P. Shawn | 6 |
| Anderson, Daniel | 5 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 34 |
| Practitioners | 21 |
| Teachers | 10 |
| Students | 8 |
| Administrators | 5 |
| Counselors | 2 |
| Parents | 1 |
| Policymakers | 1 |
Location
| Turkey | 204 |
| Nigeria | 57 |
| Jordan | 38 |
| Australia | 35 |
| Iran | 35 |
| Taiwan | 35 |
| Canada | 31 |
| China | 30 |
| Germany | 29 |
| California | 28 |
| United Kingdom | 25 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Peer reviewedMuchinsky, Paul M. – Personnel Psychology, 1976
Extends the theoretical and practical implications of organizational climate research based upon the data from two studies, Sims and LaFollette (1975) and Litwin and Stringer (1968). (Author/RK)
Descriptors: Measurement Instruments, Organizational Climate, Psychological Studies, Questionnaires
Peer reviewedKlinger, Don A.; Rogers, W. Todd – Alberta Journal of Educational Research, 2003
The estimation accuracy of procedures based on classical test score theory and item response theory (generalized partial credit model) were compared for examinations consisting of multiple-choice and extended-response items. Analysis of British Columbia Scholarship Examination results found an error rate of about 10 percent for both methods, with…
Descriptors: Academic Achievement, Educational Testing, Foreign Countries, High Stakes Tests
Peer reviewedGoodwin, Laura D.; And Others – Journal of Special Education, 1991
Using data from an individually administered interview schedule (the Consumer Satisfaction Inventory), reliability among nine interviewers was estimated with several statistical methods, including simple percentages of agreement, kappa and weighted kappa, Pearson correlations, t tests on interviewers' means, and generalizability theory techniques.…
Descriptors: Disabilities, Educational Research, Elementary Secondary Education, Estimation (Mathematics)
Peer reviewedBlackman, Nicole J-M.; Koval, John J. – Applied Psychological Measurement, 1993
Four indexes of agreement between ratings of a person that correct for chance and are interpretable as intraclass correlation coefficients for different analysis of variance models are investigated. Relationships among the estimators are established for finite samples, and the equivalence of these estimators in large samples is demonstrated. (SLD)
Descriptors: Analysis of Variance, Equations (Mathematics), Estimation (Mathematics), Interrater Reliability
Peer reviewedHoyt, William T.; Melby, Janet N. – Counseling Psychologist, 1999
Addresses generalizability theory (GT), which offers a flexible framework for assessing dependability of measurement. GT allows for consideration of multiple sources of error, allowing investigators to assess the overall impact of measurement error. Illustrative analyses demonstrate the special advantages of GT for planning studies in which…
Descriptors: Counseling Psychology, Generalizability Theory, Measurement, Research Design
Peer reviewedBurton, Richard F.; Miller, David J. – Assessment & Evaluation in Higher Education, 1999
Discusses statistical procedures for increasing test unreliability due to guessing in multiple choice and true/false tests. Proposes two new measures of test unreliability: one concerned with resolution of defined levels of knowledge and the other with the probability of examinees being incorrectly ranked. Both models are based on the binomial…
Descriptors: Guessing (Tests), Higher Education, Multiple Choice Tests, Objective Tests
Peer reviewedBall, Andrew M. – Infants and Young Children, 1998
Discusses how meta-analysis allows clinicians to determine objectively both presence and size of an effect or correlation within the existing literature by pooling the results of various studies and performing statistical analyses. Describes the risks and benefits of applying information obtained from meta-analysis into clinical practice.…
Descriptors: Developmental Disabilities, Effect Size, Meta Analysis, Reliability
Simon, Patricia – Educational and Psychological Measurement, 2006
The application range of Cohen's Kappa is extended to the field of sequential observation data, where omission mistakes of an observer may often occur. It is shown how the omission mistakes can be incorporated into the calculation of the Kappa coefficient without violating the statistic it is based on. The enhanced coefficient is termed Kappa…
Descriptors: Computation, Statistical Bias, Statistical Analysis, Logical Thinking
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Spaan, Mary – Language Assessment Quarterly, 2007
This article follows the development of test items (see "Language Assessment Quarterly", Volume 3 Issue 1, pp. 71-79 for the article "Test and Item Specifications Development"), beginning with a review of test and item specifications, then proceeding to writing and editing of items, pretesting and analysis, and finally selection of an item for a…
Descriptors: Test Items, Test Construction, Responses, Test Content
Lee, Saekyun H.; Han, Hyunjoo – Applied Language Learning, 2007
This study investigated some issues regarding the validity of the Scholastic Achievement Test (SAT) Subject Test: Korean with Listening. The SAT Korean has been administered just once a year since its inception in 1997. As of March 2006, it had been administered nine times. However, SAT foreign language tests are not as rigorously researched as…
Descriptors: Test Results, Second Language Learning, Language Tests, Academic Achievement
Fer, Seval – Teachers College Record, 2007
Background: The Thinking Styles Inventory--developed by Sternberg and Wagner based on Sternberg's (1988, 1997) earlier theory of mental self-government--was selected for the research in order to assess thinking styles of student teachers. Another reason is that the theoretical constructs, as well as the inventory generated from the theory, have…
Descriptors: Student Teachers, Research Design, Cognitive Style, Construct Validity
Allalouf, Avi – Educational Measurement: Issues and Practice, 2007
There is significant potential for error in long production processes that consist of sequential stages, each of which is heavily dependent on the previous stage, such as the SER (Scoring, Equating, and Reporting) process. Quality control procedures are required in order to monitor this process and to reduce the number of mistakes to a minimum. In…
Descriptors: Scoring, Quality Control, Sequential Approach, Error Correction
Scholfield, Phil – 1995
This book is a guide to categorizing, measuring, testing, and assessing aspects of language, and is intended for language teachers, speech therapists and other language-related practitioners, and researchers, in conjunction with other resources on research methods and statistics. The first part is a discussion of basic terminology and the varied…
Descriptors: Data Collection, Language Proficiency, Language Skills, Language Tests
Witta, E. Lea; Daniel, Larry G. – 1998
In 1994, the journal "Educational and Psychological Measurement" (EPM) instituted an editorial policy requiring authors to use technically appropriate language and methodological practices in their discussions of validity and reliability. To determine if this policy has had any effect on current publications, 150 validity and reliability…
Descriptors: Editing, Editorials, Educational Research, Reliability

Direct link
