Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 13 |
| Since 2017 (last 10 years) | 40 |
| Since 2007 (last 20 years) | 93 |
Descriptor
| Correlation | 116 |
| Test Validity | 116 |
| Test Reliability | 64 |
| Item Response Theory | 49 |
| Foreign Countries | 40 |
| Test Construction | 37 |
| Emotional Response | 30 |
| Factor Analysis | 29 |
| Psychometrics | 27 |
| Scores | 27 |
| Test Items | 24 |
| More ▼ | |
Source
Author
| Liu, Ou Lydia | 3 |
| Benton, Stephen L. | 2 |
| Chambless, Dianne L. | 2 |
| Cheng, Ying-Yao | 2 |
| Harrell, Thomas H. | 2 |
| Hill, Heather C. | 2 |
| Li, Dan | 2 |
| Olatunji, Bunmi O. | 2 |
| Wang, Wen-Chung | 2 |
| AL-Sinani, Yousra | 1 |
| Abd-El-Fattah, Sabry M. | 1 |
| More ▼ | |
Publication Type
Education Level
Location
| Canada | 5 |
| Turkey | 5 |
| California | 4 |
| Netherlands | 4 |
| China | 3 |
| Germany | 3 |
| Taiwan | 3 |
| Hong Kong | 2 |
| United Kingdom | 2 |
| United States | 2 |
| Belgium | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Viola Merhof; Caroline M. Böhm; Thorsten Meiser – Educational and Psychological Measurement, 2024
Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person…
Descriptors: Item Response Theory, Test Interpretation, Test Reliability, Test Validity
Enrico Gandolfi; Richard E. Ferdig – Educational Technology Research and Development, 2025
Augmented Reality (AR) is increasingly being adopted in education to foster engagement and interest in a variety of subjects and content areas. However, there is a scarcity of instruments to measure the instructional impact of this innovation. This article addresses this gap in two unique ways. First, it presents validation results of the…
Descriptors: Simulated Environment, Measures (Individuals), Rating Scales, Item Response Theory
Mehmet Ali Yildiz – Psychology in the Schools, 2024
The aim of this study was to adapt the Positive and Negative Affect Scale for Children (PANAS-C) developed by Laurent et al. into Turkish and to examine its validity and reliability on high school adolescents. The data of the study were analyzed with four different study groups. The first study group consisted of 414 high school adolescents, 262…
Descriptors: Foreign Countries, Affective Measures, Test Validity, Test Reliability
Abdullah Alamer; Ahmed Al Khateeb; Abdulrahman Alshabeb – Language Assessment Quarterly, 2025
This study introduces the first Arabic Vocabulary Levels Test (Arabic-VLT), created for foreign learners of Arabic. We present compelling evidence to substantiate its validity and reliability. The Arabic-VLT was developed according to five levels, beginning with the most frequently used words (Level 1) to the least frequently used ones (Level 5),…
Descriptors: Arabic, Vocabulary Development, Test Construction, Second Language Learning
Amber Dudley; Emma Marsden; Giulia Bovolenta – Language Testing, 2024
Vocabulary knowledge strongly predicts second language reading, listening, writing, and speaking. Yet, few tests have been developed to assess vocabulary knowledge in French. The primary aim of this pilot study was to design and initially validate the Context-Aligned Two Thousand Test (CA-TTT), following open research practices. The CA-TTT is a…
Descriptors: French, Vocabulary Development, Secondary School Students, Language Tests
Zaher M. Kmail; Gordon Brobbey – Journal of the American Academy of Special Education Professionals, 2024
Teacher evaluation has been closely tied to professional development. In special education, professional development experiences are meant to promote special educator learning and implementation of high leverage practices. Yet, the connection between teacher evaluation outcomes and professional development decisions of special educators is largely…
Descriptors: Teacher Evaluation, Special Education Teachers, Teacher Attitudes, Faculty Development
Yoo Jeong Jang – ProQuest LLC, 2022
Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…
Descriptors: Classification, Accuracy, Item Response Theory, Correlation
Liu, Tour; Sun, Yicong; Li, Zhen; Xin, Tao – Measurement: Interdisciplinary Research and Perspectives, 2019
Aberrant response has an important impact on item parameter estimation, individuals' evaluation, and other statistical analysis. There are various types of aberrant response behaviors in educational and psychological tests, like sleeping, guessing, and plodding. Random response is the most common one. The purpose of this research was to clarify…
Descriptors: Test Reliability, Test Validity, Item Response Theory, Differences
Luo, Jiahui; Chan, Cecilia K. Y.; Zhao, Yue – Assessment & Evaluation in Higher Education, 2023
Intensive research attention has focused on developing students' evaluative judgement within the higher education curriculum, but little has addressed how it can be measured. The contextual nature of evaluative judgement makes it difficult to generate an encompassing instrument, highlighting the need for situated measurement tools. Against this…
Descriptors: Engineering Education, Higher Education, Evaluative Thinking, Intercultural Communication
Stoevenbelt, Andrea H.; Wicherts, Jelte M.; Flore, Paulette C.; Phillips, Lorraine A. T.; Pietschnig, Jakob; Verschuere, Bruno; Voracek, Martin; Schwabe, Inga – Educational and Psychological Measurement, 2023
When cognitive and educational tests are administered under time limits, tests may become speeded and this may affect the reliability and validity of the resulting test scores. Prior research has shown that time limits may create or enlarge gender gaps in cognitive and academic testing. On average, women complete fewer items than men when a test…
Descriptors: Timed Tests, Gender Differences, Item Response Theory, Correlation
Hartono, Wahyu; Hadi, Samsul; Rosnawati, Raden; Retnawati, Heri – Pegem Journal of Education and Instruction, 2023
Researchers design diagnostic assessments to measure students' knowledge structures and processing skills to provide information about their cognitive attribute. The purpose of this study is to determine the instrument's validity and score reliability, as well as to investigate the use of classical test theory to identify item characteristics. The…
Descriptors: Diagnostic Tests, Test Validity, Item Response Theory, Content Validity
Fierro-Suero, Sebastián; Almagro, Bartolomé J.; Becker, Eva S.; Sáenz-López, Pedro – International Journal of Educational Psychology, 2022
The objective of this study was to examine possible antecedents and consequences of teachers' emotions in the classroom. Based on a cognitive-social perspective and self-determination theory, we examined the relationship between basic psychological needs (BPNs), teachers' class-related emotions and teachers' life satisfaction. A sample of 595…
Descriptors: Teacher Attitudes, Emotional Response, Correlation, Life Satisfaction
Park, Mihwa; Flores, Raymond – International Journal of Science Education, 2021
The purpose of this study was to develop a questionnaire to measure elementary preservice teachers' emotions about teaching science and mathematics. To achieve this goal, a questionnaire, Teacher Emotion for teaching Science and Mathematics (TESaM), was designed and pilot and field tested with a sample of preservice elementary teachers in the…
Descriptors: Questionnaires, Affective Measures, Test Construction, Test Validity
Lúcio, Patrícia Silva; Vandekerckhove, Joachim; Polanczyk, Guilherme V.; Cogo-Moreira, Hugo – Journal of Psychoeducational Assessment, 2021
The present study compares the fit of two- and three-parameter logistic (2PL and 3PL) models of item response theory in the performance of preschool children on the Raven's Colored Progressive Matrices. The test of Raven is widely used for evaluating nonverbal intelligence of factor g. Studies comparing models with real data are scarce on the…
Descriptors: Guessing (Tests), Item Response Theory, Test Validity, Preschool Children
Sensoy, Gözde; Siyez, Digdem M. – International Journal for Educational and Vocational Guidance, 2019
The purpose of this study is to adapt the career distress scale for Turkish university students. Participants are 493 undergraduate students. Results indicated that the two-factor structure better fit the data. For discriminant and concurrent validity, correlation coefficients with Positive and Negative Affect Schedule and Career Decision…
Descriptors: Foreign Countries, Undergraduate Students, Career Planning, Correlation

Peer reviewed
Direct link
