Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 4 |
Descriptor
| Data Analysis | 4 |
| Probability | 4 |
| Models | 3 |
| Artificial Intelligence | 2 |
| Classification | 2 |
| Computer Software | 2 |
| Information Retrieval | 2 |
| Network Analysis | 2 |
| Visual Aids | 2 |
| Algebra | 1 |
| Authors | 1 |
| More ▼ | |
Source
| Grantee Submission | 4 |
Author
| Cai, Zhiqiang | 2 |
| Eagan, Brendan | 1 |
| Graesser, Art | 1 |
| Hu, Xiangen | 1 |
| Li, Hiyiang | 1 |
| Lijin Zhang | 1 |
| Shaffer, David Williamson | 1 |
| Siebert-Evenstone, Amanda | 1 |
| Xueyang Li | 1 |
| Zhang, Danyang | 1 |
| Zhang, Zhiyong | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 3 |
| Journal Articles | 2 |
| Speeches/Meeting Papers | 2 |
| Reports - Evaluative | 1 |
Education Level
| Higher Education | 1 |
| Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Cai, Zhiqiang; Siebert-Evenstone, Amanda; Eagan, Brendan; Shaffer, David Williamson – Grantee Submission, 2021
When text datasets are very large, manually coding line by line becomes impractical. As a result, researchers sometimes try to use machine learning algorithms to automatically code text data. One of the most popular algorithms is topic modeling. For a given text dataset, a topic model provides probability distributions of words for a set of…
Descriptors: Coding, Artificial Intelligence, Models, Probability
Lijin Zhang; Xueyang Li; Zhiyong Zhang – Grantee Submission, 2023
The thriving developer community has a significant impact on the widespread use of R software. To better understand this community, we conducted a study analyzing all R packages available on CRAN. We identified the most popular topics of R packages by text mining the package descriptions. Additionally, using network centrality measures, we…
Descriptors: Computer Software, Programming Languages, Data Analysis, Visual Aids
Cai, Zhiqiang; Li, Hiyiang; Hu, Xiangen; Graesser, Art – Grantee Submission, 2016
This paper provides an alternative way of document representation by treating topic probabilities as a vector representation for words and representing a document as a combination of the word vectors. A comparison on summary data shows that this representation is more effective in document classification. [This paper was published in:…
Descriptors: Probability, Natural Language Processing, Models, Automation
Zhang, Zhiyong; Zhang, Danyang – Grantee Submission, 2021
Data science has maintained its popularity for about 20 years. This study adopts a bottom-up approach to understand what data science is by analyzing the descriptions of courses offered by the data science programs in the United States. Through topic modeling, 14 topics are identified from the current curricula of 56 data science programs. These…
Descriptors: Statistics Education, Definitions, Course Descriptions, Computer Science Education

Peer reviewed
Direct link
