ERIC Number: ED667616
Record Type: Non-Journal
Publication Date: 2021
Pages: 75
Abstractor: As Provided
ISBN: 979-8-5346-7164-3
ISSN: N/A
EISSN: N/A
Available Date: 0000-00-00
Unsupervised Learning Methods in Digital Phenotyping
Gang Liu
ProQuest LLC, Ph.D. Dissertation, Harvard University
Digital phenotyping is defined as the "moment-by-moment quantification of the individual level human phenotype in situ using data from personal digital devices". The passive data collected by smartphone devices, including GPS, accelerometer and communication logs, can provide insights on users' behaviors that could be related to various diseases. Research findings have demonstrated robust associations between the behavioral risk factors derived from the sensor data and health outcomes, including obesity, diabetes, various cardiovascular diseases, mental health and mortality. To improve patients' symptoms, communication and clinical outcomes, the thesis is focused on solving the challenging problems arising in integrating the health informatics from the smartphones into clinical practice in free-living settings. The first chapter addresses the missing data problem in GPS data caused by the sampling strategy determined by the limited battery capacity. We developed an algorithm that simulates an individual's trajectory based on previously observed GPS location traces, without reliance on external data. The method makes use of sparse online Gaussian Process, spherical geometry and a bidirectional imputation fashion to reduce the computational cost and improve the accuracy of existing methods. The second chapter describes how to quantify the uncertainty of accelerometer-based estimates such as step counts due to different sources of missingness. We propose an online Bayesian learning method which models the step count estimates as random variables from a zero-inflated negative binomial distribution. The method updates the posterior distribution of each parameter on the fly, and provides a credible interval for each time window as well as each day based on the posterior predictive distribution. The third chapter is about detecting aberrant human behaviors from all kinds of of passive data collected by smartphones in real time. We propose an online anomaly detection method using Hotelling's T-squared test, where the test statistic is a weighted average, with more weight on the between-individual component when there are little data available for the individual and more weight on the within-individual component when the data are adequate. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml.]
Descriptors: Learning Processes, Health Behavior, Handheld Devices, Telecommunications, Data Use, Behavior, Data Collection, Phenomenology
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com.bibliotheek.ehb.be/en-US/products/dissertations/individuals.shtml
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: N/A