We use cookies to understand how you use our site and to improve your experience. This includes personalizing content and advertising. To learn more, click here. By continuing to use our site, you accept our use of cookies. Cookie Policy.

Features Partner Sites Information LinkXpress
Sign In
Advertise with Us
Advantech Europe

Download Mobile App

AI Places Confidential Health Information at Risk

By HospiMedica International staff writers
Posted on 21 Jan 2019
Print article
Image: AI can reconstruct anonymous data to identify individuals (Photo courtesy of Getty Images).
Image: AI can reconstruct anonymous data to identify individuals (Photo courtesy of Getty Images).
Advances in artificial intelligence (AI) technologies, such as those incorporated into activity trackers, smartphones, and smartwatches, can threaten the privacy of personal health data.

Researchers at the Massachusetts Institute of Technology (MIT, Cambridge, MA, USA), the University of California Berkeley (UCB; USA), and other institutions conducted a cross-sectional study of U.S. National Health and Nutrition Examination Survey (NHANES) data sets to evaluate the feasibility of reidentifying accelerometer-measured physical activity data, which have had geographic and protected health information removed, using support vector machines (SVMs) and random forest machine learning methods.

The accelerometer-measured data were collected for seven continuous days, with the primary outcome being the ability of the random forest and linear SVM algorithms to match demographic and aggregated physical activity data to individual-specific record numbers, and the percentage of correct matches made by each algorithm. The results showed that random forest algorithm successfully reidentified the demographic and aggregated physical activity data of an average of 94% of the adults and 86% of the children. The linear SVM algorithm successfully reidentified demographic and physical activity data of 85% of the adults and 68% of the children. The study was published on December 21, 2018, in JAMA Network Open.

“The results point out a major problem; if you strip all the identifying information, it doesn't protect you as much as you'd think. Someone else can come back and put it all back together if they have the right kind of information,” said senior author Anil Aswani, PhD, of UCB, and colleagues. “You could imagine Facebook gathering step data from the app on your smartphone, then buying health care data from another company and matching the two. They could either start selling advertising based on that or they could sell the data to others.”

“Employers, mortgage lenders, credit card companies and others could potentially use AI to discriminate based on pregnancy or disability status, for instance. What I'd like to see from this are new regulations or rules that protect health data; but there is actually a big push to even weaken the regulations right now,” concluded Dr. Aswani. “The risk is that if people are not aware of what's happening, the rules we have will be weakened. And the fact is the risks of us losing control of our privacy when it comes to health care are actually increasing and not decreasing.”

Random forests are an ensemble learning method that combines a large number of decision trees to make predictions. Although random forest models are difficult to interpret, this approach is one of the most successful machine learning techniques because it often has the highest accuracy. Linear SVM is a popular classification algorithm that has fast computation speed, is easily interpretable, and has good accuracy.

Related Links:
Massachusetts Institute of Technology
University of California Berkeley

Print article


Copyright © 2000-2019 Globetech Media. All rights reserved.