Twin UCLA studies find correlation between Google, Twitter searches and syphilis

Data collection via social media is a topic that’s been getting headlines in recent weeks. New research shows how such information may be vital to public health. Two studies by UCLA researchers have found online search terms and tweets could be used to predict syphilis trends.

Led by researchers from the UCLA-based University of California Institute for Prediction Technology, in collaboration with the Centers for Disease Control and Prevention (CDC), the two studies aimed to provide insight into the prediction capabilities of Google and Twitter in using risk-related terms to identify syphilis trends.

In the Epidemiology study, researchers examined state-level search queries on Google in association with primary and secondary syphilis cases. Data for 25 keywords and phrases were collected from Google from 2012, to 2014. Weekly county-level syphilis data were also collected from the CDC for all 50 states.

The machine learning algorithm then analyzed Google search trends and actual rates of syphilis. The platform was able to predict 144 weeks of syphilis counts for each state with 90 percent accuracy.

The second study, published in Preventative Medicine, focused on county-level Twitter data containing words associated with sexually risky behaviors. A total of 8,538 geo-located tweets from May 26 to Dec. 9, 2012, were included in the study. Weekly county-level cases of primary and secondary syphilis reported to the CDC from 2012 to 2013 were included. Results found counties with higher risk-related tweets in 2012 correlated to a 2.7 percent increase in primary and 3.6 percent increase in secondary rates of early syphilis cases.

"Many of the most significant public health problems in our society today—HIV and sexually transmitted infections, opioid abuse and cancer—could be prevented if we had better data on when and where these issues were occurring," said Sean Young, founder and director of the UCLA Center for Digital Behavior and the UC Institute for Prediction Technology. "These two studies suggest that social media and internet search data might help to fix this problem by predicting when and where future syphilis cases may occur. This could be a tool that government agencies such as the CDC might use," added Young, who is also an associate professor of family medicine at the David Geffen School of Medicine at UCLA.