Improving Public Health Surveillance During COVID-19 with Data Analytics and AI
August 28, 2020 inEngineering Blog
As the leader of the State and Local Government business at Databricks, I get to see what governments all over the U.S. are doing to address the Novel Coronavirus and COVID-19 crisis. I am continually inspired by the work of public servants as they go about their business to save lives and address this crisis.
In the midst of all of the bad news, there aregood news reportsof the important work done by public health officials on COVID-19. The good work that public health departments beyond the Center for Disease Control and Prevention (CDC) perform are not usually the dramatic headlines, but they are making an amazing impact.
Like many of us, local and state governments are figuring things out as they go along, one step at a time. By observing successful COVID-19 response programs in countries where infections happened early, public health agencies first recognized the need for contact tracing as an important data source, and have scrambled to implement contact tracing programs. Once contact tracing is in place, vast amounts of data become available.
Across the globe, it has been proven, incountries like South Korea, that the COVID-19 case data from contact tracing can inform the management of outbreaks in powerful ways. How does all that data get used to inform government policy makers, to guide public health practices anddefine public policy, sometimes in spite of a less-than-enthusiastic public? The epidemiological study of this data informs research not just on individuals, but on populations, geographies, and risk factors that contribute to outbreaks, hospitalizations, and fatalities.
What is the right shelter-in-place or reopening policy for Los Angeles County vs. Humboldt County California? What are the right group size limitations? The right policies for high-risk environments like skilled nursing facilities? Data can inform all of these policy recommendations. It must.
Unfortunately, it’s not that easy. Local departments of health and other public health agencies at the forefront of this pandemic are struggling with fundamental data challenges that are impeding their ability to drive meaningful insights. Challenges like:
- How do we bring together clinical and case investigation datasets that reside in siloed, legacy data warehouses, EHR and operational systems managed by thousands of healthcare providers and agencies?
- How do we provide the necessary compute power to process these population-scale datasets?
- How do we blend structured data (e.g. medical records) with unstructured data (e.g. patient chatbot logs, medical images) to power novel insights and predictive models?
- How do we reliably ingest streaming data for real-time insights on the spread of COVID-19, hospital usage trends, and more?
For many health organizations, building this analytics muscle has been a slow burn. The good news: powerful cloud-based software solutions, likeDatabricks Unified Data Analytics Platform, are accelerating this transformation with the tooling and scale needed to analyze large volumes of health data in minutes. With these fundamental data problems solved, health organizations can refocus their efforts on building analytics and ML products instead of wrangling their data. One example is the COVID-19 surveillance solution developed on top of Databricks, which is being deployed in a number of state and local government health departments, as well as by a number of hospitals and care facilities across the U.S.
Included above is a brief demo of our public health surveillance solution. In the demo, we show how to take a data-driven approach to adaptive response, or in other words, apply predictive analytics to COVID-19 datasets to help drive more effective shelter-in-place policies.
With this solution on Databricks we’re able to yield important insights in a short amount of time and, as a cloud native offering, it can be deployed quickly and cost effectively at scale. This solution includesCOVID-19 data sets we have previously published, as well as workbooks used by public health departments to deliver data-driven insight to guide COVID-19 public policy. This is one of many solutions that can be built on Databricks using this dataset. Other use cases for COVID-19 data include Hotspot Analysis, Epidemiological Modeling, and Supply Chain Optimization. You can learn more on ourCOVID-19 hub.
Databricks is committed to fighting the COVID-19 epidemic and other infectious diseases by implementing powerful analytical tools for government agencies across the country. We invite you to inquire about how we might be able to help your agency.
Next Steps
- Learn how IU Health builtreal-time COVID-19 dashboardswith Databricks
- Sign-up for afree trialorrequest a live demoof our health data surveillance solution