Predicting disease spread based on climate change

Predicting disease spread based on climate change

Meghan Bongartz

The conversation about disease in the United States tends to revolve only around those diseases which pose a current threat or problem. This means that we spend far more time on average talking about measles than Ebola – but it makes it far more terrifying when Ebola is being talked about because it means that it is suddenly posing a threat and we do not have the infrastructure to deal with an outbreak. There are some diseases that we don’t currently consider threats in the United States for which it would be difficult to predict when they could become problems due to the way they are spread. However, there are other diseases that may spread or move with climate change, and we should be able to plan for these.

My goal was to investigate the risk for spread of tropical diseases in the United States as climate changes over time. There are a plethora of diseases that could be impacted by climate change for various reasons, but I narrowed my area of interest down to vector-borne diseases and, more specifically, mosquito-borne diseases because they show a stronger climate preference than some other vectors such as ticks. I looked at two different vectors: Tiger Mosquitos and Southern House Mosquitos.

Before addressing the vectors, though, I needed data on the rate at which climate is changing in the United States. This was available from the National Oceanic and Atmospheric Administration here: http://www.ncdc.noaa.gov/cag/time-series/us. NOAA has information about temperature and precipitation since 1895 that can be downloaded in a nicely formatted CSV file; however, the type of information, time scale, and state or region must be selected manually. Initially, I downloaded the annual mean temperature and precipitation for the months of January and July and for the full year on a regional basis because this was manageable to do manually. Upon exploring the data sets, though, I discovered that some regions had very strong correlations between time and temperature change or precipitation change, and others had virtually no correlation. While this was not unexpected, it did lead me to the decision that I should look at the data for individual states for more accuracy. In the future, I would even be interested in looking at smaller areas within the states.

In order to get around the manual download form and the authentication that went with it, I wrote a scraper to pull the CSVs  I was interested in. The permalinks for the CSVs took the form “http://www.ncdc.noaa.gov/cag/time-series/us/” + fips_code + “/00/” + parameter + “/” + time_scale + “/” + month + “/1895-2015.csv?base_prd=true&firstbaseyear=1901&lastbaseyear=2000”, so I was tasked with establishing where each of the form selections fit into the url. I was initially under the impression that the site used FIPS codes as state identifiers, but this is not the case and resulted in a collection of wrongly labeled files. The states are actually just numbered in alphabetical order (excluding Alaska and Hawaii).

Once I had my climate data, I needed to find information about mosquitos. I found conflicting information on a number of websites, and one question that I needed to address was whether to use the ideal climate for my disease vectors or a tolerable climate. I settled on a combination of climate factors that would allow for the widest allowable climate window and therefore err on the side of predicting more states to fall into the range of risk for disease spreading mosquitos. Because the purpose of this project is to plan for potential outbreaks, it would be better to predict a mosquito supporting climate in a place that does not wind up having that climate than the reverse. The following websites were used for information about mosquitos:http://www.cabi.org/isc/datasheet/86848, http://www.cabi.org/isc/datasheet/94897,http://www.who.int/mediacentre/factsheets/fs387/en/,http://www.climatecentral.org/gallery/graphics/mosquito-season-getting-longer,http://invasivespeciesireland.com/news/predicting-the-spread-of-the-tiger-mosquito-in-europe/,http://digital.csic.es/handle/10261/60982. Lists of current mosquito locations were taken from CABI for comparison.

I used prediction models based on linear regression to calculate when each state would fall into the tolerated climate range for both Tiger Mosquitos and Southern House Mosquitos based on three predictors. For Tiger Mosquitos, these were warm month temperature range, cold month minimum, and minimum annual rain. For Southern House Mosquitos, there were temperature range over the whole year (important for larval development), warm month temperature, and minimum precipitation. I then created functions to calculate the year in which each state would fall into the range for each predictor.

The results of my analysis were not exactly what I expected. My model produced a long list of states that should be in the climate range for each type of mosquito this year, and only a couple of states with climate threshold dates in the future. In a way, this is unexciting because there is very little being predicted; however it’s also a reminder that we may be closer to a climate that is hospitable to tropical diseases than we think. The states at climate risk produced by the model are as follows:

Tiger mosquitos:

Mississippi 2015

Oklahoma 2015

Delaware 2015

Arkansas 2015

Louisiana 2015

Texas 2015

California 2015

Georgia 2015

Maryland 2042

Virginia 2015

Oregon 2088

South Carolina 2015

Florida 2015

Alabama 2015

North Carolina 2015

Tennessee 2015

House mosquitos:

Mississippi 2015

Oklahoma 2015

Delaware 2015

Illinois 2015

Arkansas 2015

Indiana 2015

Louisiana 2015

Texas 2015

Kansas 2015

Connecticut 2027

California 2015

West Virginia 2015

Georgia 2015

Pennsylvania 2096

Missouri 2015

New Jersey 2015

Maryland 2015

Virginia 2015

Massachusetts 2077

South Carolina 2015

Florida 2015

Kentucky 2015

Rhode Island 2015

Nebraska 2066

Ohio 2015

Alabama 2015

North Carolina 2015

Tennessee 2015

These states are at higher risk for diseases including dengue fever, yellow fever, chikungunya, St. Louis encephalitis, West Nile virus, lymphatic filariasis, and Japanese encephalitis.

This analysis is certainly far from perfect. It would be worth looking at areas smaller than states, as some states cover a very large area and may have differing climates within their borders (Texas comes to mind, but states like California, Illinois, and Indiana are also quite long). I would also like to do further research into the preferred climates of the two mosquito types in order to make the models more accurate. Some options might be to enter states that already host mosquitos into the model in order to train it or to use the climates of countries where diseases spread by these mosquitos are a major problem currently in order to model the preferred climates. Regardless, it should give us pause that these diseases are still considered “tropical” or “exotic” when parts of the United States could be at very real risk for them.

If you’re curious in learning more about the intersection of data, coding and visualization, check out the Lede Program – an intensive certification program at Columbia’s School of Journalism, in conjunction with the Department of Computer Science. Find out more on our mail page – applications are open soon!

Want to stay up to date on the Lede Program?