Maternal health data analysis

For this project, my goal was to analyze maternal mortality rates across the United States with a focus on understanding the metrics that impact these rates. I investigated how availability of resources, quality of healthcare, and access to healthcare all relate to the maternal mortality rate.

The questions that I am planning to explore through this project are:

How do maternal mortality rates compare across US states?
How do the maternal mortality rates compare to the availability of maternal healthcare access in each state?
Given current maternal mortality data, what trends might we expect in states that have since enacted reproductive healthcare restrictions? (If time and project scope allows)

Data profiles

America's Health Rankings 2024 Maternal and Infant Health Disparities Data Brief

This data source is provided as a downloadable .csv file. There are no restrictions or limitations on this dataset and it is available for public download and use. From this data source, I will plan to extract all data related to maternal mortality. Some of these dimensions include: maternal mortality rate, access to maternity care, access to prenatal care, state, and year. Additionally, this dataset is quite extensive and breaks down other factors, such as ethnicity and health insurance coverage. I do not plan to factor in these considerations, so I will not pull them from this set. I will also not need to include details such as report edition and report type.

The Commonwealth Fund 2024 State Scorecard on Women’s Health and Reproductive Care

This dataset originates from an article that contains several visualizations and downloadable datasets. This information provided in the article is all very relevant and interesting. My focus will be on the state scorecard information, so my plan is to only download this single dataset. The data is provided as a downloadable .csv and there are no limitations or restrictions to using it. From this dataset, I will pull the data for state, overall ranking, health outcomes, health care quality and prevention, and coverage, access, and affordability. I will not be needing the feature to download each state’s unique profile.

Analysis

To start this project, I first created independent visualizations from each of my datasets.

I began by using my first dataset to visualize how the ranking of each state compared to one another across the 4 different health scorecard dimensions.

Analysis: Dataset 1: Summary of State Rankings

The following dataset comes from The Commonwealth Fund 2024 State Scorecard on Women’s Health and Reproductive Care (Link here)

This dataset was created in order to analyze the women's health care in each U.S. state. The data provides a "score" for each state determined based on the following metrics: health outcomes; health care quality and prevention; and coverage, access, and affordability.

I started by creating a simple bar graph that showed how the states' overall scorecards ranked against each other.

Note that the shorter the bar, the better "score" that the state received across several metrics. This bar graph shows Massachusetts as the top ranked state while Mississippi was ranked the lowest state for women's health.

Next, I wanted to evaluate how the different dimensions related to the overall scorecard ranking of the states. To do this, I created two different scatterplots. The first one looks at coverage, access, and affordability compared to each state's overall ranking. The second scatterplot shows health care quality and prevention compared to each state's health outcomes ranking.

This scatterplot shows a clear correlation between a state's ranking of coverage, access, and affordability compared to that state's overall ranking.

This scatterplot was pretty interesting because there was less correlation than I expected. It does not appear that there is a large impact of health care quality and prevention on health outcomes.

Finally, for analyzing this first dataset, I wanted to create a scatterplot that included all 4 dimensions that went into calculating each state's final scorecard. The size of each datapoint represents that state's ranking for health care quality and prevention.

This scatterplot shows some correlations across all the metrics and the state's overall ranking (although this was very expected based on how the ranking was calculated in the first place). I found some of the outliers to be really interesting. For example, District of Columbia is rated very high on the overall state ranking, as well as for coverage, access, and affordability, yet it has a relatively lower health outcome score compared to similar states around the same plot point. Additionally, states like West Virginia, Louisiana, and Kentucky had high ratings of coverage, access, and affordability, yet were much lower ranked for overall state ranking.

Being able to break down the four different metrics that make up the total scorecard ranking was really helpful in providing context for my next dataset. Going into looking at maternal mortality rates across the states, I have an initial idea of what trends to expect which is a nice reference point.

Analysis: Dataset 2: Summary of State Rankings

The next dataset is America's Health Rankings 2024 Maternal and Infant Health Disparities Data Brief (Link here). It contains an expansive breakdown of maternal and infant health statistics across the U.S. For this project, I am specifically interested in data related to maternal mortality and access to maternity care.

To begin working with this dataset, I first created visualizations to depict which states had the highest level of maternal mortality as well as the access to maternity care in each state.

I used my second dataset to create some visualizations of the maternal mortality rate across each state. Using the choropleth graph/map was really helpful in seeing how each state compared to one other while the bar graph was very helpful in showing which states were at either end of the ranking without having to hover over or interpret a shade of color like the choropleth map. This helped me to answer my first research question, which was "How do maternal mortality rates compare across US states?" I found a few regional trends, but did not see as significant of a regional difference as I expected. For example, the maternal mortality rates appear to be much higher in the southeast, in states like Alabama, Arkansas, and Louisiana. Generally, states in the north appeared to have lower maternal mortality rates. States like California, Wisconsin, and Minnesota all stood out as having exceptionally low maternal mortality rates.

Analysis: Dataset 1: Summary of State Rankings

Finally, I wanted to combine my two datasets to investigate my 2nd research question about how access to healthcare and quality of healthcare impacted maternal mortality rates, if at all. In order to visualize this, I created a double bar graph based on state. Each state has two measures, one for maternal mortality rate and one for scorecard ranking from the first dataset (inverted to better compare against the maternal mortality rate). By inverting the scorecard ranking, it is easier to compare the two measures as a higher bar is now worse in both cases.

Completing this visualization helped me to answer my 2nd research question of "how do the maternal mortality rates compare to the availability of maternal healthcare access in each state?" For the purposes of simplicity, I am using the overall scorecard ranking as a measure of the availability of maternal healthcare access, since that metric is a major factor in determing the state's scorecard ranking. My expectation prior to creating this graph was that the bars should be very close in value. After creating this visualization, I noted both some expected and unexpected trends. Some states, such as Colorado and Wisconsin, were relatively close in state scorecard value and maternal mortality rate. Other states, like Nevada, had significant differences between the two values. In general, it appears as if there is a slight correlation between maternal mortality rate and maternal healthcare access in each state.

However, it is worth noting that the scale for the state ranking is much larger than the scale for maternal mortality rate. As a result, it is not completely effective to use the difference in the bar values as an ideal comparison, but it provides a solid starting point.

Conclusions and Directions for future work

For the most part, my findings were relatively expected and unsurprising. There were a few outliers within some of my data visualizations as well as a factors that proved not to be very correlated. The healthcare scorecard dataset showed trends between coverage, access, and affordability, health outcomes, and overall ranking. The data from the maternal mortality dataset depicted some regional trends in maternal mortality rate across the US. Together, the data from these sources showed some positive correlations, as seen by the final double bar chart.

Overall, being able to visualize these datasets in context with one another was interesting. I really enjoyed this project and being able to work my way through different visualizations. One thing that surprised me about working with this data that I didn't expect was also having to factor in the most effective ways to show the data visualizations between the two datasets. There were several instances where I ran into the issue of having the visualization outputs mean two different things. For example, the data from my first dataset ranked the states based on their scorecard value which meant that a lower score/value was actually better. On the other hand, a higher value for the maternal mortality rates from my second dataset was less desired. As a result, I had to play around with inverting my data to make sure that it was effective when shown side by side.

Additionally, I feel that I was successful in answering my first two research questions with my data visualizations. I think that I scoped both of these questions appropriately for my project. With more time, I think my next step would be to continue reworking my final visualization (double bar graph) in a way that scaled both maternal mortality rate and scorecard ranking in the same way. While I originally had a third research question ("Given current maternal mortality data, what trends might we expect in states that have since enacted reproductive healthcare restrictions?"), I think that this question was out of scope for this project. It likely would have required me to bring in an addition third dataset and would have required some complex visualizations. For future work, I think that this would still be a very interesting and important question to answer. The work that I completed for this project would be a solid starting point as well. I would likely plan to create another choropleth map with a year slider to view maternal mortality rates over time, with another added dimension to compare when reproductive healthcare restrictions or laws were enacted.

A link to my full Kaggle notebook can be found here.