Accidental Shootings Analysis
Jerris George, Hannah Haley, Rachael Humphreys, and Ryan Pavlich
Introduction
Shootings are reported almost daily across the nation, mass shootings often get the most attention as more victims are reported injured or killed. In this report, we wanted to explore how many shootings are classified as accidental, whether accidental shootings are more common among young adults or children, and which gun type is the most frequent in accidental shootings.
Data Sources
The two data sources we used are public records. Before we could join the datasets together, we had to tailor and separate some of the values into new columns to be able to create a suitable join on the index values within the original participant column.
Our gun violence dataset was made available by a user on GitHub. The primary dataset created by the GitHub user can be found on the Gun Violence Archive. The dataset covers shootings in the US between January 2013 and March 2018, inclusively. Our secondary dataset was United States to Region, and this was used to help identify where the shootings were occurring.
Problem
We sought to identify the characteristics of accidental shootings and who is most likely to be killed when there is an accidental shooting, which is based on our data from the years 2013 – 2018. The factors we decided to focus on were age, gender, region, gun type, and number of guns involved.
Data Cleaning & Validation
Splitting and Stacking Columns
All our original participant level data, along with some other relevant columns such as gun type, was separated by a pipe delimiter. Within the pipe delimited columns, there were index numbers which were followed by “::” for each participant. Because how the columns in our dataset was formatted, we could not import the data directly into SAS as a delimited file. As a result, we had to first divide our delimited columns into separate files and clean the data to remove the pipe delimiter. Once we had the columns separated and readable, we imported the data into SAS where we stacked the values and then split the participant index and their associated values, which created the following separate columns: ParticipantIndex, ParticipantAge, ParticipantAgeGroup, ParticipantGender, GunIndex, GunType, and ParticipantType. This was performed using computed columns within the query builder.
Missing Data and Joining Tables
There were some missing values observed across the participant columns. We elected to not remove these rows or columns during our initial analysis; but these were filtered during the modeling phase. In order to join the participant columns back into the original non-pipe delimited columns, we used left joins so that if we were missing participant data it would not remove the row from our resulting table. We performed this join using the query builder, and joined on the incident id and participant index.
Recoding and Conversion of Columns
Initially, we changed Participant Age from character to numeric, fixed minor typos in Gender and Participant Type and recoded the Gun Type column. The Gun Type column was recoded into four categories: Handguns, Rifle, Shotgun, and Unknown. Prior to performing the modeling we recoded the number of killed variable to produce a the following values: 0 = non-fatality, 1 = fatality.
Analysis
Analysis of Fatality at a State Level


We wanted to geographically define where accidental shootings were most likely to occur at a state level and out of those shootings, which are more fatal. In our visualizations below, you can see the top three states to have more fatal accidental shootings are Texas, Florida, and California with their number of incidents ranging from 411-946 shootings.
Gun Type by Age Group Analysis


After digging into the ages of the participants, we wanted to determine which gun type has the biggest presence in accidental shootings. To determine which gun types are more involved in accidental shootings, we analyzed the number of incidents reported by age group. Through a summary statistic, we noticed that handguns were the most common gun type used. Excluding values that were of an Unknown category for guns, participants that were Adult 18+ had a total of 2058 incidents, Children aged 0-11 had a total of 433 incidents, and Teen’s who were 12-17 old had a total of 507 incidents. Also, we observed by generating a stacked bar chart visually that the most frequently used gun in accidental shootings are handguns.
Analysis on Age and Gender:

To analyze further who were affected by accidental shootings, we needed to dig into the ages and genders of the participants involved in accidental shootings. Based on the above stacked bar chart, the majority of participants involved in accidental shootings were from the age group of Adult 18+ and were males, which made up about 60.48% of the dataset.
Fatality Analysis:

In our fatality analysis, the probability of an incident being fatal goes up when it involved individuals aged 17 years and younger, while occurring in the southern region, and handguns and shotguns being the main gun are involved in the accidental shooting. Likewise, the probability of a fatality occurring goes down if the incident involves adults aged 18 years and older, while occurring in the western, northern and, midwestern regions, and handguns and shotguns not being the guns involved in the shooting. It is also noted that gender did not influence the fatality of an accidental shooting.
Conclusions
From the years 2013-2018, the top five states to have reported 500+ accidental shootings were Texas, Florida, California, Ohio, and Georgia where more than 100 individuals were injured and killed. Out of these incidents, we see adults aged 18 and older are more likely to be in accidental shootings. In our initial analysis we also noted that the gender more likely to be in an accidental shooting is male with adult males making up 60.48% of incidents reported. The most frequently known gun type used in accidental shootings are handguns at 2998 reported incidents. In addition, the fatality of an accidental shooting goes up if the incident involves participants aged 17 years and younger, if it occurs in the southern region, and if the gun type used is a handgun and/or shotgun. In conclusion, accidental shootings are more frequent than thought and to avoid any future accidental shootings, gun owners should practice and known their gun safety protocols.
Data Sources/Reference
Gun Violence – https://www.gunviolencearchive.org/reports
Accidental Deaths – https://github.com/jamesqo/gun-violence-data
United States to Region – https://www.kaggle.com/datasets/omer2040/usa-states-to-region