Which Schools are Most Likely to Adopt Metal Detectors? Answers from Machine Learning
Updated: May 8, 2019
Metal detectors are used routinely in approximately 4% of public secondary schools in the USA. Which schools are most likely to use them? The answers, from a Random Forest machine learning algorithm, might surprise you
Metal Detectors & Students' Health
The use of metal detectors on school grounds is highly controversial, not least because of the lack of data demonstrating their efficacy in maintaining a safe school environment. To date, researchers have been unable to "prove" that metal detectors reduce school crime (Hankin et al, 2011). Many scholars have found that using metal detectors in schools can have unintended consequences on students, such as increasing anxiety and fear levels (Gastic, 2011), decreasing attendance, negatively impacting instructional time (Mukerjee, 2007) and altering how teachers and staff view student behavior (Hirshfield & Celinksa, 2010). For these reasons, it is critical to understand which schools are using harsh methods of security. While crime in school may be one factor that leads to more intense security practices, other indicators such as poverty levels, ethnic composition of the school and parent involvement may all play a role.
This study employed Random Forest, a machine learning approach to build a model to predict which variables were most likely to predict schools' metal detector use.
The Study: Using Random Forest & SMOTE To Predict Factors Associated with School Metal Detector Use
This investigation utilised data from the School Survey on Crime and Safety (SSOCS) 2016 data (see Jackson et al., 2018). Firstly, we prepared the data and extracted the features that have been studied thus far by other researchers who have been interested in understanding the impacts of metal detectors. Those features included crime levels in schools/neighborhood, school size, urbanicity, parent involvement (Matthews, 2019), teacher training in security matters, parent policy input, ethnic composition (Gastic & Johnson, 2015) and funding. The dependent variable was : daily metal detector use.
After selecting the features we split the data into two data sets : a training set to build the model and a testing set to validate the model. This step is important because machine learning approaches can "over learn" their models and provide falsely elevated predictive statistics.
The data set included 1576 total observations; 946 observations in the training set and 630 observations in the test set. Daily/routine metal detector use occurred 4% of the time (i.e. 56 cases). Because of this high imbalance in the dependent variable, we applied synthetic minority oversampling technique (SMOTE) to interpolate additional synthetic samples in the training set. This was done to help the model "learn" more easily since more data are available during the training phase. This procedure also helps ensure that each bootstrapped sample in the random forest have some observations indicating metal detector use. These synthetic samples were not included in the validation set--the model was only tested on original data.
All analyses were done in R.
The balanced accuracy of our model when tested on the validation set was approximately 81% with a sensitivity of 86%. To determine which factors were most robust in the model prediction, we obtained a variable importance plot. This plot shows the most important variables in the model as determined by the mean decrease in Gini score. Here's what we found:
As you can see, violence on campus comes up eighth in the model with a Gini value of a mere 9.22! Compare this, with the top three variables which are: 1) crime near students' homes (32.73), 2) crime near school (25.99), and 3) an administrator assessment of the percentage of students who believe academics are important (22.48). Urbanicity (19.63), parent involvement (13.5), school size (10.5) and ethnic composition (16.8) are ALL ranked more highly than school level violence in this model. Using the Gini statistics we can see that crime near home is almost 4x as important in the model as violence in schools. An important note is that we may wish to consider the first two variables (c0560; c0562) as proxy indicators for poverty. The high rankings for indicators of poverty align with previous research which suggests that harsh security measures are disproportionately applied in schools serving low-income students (see Kupchik & Ward, 2011).
Certainly more research needs to be done. While the sensitivity of our model was quite good, the false positive rate was high. These indicators of model accuracy can be improved by collecting more robust data on schools and improving the discriminative capacity of data. Educators, planners, and administrators should reflect on the reasons for specific types of security and be mindful about the negative impacts of harsh security on students' development.
To read the full report or obtain the R code for this analysis, please contact us.
Gastic, B. (2011). Metal detectors and feeling safe at school. Education and Urban Society, 43(4), 486-498. doi:10.1177/0013124510380717
Gastic, B., & Johnson, D. (2015). Disproportionality in daily metal detector student searches in U.S. public schools. Journal of School Violence, 14(3), 299-315. doi:10.1080/15388220.2014.924074.
Hankin, A., Hertz, M., & Simon, T. (2011). Impacts of metal detector use in schools: Insights from 15 years of research. Journal of School Health, 81(2), 100-106.
Hirschfield, P. J., & Celinska, K. (2011). Beyond fear: Sociological perspectives on the criminalization of school discipline. Sociology Compass, 5(1), 1-12.
Jackson, M., Diliberti, M., Kemp, J., Hummel, S., Cox, C., Gbondo-Tugbawa, K., . . . Hansen, R. (2018). 2015–16 school survey on crime and safety (SSOCS): Public-use data file user’s manual (NCES 2018-107). Washington, DC: U.S. Department of Education, National Center for Education Statistics.