Validating a Bayesian Model for Linking Serial Crimes Through Simulation
MetadataShow full metadata
Crime linkage analysis tries to determine which crimes were committed by the same offender. This is an important police investigative function, as research has shown that a significant proportion of most crime types are committed by a small number of prolific offenders. Case clearance is related to the amount of information available to investigators, and a series of linked crimes provides more information than individual cases examined alone. However, there are only a few available methods for crime linkage. These methods commonly utilize information provided by physical evidence, offender description, and crime scene behavior (i.e., proximity in time and space, modus operandi, and signature). Recognizing that very few factors definitively link crimes, researchers have demonstrated the utility of probabilistically linking crimes using less than definitive information. Bayesian methods provide a promising method of analyzing these links. While some research has demonstrated the efficacy of these methods, the initial work validating the models has relied on limited samples. As such, the generalizability of this research is unknown. This study assesses the validity of a Bayesian crime linkage method using computational methods.
Using empirical observations for both serial murder and commercial robbery as the basis of offender behavior, simulated observations were generated for distance, time difference, and 12 modus operandi (M.O.) factors for serial and non-serial offenses for each crime type. In total 3,500,000 linkage analyses were generated for each crime type using the Bayesian crime linkage method. Receiver operating curve (ROC) analysis was utilized to assess the predictive capacity of the method on the simulated data. The mean area under the curve (AUC) for the entire set of linkage analyses was 0.81 for serial murder and 0.80 for commercial robbery indicating that the model represents a “good” predictor of serial linkage.
The Bayesian hypothesis test was applied to the likelihood ratio, and results indicated that the extreme level of evidence utilized in the test was a good indicator of linkage (exhibiting a median hit rate of 90% and a mean percent of series identified of 43.22%) for the commercial robbery data using spatial and time difference in conjunction with all 12 M.O. factors. For the murder data using the same set of factors, the extreme level of evidence was less effective as a predictor (exhibiting a median hit rate of 54.55% and a mean percent of series identified of 56.38%).
The inclusion of additional information was shown to increase the predictive capacity of both models using AUCs as a measure of predictive validity. Using the levels of evidence from the Bayesian hypothesis test as decision thresholds, the inclusion of additional information increased both the percent of true positives and the percent of a series identified for all levels of evidence for the commercial robbery data. Adding additional information had little effect on the percent of true positives for the murder data at the highest and lowest levels of evidence and a negative effect at the two middle levels of evidence. In contrast, adding additional information had no effect on the hit rate using the murder data at the three lower levels of evidence but increased the percent of a series identified at the highest level of evidence. The difference in performance between the commercial robbery and murder data was ascribed to the lower base rate of serial murder and the higher predictive capacity of distance for serial murder data. The higher predictive capacity of distance for serial murders resulted in overall higher likelihood ratios than those observed for the commercial robbery data. Greater performance capacity was found to be associated with longer serial distances and time differences, shorter non-serial distances and time differences, and greater offender consistency and uniqueness for M.O. factors in both the murder and commercial robbery data. Distance and time measures were more important for serial murder linkage, though they were still strong factors for commercial robbery linkage. Consistency and uniqueness were found to have equal value in serial murder linkage, but uniqueness had twice the impact on commercial robbery linkage performance. The greater impact of uniqueness in serial commercial robbery linkage than in serial murder linkage was attributed to higher average levels of consistency in the commercial robbery M.O. data and the reduced ability of distance and time to predict commercial robbery linkage.