• Open access
  • Published: 29 April 2021

Crime forecasting: a machine learning and computer vision approach to crime prediction and prevention

  • Neil Shah 1 ,
  • Nandish Bhagat 1 &
  • Manan Shah   ORCID: orcid.org/0000-0002-8665-5010 2  

Visual Computing for Industry, Biomedicine, and Art volume  4 , Article number:  9 ( 2021 ) Cite this article

69k Accesses

46 Citations

4 Altmetric

Metrics details

A crime is a deliberate act that can cause physical or psychological harm, as well as property damage or loss, and can lead to punishment by a state or other authority according to the severity of the crime. The number and forms of criminal activities are increasing at an alarming rate, forcing agencies to develop efficient methods to take preventive measures. In the current scenario of rapidly increasing crime, traditional crime-solving techniques are unable to deliver results, being slow paced and less efficient. Thus, if we can come up with ways to predict crime, in detail, before it occurs, or come up with a “machine” that can assist police officers, it would lift the burden of police and help in preventing crimes. To achieve this, we suggest including machine learning (ML) and computer vision algorithms and techniques. In this paper, we describe the results of certain cases where such approaches were used, and which motivated us to pursue further research in this field. The main reason for the change in crime detection and prevention lies in the before and after statistical observations of the authorities using such techniques. The sole purpose of this study is to determine how a combination of ML and computer vision can be used by law agencies or authorities to detect, prevent, and solve crimes at a much more accurate and faster rate. In summary, ML and computer vision techniques can bring about an evolution in law agencies.

Introduction

Computer vision is a branch of artificial intelligence that trains the computer to understand and comprehend the visual world, and by doing so, creates a sense of understanding of a machine’s surroundings [ 1 , 2 ]. It mainly analyzes data of the surroundings from a camera, and thus its applications are significant. It can be used for face recognition, number plate recognition, augmented and mixed realities, location determination, and identifying objects [ 3 ]. Research is currently being conducted on the formation of mathematical techniques to recover and make it possible for computers to comprehend 3D images. Obtaining the 3D visuals of an object helps us with object detection, pedestrian detection, face recognition, Eigenfaces active appearance and 3D shape models, personal photo collections, instance recognition, geometric alignment, large databases, location recognition, category recognition, bag of words, part-based models, recognition with segmentation, intelligent photo editing, context and scene understanding, and large image collection and learning, image searches, recognition databases, and test sets. These are only basic applications, and each category mentioned above can be further explored. In ref. [ 4 ], VLFeat is introduced, which is a library of computer vision algorithms that can be used to conduct fast prototyping in computer vision research, thus enabling a tool to obtain computer vision results much faster than anticipated. Considering face detection/human recognition [ 5 ], human posture can also be recognized. Thus, computer vision is extremely attractive for visualizing the world around us.

Machine learning (ML) is an application that provides a system with the ability to learn and improve automatically from past experiences without being explicitly programmed [ 6 , 7 , 8 ]. After viewing the data, an exact pattern or information cannot always be determined [ 9 , 10 , 11 ]. In such cases, ML is applied to interpret the exact pattern and information [ 12 , 13 ]. ML pushes forward the idea that, by providing a machine with access to the right data, the machine can learn and solve both complex mathematical problems and some specific problems [ 14 , 15 , 16 , 17 ]. In general, ML is categorized into two parts: (1) supervised ML and (2) unsupervised ML [ 18 , 19 ]. In supervised learning, the machine is trained on the basis of a predefined set of training examples, which facilitates its capability to obtain precise and accurate conclusions when new data are given [ 20 , 21 ]. In unsupervised learning, the machine is given a set of data, and it must find some common patterns and relationships between the data its own [ 22 , 23 ]. Neural networks, which are important tools used in supervised learning, have been studied since the 1980s [ 24 , 25 ]. In ref. [ 26 ], the author suggested that different aspects are needed to obtain an exit from nondeterministic polynomial (NP)-completeness, and architectural constraints are insufficient. However, in ref. [ 27 ], it was proved that NP-completeness problems can be extended to neural networks using sigmoid functions. Although such research has attempted to demonstrate the various aspects of new ML approaches, how accurate are the results [ 28 , 29 , 30 ]?

Although various crimes and their underlying nature seem to be unpredictable, how unforeseeable are they? In ref. [ 31 ], the authors pointed out that as society and the economy results in new types of crimes, the need for a prediction system has grown. In ref. [ 32 ], crime trends and prediction technology called Mahanolobis and a dynamic time wrapping technique are given, delivering the possibility of predicting crime and apprehending the actual culprit. As described in ref. [ 33 ], in 1998, the United States National Institute of Justice granted five grants for crime forecasting as an extension to crime mapping. Applications of crime forecasting are currently being used by law enforcement in the United States, the United Kingdom, the Netherlands, Germany, and Switzerland [ 34 ]. Nowadays, criminal intellect with the help of advances in technology is improving with each passing year. Consequently, it has become necessary for us to provide the police department and the government with the means of a new and powerful machine (a set of programs) that can help them in their process of solving crimes. The main aim of crime forecasting is to predict crimes before they occur, and thus, the importance of using crime forecasting methods is extremely clear. Furthermore, the prediction of crimes can sometimes be crucial because it may potentially save the life of a victim, prevent lifelong trauma, and avoid damage to private property. It may even be used to predict possible terrorist crimes and activities. Finally, if we implement predictive policing with a considerable level of accuracy, governments can apply other primary resources such as police manpower, detectives, and funds in other fields of crime solving, thereby curbing the problem of crime with double the power.

In this paper, we aim to make an impact by using both ML algorithms and computer vision methods to predict both the nature of a crime and possibly pinpoint a culprit. Beforehand, we questioned whether the nature of the crime was predictable. Although it might seem impossible from the outside, categorizing every aspect of a crime is quite possible. We have all heard that every criminal has a motive. That is, if we use motive as a judgment for the nature of a crime, we may be able to achieve a list of ways in which crimes can be categorized. Herein, we discuss a theory where we combine ML algorithms to act as a database for all recorded crimes in terms of category, along with providing visual knowledge of the surroundings through computer vision techniques, and using the knowledge of such data, we may predict a crime before it occurs.

Present technological used in crime detection and prediction

Crime forecasting refers to the basic process of predicting crimes before they occur. Tools are needed to predict a crime before it occurs. Currently, there are tools used by police to assist in specific tasks such as listening in on a suspect’s phone call or using a body cam to record some unusual illegal activity. Below we list some such tools to better understand where they might stand with additional technological assistance.

One good way of tracking phones is through the use of a stingray [ 35 ], which is a new frontier in police surveillance and can be used to pinpoint a cellphone location by mimicking cellphone towers and broadcasting the signals to trick cellphones within the vicinity to transmit their location and other information. An argument against the usage of stingrays in the United States is that it violates the fourth amendment. This technology is used in 23 states and in the district of Columbia. In ref. [ 36 ], the authors provide insight on how this is more than just a surveillance system, raising concerns about privacy violations. In addition, the Federal Communicatons Commission became involved and ultimately urged the manufacturer to meet two conditions in exchange for a grant: (1) “The marketing and sale of these devices shall be limited to federal, state, local public safety and law enforcement officials only” and (2) “State and local law enforcement agencies must advance coordinate with the FBI the acquisition and use of the equipment authorized under this authorization.” Although its use is worthwhile, its implementation remains extremely controversial.

A very popular method that has been in practice since the inception of surveillance is “the stakeout”. A stakeout is the most frequently practiced surveillance technique among police officers and is used to gather information on all types of suspects. In ref. [ 37 ], the authors discuss the importance of a stakeout by stating that police officers witness an extensive range of events about which they are required to write a report. Such criminal acts are observed during stakeouts or patrols; observations of weapons, drugs, and other evidence during house searches; and descriptions of their own behavior and that of the suspect during arrest. Stakeouts are extremely useful, and are considered 100% reliable, with the police themselves observing the notable proceedings. However, are they actually 100% accurate? All officers are humans, and all humans are subject to fatigue. The major objective of a stakeout is to observe wrongful activities. Is there a tool that can substitute its use? We will discuss this point herein.

Another way to conduct surveillance is by using drones, which help in various fields such as mapping cities, chasing suspects, investigating crime scenes and accidents, traffic management and flow, and search and rescue after a disaster. In ref. [ 38 ], legal issues regarding the use of drones and airspace distribution problems are described. Legal issues include the privacy concerns raised by the public, with the police gaining increasing power and authority. Airspace distribution raises concerns about how high a drone is allowed to go.

Other surveillance methods include face recognition, license plate recognition, and body cams. In ref. [ 39 ], the authors indicated that facial recognition can be used to obtain the profile of suspects and analyze it from different databases to obtain more information. Similarly, a license plate reader can be used to access data about a car possibly involved in a crime. They may even use body cams to see more than what the human eye can see, meaning that the reader observes everything a police officer sees and records it. Normally, when we see an object, we cannot recollect the complete image of it. In ref. [ 40 ], the impact of body cams was studied in terms of officer misconduct and domestic violence when the police are making an arrest. Body cams are thus being worn by patrol officers. In ref. [ 41 ], the authors also mentioned how protection against wrongful police practices is provided. However, the use of body cams does not stop here, as other primary reasons for having a body camera on at all times is to record the happenings in front of the wearer in hopes of record useful events during daily activities or during important operations.

Although each of these methods is effective, one point they share in common is that they all work individually, and while the police can use any of these approaches individually or concurrently, having a machine that is able to incorporate the positive aspects of all of these technologies would be highly beneficial.

ML techniques used in crime prediction

In ref. [ 42 ], a comparative study was carried out between violent crime patterns from the Communities and Crime Unnormalized Dataset versus actual crime statistical data using the open source data mining software Waikato Environment for Knowledge Analysis (WEKA). Three algorithms, namely, linear regression, additive regression, and decision stump, were implemented using the same finite set of features on communities and actual crime datasets. Test samples were randomly selected. The linear regression algorithm could handle randomness to a certain extent in the test samples and thus proved to be the best among all three selected algorithms. The scope of the project was to prove the efficiency and accuracy of ML algorithms in predicting violent crime patterns and other applications, such as determining criminal hotspots, creating criminal profiles, and learning criminal trends.

When considering WEKA [ 43 ], the integration of a new graphical interface called Knowledge Flow is possible, which can be used as a substitute for Internet Explorer. IT provides a more concentrated view of data mining in association with the process orientation, in which individual learning components (represented by java beans) are used graphically to show a certain flow of information. The authors then describe another graphical interface called an experimenter, which as the name suggests, is designed to compare the performance of multiple learning schemes on multiple data sets.

In ref. [ 34 ], the potential of applying a predictive analysis of crime forecasting in an urban context is studied. Three types of crime, namely, home burglary, street robbery, and battery, were aggregated into grids of 200 m × 250 m and retrospectively analyzed. Based on the crime data of the previous 3 years, an ensemble model was applied to synthesize the results of logistic regression and neural network models in order to obtain fortnightly and monthly predictions for the year 2014. The predictions were evaluated based on the direct hit rate, precision, and prediction index. The results of the fortnightly predictions indicate that by applying a predictive analysis methodology to the data, it is possible to obtain accurate predictions. They concluded that the results can be improved remarkably by comparing the fortnightly predictions with the monthly predictions with a separation between day and night.

In ref. [ 44 ], crime predictions were investigated based on ML. Crime data of the last 15 years in Vancouver (Canada) were analyzed for prediction. This machine-learning-based crime analysis involves the collection of data, data classification, identification of patterns, prediction, and visualization. K-nearest neighbor (KNN) and boosted decision tree algorithms were also implemented to analyze the crime dataset. In their study, a total of 560,000 crime datasets between 2003 and 2018 were analyzed, and crime prediction with an accuracy of between 39% and 44% was obtained by predicting the crime using ML algorithms. The accuracy was low as a prediction model, but the authors concluded that the accuracy can be increased or improved by tuning both the algorithms and crime data for specific applications.

In ref. [ 45 ], a ML approach is presented for the prediction of crime-related statistics in Philadelphia, United States. The problem was divided into three parts: determining whether the crime occurs, occurrence of crime and most likely crime. Algorithms such as logistic regression, KNN, ordinal regression, and tree methods were used to train the datasets to obtain detailed quantitative crime predictions with greater significance. They also presented a map for crime prediction with different crime categories in different areas of Philadelphia for a particular time period with different colors indicating each type of crime. Different types of crimes ranging from assaults to cyber fraud were included to match the general pattern of crime in Philadelphia for a particular interval of time. Their algorithm was able to predict whether a crime will occur with an astonishing 69% accuracy, as well as the number of crimes ranging from 1 to 32 with 47% accuracy.

In ref. [ 46 ], the authors analyzed a dataset consisting of several crimes and predicted the type of crime that may occur in the near future depending on various conditions. ML and data science techniques were used for crime prediction in a crime dataset from Chicago, United States. The crime dataset consists of information such as the crime location description, type of crime, date, time, and precise location coordinates. Different combinations of models, such as KNN classification, logistic regression, decision trees, random forest, a support vector machine (SVM), and Bayesian methods were tested, and the most accurate model was used for training. The KNN classification proved to be the best with an accuracy of approximately 0.787. They also used different graphs that helped in understanding the various characteristics of the crime dataset of Chicago. The main purpose of this paper is to provide an idea of how ML can be used by law enforcement agencies to predict, detect, and solve crime at a much better rate, which results in a reduction in crime.

In ref. [ 47 ], a graphical user interface-based prediction of crime rates using a ML approach is presented. The main focus of this study was to investigate machine-learning-based techniques with the best accuracy in predicting crime rates and explore its applicability with particular importance to the dataset. Supervised ML techniques were used to analyze the dataset to carry out data validation, data cleaning, and data visualization on the given dataset. The results of the different supervised ML algorithms were compared to predict the results. The proposed system consists of data collection, data preprocessing, construction of a predictive model, dataset training, dataset testing, and a comparison of algorithms, as shown in Fig.  1 . The aim of this study is to prove the effectiveness and accuracy of a ML algorithm for predicting violent crimes.

figure 1

Dataflow diagram

In ref. [ 48 ], a feature-level data fusion method based on a deep neural network (DNN) is proposed to accurately predict crime occurrence by efficiently fusing multi-model data from several domains with environmental context information. The dataset consists of data from an online database of crime statistics from Chicago, demographic and meteorological data, and images. Crime prediction methods utilize several ML techniques, including a regression analysis, kernel density estimation (KDE), and SVM. Their approach mainly consisted of three phases: collection of data, analysis of the relationship between crime incidents and collected data using a statistical approach, and lastly, accurate prediction of crime occurrences. The DNN model consists of spatial features, temporal features, and environmental context. The SVM and KDE models had accuracies of 67.01% and 66.33%, respectively, whereas the proposed DNN model had an astonishing accuracy of 84.25%. The experimental results showed that the proposed DNN model was more accurate in predicting crime occurrences than the other prediction models.

In ref. [ 49 ], the authors mainly focused on the analysis and design of ML algorithms to reduce crime rates in India. ML techniques were applied to a large set of data to determine the pattern relations between them. The research was mainly based on providing a prediction of crime that might occur based on the occurrence of previous crime locations, as shown in Fig.  2 . Techniques such as Bayesian neural networks, the Levenberg Marquardt algorithm, and a scaled algorithm were used to analyze and interpret the data, among which the scaled algorithm gave the best result in comparison with the other two techniques. A statistical analysis based on the correlation, analysis of variance, and graphs proved that with the help of the scaled algorithm, the crime rate can be reduced by 78%, implying an accuracy of 0.78.

figure 2

Functionality of proposed approach

In ref. [ 50 ], a system is proposed that predicts crime by analyzing a dataset containing records of previously committed crimes and their patterns. The proposed system works mainly on two ML algorithms: a decision tree and KNN. Techniques such as the random forest algorithm and Adaptive Boosting were used to increase the accuracy of the prediction model. To obtain better results for the model, the crimes were divided into frequent and rare classes. The frequent class consisted of the most frequent crimes, whereas the rare class consisted of the least frequent crimes. The proposed system was fed with criminal activity data for a 12-year period in San Francisco, United States. Using undersampling and oversampling methods along with the random forest algorithm, the accuracy was surprisingly increased to 99.16%.

In ref. [ 51 ], a detailed study on crime classification and prediction using ML and deep learning architectures is presented. Certain ML methodologies, such as random forest, naïve Bayes, and an SVM have been used in the literature to predict the number of crimes and hotspot prediction. Deep learning is a ML approach that can overcome the limitations of some machine-learning methodologies by extracting the features from the raw data. This paper presents three fundamental deep learning configurations for crime prediction: (1) spatial and temporal patterns, (2) temporal and spatial patterns, and (3) spatial and temporal patterns in parallel. Moreover, the proposed model was compared with 10 state-of-the-art algorithms on 5 different crime prediction datasets with more than 10 years of crime data.

In ref. [ 52 ], a big data and ML technique for behavior analysis and crime prediction is presented. This paper discusses the tracking of information using big data, different data collection approaches, and the last phase of crime prediction using ML techniques based on data collection and analysis. A predictive analysis was conducted through ML using RapidMiner by processing historical crime patterns. The research was mainly conducted in four phases: data collection, data preparation, data analysis, and data visualization. It was concluded that big data is a suitable framework for analyzing crime data because it can provide a high throughput and fault tolerance, analyze extremely large datasets, and generate reliable results, whereas the ML based naïve Bayes algorithm can achieve better predictions using the available datasets.

In ref. [ 53 ], various data mining and ML technologies used in criminal investigations are demonstrated. The contribution of this study is highlighting the methodologies used in crime data analytics. Various ML methods, such as a KNN, SVM, naïve Bayes, and clustering, were used for the classification, understanding, and analysis of datasets based on predefined conditions. By understanding and analyzing the data available in the crime record, the type of crime and the hotspot of future criminal activities can be determined. The proposed model was designed to perform various operations such as feature selection, clustering, analysis, prediction, and evaluation of the given datasets. This research proves the necessity of ML techniques for predicting and analyzing criminal activities.

In ref. [ 54 ], the authors incorporated the concept of a grid-based crime prediction model and established a range of spatial-temporal features based on 84 types of geographic locations for a city in Taiwan. The concept uses ML algorithms to learn the patterns and predict crime for the following month for each grid. Among the many ML methods applied, the best model was found to be a DNN. The main contribution of this study is the use of the most recent ML techniques, including the concept of feature learning. In addition, the testing of crime displacement also showed that the proposed model design outperformed the baseline.

In ref. [ 55 ], the authors considered the development of a crime prediction model using the decision tree (J48) algorithm. When applied in the context of law enforcement and intelligence analysis, J48 holds the promise of mollifying crime rates and is considered the most efficient ML algorithm for the prediction of crime data in the related literature. The J48 classifier was developed using the WEKA tool kit and later trained on a preprocessed crime dataset. The experimental results of the J48 algorithm predicted the unknown category of crime data with an accuracy of 94.25287%. With such high accuracy, it is fair to count on the system for future crime predictions.

Comparative study of different forecasting methods

First, in refs. [ 56 , 57 ], the authors predicted crime using the KNNs algorithm in the years 2014 and 2013, respectively. Sun et al. [ 56 ] proved that a higher crime prediction accuracy can be obtained by combining the grey correlation analysis based on new weighted KNN (GBWKNN) filling algorithm with the KNN classification algorithm. Using the proposed algorithm, we were able to obtain an accuracy of approximately 67%. By contrast, Shojaee et al. [ 57 ] divided crime data into two parts, namely, critical and non-critical, and applied a simple KNN algorithm. They achieved an astonishing accuracy of approximately 87%.

Second, in refs. [ 58 , 59 ], crime is predicted using a decision tree algorithm for the years 2015 and 2013, respectively. In their study, Obuandike et al. [ 58 ] used the ZeroR algorithm along with a decision tree but failed to achieve an accuracy of above 60%. In addition, Iqbal et al. [ 59 ] achieved a stunning accuracy of 84% using a decision tree algorithm. In both cases, however, a small change in the data could lead to a large change in the structure.

Third, in refs. [ 60 , 61 ], a novel crime detection technique called naïve Bayes was implemented for crime prediction and analysis. Jangra and Kalsi [ 60 ] achieved an astounding crime prediction accuracy of 87%, but could not apply their approach to datasets with a large number of features. By contrast, Wibowo and Oesman [ 61 ] achieved an accuracy of only 66% in predicting crimes and failed to consider the computational speed, robustness, and scalability.

Below, we summarize the above comparison and add other models to further illustrate this comparative study and the accuracy of some frequently used models (Table  1 ).

Computer vision models combined with machine and deep learning techniques

In ref. [ 66 ], the study focused on three main questions. First, the authors question whether computer vision algorithms actually work. They stated that the accuracy of the prediction is 90% over fewer complex datasets, but the accuracy drops to 60% over complex datasets. Another concern we need to focus on is reducing the storage and computational costs. Second, they question whether it is effective for policing. They determined that a distinct activity detection is difficult, and pinpointed a key component, the Public Safety Visual Analytics Workstation, which includes many capabilities ranging from detection and localization of objects in camera feeds to labeling actions and events associated with training data, and allowing query-based searches for specific events in videos. By doing so, they aim to view every event as a computer-vision trained, recognized, and labeled event. The third and final question they ask is whether computer vision impacts the criminal justice system. The answer to this from their end is quite optimistic to say the least, although they wish to implement computer vision alone, which we suspect is unsatisfactory.

In ref. [ 67 ], a framework for multi-camera video surveillance is presented. The framework is designed so efficiently that it performs all three major activities of a typical police “stake-out”, i.e., detection, representation, and recognition. The detection part mixes video streams from multiple cameras to efficiently and reliably extract motion trajectories from videos. The representation helps in concluding the raw trajectory data to construct hierarchical, invariant, and content-rich descriptions of motion events. Finally, the recognition part deals with event classification (such as robbery and possibly murder and molestation, among others) and identification of the data descriptors. For an effective recognition, they developed a sequence-alignment kernel function to perform sequence data learning to identify suspicious/possible crime events.

In ref. [ 68 ], a method is suggested for identifying people for surveillance with the help of a new feature called soft biometry, which includes a person’s height, built, skin tone, shirt and trouser color, motion pattern, and trajectory history to identify and track passengers, which further helps in predicting crime activities. They have gone further and discussed some absurd human error incidents that have resulted in the perpetrators getting away. They also conducted experiments, the results of which were quite astounding. In one case, the camera catches people giving piggyback rides in more than one frame of a single shot video. The second scenario shows the camera’s ability to distinguish between airport guards and passengers.

In ref. [ 69 ], the authors discussed automated visual surveillance in a realistic scenario and used Knight, which is a multiple camera surveillance and monitoring system. Their major targets were to analyze the detection, tracking, and classification performances. The detection, tracking, and classification accuracies were 97.4%, 96.7%, and 88%, respectively. The authors also pointed to the major difficulties of illumination changes, camouflage, uninteresting moving objects, and shadows. This research again proves the reliability of computer vision models.

It is well known that an ideal scenario for a camera to achieve a perfect resolution is not possible. In ref. [ 70 ], security surveillance systems often produce poor-quality video, which could be a hurdle in gathering forensic evidence. They examined the ability of subjects to identify targeted individuals captured by a commercially available video security device. In the first experiment, subjects personally familiar with the targets performed extremely well at identifying them, whereas subjects unfamiliar with the targets performed quite poorly. Although these results might not seem to be very conclusive and efficient, police officers with experience in forensic identification performed as poorly as other subjects unfamiliar with the targets. In the second experiment, they asked how familiar subjects could perform so well, and then used the same video device edited clips to obscure the head, body, or gait of the targets. Hiding the body or gait produced a small decrease in recognition performance. Hiding the target heads had a dramatic effect on the subject’s ability to recognize the targets. This indicates that even if the quality of the video is low, the head the target was seen and recognized.

In ref. [ 71 ], an automatic number plate recognition (ANPR) model is proposed. The authors described it as an “image processing innovation”. The ANPR system consists of the following steps: (1) vehicle image capture, (2) preprocessing, (3) number plate extraction, (4) character segmentation, and (5) character recognition. Before the main image processing, a pre-processing of the captured image is conducted, which includes converting the red, green and blue image into a gray image, clamor evacuation, and border enhancement for brightness. The plate is then separated by judging its size. In character segmentation, the letters and numbers are separated and viewed individually. In character recognition, optical character recognition is applied to a given database.

Although real-time crime forecasting is vital, it is extremely difficult to achieve in practice. No known physical models provide a reasonable approximation with dependable results for such a complex system. In ref. [ 72 ], the authors adapted a spatial temporal residual network to well-represented data to predict the distribution of crime in Los Angeles at an hourly scale in neighborhood-sized parcels. These experiments were compared with several existing approaches for prediction, demonstrating the superiority of the proposed model in terms of accuracy. They compared their deep learning approach to ARIMA, KNN, and the historical average. In addition, they presented a ternarization technique to address the concerns of resource consumption for deployment in the real world.

In ref. [ 73 ], the authors conducted a significant study on crime prediction and showed the importance of non-crime data. The major objective of this research was taking advantage of DNNs to achieve crime prediction in a fine-grain city partition. They made predictions using Chicago and Portland crime data, which were further augmented with additional datasets covering the weather, census data, and public transportation. In the paper they split each city into grid cells (beats for Chicago and square grid for Portland). The crime numbers are broken into 10 bins, and their model predicts the most likely bin for each spatial region at a daily level. They train these data using increasingly complex neural network structures, including variations that are suited to the spatial and temporal aspects of the crime prediction problem. Using their model, they were able to predict the correct bin for the overall number of crimes with an accuracy of 75.6% for Chicago and 65.3% for Portland. They showed that adding the value of additional non-crime data was an important factor. They found that days with higher amounts of precipitation and snow decreased the accuracy of the model slightly. Then, considering the impact of transportation, the bus routes and train routes were presented within their beats, and it was shown that the beat containing a train station is on average 1.2% higher than its neighboring beats. The accuracy of a beat that contained one or more train lines passing through it was 0.5% more accurate than its neighboring beats.

In ref. [ 74 ], the authors taught a system how to monitor traffic and identify vehicles at night. They used the bright spots of the headlights and tail lights to identify an object first as a vehicle, and the bright light is extracted with a segmentation process, and then processed by a spatial clustering and tracking procedure that locates and analyzes the spatial and temporal features of the vehicle light. They also conducted an experiment in which, for a span of 20 min, the detection scores for cars and bikes were 98.79% and 96.84%, respectively. In another part of the test, they conducted the same test under the same conditions for 50 min, and the detection scores for cars and bikes were 97.58% and 98.48%, respectively. It is good for machines to be built at such a beginning level. This technology can also be used to conduct surveillance at night.

In ref. [ 75 ], an important approach for human motion analysis is discussed. The author mentions that human motion analysis is difficult because appearances are extremely variable, and thus stresses that focusing on marker-less vision-based human motion analysis has the potential to provide a non-obtrusive solution for the evaluation of body poses. The author claims that this technology can have vast applications such as surveillance, human-computer interaction, and automatic annotation, and will thus benefit from a robust solution. In this paper, the characteristics of human motion analysis are discussed. We divide the analysis part into two aspects, modeling and an estimation phase. The modeling phase includes the construction of the likelihood function [including the camera model, image descriptors, human body model and matching function, and (physical) constraints], and the estimation phase is concerned with finding the most likely pose given the likelihood (function result) of the surface. We discuss the model-free approaches separately.

In ref. [ 76 ], the authors provided insight into how we can achieve crime mapping using satellites. The need for manual data collection for mapping is costly and time consuming. By contrast, satellite imagery is becoming a great alternative. In this paper, they attempted to investigate the use of deep learning to predict crime rates directly from raw satellite imagery. They trained a deep convolutional neural network (CNN) on satellite images obtained from over 1 million crime-incident reports (15 years of data) collected by the Chicago Police Department. The best performing model predicted crime rates from raw satellite imagery with an astounding accuracy of 79%. To make their research more thorough, they conducted a test for reusability, and used the tested and learned Chicago models for prediction in the cities of Denver and San Francisco. Compared to maps made from years of data collected by the corresponding police departments, their maps have an accuracy of 72% and 70%, respectively. They concluded the following: (1) Visual features contained in satellite imagery can be successfully used as a proxy indicator of crime rates; (2) ConvNets are capable of learning models for crime rate prediction from satellite imagery; (3) Once deep models are used and learned, they can be reused across different cities.

In ref. [ 77 ], the authors suggested an extremely intriguing research approach in which they claim to prove that looking beyond what is visible is to infer meaning to what is viewed from an image. They even conducted an interesting study on determining where a McDonalds could be located simply from photographs, and provided the possibility of predicting crime. They compared the human accuracy on this task, which was 59.6%, and the accuracy of using gradient-based features, which was 72.5%, with a chance performance (a chance performance is what you would obtain if you performed at random) of only 50%. This indicates the presence of some visual cues that are not easily spotted by an average human, but are able to be spotted by a machine, thus enables us to judge whether an area is safe. The authors indicated that numerous factors are often associated with our intuition, which we use to avoid certain areas because they may seem “shady” or “unsafe”.

In ref. [ 78 ], the authors describe in two parts how close we are to achieving a fully automated surveillance system. The first part views the possibility of surveillance in a real-world scenario where the installation of systems and maintenance of systems are in question. The second part considers the implementation of computer vision models and algorithms for behavior modeling and event detection. They concluded that the complete scenario is under discussion, and therefore many people are conducting research and obtaining results. However, as we look closely, we can see that reliable results are possible only in certain aspects, while other areas are still in the development process, such as obtaining information on cars and their owners as well as accurately understanding the behavior of a possible suspect.

Many times during criminal activities, convicts use hand gestures to signal messages to each other. In ref. [ 79 ], research on hand gesture recognition was conducted using computer vision models. Their application architecture is of extremely high quality and is easy to understand. They begin by capturing images, and then try detecting a hand in the background. They apply either computer aided manufacturing or different procedure in which they first convert a picture into gray scale, after which they set the image return on investment, and then find and extract the biggest contour. They then determine the convex hull of the contour to try and find an orientation around the bounded rectangle, and finally interpret the gesture and convert it into a meaningful command.

Crime hotspots or areas with high crime intensity are places where the future possibility of a crime exists along with the possibility of spotting a criminal. In ref. [ 80 ], the authors conducted research on forecasting crime hotspots. They used Google Tensor Flow to implement their model and evaluated three options for the recurrent neural network (RNN) architecture: accuracy, precision, and recall. The focus is on achieving a larger value to prove that the approach has a better performance. The gated recurrent unit (GRU) and long short-term memory (LSTM) versions obtained similar performance levels with an accuracy of 81.5%, precision of 86%–87%, recall of 75%, and F1-score of 0.8. Both perform much better than the traditional RNN version. Based on the area under the ROC curve (AUC) performance observations, the GRU version was 2% better than the RNN version. The LSTM version achieved the best AUC score, which was improved by 3% over the GRU version.

In ref. [ 81 ], a spatiotemporal crime network (STCN) is proposed that applies a CNN for predicting crime before it occurs. The authors evaluated the STCN using 311 felony datasets from New York from 2010 to 2015. The results were extremely impressive, with the STCN achieving an F1-score of 88% and an AUC of 92%, which confirmed that it exceeded the performance of the four baselines. Their proposed model achieved the best performance in terms of both F1 and AUC, which remained better than those of the other baselines even when the time window reached 100. This study provides evidence that the system can function well even in a metropolitan area.

Proposed idea

After finding and understanding various distinct methods used by the police for surveillance purposes, we determined the importance of each method. Each surveillance method can perform well on its own and produce satisfactory results, although for only one specific characteristic, that is, if we use a Sting Ray, it can help us only when the suspect is using a phone, which should be switched on. Thus, it is only useful when the information regarding the stake out location is correct. Based on this information, we can see how the ever-evolving technology has yet again produced a smart way to conduct surveillance. The introduction of deep learning, ML, and computer vision techniques has provided us with a new perspective on ways to conduct surveillance. This is an intelligent approach to surveillance because it tries to mimic a human approach, but it does so 24 h a day, 365 days a year, and once it has been taught how to do things it does them in the same manner repeatedly.

Although we have discussed the aspects that ML and computer vision can achieve, but what are these aspects essentially? This brings us to the main point of our paper discussion, i.e., our proposed idea, which is to combine the point aspects of Sting Ray, body cams, facial recognition, number plate recognition, and stakeouts. New features iclude core analytics, neural networks, heuristic engines, recursion processors, Bayesian networks, data acquisition, cryptographic algorithms, document processors, computational linguistics, voiceprint identification, natural language processing, gait analysis, biometric recognition, pattern mining, intel interpretation, threat detection, threat classification. The new features are completely computer dependent and hence require human interaction for development; however, once developed, it functions without human interaction and frees humans for other tasks. Let us understand the use of each function.

Core analytics: This includes having knowledge of a variety of statistical techniques, and by using this knowledge, predict future outcomes, which in our case are anything from behavioral instincts to looting a store in the near future.

Neural networks: This is a concept consisting of a large number of algorithms that help in finding the relation between data by acting similar to a human brain, mimicking biological nerve cells and hence trying to think on its own, thus understanding or even predicting a crime scene.

Heuristic engines: These are engines with data regarding antiviruses, and thus knowledge about viruses, increasing the safety of our system as it identifies the type of threat and eliminates it using known antiviruses.

Cryptographic algorithms: Such algorithms are used in two parts. First, they privately encode the known confidential criminal data. Second, they are used to keep the newly discovered potential crime data encrypted.

Recursion processors: These are used to apply the functions of our machine repeatedly to make sure they continuously work and never break the surveillance of the machine.

Bayesian networks: These are probabilistic acyclic graphical models that can be used for a variety of purposes such as prediction, anomaly detection, diagnostics, automated insight, reasoning, time series prediction, and decision making under uncertainty.

Data acquisition: This might be the most important part because our system has to possess the knowledge of previous crimes and learn from them to predict future possible criminal events.

Document processors: These are used after the data collection, primarily for going through, organizing, analyzing, and learning from the data.

Computer linguistics: Using algorithms and learning models, this method is attempting to give a computer the ability to understand human spoken language, which would be ground breaking, allowing a machine to not only identify a human but also understands what the human is saying.

Natural language processor: This is also used by computers to better understand human linguistics.

Voice print identification: This is an interesting application, which tries to distinguish one person’s voice from another, making it even more recognizable and identifiable. It identifies a target with the help of certain characteristics, such as the configuration of the speaker’s mouth and throat, which can be expressed as a mathematical formula.

Gait analysis: This will be used to study human motion, understanding posture while walking. It will be used to better understand the normal pace of a person and thus judge an abnormal pace.

Bio metric identification: This is used to identify individuals by their face, or if possible, identify them by their thumb print stored in few different databases.

Pattern mining: This is a subset of data mining and helps in observing patterns among routine activities. The use of this technology will help us identify if a person is seen an usual number of times behind a pharmacy window at particular time, allowing the machine to alert the authorities.

Intel interpretation: This is also used to make sense of the information gathered, and will include almost all features mentioned above, combining the results of each and making a final meaningful prediction.

Threat detection: A threat will be detected if during the intel processing a certain number of check boxes predefined when making the system are ticked.

Threat classification: As soon as a threat is detected, it is classified, and the threat can then be categorized into criminal case levels, including burglary, murder, or a possible terrorist attack; thus, based on the time line, near or distant future threats might be predictable.

Combining all of these features, we aim to produce software that has the capability of becoming a universal police officer, having eyes and ears everywhere. Obviously, we tend to use the CCTVs in urban areas during a preliminary round to see the functioning of such software in a real-world scenario. The idea is to train and make the software learn all previously recorded crimes whose footages are available (at least 5000 cases for optimum results), through supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning to help it to understand what a crime actually is. Thus, it will achieve a better understanding of criminality and can answer how crimes happen, as well as why and where. We do not propose simply making a world-class model to predict crimes, we also suggest making it understand previous crimes to better judge and therefore better predict them.

We aim to use this type of technology on two fronts: first and most importantly, for predicting crimes before they happen, followed by a thorough analysis of a crime scene allowing the system to possibly identify aspects that even a human eye can miss.

The most interesting cutting-edge and evolutionary idea that we believe should be incorporated is the use of scenario simulations. After analyzing the scene and using the 17 main characteristics mentioned above, the software should run at least 50 simulations of the present scenario presented in front of it, which will be assisted by previously learned crime recordings. The simulation will help the software in asserting the threat level and then accordingly recommend a course of action or alert police officials.

To visualize a possible scenario where we are able to invent such software, we prepared a flow chart (Fig.  3 ) to better understand the complete process.

figure 3

Flowchart of our proposed model. The data are absorbed from the surrounding with the help of cameras and microphones. If the system depicts an activity as suspicious, it gathers more intel allowing the facial algorithms to match against a big database such as a Social Security Number or Aadhaar card database. When it detects a threat, it also classifies it into categories such as the nature of the crime and time span within which it is possible to take place. With all the gathered intel and all the necessary details of the possible crime, it alerts the respective authority with a 60-word synopsis to give them a brief idea, allowing law enforcement agencies to take action accordingly

Although this paper has been implemented with high accuracy and detailed research, there are certain challenges that can pose a problem in the future. First, the correct and complete building of the whole system has to be done in the near future, allowing its implementation to take place immediately and properly. Furthermore, the implementation itself is a significant concern, as such technologies cannot be directly implemented in the open world. The system must first be tested in a small part of a metropolitan area, and only then with constant improvements (revisions of the first model) can its usage be scaled up. Hence, the challenges are more of a help in perfecting the model and thus gradually providing a perfect model that can be applied to the real world. Moreover, there are a few hurdles in the technological aspects of the model, as the size of the learning data will be enormous, and thus processing it will take days and maybe even weeks. Although these are challenges that need to be addressed, they are aspects that a collective team of experts can overcome after due diligence, and if so, the end product will be worth the hard work and persistence.

Future scope

This paper presented the techniques and methods that can be used to predict crime and help law agencies. The scope of using different methods for crime prediction and prevention can change the scenario of law enforcement agencies. Using a combination of ML and computer vision can substantially impact the overall functionality of law enforcement agencies. In the near future, by combining ML and computer vision, along with security equipment such as surveillance cameras and spotting scopes, a machine can learn the pattern of previous crimes, understand what crime actually is, and predict future crimes accurately without human intervention. A possible automation would be to create a system that can predict and anticipate the zones of crime hotspots in a city. Law enforcement agencies can be warned and prevent crime from occurring by implementing more surveillance within the prediction zone. This complete automation can overcome the drawbacks of the current system, and law enforcement agencies can depend more on these techniques in the near future. Designing a machine to anticipate and identify patterns of such crimes will be the starting point of our future study. Although the current systems have a large impact on crime prevention, this could be the next big approach and bring about a revolutionary change in the crime rate, prediction, detection, and prevention, i.e., a “universal police officer”.

Conclusions

Predicting crimes before they happen is simple to understand, but it takes a lot more than understanding the concept to make it a reality. This paper was written to assist researchers aiming to make crime prediction a reality and implement such advanced technology in real life. Although police do include the use of new technologies such as Sting Rays and facial recognition every few years, the implementation of such software can fundamentally change the way police work, in a much better way. This paper outlined a framework envisaging how the aspects of machine and deep learning, along with computer vision, can help create a system that is much more helpful to the police. Our proposed system has a collection of technologies that will perform everything from monitoring crime hotspots to recognizing people from their voice notes. The first difficulty faced will be to actually make this system, followed by problems such as its implementation and use, among others. However, all of these problems are solvable, and we can also benefit from a security system that monitors the entire city around-the-clock. In other words, to visualize a world where we incorporate such a system into a police force, tips or leads that much more reliable can be achieved and perhaps crime can be eradicated at a much faster rate.

Availability of data and materials

All relevant data and material are presented in the main paper.

Abbreviations

  • Machine learning

Nondeterministic polynomial

Waikato Environment for Knowledge Analysis

K-nearest neighbor

Automatic number plate recognition

Deep neural network

Kernel density estimation

Support vector machine

Grey correlation analysis based on new weighted KNN

Autoregressive integrated moving average

Spatiotemporal crime network

Convolutional neural network

Area under the ROC curve

Recurrent neural network

Gated recurrent unit

Long short-term memory

Absolute percent error

Shah D, Dixit R, Shah A, Shah P, Shah M (2020) A comprehensive analysis regarding several breakthroughs based on computer intelligence targeting various syndromes. Augment Hum Res 5(1):14.  https://doi.org/10.1007/s41133-020-00033-z

Patel H, Prajapati D, Mahida D, Shah M (2020) Transforming petroleum downstream sector through big data: a holistic review. J Pet Explor Prod Technol 10(6):2601–2611.  https://doi.org/10.1007/s13202-020-00889-2

Szeliski R (2010) Computer vision: algorithms and applications. Springer-Verlag, Berlin, pp 1–979

MATH   Google Scholar  

Vedaldi A, Fulkerson B (2010) Vlfeat: an open and portable library of computer vision algorithms. Paper presented at the 18th ACM international conference on multimedia. ACM, Firenze. https://doi.org/10.1145/1873951.1874249

Le TL, Nguyen MQ, Nguyen TTM (2013) Human posture recognition using human skeleton provided by Kinect. In: Paper presented at the 2013 international conference on computing, management and telecommunications. IEEE, Ho Chi Minh City. https://doi.org/10.1109/ComManTel.2013.6482417

Ahir K, Govani K, Gajera R, Shah M (2020) Application on virtual reality for enhanced education learning, military training and sports. Augment Hum Res 5(1):7. ( https://doi.org/10.1007/s41133-019-0025-2 )

Talaviya T, Shah D, Patel N, Yagnik H, Shah M (2020) Implementation of artificial intelligence in agriculture for optimisation of irrigation and application of pesticides and herbicides. Artif Intell Agric 4:58–73.  https://doi.org/10.1016/j.aiia.2020.04.002

Jha K, Doshi A, Patel P, Shah M (2019) A comprehensive review on automation in agriculture using artificial intelligence. Artif Intell Agric 2:1–12.  https://doi.org/10.1016/j.aiia.2019.05.004

Kakkad V, Patel M, Shah M (2019) Biometric authentication and image encryption for image security in cloud framework. Multiscale Multidiscip Model Exp Des 2(4):233–248.  https://doi.org/10.1007/s41939-019-00049-y

Pathan M, Patel N, Yagnik H, Shah M (2020) Artificial cognition for applications in smart agriculture: a comprehensive review. Artif Intell Agric 4:81–95.  https://doi.org/10.1016/j.aiia.2020.06.001

Pandya R, Nadiadwala S, Shah R, Shah M (2020) Buildout of methodology for meticulous diagnosis of K-complex in EEG for aiding the detection of Alzheimer's by artificial intelligence. Augment Hum Res 5(1):3.  https://doi.org/10.1007/s41133-019-0021-6

Dey A (2016) Machine learning algorithms: a review. Int J Comput Sci Inf Technol 7(3):1174–1179

Google Scholar  

Sukhadia A, Upadhyay K, Gundeti M, Shah S, Shah M (2020) Optimization of smart traffic governance system using artificial intelligence. Augment Hum Res 5(1):13.  https://doi.org/10.1007/s41133-020-00035-x

Musumeci F, Rottondi C, Nag A, Macaluso I, Zibar D, Ruffini M et al (2019) An overview on application of machine learning techniques in optical networks. IEEE Commun Surv Tutorials 21(2):1381–1408.  https://doi.org/10.1109/COMST.2018.2880039

Patel D, Shah Y, Thakkar N, Shah K, Shah M (2020) Implementation of artificial intelligence techniques for cancer detection. Augment Hum Res 5(1):6. https://doi.org/10.1007/s41133-019-0024-3

Kundalia K, Patel Y, Shah M (2020) Multi-label movie genre detection from a movie poster using knowledge transfer learning. Augment Hum Res 5(1):11. https://doi.org/10.1007/s41133-019-0029-y

Article   Google Scholar  

Marsland S (2015) Machine learning: an algorithmic perspective. CRC Press, Boca Raton, pp 1–452. https://doi.org/10.1201/b17476-1

Jani K, Chaudhuri M, Patel H, Shah M (2020) Machine learning in films: an approach towards automation in film censoring. J Data Inf Manag 2(1):55–64. https://doi.org/10.1007/s42488-019-00016-9

Parekh V, Shah D, Shah M (2020) Fatigue detection using artificial intelligence framework. Augment Hum Res 5(1):5 https://doi.org/10.1007/s41133-019-0023-4

Gandhi M, Kamdar J, Shah M (2020) Preprocessing of non-symmetrical images for edge detection. Augment Hum Res 5(1):10 https://doi.org/10.1007/s41133-019-0030-5

Panchiwala S, Shah M (2020) A comprehensive study on critical security issues and challenges of the IoT world. J Data Inf Manag 2(7):257–278. https://doi.org/10.1007/s42488-020-00030-2

Simon A, Deo MS, Venkatesan S, Babu DR (2016) An overview of machine learning and its applications. Int J Electr Sci Eng 1(1):22–24.

Parekh P, Patel S, Patel N, Shah M (2020) Systematic review and meta-analysis of augmented reality in medicine, retail, and games. Vis Comput Ind Biomed Art 3(1):21. https://doi.org/10.1186/s42492-020-00057-7

Shah K, Patel H, Sanghvi D, Shah M (2020) A comparative analysis of logistic regression, random forest and KNN models for the text classification. Augment Hum Res 5(1):12. https://doi.org/10.1007/s41133-020-00032-0

Patel D, Shah D, Shah M (2020) The intertwine of brain and body: a quantitative analysis on how big data influences the system of sports. Ann Data Sci 7(1):1–16. https://doi.org/10.1007/s40745-019-00239-y

Judd S (1988) On the complexity of loading shallow neural networks. J Complex 4(3):177–192. https://doi.org/10.1016/0885-064X(88)90019-2

Article   MathSciNet   MATH   Google Scholar  

Blum AL, Rivest RL (1992) Training a 3-node neural network is NP-complete. Neural Netw 5(1):117–127. https://doi.org/10.1016/S0893-6080(05)80010-3

Gupta A, Dengre V, Kheruwala HA, Shah M (2020) Comprehensive review of text-mining applications in finance. Financ Innov 6(1):1–25. https://doi.org/10.1186/s40854-020-00205-1

Shah N, Engineer S, Bhagat N, Chauhan H, Shah M (2020) Research trends on the usage of machine learning and artificial intelligence in advertising. Augment Hum Res 5(1):19. https://doi.org/10.1007/s41133-020-00038-8

Naik B, Mehta A, Shah M (2020) Denouements of machine learning and multimodal diagnostic classification of Alzheimer's disease. Vis Comput Ind Biomed Art 3(1):26. https://doi.org/10.1186/s42492-020-00062-w

Chen P, Yuan HY, Shu XM (2008) Forecasting crime using the ARIMA model. In: Paper presented at the 5th international conference on fuzzy systems and knowledge discovery. IEEE, Ji'nan 18-20 October 2008. https://doi.org/10.1109/FSKD.2008.222

Rani A, Rajasree S (2014) Crime trend analysis and prediction using mahanolobis distance and dynamic time warping technique. Int J Comput Sci Inf Technol 5(3):4131–4135

Gorr W, Harries R (2003) Introduction to crime forecasting. Int J Forecast 19(4):551–555. https://doi.org/10.1016/S0169-2070(03)00089-X

Rummens A, Hardyns W, Pauwels L (2017) The use of predictive analysis in spatiotemporal crime forecasting: building and testing a model in an urban context. Appl Geogr 86:255–261. https://doi.org/10.1016/j.apgeog.2017.06.011

Bates A (2017) Stingray: a new frontier in police surveillance. Cato Institute Policy Analysis, No. 809

Joh EE (2017) The undue influence of surveillance technology companies on policing. N Y Univ Law Rev 92:101–130. https://doi.org/10.2139/ssrn.2924620

Vredeveldt A, Kesteloo L, Van Koppen PJ (2018) Writing alone or together: police officers' collaborative reports of an incident. Crim Justice Behav 45(7):1071–1092. https://doi.org/10.1177/0093854818771721

McNeal GS (2014) Drones and aerial surveillance: considerations for legislators. In: Brookings Institution: The Robots Are Coming: The Project On Civilian Robotics, November 2014, Pepperdine University Legal Studies Research Paper No. 2015/3

Fatih T, Bekir C (2015) Police use of technology to fight against crime. Eur Sci J 11(10):286–296

Katz CM, Choate DE, Ready JR, Nuňo L (2014) Evaluating the impact of officer worn body cameras in the Phoenix Police Department. Center for Violence Prevention & Community Safety, Arizona State University, Phoenix, pp 1–43

Stanley J (2015) Police body-mounted cameras: with right policies in place, a win for all. https://www.aclu.org/police-body-mounted-cameras-right-policies-place-win-all . Accessed 15 Aug 2015

McClendon L, Meghanathan N (2015) Using machine learning algorithms to analyze crime data. Mach Lear Appl Int J 2(1):1–12. https://doi.org/10.5121/mlaij.2015.2101

Frank E, Hall M, Trigg L, Holmes G, Witten IH (2004) Data mining in bioinformatics using Weka. Bioinformatics 20(15):2479–2481. https://doi.org/10.1093/bioinformatics/bth261

Kim S, Joshi P, Kalsi PS, Taheri P (2018) Crime analysis through machine learning. In: Paper presented at the IEEE 9th annual information technology, electronics and mobile communication conference. IEEE, Vancouver 1-3 November 2018. https://doi.org/10.1109/IEMCON.2018.8614828

Tabedzki C, Thirumalaiswamy A, van Vliet P (2018) Yo home to Bel-Air: predicting crime on the streets of Philadelphia. In: University of Pennsylvania, CIS 520: machine learning

Bharati A, Sarvanaguru RAK (2018) Crime prediction and analysis using machine learning. Int Res J Eng Technol 5(9):1037–1042

Prithi S, Aravindan S, Anusuya E, Kumar AM (2020) GUI based prediction of crime rate using machine learning approach. Int J Comput Sci Mob Comput 9(3):221–229

Kang HW, Kang HB (2017) Prediction of crime occurrence from multi-modal data using deep learning. PLoS One 12(4):e0176244. https://doi.org/10.1371/journal.pone.0176244

Bandekar SR, Vijayalakshmi C (2020) Design and analysis of machine learning algorithms for the reduction of crime rates in India. Procedia Comput Sci 172:122–127. https://doi.org/10.1016/j.procs.2020.05.018

Hossain S, Abtahee A, Kashem I, Hoque M, Sarker IH (2020) Crime prediction using spatio-temporal data. arXiv preprint arXiv:2003.09322. https://doi.org/10.1007/978-981-15-6648-6_22

Stalidis P, Semertzidis T, Daras P (2018) Examining deep learning architectures for crime classification and prediction. arXiv preprint arXiv: 1812.00602. p. 1–13

Jha P, Jha R, Sharma A (2019) Behavior analysis and crime prediction using big data and machine learning. Int J Recent Technol Eng 8(1):461–468

Tyagi D, Sharma S (2018) An approach to crime data analysis: a systematic review. Int J Eng Technol Manag Res 5(2):67–74. https://doi.org/10.29121/ijetmr.v5.i2.2018.615

Lin YL, Yen MF, Yu LC (2018) Grid-based crime prediction using geographical features. ISPRS Int J Geo-Inf 7(8):298. https://doi.org/10.3390/ijgi7080298

Ahishakiye E, Taremwa D, Omulo EO, Niyonzima I (2017) Crime prediction using decision tree (J48) classification algorithm. Int J Comput Inf Technol 6(3):188–195

Sun CC, Yao CL, Li X, Lee K (2014) Detecting crime types using classification algorithms. J Digit Inf Manag 12(8):321–327. https://doi.org/10.14400/JDC.2014.12.8.321

Shojaee S, Mustapha A, Sidi F, Jabar MA (2013) A study on classification learning algorithms to predict crime status. Int J Digital Content Technol Appl 7(9):361–369

Obuandike GN, Isah A, Alhasan J (2015) Analytical study of some selected classification algorithms in WEKA using real crime data. Int J Adv Res Artif Intell 4(12):44–48. https://doi.org/10.14569/IJARAI.2015.041207

Iqbal R, Murad MAA, Mustapha A, Panahy PHS, Khanahmadliravi N (2013) An experimental study of classification algorithms for crime prediction. Indian J Sci Technol 6(3):4219–4225. https://doi.org/10.17485/ijst/2013/v6i3.6

Jangra M, Kalsi S (2019) Crime analysis for multistate network using naive Bayes classifier. Int J Comput Sci Mob Comput 8(6):134–143

Wibowo AH, Oesman TI (2020) The comparative analysis on the accuracy of k-NN, naive Bayes, and decision tree algorithms in predicting crimes and criminal actions in Sleman regency. J Phys Conf Ser 1450:012076. https://doi.org/10.1088/1742-6596/1450/1/012076

Vanhoenshoven F, Nápoles G, Bielen S, Vanhoof K (2017) Fuzzy cognitive maps employing ARIMA components for time series forecasting. In: Czarnowski I, Howlett RJ, Jain LC (eds) Proceedings of the 9th KES international conference on intelligent decision technologies 2017, vol 72. Springer, Heidelberg, pp 255–264. https://doi.org/10.1007/978-3-319-59421-7_24

Chapter   Google Scholar  

Gorr W, Olligschlaeger AM, Thompson Y (2000) Assessment of crime forecasting accuracy for deployment of police. Int J Forecast 2000:743–754

Yu CH, Ward MW, Morabito M, Ding W (2011) Crime forecasting using data mining techniques. In: Paper presented at the 2011 IEEE 11th international conference on data mining workshops. IEEE, Vancouver 11-11 December 2011. https://doi.org/10.1109/ICDMW.2011.56

Alves LGA, Ribeiro HV, Rodrigues FA (2018) Crime prediction through urban metrics and statistical learning. Phys A Stat Mech Appl 505:435–443. https://doi.org/10.1016/j.physa.2018.03.084

Idrees H, Shah M, Surette R (2018) Enhancing camera surveillance using computer vision: a research note. Polic Int J 41(2):292–307. https://doi.org/10.1108/PIJPSM-11-2016-0158

Wu G, Wu Y, Jiao L, Wang YF, Chang EY (2003) Multi-camera spatio-temporal fusion and biased sequence-data learning for security surveillance. In: Paper presented at the 11th ACM international conference on multimedia. ACM, Berkeley 2-8 November 2003. https://doi.org/10.1145/957013.957126

Wang YF, Chang EY, Cheng KP (2005) A video analysis framework for soft biometry security surveillance. In: Paper presented at the 3rd ACM international workshop on video surveillance & sensor networks. ACM, Hilton 11 November 2005. https://doi.org/10.1145/1099396.1099412

Shah M, Javed O, Shafique K (2007) Automated visual surveillance in realistic scenarios. IEEE MultiMed 14(1):30–39. https://doi.org/10.1109/MMUL.2007.3

Burton AM, Wilson S, Cowan M, Bruce V (1999) Face recognition in poor-quality video: evidence from security surveillance. Psychol Sci 10(3):243–248. https://doi.org/10.1111/1467-9280.00144

Goyal A, Bhatia R (2016) Automated car number plate detection system to detect far number plates. IOSR J Comput Eng 18(4):34–40. https://doi.org/10.9790/0661-1804033440

Wang B, Yin PH, Bertozzi AL, Brantingham PJ, Osher SJ, Xin J (2019) Deep learning for real-time crime forecasting and its ternarization. Chin Ann Math Ser B 40(6):949–966. https://doi.org/10.1007/s11401-019-0168-y

Stec A, Klabjan D (2018) Forecasting crime with deep learning. arXiv preprint arXiv:1806.01486. p. 1–20

Chen YL, Wu BF, Huang HY, Fan CJ (2011) A real-time vision system for nighttime vehicle detection and traffic surveillance. IEEE Trans Ind Electron 58(5):2030–2044. https://doi.org/10.1109/TIE.2010.2055771

Poppe R (2007) Vision-based human motion analysis: an overview. Comput Vision Image Underst 108(1–2):4–18. https://doi.org/10.1016/j.cviu.2006.10.016

Najjar A, Kaneko S, Miyanaga Y (2018) Crime mapping from satellite imagery via deep learning. arXiv preprint arXiv:1812.06764. p. 1–8

Khosla A, An B, Lim JJ, Torralba A (2014) Looking beyond the visible scene. In: Paper presented at the of IEEE conference on computer vision and pattern recognition. IEEE, Columbus 23-28 June 2014. https://doi.org/10.1109/CVPR.2014.474

Dee HM, Velastin SA (2008) How close are we to solving the problem of automated visual surveillance? Mach Vis Appl 19(5–6):329–343. https://doi.org/10.1007/s00138-007-0077-z

Rautaray SS (2012) Real time hand gesture recognition system for dynamic applications. Int J Ubi Comp 3(1):21–31. https://doi.org/10.5121/iju.2012.3103

Zhuang Y, Almeida M, Morabito M, Ding W (2017) Crime hot spot forecasting: a recurrent model with spatial and temporal information. In: Paper presented at the IEEE international conference on big knowledge. IEEE, Hefei 9-10 August 2017. https://doi.org/10.1109/ICBK.2017.3

Duan L, Hu T, Cheng E, Zhu JF, Gao C (2017) Deep convolutional neural networks for spatiotemporal crime prediction. In: Paper presented at the 16th international conference information and knowledge engineering. CSREA Press, Las Vegas 17-20 July 2017

Download references

Acknowledgements

The authors are grateful to Department of Computer Engineering, SAL Institute of Technology and Engineering Research and Department of Chemical Engineering, School of Technology, Pandit Deendayal Energy University for the permission to publish this research.

Not applicable.

Author information

Authors and affiliations.

Department of Computer Engineering, Sal Institute of Technology and Engineering Research, Ahmedabad, Gujarat, 380060, India

Neil Shah & Nandish Bhagat

Department of Chemical Engineering, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat, 382426, India

You can also search for this author in PubMed   Google Scholar

Contributions

All the authors make substantial contribution in this manuscript; NS, NB and MS participated in drafting the manuscript; NS, NB and MS wrote the main manuscript; all the authors discussed the results and implication on the manuscript at all stages the author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Manan Shah .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Shah, N., Bhagat, N. & Shah, M. Crime forecasting: a machine learning and computer vision approach to crime prediction and prevention. Vis. Comput. Ind. Biomed. Art 4 , 9 (2021). https://doi.org/10.1186/s42492-021-00075-z

Download citation

Received : 18 July 2020

Accepted : 05 April 2021

Published : 29 April 2021

DOI : https://doi.org/10.1186/s42492-021-00075-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Computer vision
  • Crime forecasting

crime prediction using machine learning research paper

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Vis Comput Ind Biomed Art
  • v.4; 2021 Dec

Logo of vciba

Crime forecasting: a machine learning and computer vision approach to crime prediction and prevention

1 Department of Computer Engineering, Sal Institute of Technology and Engineering Research, Ahmedabad, Gujarat 380060 India

Nandish Bhagat

2 Department of Chemical Engineering, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat 382426 India

Associated Data

All relevant data and material are presented in the main paper.

A crime is a deliberate act that can cause physical or psychological harm, as well as property damage or loss, and can lead to punishment by a state or other authority according to the severity of the crime. The number and forms of criminal activities are increasing at an alarming rate, forcing agencies to develop efficient methods to take preventive measures. In the current scenario of rapidly increasing crime, traditional crime-solving techniques are unable to deliver results, being slow paced and less efficient. Thus, if we can come up with ways to predict crime, in detail, before it occurs, or come up with a “machine” that can assist police officers, it would lift the burden of police and help in preventing crimes. To achieve this, we suggest including machine learning (ML) and computer vision algorithms and techniques. In this paper, we describe the results of certain cases where such approaches were used, and which motivated us to pursue further research in this field. The main reason for the change in crime detection and prevention lies in the before and after statistical observations of the authorities using such techniques. The sole purpose of this study is to determine how a combination of ML and computer vision can be used by law agencies or authorities to detect, prevent, and solve crimes at a much more accurate and faster rate. In summary, ML and computer vision techniques can bring about an evolution in law agencies.

Introduction

Computer vision is a branch of artificial intelligence that trains the computer to understand and comprehend the visual world, and by doing so, creates a sense of understanding of a machine’s surroundings [ 1 , 2 ]. It mainly analyzes data of the surroundings from a camera, and thus its applications are significant. It can be used for face recognition, number plate recognition, augmented and mixed realities, location determination, and identifying objects [ 3 ]. Research is currently being conducted on the formation of mathematical techniques to recover and make it possible for computers to comprehend 3D images. Obtaining the 3D visuals of an object helps us with object detection, pedestrian detection, face recognition, Eigenfaces active appearance and 3D shape models, personal photo collections, instance recognition, geometric alignment, large databases, location recognition, category recognition, bag of words, part-based models, recognition with segmentation, intelligent photo editing, context and scene understanding, and large image collection and learning, image searches, recognition databases, and test sets. These are only basic applications, and each category mentioned above can be further explored. In ref. [ 4 ], VLFeat is introduced, which is a library of computer vision algorithms that can be used to conduct fast prototyping in computer vision research, thus enabling a tool to obtain computer vision results much faster than anticipated. Considering face detection/human recognition [ 5 ], human posture can also be recognized. Thus, computer vision is extremely attractive for visualizing the world around us.

Machine learning (ML) is an application that provides a system with the ability to learn and improve automatically from past experiences without being explicitly programmed [ 6 – 8 ]. After viewing the data, an exact pattern or information cannot always be determined [ 9 – 11 ]. In such cases, ML is applied to interpret the exact pattern and information [ 12 , 13 ]. ML pushes forward the idea that, by providing a machine with access to the right data, the machine can learn and solve both complex mathematical problems and some specific problems [ 14 – 17 ]. In general, ML is categorized into two parts: (1) supervised ML and (2) unsupervised ML [ 18 , 19 ]. In supervised learning, the machine is trained on the basis of a predefined set of training examples, which facilitates its capability to obtain precise and accurate conclusions when new data are given [ 20 , 21 ]. In unsupervised learning, the machine is given a set of data, and it must find some common patterns and relationships between the data its own [ 22 , 23 ]. Neural networks, which are important tools used in supervised learning, have been studied since the 1980s [ 24 , 25 ]. In ref. [ 26 ], the author suggested that different aspects are needed to obtain an exit from nondeterministic polynomial (NP)-completeness, and architectural constraints are insufficient. However, in ref. [ 27 ], it was proved that NP-completeness problems can be extended to neural networks using sigmoid functions. Although such research has attempted to demonstrate the various aspects of new ML approaches, how accurate are the results [ 28 – 30 ]?

Although various crimes and their underlying nature seem to be unpredictable, how unforeseeable are they? In ref. [ 31 ], the authors pointed out that as society and the economy results in new types of crimes, the need for a prediction system has grown. In ref. [ 32 ], crime trends and prediction technology called Mahanolobis and a dynamic time wrapping technique are given, delivering the possibility of predicting crime and apprehending the actual culprit. As described in ref. [ 33 ], in 1998, the United States National Institute of Justice granted five grants for crime forecasting as an extension to crime mapping. Applications of crime forecasting are currently being used by law enforcement in the United States, the United Kingdom, the Netherlands, Germany, and Switzerland [ 34 ]. Nowadays, criminal intellect with the help of advances in technology is improving with each passing year. Consequently, it has become necessary for us to provide the police department and the government with the means of a new and powerful machine (a set of programs) that can help them in their process of solving crimes. The main aim of crime forecasting is to predict crimes before they occur, and thus, the importance of using crime forecasting methods is extremely clear. Furthermore, the prediction of crimes can sometimes be crucial because it may potentially save the life of a victim, prevent lifelong trauma, and avoid damage to private property. It may even be used to predict possible terrorist crimes and activities. Finally, if we implement predictive policing with a considerable level of accuracy, governments can apply other primary resources such as police manpower, detectives, and funds in other fields of crime solving, thereby curbing the problem of crime with double the power.

In this paper, we aim to make an impact by using both ML algorithms and computer vision methods to predict both the nature of a crime and possibly pinpoint a culprit. Beforehand, we questioned whether the nature of the crime was predictable. Although it might seem impossible from the outside, categorizing every aspect of a crime is quite possible. We have all heard that every criminal has a motive. That is, if we use motive as a judgment for the nature of a crime, we may be able to achieve a list of ways in which crimes can be categorized. Herein, we discuss a theory where we combine ML algorithms to act as a database for all recorded crimes in terms of category, along with providing visual knowledge of the surroundings through computer vision techniques, and using the knowledge of such data, we may predict a crime before it occurs.

Present technological used in crime detection and prediction

Crime forecasting refers to the basic process of predicting crimes before they occur. Tools are needed to predict a crime before it occurs. Currently, there are tools used by police to assist in specific tasks such as listening in on a suspect’s phone call or using a body cam to record some unusual illegal activity. Below we list some such tools to better understand where they might stand with additional technological assistance.

One good way of tracking phones is through the use of a stingray [ 35 ], which is a new frontier in police surveillance and can be used to pinpoint a cellphone location by mimicking cellphone towers and broadcasting the signals to trick cellphones within the vicinity to transmit their location and other information. An argument against the usage of stingrays in the United States is that it violates the fourth amendment. This technology is used in 23 states and in the district of Columbia. In ref. [ 36 ], the authors provide insight on how this is more than just a surveillance system, raising concerns about privacy violations. In addition, the Federal Communicatons Commission became involved and ultimately urged the manufacturer to meet two conditions in exchange for a grant: (1) “The marketing and sale of these devices shall be limited to federal, state, local public safety and law enforcement officials only” and (2) “State and local law enforcement agencies must advance coordinate with the FBI the acquisition and use of the equipment authorized under this authorization.” Although its use is worthwhile, its implementation remains extremely controversial.

A very popular method that has been in practice since the inception of surveillance is “the stakeout”. A stakeout is the most frequently practiced surveillance technique among police officers and is used to gather information on all types of suspects. In ref. [ 37 ], the authors discuss the importance of a stakeout by stating that police officers witness an extensive range of events about which they are required to write a report. Such criminal acts are observed during stakeouts or patrols; observations of weapons, drugs, and other evidence during house searches; and descriptions of their own behavior and that of the suspect during arrest. Stakeouts are extremely useful, and are considered 100% reliable, with the police themselves observing the notable proceedings. However, are they actually 100% accurate? All officers are humans, and all humans are subject to fatigue. The major objective of a stakeout is to observe wrongful activities. Is there a tool that can substitute its use? We will discuss this point herein.

Another way to conduct surveillance is by using drones, which help in various fields such as mapping cities, chasing suspects, investigating crime scenes and accidents, traffic management and flow, and search and rescue after a disaster. In ref. [ 38 ], legal issues regarding the use of drones and airspace distribution problems are described. Legal issues include the privacy concerns raised by the public, with the police gaining increasing power and authority. Airspace distribution raises concerns about how high a drone is allowed to go.

Other surveillance methods include face recognition, license plate recognition, and body cams. In ref. [ 39 ], the authors indicated that facial recognition can be used to obtain the profile of suspects and analyze it from different databases to obtain more information. Similarly, a license plate reader can be used to access data about a car possibly involved in a crime. They may even use body cams to see more than what the human eye can see, meaning that the reader observes everything a police officer sees and records it. Normally, when we see an object, we cannot recollect the complete image of it. In ref. [ 40 ], the impact of body cams was studied in terms of officer misconduct and domestic violence when the police are making an arrest. Body cams are thus being worn by patrol officers. In ref. [ 41 ], the authors also mentioned how protection against wrongful police practices is provided. However, the use of body cams does not stop here, as other primary reasons for having a body camera on at all times is to record the happenings in front of the wearer in hopes of record useful events during daily activities or during important operations.

Although each of these methods is effective, one point they share in common is that they all work individually, and while the police can use any of these approaches individually or concurrently, having a machine that is able to incorporate the positive aspects of all of these technologies would be highly beneficial.

ML techniques used in crime prediction

In ref. [ 42 ], a comparative study was carried out between violent crime patterns from the Communities and Crime Unnormalized Dataset versus actual crime statistical data using the open source data mining software Waikato Environment for Knowledge Analysis (WEKA). Three algorithms, namely, linear regression, additive regression, and decision stump, were implemented using the same finite set of features on communities and actual crime datasets. Test samples were randomly selected. The linear regression algorithm could handle randomness to a certain extent in the test samples and thus proved to be the best among all three selected algorithms. The scope of the project was to prove the efficiency and accuracy of ML algorithms in predicting violent crime patterns and other applications, such as determining criminal hotspots, creating criminal profiles, and learning criminal trends.

When considering WEKA [ 43 ], the integration of a new graphical interface called Knowledge Flow is possible, which can be used as a substitute for Internet Explorer. IT provides a more concentrated view of data mining in association with the process orientation, in which individual learning components (represented by java beans) are used graphically to show a certain flow of information. The authors then describe another graphical interface called an experimenter, which as the name suggests, is designed to compare the performance of multiple learning schemes on multiple data sets.

In ref. [ 34 ], the potential of applying a predictive analysis of crime forecasting in an urban context is studied. Three types of crime, namely, home burglary, street robbery, and battery, were aggregated into grids of 200 m × 250 m and retrospectively analyzed. Based on the crime data of the previous 3 years, an ensemble model was applied to synthesize the results of logistic regression and neural network models in order to obtain fortnightly and monthly predictions for the year 2014. The predictions were evaluated based on the direct hit rate, precision, and prediction index. The results of the fortnightly predictions indicate that by applying a predictive analysis methodology to the data, it is possible to obtain accurate predictions. They concluded that the results can be improved remarkably by comparing the fortnightly predictions with the monthly predictions with a separation between day and night.

In ref. [ 44 ], crime predictions were investigated based on ML. Crime data of the last 15 years in Vancouver (Canada) were analyzed for prediction. This machine-learning-based crime analysis involves the collection of data, data classification, identification of patterns, prediction, and visualization. K-nearest neighbor (KNN) and boosted decision tree algorithms were also implemented to analyze the crime dataset. In their study, a total of 560,000 crime datasets between 2003 and 2018 were analyzed, and crime prediction with an accuracy of between 39% and 44% was obtained by predicting the crime using ML algorithms. The accuracy was low as a prediction model, but the authors concluded that the accuracy can be increased or improved by tuning both the algorithms and crime data for specific applications.

In ref. [ 45 ], a ML approach is presented for the prediction of crime-related statistics in Philadelphia, United States. The problem was divided into three parts: determining whether the crime occurs, occurrence of crime and most likely crime. Algorithms such as logistic regression, KNN, ordinal regression, and tree methods were used to train the datasets to obtain detailed quantitative crime predictions with greater significance. They also presented a map for crime prediction with different crime categories in different areas of Philadelphia for a particular time period with different colors indicating each type of crime. Different types of crimes ranging from assaults to cyber fraud were included to match the general pattern of crime in Philadelphia for a particular interval of time. Their algorithm was able to predict whether a crime will occur with an astonishing 69% accuracy, as well as the number of crimes ranging from 1 to 32 with 47% accuracy.

In ref. [ 46 ], the authors analyzed a dataset consisting of several crimes and predicted the type of crime that may occur in the near future depending on various conditions. ML and data science techniques were used for crime prediction in a crime dataset from Chicago, United States. The crime dataset consists of information such as the crime location description, type of crime, date, time, and precise location coordinates. Different combinations of models, such as KNN classification, logistic regression, decision trees, random forest, a support vector machine (SVM), and Bayesian methods were tested, and the most accurate model was used for training. The KNN classification proved to be the best with an accuracy of approximately 0.787. They also used different graphs that helped in understanding the various characteristics of the crime dataset of Chicago. The main purpose of this paper is to provide an idea of how ML can be used by law enforcement agencies to predict, detect, and solve crime at a much better rate, which results in a reduction in crime.

In ref. [ 47 ], a graphical user interface-based prediction of crime rates using a ML approach is presented. The main focus of this study was to investigate machine-learning-based techniques with the best accuracy in predicting crime rates and explore its applicability with particular importance to the dataset. Supervised ML techniques were used to analyze the dataset to carry out data validation, data cleaning, and data visualization on the given dataset. The results of the different supervised ML algorithms were compared to predict the results. The proposed system consists of data collection, data preprocessing, construction of a predictive model, dataset training, dataset testing, and a comparison of algorithms, as shown in Fig.  1 . The aim of this study is to prove the effectiveness and accuracy of a ML algorithm for predicting violent crimes.

An external file that holds a picture, illustration, etc.
Object name is 42492_2021_75_Fig1_HTML.jpg

Dataflow diagram

In ref. [ 48 ], a feature-level data fusion method based on a deep neural network (DNN) is proposed to accurately predict crime occurrence by efficiently fusing multi-model data from several domains with environmental context information. The dataset consists of data from an online database of crime statistics from Chicago, demographic and meteorological data, and images. Crime prediction methods utilize several ML techniques, including a regression analysis, kernel density estimation (KDE), and SVM. Their approach mainly consisted of three phases: collection of data, analysis of the relationship between crime incidents and collected data using a statistical approach, and lastly, accurate prediction of crime occurrences. The DNN model consists of spatial features, temporal features, and environmental context. The SVM and KDE models had accuracies of 67.01% and 66.33%, respectively, whereas the proposed DNN model had an astonishing accuracy of 84.25%. The experimental results showed that the proposed DNN model was more accurate in predicting crime occurrences than the other prediction models.

In ref. [ 49 ], the authors mainly focused on the analysis and design of ML algorithms to reduce crime rates in India. ML techniques were applied to a large set of data to determine the pattern relations between them. The research was mainly based on providing a prediction of crime that might occur based on the occurrence of previous crime locations, as shown in Fig.  2 . Techniques such as Bayesian neural networks, the Levenberg Marquardt algorithm, and a scaled algorithm were used to analyze and interpret the data, among which the scaled algorithm gave the best result in comparison with the other two techniques. A statistical analysis based on the correlation, analysis of variance, and graphs proved that with the help of the scaled algorithm, the crime rate can be reduced by 78%, implying an accuracy of 0.78.

An external file that holds a picture, illustration, etc.
Object name is 42492_2021_75_Fig2_HTML.jpg

Functionality of proposed approach

In ref. [ 50 ], a system is proposed that predicts crime by analyzing a dataset containing records of previously committed crimes and their patterns. The proposed system works mainly on two ML algorithms: a decision tree and KNN. Techniques such as the random forest algorithm and Adaptive Boosting were used to increase the accuracy of the prediction model. To obtain better results for the model, the crimes were divided into frequent and rare classes. The frequent class consisted of the most frequent crimes, whereas the rare class consisted of the least frequent crimes. The proposed system was fed with criminal activity data for a 12-year period in San Francisco, United States. Using undersampling and oversampling methods along with the random forest algorithm, the accuracy was surprisingly increased to 99.16%.

In ref. [ 51 ], a detailed study on crime classification and prediction using ML and deep learning architectures is presented. Certain ML methodologies, such as random forest, naïve Bayes, and an SVM have been used in the literature to predict the number of crimes and hotspot prediction. Deep learning is a ML approach that can overcome the limitations of some machine-learning methodologies by extracting the features from the raw data. This paper presents three fundamental deep learning configurations for crime prediction: (1) spatial and temporal patterns, (2) temporal and spatial patterns, and (3) spatial and temporal patterns in parallel. Moreover, the proposed model was compared with 10 state-of-the-art algorithms on 5 different crime prediction datasets with more than 10 years of crime data.

In ref. [ 52 ], a big data and ML technique for behavior analysis and crime prediction is presented. This paper discusses the tracking of information using big data, different data collection approaches, and the last phase of crime prediction using ML techniques based on data collection and analysis. A predictive analysis was conducted through ML using RapidMiner by processing historical crime patterns. The research was mainly conducted in four phases: data collection, data preparation, data analysis, and data visualization. It was concluded that big data is a suitable framework for analyzing crime data because it can provide a high throughput and fault tolerance, analyze extremely large datasets, and generate reliable results, whereas the ML based naïve Bayes algorithm can achieve better predictions using the available datasets.

In ref. [ 53 ], various data mining and ML technologies used in criminal investigations are demonstrated. The contribution of this study is highlighting the methodologies used in crime data analytics. Various ML methods, such as a KNN, SVM, naïve Bayes, and clustering, were used for the classification, understanding, and analysis of datasets based on predefined conditions. By understanding and analyzing the data available in the crime record, the type of crime and the hotspot of future criminal activities can be determined. The proposed model was designed to perform various operations such as feature selection, clustering, analysis, prediction, and evaluation of the given datasets. This research proves the necessity of ML techniques for predicting and analyzing criminal activities.

In ref. [ 54 ], the authors incorporated the concept of a grid-based crime prediction model and established a range of spatial-temporal features based on 84 types of geographic locations for a city in Taiwan. The concept uses ML algorithms to learn the patterns and predict crime for the following month for each grid. Among the many ML methods applied, the best model was found to be a DNN. The main contribution of this study is the use of the most recent ML techniques, including the concept of feature learning. In addition, the testing of crime displacement also showed that the proposed model design outperformed the baseline.

In ref. [ 55 ], the authors considered the development of a crime prediction model using the decision tree (J48) algorithm. When applied in the context of law enforcement and intelligence analysis, J48 holds the promise of mollifying crime rates and is considered the most efficient ML algorithm for the prediction of crime data in the related literature. The J48 classifier was developed using the WEKA tool kit and later trained on a preprocessed crime dataset. The experimental results of the J48 algorithm predicted the unknown category of crime data with an accuracy of 94.25287%. With such high accuracy, it is fair to count on the system for future crime predictions.

Comparative study of different forecasting methods

First, in refs. [ 56 , 57 ], the authors predicted crime using the KNNs algorithm in the years 2014 and 2013, respectively. Sun et al. [ 56 ] proved that a higher crime prediction accuracy can be obtained by combining the grey correlation analysis based on new weighted KNN (GBWKNN) filling algorithm with the KNN classification algorithm. Using the proposed algorithm, we were able to obtain an accuracy of approximately 67%. By contrast, Shojaee et al. [ 57 ] divided crime data into two parts, namely, critical and non-critical, and applied a simple KNN algorithm. They achieved an astonishing accuracy of approximately 87%.

Second, in refs. [ 58 , 59 ], crime is predicted using a decision tree algorithm for the years 2015 and 2013, respectively. In their study, Obuandike et al. [ 58 ] used the ZeroR algorithm along with a decision tree but failed to achieve an accuracy of above 60%. In addition, Iqbal et al. [ 59 ] achieved a stunning accuracy of 84% using a decision tree algorithm. In both cases, however, a small change in the data could lead to a large change in the structure.

Third, in refs. [ 60 , 61 ], a novel crime detection technique called naïve Bayes was implemented for crime prediction and analysis. Jangra and Kalsi [ 60 ] achieved an astounding crime prediction accuracy of 87%, but could not apply their approach to datasets with a large number of features. By contrast, Wibowo and Oesman [ 61 ] achieved an accuracy of only 66% in predicting crimes and failed to consider the computational speed, robustness, and scalability.

Below, we summarize the above comparison and add other models to further illustrate this comparative study and the accuracy of some frequently used models (Table  1 ).

Performance analysis of forecasting methods

Computer vision models combined with machine and deep learning techniques

In ref. [ 66 ], the study focused on three main questions. First, the authors question whether computer vision algorithms actually work. They stated that the accuracy of the prediction is 90% over fewer complex datasets, but the accuracy drops to 60% over complex datasets. Another concern we need to focus on is reducing the storage and computational costs. Second, they question whether it is effective for policing. They determined that a distinct activity detection is difficult, and pinpointed a key component, the Public Safety Visual Analytics Workstation, which includes many capabilities ranging from detection and localization of objects in camera feeds to labeling actions and events associated with training data, and allowing query-based searches for specific events in videos. By doing so, they aim to view every event as a computer-vision trained, recognized, and labeled event. The third and final question they ask is whether computer vision impacts the criminal justice system. The answer to this from their end is quite optimistic to say the least, although they wish to implement computer vision alone, which we suspect is unsatisfactory.

In ref. [ 67 ], a framework for multi-camera video surveillance is presented. The framework is designed so efficiently that it performs all three major activities of a typical police “stake-out”, i.e., detection, representation, and recognition. The detection part mixes video streams from multiple cameras to efficiently and reliably extract motion trajectories from videos. The representation helps in concluding the raw trajectory data to construct hierarchical, invariant, and content-rich descriptions of motion events. Finally, the recognition part deals with event classification (such as robbery and possibly murder and molestation, among others) and identification of the data descriptors. For an effective recognition, they developed a sequence-alignment kernel function to perform sequence data learning to identify suspicious/possible crime events.

In ref. [ 68 ], a method is suggested for identifying people for surveillance with the help of a new feature called soft biometry, which includes a person’s height, built, skin tone, shirt and trouser color, motion pattern, and trajectory history to identify and track passengers, which further helps in predicting crime activities. They have gone further and discussed some absurd human error incidents that have resulted in the perpetrators getting away. They also conducted experiments, the results of which were quite astounding. In one case, the camera catches people giving piggyback rides in more than one frame of a single shot video. The second scenario shows the camera’s ability to distinguish between airport guards and passengers.

In ref. [ 69 ], the authors discussed automated visual surveillance in a realistic scenario and used Knight, which is a multiple camera surveillance and monitoring system. Their major targets were to analyze the detection, tracking, and classification performances. The detection, tracking, and classification accuracies were 97.4%, 96.7%, and 88%, respectively. The authors also pointed to the major difficulties of illumination changes, camouflage, uninteresting moving objects, and shadows. This research again proves the reliability of computer vision models.

It is well known that an ideal scenario for a camera to achieve a perfect resolution is not possible. In ref. [ 70 ], security surveillance systems often produce poor-quality video, which could be a hurdle in gathering forensic evidence. They examined the ability of subjects to identify targeted individuals captured by a commercially available video security device. In the first experiment, subjects personally familiar with the targets performed extremely well at identifying them, whereas subjects unfamiliar with the targets performed quite poorly. Although these results might not seem to be very conclusive and efficient, police officers with experience in forensic identification performed as poorly as other subjects unfamiliar with the targets. In the second experiment, they asked how familiar subjects could perform so well, and then used the same video device edited clips to obscure the head, body, or gait of the targets. Hiding the body or gait produced a small decrease in recognition performance. Hiding the target heads had a dramatic effect on the subject’s ability to recognize the targets. This indicates that even if the quality of the video is low, the head the target was seen and recognized.

In ref. [ 71 ], an automatic number plate recognition (ANPR) model is proposed. The authors described it as an “image processing innovation”. The ANPR system consists of the following steps: (1) vehicle image capture, (2) preprocessing, (3) number plate extraction, (4) character segmentation, and (5) character recognition. Before the main image processing, a pre-processing of the captured image is conducted, which includes converting the red, green and blue image into a gray image, clamor evacuation, and border enhancement for brightness. The plate is then separated by judging its size. In character segmentation, the letters and numbers are separated and viewed individually. In character recognition, optical character recognition is applied to a given database.

Although real-time crime forecasting is vital, it is extremely difficult to achieve in practice. No known physical models provide a reasonable approximation with dependable results for such a complex system. In ref. [ 72 ], the authors adapted a spatial temporal residual network to well-represented data to predict the distribution of crime in Los Angeles at an hourly scale in neighborhood-sized parcels. These experiments were compared with several existing approaches for prediction, demonstrating the superiority of the proposed model in terms of accuracy. They compared their deep learning approach to ARIMA, KNN, and the historical average. In addition, they presented a ternarization technique to address the concerns of resource consumption for deployment in the real world.

In ref. [ 73 ], the authors conducted a significant study on crime prediction and showed the importance of non-crime data. The major objective of this research was taking advantage of DNNs to achieve crime prediction in a fine-grain city partition. They made predictions using Chicago and Portland crime data, which were further augmented with additional datasets covering the weather, census data, and public transportation. In the paper they split each city into grid cells (beats for Chicago and square grid for Portland). The crime numbers are broken into 10 bins, and their model predicts the most likely bin for each spatial region at a daily level. They train these data using increasingly complex neural network structures, including variations that are suited to the spatial and temporal aspects of the crime prediction problem. Using their model, they were able to predict the correct bin for the overall number of crimes with an accuracy of 75.6% for Chicago and 65.3% for Portland. They showed that adding the value of additional non-crime data was an important factor. They found that days with higher amounts of precipitation and snow decreased the accuracy of the model slightly. Then, considering the impact of transportation, the bus routes and train routes were presented within their beats, and it was shown that the beat containing a train station is on average 1.2% higher than its neighboring beats. The accuracy of a beat that contained one or more train lines passing through it was 0.5% more accurate than its neighboring beats.

In ref. [ 74 ], the authors taught a system how to monitor traffic and identify vehicles at night. They used the bright spots of the headlights and tail lights to identify an object first as a vehicle, and the bright light is extracted with a segmentation process, and then processed by a spatial clustering and tracking procedure that locates and analyzes the spatial and temporal features of the vehicle light. They also conducted an experiment in which, for a span of 20 min, the detection scores for cars and bikes were 98.79% and 96.84%, respectively. In another part of the test, they conducted the same test under the same conditions for 50 min, and the detection scores for cars and bikes were 97.58% and 98.48%, respectively. It is good for machines to be built at such a beginning level. This technology can also be used to conduct surveillance at night.

In ref. [ 75 ], an important approach for human motion analysis is discussed. The author mentions that human motion analysis is difficult because appearances are extremely variable, and thus stresses that focusing on marker-less vision-based human motion analysis has the potential to provide a non-obtrusive solution for the evaluation of body poses. The author claims that this technology can have vast applications such as surveillance, human-computer interaction, and automatic annotation, and will thus benefit from a robust solution. In this paper, the characteristics of human motion analysis are discussed. We divide the analysis part into two aspects, modeling and an estimation phase. The modeling phase includes the construction of the likelihood function [including the camera model, image descriptors, human body model and matching function, and (physical) constraints], and the estimation phase is concerned with finding the most likely pose given the likelihood (function result) of the surface. We discuss the model-free approaches separately.

In ref. [ 76 ], the authors provided insight into how we can achieve crime mapping using satellites. The need for manual data collection for mapping is costly and time consuming. By contrast, satellite imagery is becoming a great alternative. In this paper, they attempted to investigate the use of deep learning to predict crime rates directly from raw satellite imagery. They trained a deep convolutional neural network (CNN) on satellite images obtained from over 1 million crime-incident reports (15 years of data) collected by the Chicago Police Department. The best performing model predicted crime rates from raw satellite imagery with an astounding accuracy of 79%. To make their research more thorough, they conducted a test for reusability, and used the tested and learned Chicago models for prediction in the cities of Denver and San Francisco. Compared to maps made from years of data collected by the corresponding police departments, their maps have an accuracy of 72% and 70%, respectively. They concluded the following: (1) Visual features contained in satellite imagery can be successfully used as a proxy indicator of crime rates; (2) ConvNets are capable of learning models for crime rate prediction from satellite imagery; (3) Once deep models are used and learned, they can be reused across different cities.

In ref. [ 77 ], the authors suggested an extremely intriguing research approach in which they claim to prove that looking beyond what is visible is to infer meaning to what is viewed from an image. They even conducted an interesting study on determining where a McDonalds could be located simply from photographs, and provided the possibility of predicting crime. They compared the human accuracy on this task, which was 59.6%, and the accuracy of using gradient-based features, which was 72.5%, with a chance performance (a chance performance is what you would obtain if you performed at random) of only 50%. This indicates the presence of some visual cues that are not easily spotted by an average human, but are able to be spotted by a machine, thus enables us to judge whether an area is safe. The authors indicated that numerous factors are often associated with our intuition, which we use to avoid certain areas because they may seem “shady” or “unsafe”.

In ref. [ 78 ], the authors describe in two parts how close we are to achieving a fully automated surveillance system. The first part views the possibility of surveillance in a real-world scenario where the installation of systems and maintenance of systems are in question. The second part considers the implementation of computer vision models and algorithms for behavior modeling and event detection. They concluded that the complete scenario is under discussion, and therefore many people are conducting research and obtaining results. However, as we look closely, we can see that reliable results are possible only in certain aspects, while other areas are still in the development process, such as obtaining information on cars and their owners as well as accurately understanding the behavior of a possible suspect.

Many times during criminal activities, convicts use hand gestures to signal messages to each other. In ref. [ 79 ], research on hand gesture recognition was conducted using computer vision models. Their application architecture is of extremely high quality and is easy to understand. They begin by capturing images, and then try detecting a hand in the background. They apply either computer aided manufacturing or different procedure in which they first convert a picture into gray scale, after which they set the image return on investment, and then find and extract the biggest contour. They then determine the convex hull of the contour to try and find an orientation around the bounded rectangle, and finally interpret the gesture and convert it into a meaningful command.

Crime hotspots or areas with high crime intensity are places where the future possibility of a crime exists along with the possibility of spotting a criminal. In ref. [ 80 ], the authors conducted research on forecasting crime hotspots. They used Google Tensor Flow to implement their model and evaluated three options for the recurrent neural network (RNN) architecture: accuracy, precision, and recall. The focus is on achieving a larger value to prove that the approach has a better performance. The gated recurrent unit (GRU) and long short-term memory (LSTM) versions obtained similar performance levels with an accuracy of 81.5%, precision of 86%–87%, recall of 75%, and F1-score of 0.8. Both perform much better than the traditional RNN version. Based on the area under the ROC curve (AUC) performance observations, the GRU version was 2% better than the RNN version. The LSTM version achieved the best AUC score, which was improved by 3% over the GRU version.

In ref. [ 81 ], a spatiotemporal crime network (STCN) is proposed that applies a CNN for predicting crime before it occurs. The authors evaluated the STCN using 311 felony datasets from New York from 2010 to 2015. The results were extremely impressive, with the STCN achieving an F1-score of 88% and an AUC of 92%, which confirmed that it exceeded the performance of the four baselines. Their proposed model achieved the best performance in terms of both F1 and AUC, which remained better than those of the other baselines even when the time window reached 100. This study provides evidence that the system can function well even in a metropolitan area.

Proposed idea

After finding and understanding various distinct methods used by the police for surveillance purposes, we determined the importance of each method. Each surveillance method can perform well on its own and produce satisfactory results, although for only one specific characteristic, that is, if we use a Sting Ray, it can help us only when the suspect is using a phone, which should be switched on. Thus, it is only useful when the information regarding the stake out location is correct. Based on this information, we can see how the ever-evolving technology has yet again produced a smart way to conduct surveillance. The introduction of deep learning, ML, and computer vision techniques has provided us with a new perspective on ways to conduct surveillance. This is an intelligent approach to surveillance because it tries to mimic a human approach, but it does so 24 h a day, 365 days a year, and once it has been taught how to do things it does them in the same manner repeatedly.

Although we have discussed the aspects that ML and computer vision can achieve, but what are these aspects essentially? This brings us to the main point of our paper discussion, i.e., our proposed idea, which is to combine the point aspects of Sting Ray, body cams, facial recognition, number plate recognition, and stakeouts. New features iclude core analytics, neural networks, heuristic engines, recursion processors, Bayesian networks, data acquisition, cryptographic algorithms, document processors, computational linguistics, voiceprint identification, natural language processing, gait analysis, biometric recognition, pattern mining, intel interpretation, threat detection, threat classification. The new features are completely computer dependent and hence require human interaction for development; however, once developed, it functions without human interaction and frees humans for other tasks. Let us understand the use of each function.

  • Core analytics: This includes having knowledge of a variety of statistical techniques, and by using this knowledge, predict future outcomes, which in our case are anything from behavioral instincts to looting a store in the near future.
  • Neural networks: This is a concept consisting of a large number of algorithms that help in finding the relation between data by acting similar to a human brain, mimicking biological nerve cells and hence trying to think on its own, thus understanding or even predicting a crime scene.
  • Heuristic engines: These are engines with data regarding antiviruses, and thus knowledge about viruses, increasing the safety of our system as it identifies the type of threat and eliminates it using known antiviruses.
  • Cryptographic algorithms: Such algorithms are used in two parts. First, they privately encode the known confidential criminal data. Second, they are used to keep the newly discovered potential crime data encrypted.
  • Recursion processors: These are used to apply the functions of our machine repeatedly to make sure they continuously work and never break the surveillance of the machine.
  • Bayesian networks: These are probabilistic acyclic graphical models that can be used for a variety of purposes such as prediction, anomaly detection, diagnostics, automated insight, reasoning, time series prediction, and decision making under uncertainty.
  • Data acquisition: This might be the most important part because our system has to possess the knowledge of previous crimes and learn from them to predict future possible criminal events.
  • Document processors: These are used after the data collection, primarily for going through, organizing, analyzing, and learning from the data.
  • Computer linguistics: Using algorithms and learning models, this method is attempting to give a computer the ability to understand human spoken language, which would be ground breaking, allowing a machine to not only identify a human but also understands what the human is saying.
  • Natural language processor: This is also used by computers to better understand human linguistics.
  • Voice print identification: This is an interesting application, which tries to distinguish one person’s voice from another, making it even more recognizable and identifiable. It identifies a target with the help of certain characteristics, such as the configuration of the speaker’s mouth and throat, which can be expressed as a mathematical formula.
  • Gait analysis: This will be used to study human motion, understanding posture while walking. It will be used to better understand the normal pace of a person and thus judge an abnormal pace.
  • Bio metric identification: This is used to identify individuals by their face, or if possible, identify them by their thumb print stored in few different databases.
  • Pattern mining: This is a subset of data mining and helps in observing patterns among routine activities. The use of this technology will help us identify if a person is seen an usual number of times behind a pharmacy window at particular time, allowing the machine to alert the authorities.
  • Intel interpretation: This is also used to make sense of the information gathered, and will include almost all features mentioned above, combining the results of each and making a final meaningful prediction.
  • Threat detection: A threat will be detected if during the intel processing a certain number of check boxes predefined when making the system are ticked.
  • Threat classification: As soon as a threat is detected, it is classified, and the threat can then be categorized into criminal case levels, including burglary, murder, or a possible terrorist attack; thus, based on the time line, near or distant future threats might be predictable.

Combining all of these features, we aim to produce software that has the capability of becoming a universal police officer, having eyes and ears everywhere. Obviously, we tend to use the CCTVs in urban areas during a preliminary round to see the functioning of such software in a real-world scenario. The idea is to train and make the software learn all previously recorded crimes whose footages are available (at least 5000 cases for optimum results), through supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning to help it to understand what a crime actually is. Thus, it will achieve a better understanding of criminality and can answer how crimes happen, as well as why and where. We do not propose simply making a world-class model to predict crimes, we also suggest making it understand previous crimes to better judge and therefore better predict them.

We aim to use this type of technology on two fronts: first and most importantly, for predicting crimes before they happen, followed by a thorough analysis of a crime scene allowing the system to possibly identify aspects that even a human eye can miss.

The most interesting cutting-edge and evolutionary idea that we believe should be incorporated is the use of scenario simulations. After analyzing the scene and using the 17 main characteristics mentioned above, the software should run at least 50 simulations of the present scenario presented in front of it, which will be assisted by previously learned crime recordings. The simulation will help the software in asserting the threat level and then accordingly recommend a course of action or alert police officials.

To visualize a possible scenario where we are able to invent such software, we prepared a flow chart (Fig.  3 ) to better understand the complete process.

An external file that holds a picture, illustration, etc.
Object name is 42492_2021_75_Fig3_HTML.jpg

Flowchart of our proposed model. The data are absorbed from the surrounding with the help of cameras and microphones. If the system depicts an activity as suspicious, it gathers more intel allowing the facial algorithms to match against a big database such as a Social Security Number or Aadhaar card database. When it detects a threat, it also classifies it into categories such as the nature of the crime and time span within which it is possible to take place. With all the gathered intel and all the necessary details of the possible crime, it alerts the respective authority with a 60-word synopsis to give them a brief idea, allowing law enforcement agencies to take action accordingly

Although this paper has been implemented with high accuracy and detailed research, there are certain challenges that can pose a problem in the future. First, the correct and complete building of the whole system has to be done in the near future, allowing its implementation to take place immediately and properly. Furthermore, the implementation itself is a significant concern, as such technologies cannot be directly implemented in the open world. The system must first be tested in a small part of a metropolitan area, and only then with constant improvements (revisions of the first model) can its usage be scaled up. Hence, the challenges are more of a help in perfecting the model and thus gradually providing a perfect model that can be applied to the real world. Moreover, there are a few hurdles in the technological aspects of the model, as the size of the learning data will be enormous, and thus processing it will take days and maybe even weeks. Although these are challenges that need to be addressed, they are aspects that a collective team of experts can overcome after due diligence, and if so, the end product will be worth the hard work and persistence.

Future scope

This paper presented the techniques and methods that can be used to predict crime and help law agencies. The scope of using different methods for crime prediction and prevention can change the scenario of law enforcement agencies. Using a combination of ML and computer vision can substantially impact the overall functionality of law enforcement agencies. In the near future, by combining ML and computer vision, along with security equipment such as surveillance cameras and spotting scopes, a machine can learn the pattern of previous crimes, understand what crime actually is, and predict future crimes accurately without human intervention. A possible automation would be to create a system that can predict and anticipate the zones of crime hotspots in a city. Law enforcement agencies can be warned and prevent crime from occurring by implementing more surveillance within the prediction zone. This complete automation can overcome the drawbacks of the current system, and law enforcement agencies can depend more on these techniques in the near future. Designing a machine to anticipate and identify patterns of such crimes will be the starting point of our future study. Although the current systems have a large impact on crime prevention, this could be the next big approach and bring about a revolutionary change in the crime rate, prediction, detection, and prevention, i.e., a “universal police officer”.

Conclusions

Predicting crimes before they happen is simple to understand, but it takes a lot more than understanding the concept to make it a reality. This paper was written to assist researchers aiming to make crime prediction a reality and implement such advanced technology in real life. Although police do include the use of new technologies such as Sting Rays and facial recognition every few years, the implementation of such software can fundamentally change the way police work, in a much better way. This paper outlined a framework envisaging how the aspects of machine and deep learning, along with computer vision, can help create a system that is much more helpful to the police. Our proposed system has a collection of technologies that will perform everything from monitoring crime hotspots to recognizing people from their voice notes. The first difficulty faced will be to actually make this system, followed by problems such as its implementation and use, among others. However, all of these problems are solvable, and we can also benefit from a security system that monitors the entire city around-the-clock. In other words, to visualize a world where we incorporate such a system into a police force, tips or leads that much more reliable can be achieved and perhaps crime can be eradicated at a much faster rate.

Acknowledgements

The authors are grateful to Department of Computer Engineering, SAL Institute of Technology and Engineering Research and Department of Chemical Engineering, School of Technology, Pandit Deendayal Energy University for the permission to publish this research.

Abbreviations

Authors’ contributions.

All the authors make substantial contribution in this manuscript; NS, NB and MS participated in drafting the manuscript; NS, NB and MS wrote the main manuscript; all the authors discussed the results and implication on the manuscript at all stages the author(s) read and approved the final manuscript.

Not applicable.

Availability of data and materials

Declarations.

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Subscribe to the PwC Newsletter

Join the community, edit social preview.

crime prediction using machine learning research paper

Add a new code entry for this paper

Remove a code repository from this paper, mark the official implementation from paper authors, add a new evaluation result row.

  • CRIME PREDICTION

Remove a task

Add a method, remove a method, edit datasets, crime prediction using machine learning and deep learning: a systematic review and future directions.

28 Mar 2023  ·  Varun Mandalapu , Lavanya Elluri , Piyush Vyas , Nirmalya Roy · Edit social preview

Predicting crime using machine learning and deep learning techniques has gained considerable attention from researchers in recent years, focusing on identifying patterns and trends in crime occurrences. This review paper examines over 150 articles to explore the various machine learning and deep learning algorithms applied to predict crime. The study provides access to the datasets used for crime prediction by researchers and analyzes prominent approaches applied in machine learning and deep learning algorithms to predict crime, offering insights into different trends and factors related to criminal activities. Additionally, the paper highlights potential gaps and future directions that can enhance the accuracy of crime prediction. Finally, the comprehensive overview of research discussed in this paper on crime prediction using machine learning and deep learning approaches serves as a valuable reference for researchers in this field. By gaining a deeper understanding of crime prediction techniques, law enforcement agencies can develop strategies to prevent and respond to criminal activities more effectively.

Code Edit Add Remove Mark official

Tasks edit add remove, datasets edit, results from the paper edit, methods edit add remove.

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Leveraging transfer learning with deep learning for crime prediction

Contributed equally to this work with: Umair Muneer Butt, Sukumar Letchmunan

Roles Conceptualization, Data curation, Formal analysis, Investigation, Project administration, Supervision, Visualization, Writing – review & editing

* E-mail: [email protected] (UMB); [email protected] (SL)

Affiliations School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia, Department of Computer Science, The University of Chenab, Gujrat, Pakistan

ORCID logo

Roles Funding acquisition, Supervision, Validation, Writing – review & editing

Affiliation School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia

Roles Conceptualization, Data curation, Formal analysis, Investigation

Roles Formal analysis, Investigation

Affiliation Department of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar, Perak

  • Umair Muneer Butt, 
  • Sukumar Letchmunan, 
  • Fadratul Hafinaz Hassan, 
  • Tieng Wei Koh

PLOS

  • Published: April 17, 2024
  • https://doi.org/10.1371/journal.pone.0296486
  • Reader Comments

Fig 1

Crime remains a crucial concern regarding ensuring a safe and secure environment for the public. Numerous efforts have been made to predict crime, emphasizing the importance of employing deep learning approaches for precise predictions. However, sufficient crime data and resources for training state-of-the-art deep learning-based crime prediction systems pose a challenge. To address this issue, this study adopts the transfer learning paradigm. Moreover, this study fine-tunes state-of-the-art statistical and deep learning methods, including Simple Moving Averages (SMA), Weighted Moving Averages (WMA), Exponential Moving Averages (EMA), Long Short Term Memory (LSTM), Bi-directional Long Short Term Memory (BiLSTMs), and Convolutional Neural Networks and Long Short Term Memory (CNN-LSTM) for crime prediction. Primarily, this study proposed a BiLSTM based transfer learning architecture due to its high accuracy in predicting weekly and monthly crime trends. The transfer learning paradigm leverages the fine-tuned BiLSTM model to transfer crime knowledge from one neighbourhood to another. The proposed method is evaluated on Chicago, New York, and Lahore crime datasets. Experimental results demonstrate the superiority of transfer learning with BiLSTM, achieving low error values and reduced execution time. These prediction results can significantly enhance the efficiency of law enforcement agencies in controlling and preventing crime.

Citation: Butt UM, Letchmunan S, Hassan FH, Koh TW (2024) Leveraging transfer learning with deep learning for crime prediction. PLoS ONE 19(4): e0296486. https://doi.org/10.1371/journal.pone.0296486

Editor: Sathishkumar Veerappampalayam Easwaramoorthy, Sunway University, MALAYSIA

Received: April 5, 2023; Accepted: December 7, 2023; Published: April 17, 2024

Copyright: © 2024 Butt et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data is publicly available in the respective portals as cited in the manuscript https://www.kaggle.com/datasets/umairbutt/crime-prediction-chicago-newyork-lahore/data https://github.com/muneerumair/transfer-learning https://data.cityofnewyork.us/Public-Safety/NYC-crime/qb7u-rbmr https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-Present/ijzp-q8t2 https://www.pbs.gov.pk/publication/social-indicators-pakistan-2021 .

Funding: This work was partially supported by the Ministry of Higher Education Malaysia for Fundamental Research Grant Scheme (FRGS) with Project Code:FRGS/1/2020/TK03/USM/02/1, School of Computer Sciences and University Sains Malaysia (USM). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

Crime is one of the most intensifying and critical concerns in ensuring the safety and security of the public. Crime has been one of the social issues manipulating the nature of life and economic progress in a community in recent years [ 1 ]. The accessibility of modern technology has permissible implementation to gather detailed data on crime [ 2 , 3 ]. With today’s increasing crime rates, crime analysis is required, including strategies and procedures to reduce the chance of crime [ 4 ]. The fundamental component of the sustainable development of a country is security. It is the obligatory duty of a country’s security forces to regulate criminal occurrences and threats to society’s well-being. Governments spend much of their Gross Domestic Product (GDP) on enforcement agencies [ 5 , 6 ].

The priority of law enforcement agencies has been to study crime trends and patterns to formulate an effective policy based on historical data to create a tranquil community [ 7 , 8 ]. The vast amount of spatiotemporal data has grabbed the attention of scientists in conducting further analyses of criminal interrogation and crime. Depending on past data, crime prediction has been a topic of interest that has gained much attention in analysis, resulting in the proposal of numerous methods by covering multiple aspects associated with crime [ 9 – 11 ]. Crime is frequently seen as a location-specific feature, as some areas pose a more critical threat of criminal activity than others [ 12 ]. Fig 1 shows the crime spike variance in Chicago city. It is well known that crime is not distributed evenly, uniformly, or even randomly within a given area, regardless of its size [ 13 ]. Spatio-temporal facts within the crime datasets using the Geographic Information System (GIS) have transformed the crime prediction system [ 14 , 15 ].

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0296486.g001

Recently, time series analysis strategies such as Autoregressive Integrated Moving Averages (ARIMA) and Seasonal Autoregressive Integrated Moving Averages (SARIMA) produced promising results for crime prediction [ 16 – 19 ] as compared to traditional machine learning techniques. In addition, machine and deep learning methods have been used to predict crime using spatiotemporal data [ 20 , 21 ]. Moreover, deep learning strategies like CNN and LSTM have additionally remained investigated and shown to be beneficial compared to the cutting-edge approach [ 22 – 24 ]. A hybrid of LSTM and ES gives promising results in predicting financial time series data [ 25 ]. Recent literature shows the challenges of forecasting and predicting vicious acts primarily in denser regions of excessive crime through various deep learning and time series analysis models [ 26 , 27 ].

However, adequate data is necessary to strengthen the crime prediction system [ 28 ]. Researchers worldwide study alternative approaches like transfer learning [ 29 , 30 ] to overcome this issue. In most deep learning models, transfer learning is employed to solve the problem of inadequate data [ 21 ]. Ye et al. [ 31 ] suggested a unique framework for time series prediction using transfer learning. The primary purpose of this research was to transmit information, or functionality, from the source to the target dataset. However, in some instances, if the targeted dataset is insufficient, the model may be required to learn features or patterns from several source data sets.

Transfer learning has recently been employed in a variety of research domains, such as forecasting traffic [ 32 ], predicting financial time series data [ 33 ], and forecasting air quality index [ 34 ]. For places with similar demographic features and even an exceptional state, it is possible to use transfer learning to predict crime. Transfer learning creates a generic model, incorporating his previous knowledge and performing admirably in the new environment.

This study is divided into three steps. First, the study examines several statistical modelling techniques in finance, economics, and business for time series prediction, such as SMA, WMA, and EMA. Moreover, this study investigates deep learning-based algorithms for time series prediction, such as LSTM, BiLSTMs, and CNN-LSTM algorithms. Finally, a BiLSTM based architecture is proposed by adopting a transfer learning paradigm to overcome the deep learning model’s excessive data availability and training issues. This approach transfers knowledge from one neighbourhood to another, utilizing fewer resources and time.

The rest of the study is as follows: Section 2 discussed state-of-the-art literature on crime prediction and forecasting and used it for transfer learning. Section 3 describes the proposed methodology. Section 4 highlights the significance of the proposed model using experimental evaluation. Next, the performance measures used to conduct the research are presented with an experimental evaluation. Finally, Section 5 concludes the paper by focusing on empirical findings and future directions.

2 Related work

This section discusses the two state-of-the-art aspects involved in this research. First, it compares various statistical and deep learning techniques for crime prediction. Second, this study highlights the significance of the transfer learning paradigm in solving massive data availability and model training issues for deep learning and improving the prediction accuracy of various time series problems.

2.1 Deep learning and statistical techniques

Several attempts have been reported in the literature on the significance of statistical and deep learning approaches for prediction [ 17 , 27 , 35 ]. Particularly time series analysis techniques such as ARIMA, SARIMA [ 21 ], Exponential Smoothing (ES), and Moving averages models [ 36 ]. Moreover, Deep learning techniques such as LSTM [ 37 ], ST-ResNet [ 38 ], and Deep Neural Networks [ 39 ] have also been reported for enhancing crime prediction accuracy.

Zhe Li et al. [ 18 ] study the inherent traits of Chinese city crime by analyzing crime data from the original case file. First, a quantitative method for case facts is devised, primarily based on Chinese descriptions, which can be utilized to drastically transform the unstructured information within the case record to the model’s safety level. Second, based on the variety of cases, the occurrence of time, and location, assess the core traits of the case. Finally, an ARIMA-based forecasting model is introduced to predict the state of crime over time. Hossain et al. [ 40 ] discovered spatiotemporal crime hotspots by examining two distinct real-world crime datasets for Los Angeles (LA) and Denver. The paper demonstrates how the Naive Bayesian and Decision Tree classifiers forecast potential crime types.

Manjunatha and Annappa [ 41 ] studied higher crime rates in cities using a predictive method based on spatial analysis and autoregressive models, highlighting the hazardous crime location. For the trial of this approach, two real-world datasets were collected in the cities of New York and Chicago, and the results demonstrate good precision in spatial and temporal crime prediction in each region. Gu and Dai [ 42 ] employed time series analysis on meteorological data, health-related data, and economic and stock market indexes.

Mahajan and Mansotra [ 43 ] proposed a deep learning-based system to detect cyberbullying on different social network sites. They used the transfer learning concept with deep learning to train a cyber-bullying detection model across Twitter, Wikipedia, and Form Spring. The transfer-based deep learning technique is evaluated on three state-of-the-art real-world datasets. They got an F1 score of 0.94 for Wikipedia and Twitter and 0.95 for the Form Spring dataset. Ying et al. [ 44 ] aimed for a CNN-based image retrieval system for crime scene investigation. The suggested technique is based on the feature fusion technique, which exploits transfer learning to extract useful information from crime scenes. Pre-trained models of VGG and PCA are utilized for fine-tuned feature extraction. The proposed algorithm is evaluated on crime scene investigation images provided by Xian University. The algorithm performed comparable to state-of-the-art techniques, with 93.37% precision.

2.2 Transfer learning for crime prediction

Recently, transfer learning has been exploited with different classification approaches to address data availability challenges in the real world. Transfer learning aims to get knowledge from one source task and apply it to a target but related task [ 45 ]. The study of transfer learning is inspired by the idea that humans can logically use previously acquired knowledge to solve new problems quickly and accurately. Bappee et al. [ 46 ] explored transfer learning to predict crime in neighbouring city boroughs. Crime data from New York City from 2012 to 2013 was collected to evaluate the theoretical framework presented in this paper. They identified several research topics that need the serious attention of researchers.

Karl et al. [ 47 ] defined transfer learning as follows: Transfer learning for deep neural networks is the process of first training a base network on a source dataset and task and then transferring the learned features (the network’s weights) to a second network to be trained on a target dataset and task. Transfer learning has been widely used in Computer Vision (CV) and Natural Language Processing (NLP). Fuzhen et al. [ 48 ] presented an inclusive survey on the significance of transfer learning and its possible usage with existing machine learning algorithms. They discussed the performance of twenty different transfer learning algorithms by evaluating three real-world datasets: the Amazon review, Office No. 31, and Reuters 21578. The experiment’s outcomes showed that transfer learning models should be carefully selected for solving different real-world problems.

Huaxia et al. [ 49 ] addressed the data availability problem for certain regions using meta-learning with spatiotemporal prediction. The term “transfer learning” refers to transferring the knowledge of the model learned on sufficient data from a city to other cities where data availability is limited. They investigate the effectiveness of the meta-learning approach with the fusion of transfer learning for spatiotemporal prediction of traffic and water quality in Chicago and Boston cities. Lianbing et al. [ 50 ] suggested a hybrid intrusion detection method based on fuzzy C-mean PCA and clustering to overcome different Internet of Things-based security and privacy issues. They exploit transfer learning with the proposed intrusion detection approach for various security factors. The algorithm was assessed on the dataset named KDD-CUP99. Simulations showed a low false-positive value by improving detection accuracy.

Hu et al. [ 51 ] suggested a method for forecasting wind velocity for the new farm that involved transferring the information of numerous historical farms. The authors pre-train a two-layer deep neural network model using time-series data from multiple ancient farms. The trained model’s parameters are standard across all wind farms. Therefore, the model may be thought of as a recurring feature transformation. With a model created using multi-source datasets, the overall performance based on a single dataset is not analyzed.

3 Proposed methodology

This section discusses the proposed crime prediction methodology. Moreover, various state-of-the-art prediction methods are fine-tuned, and the best approach is used under the transfer learning paradigm. The methodology comprised several steps required to perform crime prediction using transfer learning, as shown in Fig 2 .

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g002

3.1 Data collection and preprocessing

The dataset utilized in this research comprises criminal data from Chicago, New York, and Lahore. This study obtained publicly accessible datasets from their respective official crime portals, including Chicago [ 52 ], New York [ 53 ], and Lahore [ 54 ]. Common attributes are chosen in each dataset: id, date, time, crime category, crime description, spatial (longitude and latitude), and year. Table 1 shows the data specifications for each city.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.t001

The Chicago city dataset reports crimes from January 2001 to December 2020. The crime dataset of Chicago originally had 7255968 crime records, of which 682341 were eliminated due to incorrect formatting (duplication, values lacking, etc.). The criminal records reported from January 2006 to 2019 are included in the crime dataset of New York, with a population density of 8.4 million in 2019. In the crime dataset of NYC, there were 2158804 records initially, and 45884 records were eliminated during data cleaning. Finally, there have been 2112920 records for experiments in New York. The Punjab Police Department revealed the Lahore City Crime dataset from 2015 to 2016. The crime dataset of Lahore originally had 151638 crime records, of which 18 were eliminated during data preprocessing. Lastly, there have been 151611 records for Lahore for the experiment.

3.2 Exploratory data analysis (EDA)

This section discusses the comprehensive periodic insights of the Chicago, New York, and Lahore datasets. Fig 3 shows crime distribution over the years. Moreover, it shows a decreasing trend of crimes in Chicago and an increasing trend in New York and Lahore. Crimes are reported at the district, borough, and town levels.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g003

The crime datasets reveal that environmental variables like harsh weather or the winter season may reduce crime and favour individuals and residents. It is evident from Table 2 that crime rates were lower in February than in previous months in both Chicago and New York. But in Lahore, June has the lowest crime rate. The highest crime rates were recorded in July in Chicago and January in New York and Lahore. Most crimes were committed Friday in Chicago and New York, while on Thursday in Lahore.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.t002

Table 2 shows the comparison is drawn based on EDA among Chicago, New York, and Lahore crime data. Moreover, Table 2 outlines the top 5 crimes in all regions.

3.3 Crime prediction

This section discusses the six state-of-the-art prediction algorithms used for crime prediction. The six most promising statistical and deep learning methods (SMA, WMA, EMA, LSTM, LSTM-CNN, and BiLSTM) are fine-tuned to attain precise predictions on Chicago, New York, and Lahore crime datasets. The following sections explain the methods chosen for experimental investigation.

3.3.1 Statistical methods.

In this section, this study discussed the characteristics of each statistical prediction method and highlighted their significance in crime prediction.

crime prediction using machine learning research paper

3.3.2 Deep learning methods.

This section discusses state-of-the-art deep learning methods and compares their performance for crime prediction. In addition, the study utilizes this method later on for transfer learning. The following sections discuss state-of-the-art techniques.

3.3.2.1 Long Short Term Memory (LSTM). LSTM-based techniques are an RNN extension that can effectively deal with the vanishing gradient problem. This memory extension can remember information over an extended period, thus allowing interpretation, scripting, and erasing of information from their memories. The memory of LSTM is known as the “gated” cell, where the term gate is enthused by the capability to remember or ignore the memory information [ 56 ].

crime prediction using machine learning research paper

3.3.2.2. Bi-directional Long Short Term Memory (BiLSTMs). There are occasions when it is vital to make the prediction utilizing a lot of past and subsequent information since it is more accurate. Consequently, a two-way cyclic neural network is shown, and Fig 4 shows its construction. The output layer is connected to the forward and backward layers and contains six standard weights, w1–w6. At each time stamp, the six weights are repeated: w1 and w3, which represent input to the forward and backward hidden layers; w2 and w5, which represent information flow from the hidden layers to themselves; and w2, which represents information flow from the forward and backward hidden layers to the output layer (w4 and w6). The enlarged graph is acyclic because there is no information movement between the forward and backward hidden layers [ 56 ].

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g004

crime prediction using machine learning research paper

Algorithm 1 outlines the training and fine-tuning process for BiLSTM. Input to the model are three crime datasets and returned MSE, MAD, and MAE values.

Algorithm 1 Crime Prediction Using BiLSTM

Require: Crime Datasets of Chicago, Newyork, and Lahore

Ensure: MAE, MAD, And MSE of predicted data

 ▷ Data Splitting (70% Training and 30% Testing)

1: size ⇐ Length ( data ) * 0.70

2: train ⇐ data [0… size ]

3: test ⇐ data [ size …Length(size)]

4: set random.seed (8)    ▷Set random seed to an 8 to achieve optimal results

 ▷Fit a BiLSTM model to training data

5: X ⇐ train

6: Y ⇐ train − X

7: model = Sequential ()

8: model . add ( Bidirectional ( LSTM ( neurons , stateful = True ))

9: model.compile ( loss =’ MSE,MAE,MAE ’, optimizer =’ adam ’)

10: while i = range ( epoch ) do

11:   model . fit ( X , y , epochs = 1, shuffle = False )

12:   model . reset _ states ()

13: end while         ▷Make Predictions

14: Y predicted ⇐ model . predict ( Y )

15: output ⇐ Return ( MSE , MAD , MAE )

3.3.2.3 Hybrid of Convolution Neural Network and Long Short Term Memory (CNN-LSTM). A Convolutional Neural Network (CNN) is an artificial neural network with 2D picture input. It automatically extracts and learns features from 1D sequence data, such as univariate time series data, which may be a breeze using CNNs. A convolutional neural network model is frequently employed as part of a hybrid model with a long short-term memory backend for predictions [ 57 ]. The convolutional neural network analyzes subsequences of input collectively supplied as a sequence for the long short-term memory model to comprehend. This hybrid model is referred to as Convolutional Neural Networks Short-Term Memory. The first step is to divide the input orders into subsequences that the convolutional neural network model can handle. For example, the study may divide the univariate time series data into input and output samples using four steps and one step as output. Every subsequence of two-time steps may be interpreted by the convolutional neural networks, which can then offer a time series of interpretations to the LSTM model to process as input [ 58 ].

4 Experimental evaluation

This section evaluates six state-of-the-art statistical and deep learning approaches for crime prediction. Primarily, three state-of-the-art evaluation measures, Mean Absolute Error (MAE), Median Absolute Deviation (MAD), and Mean Squared Error (MSE), are used [ 27 , 59 ]. Furthermore, three spatiotemporal crime datasets from Chicago [ 52 ], New York [ 53 ], and Lahore [ 54 ] are used for monthly and weekly crime predictions. The following sections discuss the predictions in different experimental settings.

4.1 Chicago district-wise prediction for a month and week using statistical and deep learning method

This section illustrates the experimental analysis results to compare the prediction performance of statistical and deep learning methods on Chicago crime data. The crime data for Chicago City is divided into 22 districts. This study split the data into training (70%) and testing (30%) sets to perform monthly and weekly crime predictions for each district of Chicago.

Fig 5 shows the district-wise prediction for a month and a week. The X-axis shows time regarding the number of months, and the crime counts are on the Y-axis. The blue curve is the actual measurement, and the red curve is the prediction measurement of the statistical and deep learning methods. It is evident from Fig 5 and Table 3 that the BiLSTM model performs efficiently with a low error rate compared to other methods. In particular, it achieves a low error rate in weekly predictions compared to monthly predictions.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g005

thumbnail

https://doi.org/10.1371/journal.pone.0296486.t003

4.2 New York borough-wise crime prediction for a month and week using statistical and deep learning methods

This section discusses the evaluation results performed on crime data for New York City (NYC), divided into five boroughs (Bronx, Brooklyn, Manhattan, Queens, and Staten Island). This study splits the data into training (70%), and testing (30%) sets randomly to perform monthly and weekly crime predictions for each borough. Fig 6 shows each borough’s monthly and weekly prediction graphs. The X-axis shows time regarding the number of months, and the crime counts on the Y-axis. The blue curve is the actual measurement, and the red curve is the prediction measurement of the statistical and deep learning methods. Table 4 shows the comparative analysis of deep learning and statistical techniques based on MAE, MAD, and MSE values.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g006

thumbnail

https://doi.org/10.1371/journal.pone.0296486.t004

The fine-tuned BiLSTM outperformed other deep learning and statistical approaches for monthly and weekly predictions. Moreover, weekly predictions are more accurate with less error rate than monthly predictions.

4.3 Lahore town-wise crime prediction for a month and week using statistical and deep learning methods

This section presents the novel spatiotemporal crime dataset of Lahore City, Pakistan. The Lahore dataset is divided into 10 towns: Iqbal Town, Samanabad Town, Gulberg Town, Data Ganj Bakhsh Town, Nishtar Town, Ravi Town, Shalamar Town, Cantonment, Wahga Town, and Aziz Bhatti Town. This study splits data into training (70%) and testing (30%) sets to perform monthly crime predictions. Fig 7 shows the graphs of each town prediction for a month. The X-axis shows time regarding the number of months and the crime counts on the Y-axis. The blue curve is the actual measurement, and the red curve is the prediction measurement of the statistical and deep learning methods.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g007

Fig 7 and Table 5 show the comparative analysis based on MAE, MSE, and MAD for monthly and weekly crime prediction. Again, BiLSTM achieved a low error rate in all towns compared to other statistical and deep-learning models. In addition, the fine-tine BiLSTM achieved a lower error rate in weekly predictions than in monthly predictions. Therefore, BiLSTM has been adopted with the transfer learning paradigm for knowledge transfer.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.t005

4.4 Transfer learning using BiLSTM

This section utilizes the BiLSTM under the transfer learning paradigm due to its superior performance in weekly and monthly crime predictions. This study used three spatiotemporal crime datasets from Chicago, New York, and Lahore for experimental evaluation. The proposed BiLSTM based transfer learning methodology is shown in Fig 8 . The BiLSTM Based transfer learning on a crime dataset comprises several steps. The first step is to acquire data from the source. Second, the dataset is divided into training and testing subsets. Third, the BiLSTM model is utilized and fine-tuned for crime prediction. Fourth, evaluation is performed using three state-of-the-art crime datasets and evaluation measures. Lastly, transfer learning is achieved between boroughs, districts, and towns.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g008

Furthermore, the architecture of BiLSTM as a pre-trained model in transfer learning is described in detail in Fig 9 . When feeding timestamp data into the model, embedding layers from the crime data extract contextual information. The BiLSTM layer extracts the sequential pattern and semantic data (past and future) from the source data. To avoid model overfitting, dropout layers are also included. Finally, the linear activation function is utilised to reduce the error between the actual and predicted values when dense layers and the linear activation function are applied to extract key features. The following sections discuss the knowledge transfer process in different experimental setups.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g009

4.4.1 Optimization of the proposed approach.

To fine-tune the design parameters, this study used three cutting-edge optimizers from the Keras library, namely SGD [ 60 ], Rprop [ 61 ], and Nadam [ 62 ]. Once features have been extracted, these optimizers are used to build the model with two dense layers for the final prediction. Rprop achieves the best test accuracy of any, as demonstrated in Fig 10 . Rprop optimizer is useful, especially for recurrent neural networks [ 63 , 64 ]. Based on the gradient’s sign, it modifies the learning rates for each parameter and achieves significant performance compared to others.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g010

4.4.2 Knowledge transfer from a district, borough, and town to another.

This section focuses on knowledge transfer from one district of Chicago to another, one borough of New York to another, and from one town of Lahore to another. Primarily from District 1 to District 2 of Chicago, Brooklyn to Manhattan, and Iqbal Town to Nishtar Town, respectively. This study split the data into 70% training and 30% testing to perform monthly crime prediction. The pre-trained model adopts similar parameters and some epochs to test in the target neighbourhood. Fig 11 shows the graphs of a district, borough, and town prediction for the months using transfer learning. The X-axis shows the number of months, and the crime counts are on the Y-axis.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g011

The blue curve is the actual measurement, and the red is the prediction. Table 6 depicts the value of MAE, MAD, MSE, and execution time with transfer learning and without transfer learning using BiLSTM. The monthly crime prediction from pre-trained methods has a constant execution time when tested on the target dataset. It is also observed that the error values and execution time corresponding to monthly crime prediction with transfer learning are significantly less than the results without transfer learning.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.t006

Fig 12 compares execution time with transfer learning and without transfer learning used to predict the monthly crime of a district, borough, and town on a bar chart with the dataset name on the X-axis and execution time on the Y-axis.

thumbnail

https://doi.org/10.1371/journal.pone.0296486.g012

5 Conclusion

Crimes represent a severe danger to human civilization, security, and long-term growth and are expected to be managed. Therefore, law enforcement agencies frequently demand computational forecasts and prediction-based systems that improve crime analytics to improve city safety and security and prevent criminal activity. Furthermore, the availability of spatiotemporal crime data is vital to predicting crimes. Several studies have highlighted the significance of deep learning methods in enhancing crime prediction accuracy. However, sufficient data and resources are required to train a deep learning system. Thus, this study used the transfer learning paradigm and deep learning techniques to predict crime. This study employed and fine-tuned diverse statistical, deep learning, and machine learning algorithms on crime datasets from Chicago, New York, and Lahore. Moreover, EDA also highlights daily, weekly, monthly, yearly, and hourly crime trends.

BiLSTM achieves the maximum performance and low MAE, MAD, and MSE rates among the various algorithms. Therefore, this study exploited BiLSTM under transfer learning to predict monthly crime trends. The proposed approach achieves comparable performance, a low error rate, and less execution time by transferring knowledge from District 1 to District 2, Brooklyn to Manhattan, and from Iqbal town to Nishtar town. The proposed approach is significant for law enforcement agencies in predicting crime with fewer resources and time. In the future, the authors aim to fine-tune the knowledge transfer mechanism at the parameter level to avoid negative transfers. Moreover, cross-region (Lahore to Chicago or New York to Lahore) knowledge transfer will be studied as crime trends in EDA show similar characteristics.

  • 1. Bogomolov A, Lepri B, Staiano J, Oliver N, Pianesi F, Pentland A. Once upon a crime: towards crime prediction from demographics and mobile data. In: Proceedings of the 16th international conference on multimodal interaction; 2014. p. 427–434.
  • 2. Thongtae P, Srisuk S. An analysis of data mining applications in crime domain. In: 2008 IEEE 8th International Conference on Computer and Information Technology Workshops. IEEE; 2008. p. 122–126.
  • 3. Sathyadevan S, Devan M, Gangadharan SS. Crime analysis and prediction using data mining. In: 2014 First international conference on networks & soft computing (ICNSC2014). IEEE; 2014. p. 406–412.
  • 4. Grover V, Adderley R, Bramer M. Review of current crime prediction techniques. In: International Conference on Innovative Techniques and Applications of Artificial Intelligence. Springer; 2006. p. 233–237.
  • View Article
  • Google Scholar
  • 9. Yi F, Yu Z, Zhuang F, Zhang X, Xiong H. An Integrated Model for Crime Prediction Using Temporal and Spatial Factors. In: 2018 IEEE International Conference on Data Mining (ICDM). IEEE; 2018. p. 1386–1391.
  • 10. Buczak AL, Gifford CM. Fuzzy association rule mining for community crime pattern discovery. In: ACM SIGKDD Workshop on Intelligence and Security Informatics; 2010. p. 1121–1131.
  • 11. Tayebi MA, Ester M, Glässer U, Brantingham PL. Crimetracer: Activity space based crime location prediction. In: Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE Press; 2014. p. 472–480.
  • 12. Wang S, Cao J, Yu PS. Deep learning for spatio-temporal data mining: A survey. arXiv preprint arXiv:190604928. 2019;.
  • 13. Wang S, Yuan K. Spatiotemporal Analysis and Prediction of Crime Events in Atlanta Using Deep Learning. In: 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC). IEEE; 2019. p. 346–350.
  • PubMed/NCBI
  • 16. Catlett C, Cesario E, Talia D, Vinci A. A data-driven approach for spatio-temporal crime predictions in smart cities. In: 2018 IEEE International Conference on Smart Computing (SMARTCOMP). IEEE; 2018. p. 17–24.
  • 18. Li Z, Zhang T, Yuan Z, Wu Z, Du Z. Spatio-Temporal Pattern Analysis and Prediction for Urban Crime. In: 2018 Sixth International Conference on Advanced Cloud and Big Data (CBD). IEEE; 2018. p. 177–182.
  • 19. Shamsuddin NHM, Ali NA, Alwee R. An overview on crime prediction methods. In: 2017 6th ICT International Student Project Conference (ICT-ISPC). IEEE; 2017. p. 1–5.
  • 23. Zhao X, Tang J. Exploring transfer learning for crime prediction. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE; 2017. p. 1158–1159.
  • 28. Wu DD, Olson DL. Financial risk forecast using machine learning and sentiment analysis. In: Enterprise Risk Management in Finance. Springer; 2015. p. 32–48.
  • 33. He QQ, Pang PCI, Si YW. Transfer learning for financial time series forecasting. In: Pacific Rim International Conference on Artificial Intelligence. Springer; 2019. p. 24–36.
  • 35. Baqir A, ul Rehman S, Malik S, ul Mustafa F, Ahmad U. Evaluating the Performance of Hierarchical Clustering algorithms to Detect Spatio-Temporal Crime Hot-Spots. In: 2020 3rd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET). IEEE; 2020. p. 1–5.
  • 39. Nair SN, Gopi E. Deep Learning Techniques for Crime Hotspot Detection. In: Optimization in Machine Learning and Applications. Springer; 2020. p. 13–29.
  • 40. Hossain S, Abtahee A, Kashem I, Hoque MM, Sarker IH. Crime prediction using spatio-temporal data. In: International Conference on Computing Science, Communication and Security. Springer; 2020. p. 277–289.
  • 44. Liu Y, Peng Y, Li D, Fan J, Li Y. Crime scene investigation image retrieval with fusion CNN features based on transfer learning. In: Proceedings of the 3rd International Conference on Multimedia and Image Processing; 2018. p. 68–72.
  • 45. Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller PA. Transfer learning for time series classification. In: 2018 IEEE international conference on big data (Big Data). IEEE; 2018. p. 1367–1376.
  • 49. Yao H, Liu Y, Wei Y, Tang X, Li Z. Learning from multiple cities: A meta-learning approach for spatial-temporal prediction. In: The World Wide Web Conference; 2019. p. 2181–2191.
  • 52. System) CPDC. Citizen Law Enforcement Analysis and Reporting System; 2020. https://opendata.com.pk/ .
  • 53. NYCOpenData. NYPD Complaint Data Historic | NYC Open Data; 2019. https://data.cityofnewyork.us/Public-Safety/NYPD-Complaint-Data-Historic/qgea-i56i/data .
  • 54. PakCrimeData. Pakistan Bureau of Statistics; 2020.
  • 56. Siami-Namini S, Tavakoli N, Namin AS. The performance of LSTM and BiLSTM in forecasting time series. In: 2019 IEEE International Conference on Big Data (Big Data). IEEE; 2019. p. 3285–3292.
  • 60. Sutskever I, Martens J, Dahl G, Hinton G. On the importance of initialization and momentum in deep learning. In: International conference on machine learning. PMLR; 2013. p. 1139–1147.
  • 61. Braun H, Riedmiller M. RPROP: a fast adaptive learning algorithm. In: Proceedings of the International Symposium on Computer and Information Science VII; 1992. p. 342–346.
  • 62. Dozat T. Incorporating nesterov momentum into adam. Advances in Neural Information Processing Systems. 2016;.
  • 63. Joseph FJJ. Iot based aquarium water quality monitoring and predictive analytics using parameter optimized stack lstm. In: 2022 6th International Conference on Information Technology (InCIT). IEEE; 2022. p. 342–346.

> cs > arXiv:2303.16310

  • Other formats

Current browse context:

Change to browse by:, references & citations, dblp - cs bibliography, computer science > machine learning, title: crime prediction using machine learning and deep learning: a systematic review and future directions.

Abstract: Predicting crime using machine learning and deep learning techniques has gained considerable attention from researchers in recent years, focusing on identifying patterns and trends in crime occurrences. This review paper examines over 150 articles to explore the various machine learning and deep learning algorithms applied to predict crime. The study provides access to the datasets used for crime prediction by researchers and analyzes prominent approaches applied in machine learning and deep learning algorithms to predict crime, offering insights into different trends and factors related to criminal activities. Additionally, the paper highlights potential gaps and future directions that can enhance the accuracy of crime prediction. Finally, the comprehensive overview of research discussed in this paper on crime prediction using machine learning and deep learning approaches serves as a valuable reference for researchers in this field. By gaining a deeper understanding of crime prediction techniques, law enforcement agencies can develop strategies to prevent and respond to criminal activities more effectively.

Submission history

Link back to: arXiv , form interface , contact .

  • Architecture and Design
  • Asian and Pacific Studies
  • Business and Economics
  • Classical and Ancient Near Eastern Studies
  • Computer Sciences
  • Cultural Studies
  • Engineering
  • General Interest
  • Geosciences
  • Industrial Chemistry
  • Islamic and Middle Eastern Studies
  • Jewish Studies
  • Library and Information Science, Book Studies
  • Life Sciences
  • Linguistics and Semiotics
  • Literary Studies
  • Materials Sciences
  • Mathematics
  • Social Sciences
  • Sports and Recreation
  • Theology and Religion
  • Publish your article
  • The role of authors
  • Promoting your article
  • Abstracting & indexing
  • Publishing Ethics
  • Why publish with De Gruyter
  • How to publish with De Gruyter
  • Our book series
  • Our subject areas
  • Your digital product at De Gruyter
  • Contribute to our reference works
  • Product information
  • Tools & resources
  • Product Information
  • Promotional Materials
  • Orders and Inquiries
  • FAQ for Library Suppliers and Book Sellers
  • Repository Policy
  • Free access policy
  • Open Access agreements
  • Database portals
  • For Authors
  • Customer service
  • People + Culture
  • Journal Management
  • How to join us
  • Working at De Gruyter
  • Mission & Vision
  • De Gruyter Foundation
  • De Gruyter Ebound
  • Our Responsibility
  • Partner publishers

crime prediction using machine learning research paper

Your purchase has been completed. Your documents are now available to view.

A study on predicting crime rates through machine learning and data mining using text

Crime is a threat to any nation’s security administration and jurisdiction. Therefore, crime analysis becomes increasingly important because it assigns the time and place based on the collected spatial and temporal data. However, old techniques, such as paperwork, investigative judges, and statistical analysis, are not efficient enough to predict the accurate time and location where the crime had taken place. But when machine learning and data mining methods were deployed in crime analysis, crime analysis and predication accuracy increased dramatically. In this study, various types of criminal analysis and prediction using several machine learning and data mining techniques, based on the percentage of an accuracy measure of the previous work, are surveyed and introduced, with the aim of producing a concise review of using these algorithms in crime prediction. It is expected that this review study will be helpful for presenting such techniques to crime researchers in addition to supporting future research to develop these techniques for crime analysis by presenting some crime definition, prediction systems challenges and classifications with a comparative study. It was proved though literature, that supervised learning approaches were used in more studies for crime prediction than other approaches, and Logistic Regression is the most powerful method in predicting crime.

1 Introduction

Violations of the law pose a danger to the administration of justice and should be curtailed. Computational crime prediction and forecasting can help improve the safety of metropolitan areas. The inability of humans to process large amounts of complicated data from big data makes it difficult to make early and accurate predictions about criminal activity. Computational problems and opportunities arise from accurately predicting crime rates, types, and hot locations based on historical patterns. Still, there is a need for stronger prediction algorithms that target police patrols toward criminal events, despite extensive research efforts [ 1 ].

Crime analysis is a methodology approach used to identify crime spots and it is not an easy approach. In year 2020, Geographical Information Systems (GIS) was the non-machine learning tool used earlier for temporal and spatial data. GIS used the crime spots technique that mainly depends on crime type to help reduce crime rates [ 2 ].

Crime rate prediction can be defined as a method to build a system for finding crime future patterns and help the law enforcer to solve the crime which lead to reduce its rate in the real-world. Meanwhile, crime forecasting refers to the ability to predict far future crimes, up to years in the future to increase crime preventions, and this can be achieved by using time series approaches to find future crime trends from time series data.

In general, crime analysis in data mining can be predicted using different methods such as statistical methods [ 3 , 4 , 5 ], cover visualization methods [ 6 , 7 , 8 ], unsupervised learning, and supervised learning techniques [ 9 , 10 , 11 ]. Visualization methods include visual explanation of the connection between geographical view and other crime data such as geographic profiling [ 12 ], GIS-based crime mapping [ 13 ], crime prediction [ 14 , 15 , 16 ], and asymmetric mapping [ 17 ]. However, to obtain the connection between statistical methods, unsupervised learning techniques and crime data such as clustering methods which were very popular. These techniques were implemented as follows: clustering methods were used as criminal behavior analysis [ 18 , 19 ], crime pattern recognition, criminal association analysis, and incident pattern recognition to extract the groups or patterns that had the same features in crime data [ 20 ].

Then, the machine learning algorithms’ development helped the crime data analysis researchers to investigate crime depending on preprocessing and clustering techniques to extract the crime locations from row data [ 21 ], using the supervised and unsupervised machine learning models to analyze these data and discover their pattern based on time and location of crime to produce precise predictions [ 22 ]. In addition, the machine learning algorithms’ development also helped to investigate the reasons of crime occurring in certain areas by applying machine learning algorithms on history data collected from past years in the same area [ 23 ].

Nowadays, the development of classification algorithms, especially machine learning algorithms, helps to enhance the crime prediction [ 24 ]. Therefore, researchers tried to connect crime with time depending on various factors to help in resolving the crimes and prevent it and its frequencies. In year 2018, Fourier series was proposed as an analytic technique to accomplish a flexible mathematical model on time periodic effects. This technique explained the accuracy and usefulness of analytical techniques to connect the time factor with crime prediction. Thereby, the analytical techniques effectively achieved the relation between crimes and time, but not for all type of crimes [ 25 ].

We can say that, machine learning algorithms is widely used in crime prediction discipline, but it is not more than data mining and each one has its own performance and gives a perfect result.

Our work has been setup so that interested parties become familiar with the previous studies and the accuracy that have been achieved, presented in tabular format. The main contribution of this study is presenting machine learning and data mining applications in predicting crimes, by classifying the studies according to different types of techniques, and providing a brief overview of each applicable methodology that has been used to mine crime, and also, enlisting some challenges faced by such system developers.

The limitation of the state of art works are the lack with big entered geo-area, no generality because of using the same system on two different crime datasets leads to different accuracy percentages with big difference, the lack of works that predict criminal action, and finally but not last, the difficulty that faced the researchers in the crime prediction field, that it may be a missing informations in the on-line crime datasets or the data are repeated.

The rest of the review paper is organized as follows: in Section 2 , the research methodology of the survey is explained, in Section 3 , crime definitions and descriptions are discussed in detail. In Section 4 , challenges of prediction system are discussed. In Section 5 , the public datasets are described. The related work is included in Section 6 . In Section 7 , the prediction system classification is introduced. A comparison study of previous works is explained in Sections 8.1 and 8.2 . Eventually, discussion and conclusion are presented at the end of this article in Section 9 .

2 Research methodology

The methodology involved in this review study contains two stages: first is getting the relevant research works on crime prediction with machine learning and data mining studies and analyze them, and second is setting a classification table in Section 8 , and finally presenting a study about the performances of various algorithms and the achieved accuracy and comparison between them.

In choosing relevant research works, any Master and Doctoral dissertations or any papers that were not published were ignored. The research keyword was crime prediction with machine learning and data mining or violent crime prediction, the publishing criteria was between 2001 and 2022, the abstract of every article was read and then determined if it is relevant or not.

3 Crime definition and description

Generally, crimes are classified in to three groups: infractions, felonies, and misdemeanors based on the severity, punishment, and seriousness of crimes. Infractions are minimal crimes such as tailgating, parking overtime, and speeding. Meanwhile, Felonies are considered as most severe crimes followed by misdemeanors which are considered less severe crimes [ 26 ]. In addition, the crimes are classified into types based on the time when occurred such as the day, week, month, and season in order to find the connection between these types of crime and then to predict them in the future using machine learning and data mining algorithms. This can be done by using a dataset collected on a certain area for earlier crimes to forecast the future ones.

There are many types of crimes depending on the severity of the crime. Therefore, crimes are classified into three types, which are, felony, misdemeanors, and infractions (or wobblers) [ 22 ], as listed and defined in Table 1 .

Crime description

In addition, a crime could be categorized in other categories, such as victim, victimless, and violent crimes and there are other categorizations for crime, but through this study, only the classification mentioned in Table 1 will be considered.

4 Prediction systems challenges

Researchers and governmental security agents face some problems when it comes to predict crime’s location, time, and problems in choosing the effective method to do so. In addition, there are problems faced by the computer science researchers who used machine learning, data mining, and spatial–temporal data. In 2012 and 2016, the near-repeat-victimization and repeat-victimization methods were implemented to predict crimes in houses, streets, and regions. These methods state that if a crime happened in a block, then there is a probability that other crimes are increasing significantly in the same area [ 27 , 28 ].

The huge amount of data requires a large amount of storage

Crime-related data are usually in different formats such as text, images, graph, audio, relational data, unstructured data, and semi-structured data [ 29 ], so, the process of transforming these data to the understandable format is also a challenge.

In machine learning, to give the correct label (e.g., prediction or output) to an instance (e.g., context or input) is a challenge.

Use of appropriate data mining algorithm that gives better results than the used algorithms.

The environment and surrounding factors, such as the lack of the law and the weather, have an impact on the likelihood of crime, which ultimately causes the crime prediction algorithms to make grave errors. Any crime forecast must take the surrounding and environmental changes into consideration to avoid making such errors and to achieve high prediction accuracy.

5 Crime datasets

Crime-related data are gathered from a variety of different sources, including police reports, social media, news, and criminal records. It is difficult to gather data of this amount [ 30 ]. The datasets are available online in many countries around the world or gathered from the police departments. During our survey, we noticed that the Chicago crime dataset is more frequently used in crime prediction systems, and that returned to the large population and hight crime rates in this area [ 31 ].

6 Related work

With the huge data size nowadays, the evaluation of machine learning and data mining techniques allow us to deal with this row data and extract the results in better ways. Techniques for criminal activity detection and, more generally, machine learning and data mining, have recently been applied to the area of policing to achieve crime reduction.

Correct choices of the parameters for these techniques can help law enforcers to analyze and find the likelihood between crimes as well as patterns and trends in criminal activities, which lead to qualify those activities more efficiently [ 5 ].

In this section, the previous related works are discussed and analyzed, these research works are widely variate, some of these take the field of crime analysis to predict, some take the field of application of Artificial Intelligent on crime data, machine learning or data mining (which are subfields of Artificial Intelligent) in order to predict and forecast violence crimes, based on spatial and temporal data in some research works.

During our survey, we noticed five surveys or overviews related to crime prediction and machine learning or data mining.

The earliest was in 2011, a survey of different methods that used to extract patterns from spatial information (they called it spatial data mining (SDM) algorithms) like co-location mining, spatial clustering, spatial hot spots, spatial outliers, spatial auto-regression, conditional auto-regression, and geographically weighted regression, which conclude the effectiveness of these SDM algorithms and the guarantee to use it in the real world, and they found the need for more methods to validate the hypotheses produced by these algorithms [ 32 ].

In 2015, some researches in the field of crime prediction with data mining and machine learning were discussed , this research takes a variety of crime related variables then found that the information influencing the crime rate such as age, alcohol, hot spots, media, some policies, etc., do not have effect on crime rate prediction [ 33 ], it succeeded in discussion, but there is a shortage in the conclusion.

In 2016, another survey was published. It reviewed over 100 applications of data mining in crime. They made a concise review by preparing a brief table containing the used technique with a specific software, the relevant study area with the expected use and function. They suggested to enhance the benefits, improvements, and usability of data mining techniques in crime data mining by introducing more training and educating fields for these techniques [ 34 ].

In 2019, a systematic review of crime prediction and data mining studies between 2004 and 2018 classified the research works based on the used data mining techniques. Based on the challenges addressed and the number of research papers according to technique used, by covering 40 papers, a gap was identified in all of them, that is, when datasets increase, there is a noticeable decrease in the system’s overall performance [ 35 ].

Finally, in 2020, another systematic review was done, 32 papers were analyzed from 2000 to 2018 in spatial crime forecasting. In this study, in addition to the surveying table that contains the information about the space and time of the research, the crime data, and forecasting details, more than one summary was given, that is, the top four proposed methods, best proposed, and baseline methods applied in the 32 selected papers. This study discussed the points of strengths, weaknesses, threats, and opportunities of the selected papers, and the conclusion was that the contiguity of algorithms should not be ignored in the future [ 2 ].

7 Classification of prediction systems

According to approaches, machine learning and data mining.

According to prediction type, special and temporal.

According to dataset, image prediction and data prediction.

8 Comparison study: Crime prediction vs classification approaches

In this section, Tables 2 and 3 lists the literature surveys of the machine learning and data mining algorithms using different datasets for different cities around the world. In addition, a comparison is made between machine learning and data mining methods toward crime in a border crime prediction system. In these tables, we enlist each selected paper with the important information that will assist other researchers in determining which categories of crime prediction techniques are most powerful. Consequently, these two tables explain the machine learning and data mining algorithms with crime prediction in order to achieve the purpose of this survey. The tables contain the references, the machine learning or data mining algorithms, the used dataset source, and the accuracy of each algorithm depending on a certain dataset that was used for a particular city. The following section discusses crime prediction research works that followed the machine learning and data mining approaches, separately.

Literature survey on crime prediction research works with machine learning

Literature survey on crime prediction research works with data mining

8.1 Machine learning and crime prediction

Crime prediction has been studied widely due to its relation with the society, these studies employ machine learning algorithms to outfit the crime predicting and forecasting issues. Machine learning algorithms are successfully used to predict spatial crime information. So, in 2006, Support Vector Machine (SVM) algorithm was applied to predict the location of crimes in Columbus, Ohio, US. SVM used both random and clustering approaches to train and test dataset and then predict the hot spot area and improve its effectiveness [ 37 ]. These algorithms are used to study the correlation between crime occurrence and crime motivates. In 2013, a Logistic Regression (LR) algorithm was implemented to forecast the relationship between burglar crimes and several other factors which are time of the day, day of the week, barriers, connectors, and repeated victimization, but this model was a failure for large geo-area [ 38 ].

In 2015, crime was predicted in southern US states using Random Forest (RF) method after applying SmoteR algorithm to detect the more dangerous crimes. In addition, their work was optimized using R software after the density and population were selected as real values [ 39 ].

Eventually, the auto-regressive approach was implemented to forecast the number of crimes that happened in the same time and predict them in urban areas [ 40 ]. In 2017, Naive Bayes (NB) algorithm was proposed to predict crime incident depending on history data that shows the same crime happening in the same place. Moreover, NB model was compared with Decision Tree (DT) algorithm in order to test the performance of the proposed method, and found that the NB outperforms the DT even with the computational complexity of DT [ 41 ].

In 2020, many research works were presented, one of them fused three methods, the Long short-term memory (LSTM), Residual neural network, and Graph convolutional network to propose a certain mechanism, which was able to extract spatial–temporal features to predict crimes in Chicago. In addition, Root mean square error and Mean absolute error were used as a criterion to test the performance of the applied method [ 42 ]. On the other study, a crime network for spatiotemporal data was proposed using Convolutional neural network (CNN) in order to automatically predict the time and place of the crimes [ 43 ]. And in another study [ 44 ], Recurrent neural network (RNN) with LSTM was integrated in order to design time series crime prediction system to predict crimes in Addis Ababa. Also, in one more study [ 45 ], the severity level of crime in Boston was studied and predicted using machine learning algorithm such as SVM, NB, LR, and DT.

According to ref. [ 31 ], the Deep neural network (DNN) has overcome the SVM, but according to ref. [ 46 ], the opposite occurred, the SVM has overcome the DNN, and this can be justified by one reason, the first has worked on an image dataset and the second has worked on a text dataset. So, it is recommended to use DNN in case of an image crime dataset.

According to refs [ 1 , 47 ], using the same system on two different crime datasets leads to different accuracy percentages with big difference, which shows that the dataset utilized severely affects the results gained. Therefore, this presents a challenge to these algorithms to prove its efficiency and then its accuracy to predict a crime.

After surveying the machine learning approaches, the highest accuracy crime prediction results gained are shown in Table 2 .

According to ref. [ 48 ], the LR algorithm achieves the highest accuracy among the different machine learning algorithms.

When observing the crime prediction results of the research works adopting the RF method, it was noticed that the highest accuracy achieved is 59.8%, which is considered a poor accuracy, compared with other methods.

The standard deviations of crime prediction accuracies for each algorithm show that the SVM algorithm outperforms the LR algorithm and achieves (71.9%) accuracy. Actually, it outperforms all other machine learning algorithm’s standard deviation results.

According to the previous studies, it was noticed that the highest crime prediction accuracy results were gained through the machine learning logistic regression method, which was 95% for Baltimore city in ref. [ 48 ]. Furthermore, algorithms such as XGBoost and Logistic Regression have achieved a high accuracy of 94 and 90%, respectively [ 1 ]. However, it can be noticed that the same algorithm can perform differently with two different datasets, and this proves that the dataset has a large influence on the crime prediction results.

8.2 Data mining and crime prediction

In 2011, special data mining and technologies were proposed to extract patterns from spatial and temporal data. In addition, the data were mined geospatially using special knowledge. In 2011, crimes were predicted in Portland; data mining methods were used to forecast crimes using spatial and temporal dataset collected in Portland and predict whether residential burglary will happen. The methods NB, SVM, DT, and K-Nearest Neighbor algorithms were applied to predict crimes and the result was compared between these methods, which shows the power of neural network in complex systems [ 57 ]. Moreover, the pattern extraction usefulness was limited by the complexity of the relationships between spatial data [ 32 ]. In 2016, high accuracy was achieved using various DT algorithms to extract knowledge from data collected during 1994 instances, with 128 attributes, then made a comparison between them. In addition, the data were trained and tested using scatter plots to illustrate the crime areas with the severity of each area based on previous data [ 58 ]. In year 2016, data mining algorithms were developed and used to classify these crimes based on their types. A crime was characterized according to time, based on factors such as vacations that started with the academic year for colleges and schools. In addition, the classifier was used to predict the severity risk of the crime areas in Denver city between 2010 and 2015 [ 59 ]. In 2020, Autoregressive Integrated Moving Average (ARIMA) technique was implemented to predict time series data and then have been visualized with data mining platform. This technique proved that regressive model can work on historical newsfeed data to predict future crimes [ 60 ].

Table 3 shows the comparison of many algorithms implemented against crime prediction challenge, such as DT, NB, RF, etc., either individually or group of them to a certain type of dataset and city. Thereby, this presents a challenge to these algorithms to confirm its effectiveness and then its accuracy to predict a crime.

The highest accuracy crime prediction results gained, based on the survey of the data mining methods, are shown in Table 3 .

According to ref. [ 61 ] the K-mean algorithm achieves the highest accuracy among the different data mining algorithms.

When we take the standard deviations of crime prediction accuracies for each algorithm, we noticed that the DT algorithm outperforms the NB algorithm and achieves (18.9%) accuracy.

According to the previous study, DT and Neural Network have recorded 94% accuracy for different datasets in refs [ 48 , 51 ] for machine learning algorithms. The k-mean data mining algorithm achieved 93.62% (cluster one) and 93.99% (cluster two) for crimes in India [ 61 ].

9 Conclusion

Crime prediction became the hot research area nowadays because of its correlation benefits to any society or nation’s security. It is found that many studies adopted supervised learning approaches to the field of crime prediction compared to others.

It is obviously concluded, that data mining methods achieved the highest crime prediction accuracies, overcoming machine learning methods. Regardless of this, on average, the machine learning out performs data mining in crime prediction. But, when we use the standard deviation of crime prediction accuracies of machine learning and data mining, we can say that the machine learning algorithms perform better than the data mining algorithms.

Eventually, it can be concluded that the comparison of machine learning and data mining algorithms for crime prediction systems give certain indications, such as the selection of an algorithm may depend on the dataset type (like image, text, video, or voice dataset), and there are certain algorithms that preform perfectly on average, but can fail working with other datasets. Crime prediction methods adopting deep learning algorithms were not covered through this survey for time limitation reasons.

Acknowledgements

The authors, want to thank all researchers whose works have been cited in this survey, in the field of crime prediction.

Funding information : This project is funded by the authors only.

Conflict of interest : The authors declare that there is no conflict of interest regarding the publication of this article.

[1] Safat W, Asghar S, Gillani SA. Empirical analysis for crime prediction and forecasting using machine learning and deep learning techniques. IEEE Access J. 2021;9:70080–94. 10.1109/ACCESS.2021.3078117 Search in Google Scholar

[2] Kounadi O, Ristea A, Araujo A, Leitner M. A systematic review on spatial crime forecasting. Crime Sci. 2020;9(1):1–22. 10.1186/s40163-020-00116-7 Search in Google Scholar PubMed PubMed Central

[3] Tollenaar N, van der Heijden PGM. Which method predicts recidivism best?: a comparison of statistical, machine learning and data mining predictive models. J R Stat Soc Ser A. 2013;176(2):565–84. 10.1111/j.1467-985X.2012.01056.x Search in Google Scholar

[4] Enzmann D, Podana Z. Official crime statistics and survey data: Comparing trends of youth violence between 2000 and 2006 in cities of the Czech Republic, Germany, Poland, Russia, and Slovenia. Eur J Crim Policy Res. 2010;16(3):191–205. 10.1007/s10610-010-9121-z Search in Google Scholar

[5] Holst A, Bjurling B. A Bayesian parametric statistical anomaly detection method for finding trends and patterns in criminal behavior. In 2013 European Intelligence and Security Informatics Conference. IEEE; 2013. 10.1109/EISIC.2013.19 Search in Google Scholar

[6] Brunsdon C, Corcoran J, Higgs G. Visualising space and time in crime patterns: A comparison of methods. Comput Environ Urban Syst. 2007;31(1):52–75. 10.1016/j.compenvurbsys.2005.07.009 Search in Google Scholar

[7] Vural MS, Gök M, Yetgin Z. Generating incident-level artificial data using GIS based crime simulation. In 2013 International Conference on Electronics, Computer and Computation (ICECCO). IEEE; 2013. 10.1109/ICECCO.2013.6718273 Search in Google Scholar

[8] Xiang Y, Chau M, Atabakhsh H, Chen H. Visualizing criminal relationships: Comparison of a hyperbolic tree and a hierarchical list. Decis Support Syst. 2005;41(1):69–83. 10.1016/j.dss.2004.02.006 Search in Google Scholar

[9] Jain LC, Seera M, Lim CP, Balasubramaniam P. A review of online learning in supervised neural networks. Neural Comput Appl. 2014;25(3):491–509. 10.1007/s00521-013-1534-4 Search in Google Scholar

[10] EL Aissaoui O, EL Madani EY, Oughdir L, EL Allioui Y. Combining supervised and unsupervised machine learning algorithms to predict the learners’ learning styles. Procedia Comput Sci. 2019;148:87–96. 10.1016/j.procs.2019.01.012 Search in Google Scholar

[11] Mackenzie DM. CDUL: class directed unsupervised learning. Neural Comput Appl. 1995;3(1):2–16. 10.1007/BF01414172 Search in Google Scholar

[12] Rossmo DK, Laverty I, Moore B. Geographic profiling for serial crime investigation, in Geographic information systems and crime analysis. IGI Glob. 2005;6:102–17. 10.4018/978-1-59140-453-8.ch006 Search in Google Scholar

[13] Ristea A, Leitner M. Urban crime mapping and analysis using GIS. ISPRS Int J Geo-Information. 2020;9(9):511. 10.3390/ijgi9090511 Search in Google Scholar

[14] Corcoran JJ, Wilson ID, Ware JA. Predicting the geo-temporal variations of crime and disorder. Int J Forecast. 2003;19(4):623–34. 10.1016/S0169-2070(03)00095-5 Search in Google Scholar

[15] Sangani A, Sampat C, Pinjarkar V. Crime prediction and analysis. In 2nd International Conference on Advances in Science & Technology (ICAST); 2019. 10.2139/ssrn.3367712 Search in Google Scholar

[16] Wang Y, Peng X, Bian J. Computer crime forensics based on improved decision tree algorithm. J Netw. 2014;9(4):1005. 10.4304/jnw.9.4.1005-1011 Search in Google Scholar

[17] Khan M, Ali A, Alharbi Y. Predicting and preventing crime: A crime prediction model using San Francisco crime data by classification techniques. New Jersey: Wiley/Hindawi. Vol. 2022, No. 4830411, 2022. p. 13. 10.1155/2022/4830411 Search in Google Scholar

[18] Ewart BW, Oatley GC. Applying the concept of revictimization: using burglars’ behaviour to predict houses at risk of future victimization. Int J Police Sci Manag. 2003;5(2):69–84. 10.1350/ijps.5.2.69.14324 Search in Google Scholar

[19] Box GEP, Jenkins GM, Reinsel GC, Ljung GM. Time series analysis: forecasting and control. John Wiley & Sons; 2015. Search in Google Scholar

[20] Jangra M, Kalsi S. Naïve Bayes approach for the crime prediction in Data Mining. Int J Comput Appl. 2019;178(4):33–7. 10.5120/ijca2019918907 Search in Google Scholar

[21] Khairuddin A, Alwee R, Haron H. A comparative analysis of artificial intelligence techniques in forecasting violent crime rate. In IOP Conference Series: Materials Science and Engineering. IOP Publishing; 2020. Search in Google Scholar

[22] Sardana D, Marwaha S, Bhatnagar R. Supervised and unsupervised machine learning methodologies for crime pattern analysis. Int J Artif Intell Appl. 2021;12(1):43–58. 10.5121/ijaia.2021.12106 Search in Google Scholar

[23] Sivanagaleela B, Rajesh S. Crime analysis and prediction using fuzzy c-means algorithm. In 3rd International Conference on Trends in Electronics and Informatics (ICOEI). IEEE; 2019. 10.1109/ICOEI.2019.8862691 Search in Google Scholar

[24] Liu X, Sun H, Han S, Han S, Niu S, Qin W, et al. A data mining research on office building energy pattern based on time-series energy consumption data. Energy Build. 2022;259:111888. 10.1016/j.enbuild.2022.111888 Search in Google Scholar

[25] Borowik G, Wawrzyniak ZM, Cichosz P. Time series analysis for crime forecasting. In 2018 26th International Conference on Systems Engineering (ICSEng). IEEE; 2018. 10.1109/ICSENG.2018.8638179 Search in Google Scholar

[26] Goel A, Singh B. White collar crimes: A study in the context of classification, causation and preventive measures. Contemp Soc Sci. 2018;27:84–92. 10.29070/27/58311 Search in Google Scholar

[27] Grove L, Farrell G. Once bitten, twice shy: Repeat victimization and its prevention. Oxf Handb Crime Prev. 2012;404–19. 10.1093/oxfordhb/9780195398823.013.0020 Search in Google Scholar

[28] Chainey SP, da Silva BFA. Examining the extent of repeat and near repeat victimisation of domestic burglaries in Belo Horizonte, Brazil. Crime Sci. 2016;5(1):1–10. 10.1186/s40163-016-0049-6 Search in Google Scholar

[29] Peter P, Ickjai L. Crime analysis through spatial areal aggregated density patterns. Geoinformatica. 2011;15(1):49–74. 10.1007/s10707-010-0116-1 Search in Google Scholar

[30] Jonas p, Paul E, Stijn V, Marc MVH, Guido D. Gaining insight in domestic violence with emergent self organizing maps. Expert Syst Appl. 2009;36(9):11864–74. 10.1016/j.eswa.2009.04.027 Search in Google Scholar

[31] Kang HW, Kang HB. Prediction of crime occurrence from multi-modal data using deep learning. PLoS One. 2017;12(4):e0176244. 10.1371/journal.pone.0176244 Search in Google Scholar PubMed PubMed Central

[32] Shekhar S, Evans MR, Kang JM, Mohan P. Identifying patterns in spatial information: A survey of methods. Wiley Interdiscip Reviews Data Min Knowl Discovery. 2011;1(3):193–214. 10.1002/widm.25 Search in Google Scholar

[33] Mookiah L, Eberle W, Siraj A. Survey of crime analysis and prediction. In The Twenty-Eighth International Flairs Conference; 2015. Search in Google Scholar

[34] Hassani H, Huang X, Silva ES, Ghodsi M. A review of data mining applications in crime. Stat Anal Data Mining: ASA Data Sci J. 2016;9(3):139–54. 10.1002/sam.11312 Search in Google Scholar

[35] Falade A, Azeta A, Oni A, Odun-ayo I. Systematic literature review of crime prediction and data mining. Rev Comput Eng Stud. 2019;6(3):56–63. 10.18280/rces.060302 Search in Google Scholar

[36] Okeke OC. An overview of crime analysis, prevention and predicton using data mining based on real time and location data. Int J Recent Technol Eng. 2022;5(10):99–103. 10.33564/IJEAST.2021.v05i10.015 Search in Google Scholar

[37] Kianmehr K, Alhajj R. Crime hot-spots prediction using support vector machine. In IEEE International Conference on Computer Systems and Applications. IEEE Computer Society; 2006. 10.1109/AICCSA.2006.205203 Search in Google Scholar

[38] Antolos D, Liu D, Ludu A, Vincenzi D. Burglary crime analysis using logistic regression. In International Conference on Human Interface and the Management of Information. Berlin: Springer; 2013. 10.1007/978-3-642-39226-9_60 Search in Google Scholar

[39] Cavadas B, Branco P, Pereira S. Crime prediction using regression and resources optimization. In Portuguese Conference on Artificial Intelligence. Springer; 2015. 10.1007/978-3-319-23485-4_51 Search in Google Scholar

[40] Cesario E, Catlett C, Talia D. Forecasting crimes using autoregressive models. in 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE; 2016. 10.1109/DASC-PICom-DataCom-CyberSciTec.2016.138 Search in Google Scholar

[41] Vural MS, Gök M. Criminal prediction using Naive Bayes theory. Neural Comput Appl. 2017;28(9):2581–92. 10.1007/s00521-016-2205-z Search in Google Scholar

[42] Hou M, Hu X, Cai J, Han X, Yuan S. An integrated graph model for spatial–temporal urban crime prediction based on attention mechanism. ISPRS Int J Geo-Information. 2022;11(5):294. 10.3390/ijgi11050294 Search in Google Scholar

[43] Ilhan F, Tekin SF, Aksoy B. Spatio-temporal crime prediction with temporally hierarchical convolutional neural networks. In 2020 28th Signal Processing and Communications Applications Conference (SIU). IEEE; 2020. 10.1109/SIU49456.2020.9302169 Search in Google Scholar

[44] Meskela TE, Afework YK, Ayele NA, Teferi MW, Mengist TB. Designing time series crime prediction model using long short-term memory recurrent neural network. Int J Recent Technol Eng. 2020;9:402–5. 10.35940/ijrte.D5025.119420 Search in Google Scholar

[45] Hussain FS, Aljuboori AF. A crime data analysis of prediction based on classification approaches. Baghdad Sci J. 2022;4:1073–7. Search in Google Scholar

[46] Lin YL, Yen MF, Yu LC. Grid-based crime prediction using geographical features. ISPRS Int J Geo-Information. 2018;7(8):298. 10.3390/ijgi7080298 Search in Google Scholar

[47] Stec A, Klabjan D. Forecasting crime with deep learning. arXiv preprint arXiv; 2018. p. 01486. Search in Google Scholar

[48] Kim KS, Jeong YH. A study on crime prediction to reduce crime rate based on artificial intelligence. Korea J Artif Intell. 2021;9(1):15–20. Search in Google Scholar

[49] Bogomolov A, Lepri B, Staiano J, Oliver N, Pianesi F, Pentland A. Once upon a crime: towards crime prediction from demographics and mobile data. In Proceedings of the 16th International Conference on Multimodal Interaction; 2014. 10.1145/2663204.2663254 Search in Google Scholar

[50] Zhuang Y, Almeida M, Morabito M Ding W. Crime hot spot forecasting: A recurrent model with spatial and temporal information. In 2017 IEEE International Conference on Big Knowledge (ICBK). IEEE; 2017. 10.1109/ICBK.2017.3 Search in Google Scholar

[51] Ivan N, Ahishakiye E, Omulo EO, Taremwa D. Crime prediction using decision tree (J48) classification algorithm. International Journal of Computer and Information Technology. 2017;6:188–95. Search in Google Scholar

[52] El Bour HA, Ounacer S, Elghomari Y, Jihal H, Azzouazi M. A crime prediction model based on spatial and temporal data. Periodicals Eng Nat Sci. 2018;6(2):360–4. 10.21533/pen.v6i2.524 Search in Google Scholar

[53] Kim S, Joshi P, Kalsi PS, Taheri P. Crime analysis through machine learning. In IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). IEEE; 2018. 10.1109/IEMCON.2018.8614828 Search in Google Scholar

[54] Bharati A, RA KS. Crime prediction and analysis using machine learning. Int Res J Eng Technol (IRJET). 2018;5:1037–42. Search in Google Scholar

[55] Mahmud S, Nuha M, Sattar A. Crime Rate Prediction Using Machine Learning and Data Mining, in Soft Computing Techniques and Applications. Singapore: Springer; 2021. p. 59–69. 10.1007/978-981-15-7394-1_5 Search in Google Scholar

[56] Almuhanna AA, Alrehili MM, Alsubhi SH, Syed L. Prediction of crime in neighbourhoods of New York City using spatial data analysis. In 2021 1st International Conference on Artificial Intelligence and Data Analytics (CAIDA). IEEE; 2021. 10.1109/CAIDA51941.2021.9425120 Search in Google Scholar

[57] Yu CH, Ward MW, Morabito M, Ding W. Crime forecasting using data mining techniques. In 2011 IEEE 11th International Conference on Data Mining Workshops. IEEE; 2011. 10.1109/ICDMW.2011.56 Search in Google Scholar

[58] Sharma H, Kumar S. A survey on decision tree algorithms of classification in data mining. Int J Sci Res. 2016;5(4):2094–7. 10.21275/v5i4.NOV162954 Search in Google Scholar

[59] Gupta A, Mohammad A, Syed A, Halgamuge MN. A comparative study of classification algorithms using data mining: crime and accidents in Denver City the USA. Education. 2016;7(7):374–81. 10.14569/IJACSA.2016.070753 Search in Google Scholar

[60] Boppuru PR, Ramesha K. Spatio-temporal crime analysis using KDE and ARIMA models in the indian context. Int J Digital Crime Forensics. 2020;12(4):1–19. 10.4018/IJDCF.2020100101 Search in Google Scholar

[61] Tayal D, Jain A, Arora S, Agarwal S, Gupta T, Tyagi N. Crime detection and criminal identification in India using data mining techniques. AI Soc. 2015;30(1):117–27. 10.1007/s00146-014-0539-6 Search in Google Scholar

[62] Iqbal R, Murad MAA, Mustapha A, Panahy PHS, Khanahmadliravi N. An experimental study of classification algorithms for crime prediction. Indian J Sci Technol. 2013;6(3):4219–25. 10.17485/ijst/2013/v6i3.6 Search in Google Scholar

[63] Almanie T, Mirza R, Lor E. Crime prediction based on crime types and using spatial and temporal criminal hotspots. arXiv preprint arXiv; 2015. p. 02050. 10.5121/ijdkp.2015.5401 Search in Google Scholar

[64] Yerpude P, Gudur V. Predictive modelling of crime dataset using data mining. Int J Data Min Knowl Manag Process. 2020;7:83–99. Search in Google Scholar

[65] Prathap BR, Krishna A, Balachandran K. Crime analysis and forecasting on spatio temporal news feed data—An indian context, in artificial intelligence and blockchain for future cybersecurity applications. Switzerland: Springer; 2021. p. 307–27. 10.1007/978-3-030-74575-2_16 Search in Google Scholar

© 2023 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

  • X / Twitter

Supplementary Materials

Please login or register with De Gruyter to order this product.

Journal of Intelligent Systems

Journal and Issue

Articles in the same issue.

crime prediction using machine learning research paper

This paper is in the following e-collection/theme issue:

Published on 17.4.2024 in Vol 12 (2024)

This is a member publication of University of Washington

A Roadmap for Using Causal Inference and Machine Learning to Personalize Asthma Medication Selection

Authors of this article:

Author Orcid Image

  • Flory L Nkoy 1 * , MS, MPH, MD   ; 
  • Bryan L Stone 1 , MS, MD   ; 
  • Yue Zhang 2, 3 , PhD   ; 
  • Gang Luo 4 * , PhD  

1 Department of Pediatrics, University of Utah, Salt Lake City, UT, United States

2 Division of Epidemiology, Department of Internal Medicine, University of Utah, Salt Lake City, UT, United States

3 Division of Biostatistics, Department of Population Health Sciences, University of Utah, Salt Lake City, UT, United States

4 Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, United States

*these authors contributed equally

Corresponding Author:

Gang Luo, PhD

Department of Biomedical Informatics and Medical Education

University of Washington

850 Republican Street, Building C

Seattle, WA, 98195

United States

Phone: 1 2062214596

Fax:1 2062212671

Email: [email protected]

Inhaled corticosteroid (ICS) is a mainstay treatment for controlling asthma and preventing exacerbations in patients with persistent asthma. Many types of ICS drugs are used, either alone or in combination with other controller medications. Despite the widespread use of ICSs, asthma control remains suboptimal in many people with asthma. Suboptimal control leads to recurrent exacerbations, causes frequent ER visits and inpatient stays, and is due to multiple factors. One such factor is the inappropriate ICS choice for the patient. While many interventions targeting other factors exist, less attention is given to inappropriate ICS choice. Asthma is a heterogeneous disease with variable underlying inflammations and biomarkers. Up to 50% of people with asthma exhibit some degree of resistance or insensitivity to certain ICSs due to genetic variations in ICS metabolizing enzymes, leading to variable responses to ICSs. Yet, ICS choice, especially in the primary care setting, is often not tailored to the patient’s characteristics. Instead, ICS choice is largely by trial and error and often dictated by insurance reimbursement, organizational prescribing policies, or cost, leading to a one-size-fits-all approach with many patients not achieving optimal control. There is a pressing need for a decision support tool that can predict an effective ICS at the point of care and guide providers to select the ICS that will most likely and quickly ease patient symptoms and improve asthma control. To date, no such tool exists. Predicting which patient will respond well to which ICS is the first step toward developing such a tool. However, no study has predicted ICS response, forming a gap. While the biologic heterogeneity of asthma is vast, few, if any, biomarkers and genotypes can be used to systematically profile all patients with asthma and predict ICS response. As endotyping or genotyping all patients is infeasible, readily available electronic health record data collected during clinical care offer a low-cost, reliable, and more holistic way to profile all patients. In this paper, we point out the need for developing a decision support tool to guide ICS selection and the gap in fulfilling the need. Then we outline an approach to close this gap via creating a machine learning model and applying causal inference to predict a patient’s ICS response in the next year based on the patient’s characteristics. The model uses electronic health record data to characterize all patients and extract patterns that could mirror endotype or genotype. This paper supplies a roadmap for future research, with the eventual goal of shifting asthma care from one-size-fits-all to personalized care, improve outcomes, and save health care resources.

Introduction

Asthma is a chronic disease characterized by inflammation, narrowing, and hyperactivity of the airways causing shortness of breath, chest tightness, coughing, and wheezing [ 1 ]. Asthma affects about 25 million people in the United States [ 2 ]. In 2021, there were 9.8 million exacerbations of asthma symptoms (or asthma attacks) leading to over 980,000 emergency room (ER) visits and over 94,500 hospitalizations [ 2 ]. Asthma costs the US economy over US $80 billion in health care expenses each year, work and school absenteeism, and deaths [ 3 ].

Inhaled corticosteroid (ICS) is a mainstay treatment for controlling asthma and preventing exacerbations in patients with persistent asthma [ 4 ] accounting for over 60% of people with asthma [ 5 , 6 ]. Many types of ICS drugs are used, either alone like fluticasone (Flovent, Arnuity, and Aller-flo), budesonide (Pulmicort, Entocort, and Rhinocort), mometasone (Asmanex), beclomethasone (Beclovent, Qvar, Vancenase, Beconase, Vanceril, and Qnasl), ciclesonide (Alvesco), and so forth, or in combination with a long-acting beta2 agonist like fluticasone/salmeterol (Advair), budesonide/formoterol (Symbicort), mometasone/formoterol (Dulera), and fluticasone/vilanterol (Breo), and so forth [ 4 ]. Regular use of appropriate ICSs improves asthma control and reduces airway inflammation, symptoms, exacerbations, ER visits, and inpatient stays [ 7 - 9 ].

Despite the widespread use of ICSs, asthma control remains suboptimal in many people with asthma [ 10 - 13 ] including 44% of children and 60% of adults based on asthma exacerbations in the past year [ 14 , 15 ], 72% of patients based on asthma control test [ 10 ], 53% of children and 44% of adults based on asthma attacks in the past year [ 16 ], and 59% of children based on the 2007-2013 Medical Expenditure Panel Survey [ 17 ]. Suboptimal control leads to recurrent exacerbations, causes frequent ER visits and inpatient stays, and is projected to have an economic burden of US $963.5 billion over the next 20 years [ 18 ]. Suboptimal control is due to multiple factors [ 19 - 23 ] including (1) failure to recognize and act on early signs of declining control [ 24 , 25 ], (2) lack of self-management skills, (3) nonadherence to therapy [ 26 ], and (4) inappropriate ICS choice for the patient [ 27 - 32 ]. While interventions targeting other factors exist, less attention has been given to inappropriate ICS choice.

Asthma is heterogeneous with variable profiles in terms of clinical presentations (phenotypes) and underlying mechanisms (endotypes) [ 33 , 34 ]. Molecular techniques have revealed a few phenotype and endotype relationships, allowing the categorization of asthma into two main groups (1) T-helper type 2 (Th2)-high (eg, atopic and late onset) and (2) Th2-low (eg, nonatopic, smoking-related, and obesity-related) [ 33 , 34 ]. It is known that within the 2 groups, there are many subgroups [ 33 , 35 ] with different biomarker expressions (eg, immunoglobulin E [IgE], fractional exhaled nitric oxide [FeNO], interleukin [IL]-4, IL-5, and IL-13) [ 36 ]. So far, only a few biomarkers have been characterized for use in clinical practice. Despite a few successes using biomarkers for targeted therapy, ICS choice, especially in the primary care setting, is largely by trial and error and many patients remain uncontrolled [ 37 - 42 ].

Besides patient nonadherence and environmental factors, response to ICS treatment is affected by genetic variations in ICS metabolizing enzymes [ 43 , 44 ], regardless of whether the ICS is used alone or is combined with another asthma medication like a long-acting beta2 agonist. Single nucleotide polymorphisms in cap methyltransferase 1 (CMTR1), tripartite motif containing 24 (TRIM24), and membrane associated guanylate kinase, WW and PDZ domain containing 2 (MAGI2) genes were found to be associated with variability in asthma exacerbations [ 43 ]. Additional evidence supports that these genes also cause variability in ICS response [ 44 ]. Due to genetic variations in cytochrome P (CYP) 450 enzymes that metabolize over 80% of drugs including ICS, up to 50% of people with asthma have altered metabolism to certain ICSs [ 45 - 51 ] impacting asthma control [ 52 , 53 ]. CYP3A5*3/*3 and CYP3A4*22 genotypes were found to be linked to ICS response [ 54 , 55 ]. These studies provide evidence that genetic variations greatly affect ICS responsiveness, although the exact relationships between genetic variations and ICS response remain largely unknown [ 36 , 56 , 57 ]. Currently, many candidate genes are being studied, and pharmacogenetics has not yet reached routine clinical practice in asthma care.

ICS choice for patients is often dictated by insurance reimbursement, organizational policies, or cost, leading to a one-size-fits-all approach [ 37 - 42 ]. Some insurers require patients to first fail on a cheaper ICS before authorizing a more expensive ICS [ 39 ]. Nonmedical switch due to preferred drug formulary change is common and leads to bad outcomes, with 70% of patients reporting more exacerbations after the switch [ 39 ]. Patients also often report that they tried a few different ICSs before ending up with the drug that gave them the most relief, with 60% reporting it was hard for their providers to find the effective drug [ 37 - 39 ]. Cycling through various ICSs delays the start of an effective ICS and is neither efficient nor cost-effective [ 39 ]. New strategies are needed to allow a faster and more efficient way to tailor ICS selection to each patient’s characteristics [ 36 ].

While the biologic heterogeneity of asthma is vast, few, if any, biomarkers or genotypes can currently be used to systematically profile all patients with asthma and predict ICS response [ 36 , 58 , 59 ]. Readily available electronic health record (EHR) data collected during clinical care offer a low-cost, reliable, and more holistic way to profile all patients [ 36 , 60 ]. With a high accuracy of 87%-95% [ 36 ], machine learning models using EHR data have been used to profile patients in various areas, for example, to develop a phenotype for patients with Turner syndrome [ 61 ], identify low medication adherence profiles [ 62 ], find variable COVID-19 treatment response profiles [ 63 ], and predict hypertension treatment response [ 64 ]. Yet, while machine learning has helped find various asthma profiles [ 65 - 72 ], no prior study has predicted ICS response. Also, prior studies are mostly from single centers with small sample sizes and have not moved the needle of precision treatment for asthma [ 58 , 60 ].

A decision support tool is greatly needed, especially in the primary care setting, to guide providers to select at the point of care the ICS that will most likely and quickly ease patient symptoms and improve asthma control. Forecasting which patient will respond well to which ICS is the first step toward creating this tool, but no prior study has predicted ICS response, forming a gap.

To shift asthma care from one-size-fits-all to personalized care, improve outcomes, and save health care resources, we make three contributions in this paper, supplying a roadmap for future research: (1) we point out the above-mentioned need for creating a decision support tool to guide ICS selection; (2) we point out the above-mentioned gap in fulfilling this need; and (3) to close this gap, we outline an approach to create a machine learning model and apply causal inference to predict a patient’s ICS response in the next year based on the patient’s characteristics. We present the central ideas of this approach in the following sections.

Creating a Machine Learning Model and Applying Causal Inference to Predict ICS Response

Overview of our approach.

We use EHR data from a large health care system to develop a machine learning model and apply casual inference to predict a patient’s ICS response based on the patient’s characteristics. As endotyping or genotyping all patients is infeasible, our model uses EHR data to characterize all patients and extract patterns that could mirror endotype or genotype. Our model is trained on historical data, and can then be applied to new patients to guide ICS selection during an initial or early encounter for asthma care. The optimal ICS choice identified by our approach can be either an ICS (generic name and dosage) alone or an ICS combined with another asthma medication like a long-acting beta2 agonist.

Both pediatric and adult patients with asthma are treated by primary care providers (PCPs) who are mostly generalists and asthma specialists including allergists, immunologists, and pulmonologists. Large differences exist between PCPs and specialists in terms of knowledge, care patterns, and asthma outcomes, with asthma specialists adhering more often to guideline recommendations [ 73 - 76 ]. A greater difference exists between PCPs and specialists in controller medication use [ 76 ]. Compared to PCPs, asthma specialists tend to achieve better outcomes [ 77 ], including higher physical functioning [ 78 ], better patient-reported care [ 78 ], and fewer ER visits and inpatient stays [ 78 - 84 ]. As over 60% of people with asthma are cared for by PCPs [ 85 ], our machine learning model primarily targets PCPs, although asthma specialists could also benefit from this model.

The asthma medication ratio (AMR) is the total number of units of asthma controller medications dispensed divided by the total number of units of asthma medications (controllers + relievers) dispensed [ 86 , 87 ]. Higher AMR (≥0.5) is associated with less oral corticosteroid use (a surrogate measure for asthma exacerbations), fewer ER visits and inpatient stays, and lower costs [ 87 - 89 ]. Lower AMR (<0.5) is associated with more exacerbations, ER visits, and inpatient stays [ 90 , 91 ]. Approved by Healthcare Effectiveness Data and Information Set (HEDIS) as a quality measure, AMR is widely used by health care systems [ 89 ]. AMR is a reliable reflection of asthma control and gives an accurate assessment of asthma exacerbation risk [ 92 ]. We use change in AMR as the prediction target of our model for predicting ICS response, as AMR can be calculated on all patients. In comparison, neither asthma control nor acute outcomes (eg, ER visits, inpatient stays, or oral corticosteroid use) is used as the prediction target, as the former is often missing in EHRs and the latter does not occur in all patients. An effective ICS will lead to less reliever use and increased AMR. An ineffective ICS will lead to more reliever use and reduced AMR. We formerly used EHR data to build accurate models to predict hospital use (ER visit or inpatient stay) for asthma [ 93 - 95 ]. We expect EHR data to have great predictive power for AMR, which is associated with hospital use for asthma [ 87 - 91 ]. Using the AMR can facilitate the dissemination of our approach across health care systems.

We outline the individual steps of our approach in the following sections.

Step 1: Building a Machine Learning Model to Predict a Patient’s ICS Response Defined by Changes in AMR

We focus on patients with persistent asthma for whom ICSs are mainly used. We use the HEDIS case definition of persistent asthma [ 96 , 97 ], the already validated [ 98 ] and the most commonly used administrative data marker of persistent asthma [ 97 ]. A patient is deemed to have persistent asthma if in each of 2 consecutive years, the patient meets at least one of the following criteria: (1) at least 1 ER visit or inpatient stay with a principal diagnosis code of asthma ( ICD-9 [ International Classification of Diseases, Ninth Revision ] 493.0x, 493.1x, 493.8x, 493.9x; ICD-10 [ International Classification of Diseases, Tenth Revision ] J45.x), (2) at least 2 asthma medication dispensing and at least 4 outpatient visits, each with a diagnosis code of asthma, and (3) at least 4 asthma medication dispensing. In the rest of this paper, we always use patients with asthma to refer to patients with persistent asthma. The prediction target or outcome is the amount of change in a patient’s AMR after 1 year. The AMR is computed over a 1-year period [ 86 , 87 ].

We combine patient, air quality, and weather features computed on the raw variables to build the model to predict ICS response. Existing predictive models for asthma outcomes [ 93 - 95 , 99 - 110 ] rarely use air quality and weather variables, but these variables impact asthma outcomes [ 111 - 117 ] (eg, short-term exposure to air pollution, even if measured at the regional level, is associated with asthma exacerbations [ 113 - 117 ]). For each such variable, we examine multiple features (eg, mean, maximum, SD, and slope). We examine over 200 patient features listed in our papers’ [ 93 - 95 ] appendices and formerly used to predict hospital use for asthma, which is associated with AMR [ 87 - 91 ]. Several examples of these features are comorbidities, allergies, the number of the patient’s asthma-related ER visits in the prior 12 months, the total number of units of systemic corticosteroids ordered for the patient in the prior 12 months, and the number of primary or principal asthma diagnoses of the patient in the prior 12 months. We also use as features the patient’s current AMR computed over the prior 12 months [ 86 , 87 ], the generic name and the dosage of the ICS that the patient currently uses, and those of the long-acting beta2 agonist, leukotriene receptor antagonist, biologic or another asthma medication, if any, that is combined with the ICS.

Step 2: Conducting Causal Machine Learning to Identify Optimal ICS Choice

Our goal is to integrate machine learning and G-computation to develop a method to estimate the causal effects of various ICS choices on AMR for patients with specific characteristics. This causal machine learning method [ 118 ] processes large data sets by capturing complex nonlinear relationships between features, thereby revealing the cause-and-effect relationships between ICS choice and change in AMR. We use the machine learning model built in step 1. Using G-computation [ 119 , 120 ], an imputation-based causal inference method, we estimate the potential effects of hypothetical ICS choices with specific dosages on changes in AMR after 1 year. G-computation builds on the machine learning model of the outcome as a function of ICS indicators, ICS dosages, and other features to predict AMR outcomes under different counterfactual ICS choice scenarios. CIs are estimated through 10,000 bootstrap resampling with replacement [ 121 ].

We apply causal machine learning to estimate the impact of ICS choices on patients with specific characteristics by averaging predicted AMR after 1 year for a given ICS and these characteristics across all participants. This estimation is contrasted with the averaged predicted outcome in the absence of any ICS choice. The ICS choice with the highest and statistically significant contrast estimation is identified as the optimal choice for patients with these characteristics. All hypotheses can be tested at a significance level of .05.

Step 3: Assessing the Impact of Adding External Patient-Reported Asthma Control and ICS Use Adherence Data on the Model’s Predictions

EHRs have limitations regarding patient-reported data with extra predictive power such as asthma control and ICS use adherence. For asthma, asthma control and ICS use adherence are critical variables, as (1) a patient’s asthma control fluctuates over time and drives the provider’s decision to prescribe or adjust ICSs and (2) ICS use adherence impacts the patient’s asthma control and helps assess whether the patient is actually responding to an ICS. However, despite their high predictive power for patient outcomes, these variables are not routinely collected or included in EHRs in clinical practice. At Intermountain Healthcare, the largest health care system in Utah, we pioneered the electronic AsthmaTracker, a mobile health (mHealth) app used weekly to assess, collect, and monitor patients’ asthma control and actual ICS use adherence [ 122 ]. Like most patient-reported data, these patient-reported variables have been collected on only a small proportion of patients with asthma. To date, 1380 patients with asthma have used the app and produced about 45,000 records of weekly asthma control scores and ICS use adherence data (eg, the ICS’ name and the number of days an ICS is actually used by the patient in that week). If we train a predictive model using EHR and patient-reported data limited to this small proportion of patients, the model will be inaccurate due to insufficient training data. Yet, for these patients, combining their patient-reported data with the outputs of a model built on all patients’ EHR data can help raise the prediction accuracy for them. To realize this, we propose the first method to combine external patient-reported data available on a small proportion of patients with the outputs of a model built on all patients’ EHR data to raise prediction accuracy for the small proportion of patients while maintaining prediction accuracy for the other patients.

To illustrate how our method works, we consider the case that the model created in step 1 is built using Intermountain Healthcare EHR data. The weekly asthma control scores and ICS use adherence data collected from the 1380 patients with asthma are unused in step 1. Now we add features (eg, mean, SD, and slope) computed on patient-reported asthma control and ICS use adherence data to raise prediction accuracy for these patients. Among all patients with asthma, only 1% have asthma control and ICS use adherence data. We use the method shown in Figure 1 to combine the asthma control and ICS use adherence data from this small proportion of patients with the outputs of a model trained on EHR, air quality, and weather data of all patients with asthma. We start from the original model built in step 1. This model is reasonably accurate, as it is trained using EHR, air quality, and weather data of all patients with asthma and all features excluding those computed on asthma control and ICS use adherence data. For each patient with asthma control and ICS use adherence data, we apply the model to the patient, obtain a prediction result, and use this result as a feature. We then combine this new feature with the features computed on asthma control and ICS use adherence data to train a second model for these patients using their data. The second model is built upon and thus tends to be more accurate than the original model for these patients. The original model is used for the other patients. Our method is general, works for all kinds of features, and is not limited to any specific disease, prediction target, cohort, or health care system. Whenever a small proportion of patients have extra predictive variables, we could use this method to raise prediction accuracy for these patients while maintaining prediction accuracy for the other patients.

For the patients with asthma control and ICS use adherence data, we compare the mean squared and the mean absolute prediction errors gained by the model built in step 1 and the second model built here. We expect adding asthma control and ICS use adherence data to the model to lower both prediction errors. The error drop rates help reveal the value of routinely collecting asthma control and ICS use adherence data in clinical care to lower prediction errors. Currently, such data are rarely collected.

crime prediction using machine learning research paper

Principal Findings

Besides the variables mentioned in the “Step 1: Building a machine learning model to predict a patient’s ICS response defined by changes in AMR” section, environmental variables beyond air quality and weather and many other factors can impact patient outcomes. Moreover, there are almost infinite possible features. For any first future study that one will do along the direction pointed out in this paper, a realistic goal is to show that using our methods can build decent models and improve asthma care rather than to exhaust all possible useful variables and features and obtain the theoretically highest possible model performance. Not accounting for all possible factors limits the generalizability of these models to medication selection for other diseases.

We use the G-computation method to conduct causal inference. This method relies heavily on correctly specifying the predictive model for ICS response, including accurately identifying all relevant confounders and interactions and incorporating them into the model. Misspecification of the model can lead to biased estimated effects of various ICS choices on AMR. To address this issue, we can adopt several preventive strategies during model development. We engage with subject matter experts to ensure that the model includes all relevant variables and reflects the underlying process. To guide model development and help identify potential sources of bias, we construct a directed acyclic graph that lays out the relationships among the independent and dependent variables. We use machine learning techniques that provide flexible modeling approaches to capture complex relationships among variables. When reporting our findings, we keep transparent about the final model specification and the rationale behind our model building process. We believe using these strategies will mitigate the risk of model misspecification and strengthen the reliability of our estimated effects of various ICS choices on AMR.

AMR is reported to be a reliable reflection of asthma control and of asthma exacerbation risk [ 92 ]. In a future study that we plan to do along the direction pointed out in this paper, we can use Intermountain Healthcare data to validate this relationship. Specifically, we use multivariable linear regression to assess the relationship between the AMR computed on EHR data and the patient’s asthma control level obtained from the external patient-reported data, while controlling for other factors. We expect to see a strong and positive association between the AMR and the patient’s asthma control level.

When creating the model in step 1, we can include medication persistence measures computed on insurance claim data [ 123 ], such as the proportion of days covered for ICS, as features. However, this does not obviate the need to examine patient-reported ICS use adherence data in step 3. ICS persistence measures give information on the possession of ICS, but not on actual use of ICS. Each ICS persistence measure is computed at a coarse time granularity as an average value over a long period. In comparison, our patient-reported ICS use adherence data offer information on the actual use of ICS. The data are at a fine time granularity, with 1 set of values per week for a patient. This enables us to compute features on various patterns and trends that can be useful for making predictions.

Conclusions

In asthma care, ICS choice is largely by trial and error and often made by a one-size-fits-all approach with many patients not achieving optimal outcomes. In this paper, we point out the need for creating a decision support tool to guide ICS selection and a gap in fulfilling this need. Then we outline an approach to close this gap via creating a machine learning model and applying causal inference to predict a patient’s ICS response in the next year based on the patient’s characteristics. This supplies a roadmap for future research.

Authors' Contributions

FLN and GL are co-senior authors mainly responsible for the paper. They conceptualized the presentation approach, performed literature review, and wrote the paper. BLS provided feedback on various medical issues, contributed to conceptualizing the presentation, and revised the paper. YZ wrote the causal inference section. All authors read and approved the final paper.

Conflicts of Interest

GL is an editorial board member of JMIR AI . The other authors declare no conflicts of interest.

  • Hashmi MF, Tariq M, Cataletto ME. Asthma. In: StatPearls. Treasure Island (FL). StatPearls Publishing; 2023.
  • Most recent national asthma data. Centers for Disease Control and Prevention. 2023. URL: https://www.cdc.gov/asthma/most_recent_national_asthma_data.htm [accessed 2024-01-22]
  • Nurmagambetov T, Kuwahara R, Garbe P. The economic burden of asthma in the United States, 2008-2013. Ann Am Thorac Soc. 2018;15(3):348-356. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Inhaled corticosteroids. American Academy of Allergy, Asthma & Immunology. 2023. URL: https://www.aaaai.org/tools-for-the-public/drug-guide/inhaled-corticosteroids [accessed 2024-01-22]
  • Asthma severity among children with current asthma. Centers for Disease Control and Prevention. 2023. URL: https://archive.cdc.gov/#/details?url=https://www.cdc.gov/asthma/asthma_stats/severity_child.htm [accessed 2024-01-22]
  • Asthma severity among adults with current asthma. Centers for Disease Control and Prevention. 2023. URL: https://archive.cdc.gov/#/details?url=https://www.cdc.gov/asthma/asthma_stats/severity_adult.htm [accessed 2024-01-22]
  • Averell CM, Laliberté F, Germain G, Duh MS, Rousculp MD, MacKnight SD, et al. Impact of adherence to treatment with inhaled corticosteroids/long-acting β-agonists on asthma outcomes in the United States. Ther Adv Respir Dis. 2022;16:17534666221116997. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cardet JC, Papi A, Reddel HK. "As-needed" inhaled corticosteroids for patients with asthma. J Allergy Clin Immunol Pract. 2023;11(3):726-734. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sadatsafavi M, Lynd LD, De Vera MA, Zafari Z, FitzGerald JM. One-year outcomes of inhaled controller therapies added to systemic corticosteroids after asthma-related hospital discharge. Respir Med. 2015;109(3):320-328. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • George M, Balantac Z, Gillette C, Farooqui N, Tervonen T, Thomas C, et al. Suboptimal control of asthma among diverse patients: a US mixed methods focus group study. J Asthma Allergy. 2022;15:1511-1526. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sullivan PW, Ghushchyan V, Kavati A, Navaratnam P, Friedman HS, Ortiz B. Trends in asthma control, treatment, health care utilization, and expenditures among children in the United States by place of residence: 2003-2014. J Allergy Clin Immunol Pract. 2019;7(6):1835-1842.e2. [ CrossRef ] [ Medline ]
  • Zhang S, White J, Hunter AG, Hinds D, Fowler A, Gardiner F, et al. Suboptimally controlled asthma in patients treated with inhaled ICS/LABA: prevalence, risk factors, and outcomes. NPJ Prim Care Respir Med. 2023;33(1):19. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nurmagambetov TA, Krishnan JA. What will uncontrolled asthma cost in the United States? Am J Respir Crit Care Med. 2019;200(9):1077-1078. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Uncontrolled asthma among children with current asthma, 2018-2020. Centers for Disease Control and Prevention. 2021. URL: https://tinyurl.com/ycdz2mp2 [accessed 2024-01-22]
  • Uncontrolled asthma among adults, 2019. Centers for Disease Control and Prevention. 2020. URL: https:/​/archive.​cdc.gov/​#/​details?url=https:/​/www.​cdc.gov/​asthma/​asthma_stats/​uncontrolled-asthma-adults-2019.​htm [accessed 2024-01-22]
  • Pate CA, Zahran HS, Qin X, Johnson C, Hummelman E, Malilay J. Asthma surveillance—United States, 2006-2018. MMWR Surveill Summ. 2021;70(5):1-32. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sullivan PW, Ghushchyan V, Navaratnam P, Friedman HS, Kavati A, Ortiz B, et al. National prevalence of poor asthma control and associated outcomes among school-aged children in the United States. J Allergy Clin Immunol Pract. 2018;6(2):536-544.e1. [ CrossRef ] [ Medline ]
  • Yaghoubi M, Adibi A, Safari A, FitzGerald JM, Sadatsafavi M. The projected economic and health burden of uncontrolled asthma in the United States. Am J Respir Crit Care Med. 2019;200(9):1102-1112. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Centers for Disease Control and Prevention (CDC). Asthma hospitalizations and readmissions among children and young adults--Wisconsin, 1991-1995. MMWR Morb Mortal Wkly Rep. 1997;46(31):726-729. [ FREE Full text ] [ Medline ]
  • Li D, German D, Lulla S, Thomas RG, Wilson SR. Prospective study of hospitalization for asthma. A preliminary risk factor model. Am J Respir Crit Care Med. 1995;151(3 Pt 1):647-655. [ CrossRef ] [ Medline ]
  • Crane J, Pearce N, Burgess C, Woodman K, Robson B, Beasley R. Markers of risk of asthma death or readmission in the 12 months following a hospital admission for asthma. Int J Epidemiol. 1992;21(4):737-744. [ CrossRef ] [ Medline ]
  • Mitchell EA, Bland JM, Thompson JM. Risk factors for readmission to hospital for asthma in childhood. Thorax. 1994;49(1):33-36. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Vargas PA, Perry TT, Robles E, Jo CH, Simpson PM, Magee JM, et al. Relationship of body mass index with asthma indicators in head start children. Ann Allergy Asthma Immunol. 2007;99(1):22-28. [ CrossRef ] [ Medline ]
  • Barnes PJ. Achieving asthma control. Curr Med Res Opin. 2005;21(Suppl 4):S5-S9. [ CrossRef ] [ Medline ]
  • Bloomberg GR, Banister C, Sterkel R, Epstein J, Bruns J, Swerczek L, et al. Socioeconomic, family, and pediatric practice factors that affect level of asthma control. Pediatrics. 2009;123(3):829-835. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bateman ED, Frith LF, Braunstein GL. Achieving guideline-based asthma control: does the patient benefit? Eur Respir J. 2002;20(3):588-595. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chapman KR, Boulet LP, Rea RM, Franssen E. Suboptimal asthma control: prevalence, detection and consequences in general practice. Eur Respir J. 2008;31(2):320-325. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Rabe KF, Adachi M, Lai CK, Soriano JB, Vermeire PA, Weiss KB, et al. Worldwide severity and control of asthma in children and adults: the global asthma insights and reality surveys. J Allergy Clin Immunol. 2004;114(1):40-47. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • National Asthma Education and Prevention Program. Expert Panel Report 3 (EPR-3): guidelines for the diagnosis and management of asthma-summary report 2007. J Allergy Clin Immunol. 2007;120(Suppl 5):S94-S138. [ CrossRef ] [ Medline ]
  • Stempel DA, McLaughin TP, Stanford RH, Fuhlbrigge AL. Patterns of asthma control: a 3-year analysis of patient claims. J Allergy Clin Immunol. 2005;115(5):935-939. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cukovic L, Sutherland E, Sein S, Fuentes D, Fatima H, Oshana A, et al. An evaluation of outpatient pediatric asthma prescribing patterns in the United States. Int J Sci Res Arch. 2023;9(1):344-349. [ FREE Full text ] [ CrossRef ]
  • Belhassen M, Nibber A, Van Ganse E, Ryan D, Langlois C, Appiagyei F, et al. Inappropriate asthma therapy-a tale of two countries: a parallel population-based cohort study. NPJ Prim Care Respir Med. 2016;26:16076. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • McIntyre AP, Viswanathan RK. Phenotypes and endotypes in asthma. Adv Exp Med Biol. 2023;1426:119-142. [ CrossRef ] [ Medline ]
  • Kuruvilla ME, Lee FE, Lee GB. Understanding asthma phenotypes, endotypes, and mechanisms of disease. Clin Rev Allergy Immunol. 2019;56(2):219-233. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Salter B, Lacy P, Mukherjee M. Biologics in asthma: a molecular perspective to precision medicine. Front Pharmacol. 2021;12:793409. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • van der Burg N, Tufvesson E. Is asthma's heterogeneity too vast to use traditional phenotyping for modern biologic therapies? Respir Med. 2023;212:107211. [ CrossRef ] [ Medline ]
  • A study of the qualitative impact of non-medical switching. Alliance for Patient Access. 2019. URL: https://tinyurl.com/2vxwks83 [accessed 2024-01-22]
  • Cost-motivated treatment changes & non-medical switching: commercial health plans analysis. Alliance for Patient Access. 2017. URL: https://tinyurl.com/424dy3xz [accessed 2024-01-22]
  • Collins S. Asthma meds, insurers, and the practice of non-medical drug switching. HealthCentral. 2023. URL: https://www.healthcentral.com/condition/asthma/what-you-need-to-know-about-asthma-meds [accessed 2024-01-22]
  • Landhuis E. OTC budesonide-formoterol for asthma could save lives, money. Medscape Medical News. 2023. URL: https://www.medscape.com/viewarticle/989099 [accessed 2024-01-22]
  • Modglin L. How much do inhalers cost? SingleCare. 2022. URL: https://www.singlecare.com/blog/asthma-inhalers-price-list [accessed 2024-01-22]
  • Gibson PG, McDonald VM, Thomas D. Treatable traits, combination inhaler therapy and the future of asthma management. Respirology. 2023;28(9):828-840. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Dahlin A, Denny J, Roden DM, Brilliant MH, Ingram C, Kitchner TE, et al. CMTR1 is associated with increased asthma exacerbations in patients taking inhaled corticosteroids. Immun Inflamm Dis. 2015;3(4):350-359. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Keskin O, Farzan N, Birben E, Akel H, Karaaslan C, Maitland-van der Zee AH, et al. Genetic associations of the response to inhaled corticosteroids in asthma: a systematic review. Clin Transl Allergy. 2019;9:2. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Delgado-Dolset MI, Obeso D, Rodríguez-Coira J, Tarin C, Tan G, Cumplido JA, et al. Understanding uncontrolled severe allergic asthma by integration of omic and clinical data. Allergy. 2022;77(6):1772-1785. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Liu Q, Hua L, Bao C, Kong L, Hu J, Liu C, et al. Inhibition of spleen tyrosine kinase restores glucocorticoid sensitivity to improve steroid-resistant asthma. Front Pharmacol. 2022;13:885053. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cardoso-Vigueros C, von Blumenthal T, Rückert B, Rinaldi AO, Tan G, Dreher A, et al. Leukocyte redistribution as immunological biomarker of corticosteroid resistance in severe asthma. Clin Exp Allergy. 2022;52(10):1183-1194. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Liang H, Zhang X, Ma Z, Sun Y, Shu C, Zhu Y, et al. Association of CYP3A5 gene polymorphisms and amlodipine-induced peripheral edema in Chinese Han patients with essential hypertension. Pharmgenomics Pers Med. 2021;14:189-197. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wang SB, Huang T. The early detection of asthma based on blood gene expression. Mol Biol Rep. 2019;46(1):217-223. [ CrossRef ] [ Medline ]
  • Roberts JK, Moore CD, Romero EG, Ward RM, Yost GS, Reilly CA. Regulation of CYP3A genes by glucocorticoids in human lung cells. F1000Res. 2013;2:173. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Moore CD, Roberts JK, Orton CR, Murai T, Fidler TP, Reilly CA, et al. Metabolic pathways of inhaled glucocorticoids by the CYP3A enzymes. Drug Metab Dispos. 2013;41(2):379-389. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Roche N, Garcia G, de Larrard A, Cancalon C, Bénard S, Perez V, et al. Real-life impact of uncontrolled severe asthma on mortality and healthcare use in adolescents and adults: findings from the retrospective, observational RESONANCE study in France. BMJ Open. 2022;12(8):e060160. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Munoz-Cano R, Torrego A, Bartra J, Sanchez-Lopez J, Palomino R, Picado C, et al. Follow-up of patients with uncontrolled asthma: clinical features of asthma patients according to the level of control achieved (the COAS study). Eur Respir J. 2017;49(3):1501885. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Stockmann C, Reilly CA, Fassl B, Gaedigk R, Nkoy F, Stone B, et al. Effect of CYP3A5*3 on asthma control among children treated with inhaled beclomethasone. J Allergy Clin Immunol. 2015;136(2):505-507. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Stockmann C, Fassl B, Gaedigk R, Nkoy F, Uchida DA, Monson S, et al. Fluticasone propionate pharmacogenetics: CYP3A4*22 polymorphism and pediatric asthma control. J Pediatr. 2013;162(6):1222-1227, 1227.e1-2. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Smolnikova MV, Kasparov EW, Malinchik MA, Kopylova KV. Genetic markers of children asthma: predisposition to disease course variants. Vavilovskii Zhurnal Genet Selektsii. 2023;27(4):393-400. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kim HK, Kang JO, Lim JE, Ha TW, Jung HU, Lee WJ, et al. Genetic differences according to onset age and lung function in asthma: a cluster analysis. Clin Transl Allergy. 2023;13(7):e12282. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mohan A, Lugogo NL. Phenotyping, precision medicine, and asthma. Semin Respir Crit Care Med. 2022;43(5):739-751. [ CrossRef ] [ Medline ]
  • Casanova S, Ahmed E, Bourdin A. Definition, phenotyping of severe asthma, including cluster analysis. Adv Exp Med Biol. 2023;1426:239-252. [ CrossRef ] [ Medline ]
  • Singhal P, Tan ALM, Drivas TG, Johnson KB, Ritchie MD, Beaulieu-Jones BK. Opportunities and challenges for biomarker discovery using electronic health record data. Trends Mol Med. 2023;29(9):765-776. [ CrossRef ] [ Medline ]
  • Huang SD, Bamba V, Bothwell S, Fechner PY, Furniss A, Ikomi C, et al. Development and validation of a computable phenotype for turner syndrome utilizing electronic health records from a national pediatric network. Am J Med Genet A. 2024;194(4):e63495. [ CrossRef ] [ Medline ]
  • Blecker S, Schoenthaler A, Martinez TR, Belli HM, Zhao Y, Wong C, et al. Leveraging electronic health record technology and team care to address medication adherence: protocol for a cluster randomized controlled trial. JMIR Res Protoc. 2023;12:e47930. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Verhoef PA, Spicer AB, Lopez-Espina C, Bhargava A, Schmalz L, Sims MD, et al. Analysis of protein biomarkers from hospitalized COVID-19 patients reveals severity-specific signatures and two distinct latent profiles with differential responses to corticosteroids. Crit Care Med. 2023;51(12):1697-1705. [ CrossRef ] [ Medline ]
  • Hu Y, Huerta J, Cordella N, Mishuris RG, Paschalidis IC. Personalized hypertension treatment recommendations by a data-driven model. BMC Med Inform Decis Mak. 2023;23(1):44. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cottrill KA, Rad MG, Ripple MJ, Stephenson ST, Mohammad AF, Tidwell M, et al. Cluster analysis of plasma cytokines identifies two unique endotypes of children with asthma in the pediatric intensive care unit. Sci Rep. 2023;13(1):3521. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Horne EMF, McLean S, Alsallakh MA, Davies GA, Price DB, Sheikh A, et al. Defining clinical subtypes of adult asthma using electronic health records: analysis of a large UK primary care database with external validation. Int J Med Inform. 2023;170:104942. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ilmarinen P, Julkunen-Iivari A, Lundberg M, Luukkainen A, Nuutinen M, Karjalainen J, et al. Cluster analysis of Finnish population-based adult-onset asthma patients. J Allergy Clin Immunol Pract. 2023;11(10):3086-3096. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Imoto S, Suzukawa M, Fukutomi Y, Kobayashi N, Taniguchi M, Nagase T, et al. Phenotype characterization and biomarker evaluation in moderate to severe type 2-high asthma. Asian Pac J Allergy Immunol. 2023.:1-14. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kim MA, Shin SW, Park JS, Uh ST, Chang HS, Bae DJ, et al. Clinical characteristics of exacerbation-prone adult asthmatics identified by cluster analysis. Allergy Asthma Immunol Res. 2017;9(6):483-490. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Matabuena M, Salgado FJ, Nieto-Fontarigo JJ, Álvarez-Puebla MJ, Arismendi E, Barranco P, et al. Identification of asthma phenotypes in the Spanish MEGA cohort study using cluster analysis. Arch Bronconeumol. 2023;59(4):223-231. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ngo SY, Venter C, Anderson WC3, Picket K, Zhang H, Arshad SH, et al. Clinical features and later prognosis of replicable early-life wheeze clusters from two birth cohorts 12 years apart. Pediatr Allergy Immunol. 2023;34(7):e13999. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhan W, Wu F, Zhang Y, Lin L, Li W, Luo W, et al. Identification of cough-variant asthma phenotypes based on clinical and pathophysiologic data. J Allergy Clin Immunol. 2023;152(3):622-632. [ CrossRef ] [ Medline ]
  • Cloutier MM, Akinbami LJ, Salo PM, Schatz M, Simoneau T, Wilkerson JC, et al. Use of national asthma guidelines by allergists and pulmonologists: a national survey. J Allergy Clin Immunol Pract. 2020;8(9):3011-3020.e2. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Vollmer WM, O'Hollaren M, Ettinger KM, Stibolt T, Wilkins J, Buist AS, et al. Specialty differences in the management of asthma. A cross-sectional assessment of allergists' patients and generalists' patients in a large HMO. Arch Intern Med. 1997;157(11):1201-1208. [ Medline ]
  • Cloutier MM, Salo PM, Akinbami LJ, Cohn RD, Wilkerson JC, Diette GB, et al. Clinician agreement, self-efficacy, and adherence with the guidelines for the diagnosis and management of asthma. J Allergy Clin Immunol Pract. 2018;6(3):886-894.e4. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Diette GB, Skinner EA, Nguyen TT, Markson L, Clark BD, Wu AW. Comparison of quality of care by specialist and generalist physicians as usual source of asthma care for children. Pediatrics. 2001;108(2):432-437. [ CrossRef ] [ Medline ]
  • Rosman Y, Hornik-Lurie T, Meir-Shafrir K, Lachover-Roth I, Cohen-Engler A, Confino-Cohen R. The effect of asthma specialist intervention on asthma control among adults. World Allergy Organ J. 2022;15(11):100712. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wu AW, Young Y, Skinner EA, Diette GB, Huber M, Peres A, et al. Quality of care and outcomes of adults with asthma treated by specialists and generalists in managed care. Arch Intern Med. 2001;161(21):2554-2560. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Erickson S, Tolstykh I, Selby JV, Mendoza G, Iribarren C, Eisner MD. The impact of allergy and pulmonary specialist care on emergency asthma utilization in a large managed care organization. Health Serv Res. 2005;40(5 Pt 1):1443-1465. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zeiger RS, Heller S, Mellon MH, Wald J, Falkoff R, Schatz M. Facilitated referral to asthma specialist reduces relapses in asthma emergency room visits. J Allergy Clin Immunol. 1991;87(6):1160-1168. [ CrossRef ] [ Medline ]
  • Mahr TA, Evans R3. Allergist influence on asthma care. Ann Allergy. 1993;71(2):115-120. [ Medline ]
  • Schatz M, Zeiger RS, Mosen D, Apter AJ, Vollmer WM, Stibolt TB, et al. Improved asthma outcomes from allergy specialist care: a population-based cross-sectional analysis. J Allergy Clin Immunol. 2005;116(6):1307-1313. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wechsler ME. Managing asthma in primary care: putting new guideline recommendations into context. Mayo Clin Proc. 2009;84(8):707-717. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cooper S, Rahme E, Tse SM, Grad R, Dorais M, Li P. Are primary care and continuity of care associated with asthma-related acute outcomes amongst children? A retrospective population-based study. BMC Prim Care. 2022;23(1):5. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Akinbami LJ, Salo PM, Cloutier MM, Wilkerson JC, Elward KS, Mazurek JM, et al. Primary care clinician adherence with asthma guidelines: the National Asthma Survey of Physicians. J Asthma. 2020;57(5):543-555. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • HEDIS measures and technical resources: asthma medication ratio (AMR). NCQA. 2023. URL: https:/​/www.​ncqa.org/​hedis/​measures/​medication-management-for-people-with-asthma-and-asthma-medication-ratio [accessed 2024-01-22]
  • Schatz M, Zeiger RS, Vollmer WM, Mosen D, Mendoza G, Apter AJ, et al. The controller-to-total asthma medication ratio is associated with patient-centered as well as utilization outcomes. Chest. 2006;130(1):43-50. [ CrossRef ] [ Medline ]
  • Kim Y, Parrish KM, Pirritano M, Moonie S. A higher asthma medication ratio (AMR) predicts a decrease in ED visits among African American and Hispanic children. J Asthma. 2023;60(7):1428-1437. [ CrossRef ] [ Medline ]
  • Luskin AT, Antonova EN, Broder MS, Chang E, Raimundo K, Solari PG. Patient outcomes, health care resource use, and costs associated with high versus low HEDIS asthma medication ratio. J Manag Care Spec Pharm. 2017;23(11):1117-1124. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Andrews AL, Simpson AN, Basco WTJ, Teufel RJ2. Asthma medication ratio predicts emergency department visits and hospitalizations in children with asthma. Medicare Medicaid Res Rev. 2013;3(4):mmrr.003.04.a05. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Andrews AL, Brinton DL, Simpson KN, Simpson AN. A longitudinal examination of the asthma medication ratio in children with Medicaid. J Asthma. 2020;57(10):1083-1091. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Andrews AL, Brinton D, Simpson KN, Simpson AN. A longitudinal examination of the asthma medication ratio in children. Am J Manag Care. 2018;24(6):294-300. [ FREE Full text ] [ Medline ]
  • Tong Y, Messinger AI, Wilcox AB, Mooney SD, Davidson GH, Suri P, et al. Forecasting future asthma hospital encounters of patients with asthma in an academic health care system: predictive model development and secondary analysis study. J Med Internet Res. 2021;23(4):e22796. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Luo G, He S, Stone BL, Nkoy FL, Johnson MD. Developing a model to predict hospital encounters for asthma in asthmatic patients: secondary analysis. JMIR Med Inform. 2020;8(1):e16080. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Luo G, Nau CL, Crawford WW, Schatz M, Zeiger RS, Rozema E, et al. Developing a predictive model for asthma-related hospital encounters in patients with asthma in a large, integrated health care system: secondary analysis. JMIR Med Inform. 2020;8(11):e22689. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mosen DM, Macy E, Schatz M, Mendoza G, Stibolt TB, McGaw J, et al. How well do the HEDIS asthma inclusion criteria identify persistent asthma? Am J Manag Care. 2005;11(10):650-654. [ FREE Full text ] [ Medline ]
  • Schatz M, Zeiger RS. Improving asthma outcomes in large populations. J Allergy Clin Immunol. 2011;128(2):273-277. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Schatz M, Zeiger RS, Yang SJ, Chen W, Crawford WW, Sajjan SG, et al. Persistent asthma defined using HEDIS versus survey criteria. Am J Manag Care. 2010;16(11):e281-e288. [ FREE Full text ] [ Medline ]
  • Schatz M, Nakahiro R, Jones CH, Roth RM, Joshua A, Petitti D. Asthma population management: development and validation of a practical 3-level risk stratification scheme. Am J Manag Care. 2004;10(1):25-32. [ FREE Full text ] [ Medline ]
  • Schatz M, Cook EF, Joshua A, Petitti D. Risk factors for asthma hospitalizations in a managed care organization: development of a clinical prediction rule. Am J Manag Care. 2003;9(8):538-547. [ FREE Full text ] [ Medline ]
  • Lieu TA, Quesenberry CP, Sorel ME, Mendoza GR, Leong AB. Computer-based models to identify high-risk children with asthma. Am J Respir Crit Care Med. 1998;157(4 Pt 1):1173-1180. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Lieu TA, Capra AM, Quesenberry CP, Mendoza GR, Mazar M. Computer-based models to identify high-risk adults with asthma: is the glass half empty of half full? J Asthma. 1999;36(4):359-370. [ CrossRef ] [ Medline ]
  • Forno E, Fuhlbrigge A, Soto-Quirós ME, Avila L, Raby BA, Brehm J, et al. Risk factors and predictive clinical scores for asthma exacerbations in childhood. Chest. 2010;138(5):1156-1165. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Loymans RJB, Debray TPA, Honkoop PJ, Termeer EH, Snoeck-Stroband JB, Schermer TRJ, et al. Exacerbations in adults with asthma: a systematic review and external validation of prediction models. J Allergy Clin Immunol Pract. 2018;6(6):1942-1952.e15. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Eisner MD, Yegin A, Trzaskoma B. Severity of asthma score predicts clinical outcomes in patients with moderate to severe persistent asthma. Chest. 2012;141(1):58-65. [ CrossRef ] [ Medline ]
  • Sato R, Tomita K, Sano H, Ichihashi H, Yamagata S, Sano A, et al. The strategy for predicting future exacerbation of asthma using a combination of the asthma control test and lung function test. J Asthma. 2009;46(7):677-682. [ CrossRef ] [ Medline ]
  • Yurk RA, Diette GB, Skinner EA, Dominici F, Clark RD, Steinwachs DM, et al. Predicting patient-reported asthma outcomes for adults in managed care. Am J Manag Care. 2004;10(5):321-328. [ FREE Full text ] [ Medline ]
  • Xiang Y, Ji H, Zhou Y, Li F, Du J, Rasmy L, et al. Asthma exacerbation prediction and risk factor analysis based on a time-sensitive, attentive neural network: retrospective cohort study. J Med Internet Res. 2020;22(7):e16981. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Miller MK, Lee JH, Blanc PD, Pasta DJ, Gujrathi S, Barron H, et al. TENOR risk score predicts healthcare in adults with severe or difficult-to-treat asthma. Eur Respir J. 2006;28(6):1145-1155. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Loymans RJ, Honkoop PJ, Termeer EH, Snoeck-Stroband JB, Assendelft WJ, Schermer TR, et al. Identifying patients at risk for severe exacerbations of asthma: development and external validation of a multivariable prediction model. Thorax. 2016;71(9):838-846. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Schatz M. Predictors of asthma control: what can we modify? Curr Opin Allergy Clin Immunol. 2012;12(3):263-268. [ CrossRef ] [ Medline ]
  • Dick S, Doust E, Cowie H, Ayres JG, Turner S. Associations between environmental exposures and asthma control and exacerbations in young children: a systematic review. BMJ Open. 2014;4(2):e003827. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Schwartz J, Slater D, Larson TV, Pierson WE, Koenig JQ. Particulate air pollution and hospital emergency room visits for asthma in Seattle. Am Rev Respir Dis. 1993;147(4):826-831. [ CrossRef ] [ Medline ]
  • Romieu I, Meneses F, Sienra-Monge JJ, Huerta J, Ruiz Velasco S, White MC, et al. Effects of urban air pollutants on emergency visits for childhood asthma in Mexico City. Am J Epidemiol. 1995;141(6):546-553. [ CrossRef ] [ Medline ]
  • Lu P, Zhang Y, Lin J, Xia G, Zhang W, Knibbs LD, et al. Multi-city study on air pollution and hospital outpatient visits for asthma in China. Environ Pollut. 2020;257:113638. [ CrossRef ] [ Medline ]
  • Liu Y, Pan J, Zhang H, Shi C, Li G, Peng Z, et al. Short-term exposure to ambient air pollution and asthma mortality. Am J Respir Crit Care Med. 2019;200(1):24-32. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Vagaggini B, Taccola M, Cianchetti S, Carnevali S, Bartoli ML, Bacci E, et al. Ozone exposure increases eosinophilic airway response induced by previous allergen challenge. Am J Respir Crit Care Med. 2002;166(8):1073-1077. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sanchez P, Voisey JP, Xia T, Watson HI, O'Neil AQ, Tsaftaris SA. Causal machine learning for healthcare and precision medicine. R Soc Open Sci. 2022;9(8):220638. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Robins JM. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Model. 1986;7(9-12):1393-1512. [ FREE Full text ] [ CrossRef ]
  • Snowden JM, Rose S, Mortimer KM. Implementation of G-computation on a simulated data set: demonstration of a causal inference technique. Am J Epidemiol. 2011;173(7):731-738. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Efron B. Bootstrap methods: another look at the Jackknife. Ann Stat. 1979;7(1):1-26. [ FREE Full text ] [ CrossRef ]
  • Nkoy FL, Stone BL, Fassl BA, Uchida DA, Koopmeiners K, Halbern S, et al. Longitudinal validation of a tool for asthma self-monitoring. Pediatrics. 2013;132(6):e1554-e1561. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Anghel LA, Farcas AM, Oprean RN. An overview of the common methods used to measure treatment adherence. Med Pharm Rep. 2019;92(2):117-122. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by A Benis; submitted 24.01.24; peer-reviewed by H Tibble, A Kaplan; comments to author 01.03.24; revised version received 12.03.24; accepted 25.03.24; published 17.04.24.

©Flory L Nkoy, Bryan L Stone, Yue Zhang, Gang Luo. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 17.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.

This paper is in the following e-collection/theme issue:

Published on 16.4.2024 in Vol 8 (2024)

Integrating Explainable Machine Learning in Clinical Decision Support Systems: Study Involving a Modified Design Thinking Approach

Authors of this article:

Author Orcid Image

Original Paper

  • Michael Shulha 1, 2 , PhD   ; 
  • Jordan Hovdebo 3 , PhD   ; 
  • Vinita D’Souza 1, 2 , MSc   ; 
  • Francis Thibault 4 , PhD   ; 
  • Rola Harmouche 4 , PhD  

1 Lady Davis Institute for Medical Research, Jewish General Hospital, Centre intégré universitaire de santé et de services sociaux (CIUSSS) du Centre-Ouest-de-l'Île-de-Montréal, Montreal, QC, Canada

2 Department of Family Medicine, McGill University, Montreal, QC, Canada

3 National Research Council of Canada, Winnipeg, MB, Canada

4 National Research Council of Canada, Boucherville, QC, Canada

Corresponding Author:

Michael Shulha, PhD

Lady Davis Institute for Medical Research

Jewish General Hospital

Centre intégré universitaire de santé et de services sociaux (CIUSSS) du Centre-Ouest-de-l'Île-de-Montréal

Pavilion B-274

3755 Chem. de la Côte-Sainte-Catherine

Montreal, QC, H3T 1E2

Phone: 1 514 340 8222

Email: [email protected]

Background: Though there has been considerable effort to implement machine learning (ML) methods for health care, clinical implementation has lagged. Incorporating explainable machine learning (XML) methods through the development of a decision support tool using a design thinking approach is expected to lead to greater uptake of such tools.

Objective: This work aimed to explore how constant engagement of clinician end users can address the lack of adoption of ML tools in clinical contexts due to their lack of transparency and address challenges related to presenting explainability in a decision support interface.

Methods: We used a design thinking approach augmented with additional theoretical frameworks to provide more robust approaches to different phases of design. In particular, in the problem definition phase, we incorporated the nonadoption, abandonment, scale-up, spread, and sustainability of technology in health care (NASSS) framework to assess these aspects in a health care network. This process helped focus on the development of a prognostic tool that predicted the likelihood of admission to an intensive care ward based on disease severity in chest x-ray images. In the ideate, prototype, and test phases, we incorporated a metric framework to assess physician trust in artificial intelligence (AI) tools. This allowed us to compare physicians’ assessments of the domain representation, action ability, and consistency of the tool.

Results: Physicians found the design of the prototype elegant, and domain appropriate representation of data was displayed in the tool. They appreciated the simplified explainability overlay, which only displayed the most predictive patches that cumulatively explained 90% of the final admission risk score. Finally, in terms of consistency, physicians unanimously appreciated the capacity to compare multiple x-ray images in the same view. They also appreciated the ability to toggle the explainability overlay so that both options made it easier for them to assess how consistently the tool was identifying elements of the x-ray image they felt would contribute to overall disease severity.

Conclusions: The adopted approach is situated in an evolving space concerned with incorporating XML or AI technologies into health care software. We addressed the alignment of AI as it relates to clinician trust, describing an approach to wire framing and prototyping, which incorporates the use of a theoretical framework for trust in the design process itself. Moreover, we proposed that alignment of AI is dependent upon integration of end users throughout the larger design process. Our work shows the importance and value of engaging end users prior to tool development. We believe that the described approach is a unique and valuable contribution that outlines a direction for ML experts, user experience designers, and clinician end users on how to collaborate in the creation of trustworthy and usable XML-based clinical decision support tools.

Introduction

Though much research has been published on the applications of machine learning (ML) in clinical contexts, few studies have proceeded to deployment for patient care [ 1 ]. Barriers to adoption in health care include data quality; data bias; and lack of proper validation, reproducibility, and transparency [ 2 ]. Particularly with respect to transparency, black box models, which are increasingly used for prediction tasks in clinical contexts, do not provide the rationale behind the prediction in order to justify a clinical decision [ 3 , 4 ]. In fact, studies found that “physicians need to understand artificial intelligence (AI) methods and systems sufficiently to be able to trust an algorithm’s predictions—or know how to assess the trustworthiness and value of an algorithm—as a foundation for clinical recommendations” [ 5 ].

Explainable machine learning (XML) is a field focused on developing techniques to help end users understand the predictions made by complex models [ 6 ]. Indeed, we followed Rudin [ 7 ] in adopting the definition of XML as the use of additional post-hoc models to explain a primary black-box model. Such black-box models are in contrast to interpretable models. This includes information concerning the underlying data and performance of the model [ 8 ]. However, the effectiveness of various approaches for explainability are dependent upon well-designed and highly usable user interfaces [ 9 ], and Abdul et al [ 10 ] pointed out that much of the work within the domains of AI and ML has not focused on usability or practical interpretability. Indeed, as Liao et al [ 8 ] discussed, current work provides limited guidance on actualizing guidelines in user interfaces.

As discussed by Schwartz et al [ 11 ], clinician involvement in the design of ML clinical decision support has primarily been used to validate the clinical accuracy of underlying models developed by the researchers. A recent review by Chen et al [ 12 ] of explainable AI and ML medical imaging design found no evidence of end-user clinical involvement in the design of explainability models and a highly limited number of articles that documented an empirical assessment of explainability claims with end users. These findings mirror our previous unpublished work that looked at the broader state of XML in clinical decision support and the same low engagement of end users in the empirical assessment of XML decision support applications.

Our study used a design thinking [ 13 , 14 ] approach to explore how constant engagement of clinician end users could provide insights on how to improve the alignment of XML decision support to actual end-user needs and address challenges related to presenting explainability in a decision support interface. To this end, we identified a relevant ML decision support tool targeted toward COVID-19 via clinician focus groups. We then developed a clinician-facing interface for the quantification of COVID-19 severity from chest x-ray images with XML. We tested the resulting prototype via structured interviews with clinicians to verify the domain-appropriate representation, potential actionability, and consistency of the tool.

Ethical Considerations

Ethical approval for this study was granted by the Centre intégré universitaire de santé et de services sociaux (CIUSSS) of West Central Montreal psychosocial research ethics committee (Project 2022-2838) and by the NRC research ethics board (Project 2021-101). Informed consent was received from all participants. In all analysis and research documents, participant-identifying data were replaced by a code. No compensation was offered to any participants of the study.

Design Thinking Approach

We chose a design thinking approach to optimize clinician involvement in the creation of an XML-based clinical decision support system (CDSS). Design thinking is a process for solving complex problems that emphasizes iteration and rapid prototyping to maximize end-user involvement in generating a usable solution. Stanford University Design School describes 5 key phases of the design thinking approach, namely, empathize , define , ideate , prototype , and test . Table 1 provides an overview of the work presented in this manuscript according to design thinking phases. For each phase, we define the objective, associated research activities, end-user involvement, and supplementary theoretical frameworks used to add robustness to our work.

To simplify the structure of the paper, we have chosen to report the majority of research activities conducted in the empathize, define, and ideate phases in the Methods section. The Results section is primarily focused on the outputs of the prototype and test phases.

a N/A: not applicable.

b NASSS: nonadoption, abandonment, scale-up, spread, and sustainability of technology in health care.

Empathize Phase

The objective of the empathize phase was to better understand the motivation and experiences of potential end users and to consult with experts on the problem in question. In this phase, we conducted three key research activities: (1) a rapid review to identify clinical use cases for ML or AI that could benefit from explainability and be useful to clinicians in the context of the COVID-19 pandemic; (2) focus groups with physicians designed to review the output of the rapid review and to elicit data that would help the team better understand the scope and nature of a tool that would be most useful to physicians in an integrated health care network, responding to the COVID-19 pandemic; and (3) a scoping review to better understand the existing XML CDSS in health care, the associated design frameworks used for explainability, and the research methods used to study end-user perceptions.

During our rapid review, we searched Scopus, the World Health Organization (WHO) COVID-19 publication database, and the Dialog Proquest COVID-19 database to identify systematic reviews, literature reviews, or surveys of AI or ML technologies used to support clinicians in a pandemic (COVID-19). We identified 65 review articles, of which the 7 most pertinent were used to separate the cited papers within the reviews into 8 broad categories of applications. We selected 4 of these categories to present to stakeholders as candidate applications based on assessment of their clinical need, applicability to hospital settings, relevance to our current research field, interest to clinicians, and feasibility in the chosen clinical setting. The selected categories were large-scale COVID-19 screening; detection, diagnosis, or prognosis of COVID-19; predicting recovery, mortality, or severity of COVID-19 patients; and hospital resource management.

After performing a nonexhaustive scan of additional publications that fit these categories, we retained a total of 37 articles that were peer reviewed and that described ML implementations considering the following criteria: techniques where explainability would be beneficial and associated data or codes were available for implementation. These articles were explored to further select 3 themes, each with its own clinical use case for an ML application, which cross-cut the previously described categories. The first theme was screening. It involved algorithms for tools that may help with COVID-19 screening by predicting the risk of COVID-19 in undiagnosed patients through the analysis of text-based telehealth notes or triage notes in the emergency room. The second theme was prognosis. It involved algorithms for tools that may help predict the severity of COVID-19 infections and the prognosis and risk of intensive care unit (ICU) admission through the analysis of chest x-ray images. The third theme was long COVID. It involved algorithms for tools that may predict the likelihood of long-term implications (long COVID) resulting from COVID-19 through patient-reported outcomes.

Two focus groups were conducted with 7 physicians to identify which of the 3 use cases were suitable for use in clinical decision support. Participants represented a broad range of medical specialties and had experience providing COVID-19–related care in a variety of venues ( Textbox 1 ).

Medical subspecialties

  • Emergency medicine
  • Intensive care
  • Palliative care
  • Family medicine
  • Diagnostic medicine
  • Internal medicine

COVID-19 care venues

  • Long-term care facilities
  • Family medicine centers
  • Emergency departments
  • Intensive care units
  • COVID-19 acute care wards

For each category of possible tools (screening, prognosis, and long COVID), we used the following interview guide questions to seed the discussions: (1) How might a clinical decision support tool focused on (insert tool type) be useful in the context of our health care sites? (additional prompts: Could you describe what you see as the value of this type of tool for clinicians [doctors, nurses, and others]? Could you describe what you see as the value of this type of tool for patients?); (2) If we assume that we can access the required data to make the tool work, what additional challenges might a tool like this be associated with? (additional prompts: Any specialized additional clinical knowledge needed? Would it require dramatic changes to existing care protocols or workflows? Any special characteristic of our patient population?); and (3) What types of information or data points would be most crucial in an explanation of the prediction being discussed?

All data from the focus groups were transcribed and loaded into NVivo software (QSR International) for thematic coding.

In parallel, a scoping review of the use of XML for decision support in health care was conducted, using the methods proposed by Levac et al [ 16 ]. We generally found very few studies that described testing or methods to collect end-user perceptions of explainability, and even fewer studies that referenced any design theory or framework in the development of decision support tools.

Define Phase

The objective of the define phase was to synthesize the findings from the work done in the empathize phase and formally define the scope of the problem. For focus group data, we applied a framework developed to study the nonadoption, abandonment, scale-up, spread, and sustainability of technology in health care (NASSS), to analyze physician feedback on the possible tools for development. The NASSS framework is composed of the following 7 domains: condition, technology, value proposition, adopters, organizations, wider system, and embedding over time [ 15 ]. The NASSS framework, while traditionally used to analyze technology implementations, can be used to “generate a rich and situated narrative of the multiple influences on a complex project” [ 17 ] and assess in advance whether certain technology will be adopted in a health care setting. According to Fereday and Muir-Cochrane [ 18 ], a hybrid deductive or inductive thematic analysis was used. An initial deductive coding framework based a priori on the domains of the NASSS technology implementation framework was completed by one of the researchers. During the coding, 2 members of the research team met frequently to review challenges in the coding process and identify new subcodes for each domain. A detailed summary of the NASSS framework domains and sample coded participant data are provided in Multimedia Appendix 1 . The objective of this phase was to better understand the suitability of each tool for adoption within the organization. We present a brief summary of the analysis in Table 2 .

Physicians felt that the characteristics of long COVID were still unclear and difficult to define, and thus, it would be inappropriate to develop a long COVID tool. There was strong disapproval for a screening tool that makes predictions based on analysis of free text in the patient medical records given the possibility of incorrect information, inconsistent completion of records, missing information, and false reporting by patients.

Physicians were more receptive to the prognosis tool. Their familiarity with x-ray images and more trust in the image data source increased support for this tool. Physicians not only considered this a more useful application but also considered the warning of the impending prognosis to be important.

a NASSS: nonadoption, abandonment, scale-up, spread, and sustainability of technology in health care.

b Positive sentiment from physicians.

c Negative sentiment from physicians.

Ideate Phase

In this phase, the research team met continuously to build various approaches for both the user experience (UX) implementation and underlying explainability approaches used in the tool. As discussed in the Introduction section, increasing clinician trust in ML-based CDSS applications is seen as a key driver to increasing their use in actual practice. In line with this principle, the team adopted an evaluation framework published by Tonekaboni et al [ 4 ], which presents a series of metrics that can help assess clinician trust in an explanation provided by a ML CDSS tool. The framework includes 3 metrics. The first metric is domain appropriate representation, which represents the degree to which the tool provides adequate information to the end user within the context of the specific clinical setting and workflow. The second metric is potential actionability, which represents the degree to which the tool facilitates the taking of appropriate decisions or “next steps” in the care of the patient. The third metric is consistency, which represents the degree to which changes in the tool’s predictions and corresponding explanations can be explored to determine consistency.

These metrics were incorporated as design guidelines in the ideate phase and then used as the primary metrics in the prototype and test phases.

Prototype Phase

The objective of the prototype phase was to develop low-cost physical representations of the tool that allow for more detailed end-user feedback and more opportunities to iterate on the design of the solution.

For domain appropriate representation, we focused on providing a succinct summary of additional COVID-19–relevant information that physicians would likely find relevant in the context of a prognostic prediction based solely on chest x-ray images. This included the addition of a subset of patient vitals, laboratory values, and history, including symptom onset.

For potential actionability, we assumed the most important visual component of the tool would be the chest x-ray and corresponding explainability features. The team chose to implement a heat map–based visualization approach that would highlight areas of the image that most significantly contributed to the x-ray severity score. The technical implementation is explained in more detail below.

Our assumption was that this would quickly provide clinicians with the information they needed to make appropriate decisions concerning the next steps of the patient’s care trajectory. In order to provide context to the severity score, we considered a model of ICU admission to assess risk by defining 3 categories of risk (low, medium, and high) and the associated likelihood of admission.

Finally, for consistency, we planned for the clinician to be able to click on multiple imaging results in the patient timeline such that the clinician can compare the ML predictions across images and make assessments as to the consistency of the predictions.

The prototype wireframe in Figure 1 illustrates the basic design and different domain considerations.

crime prediction using machine learning research paper

ML Model: Developing Explainability

Based on the selected application, we chose to use the algorithm of Cohen et al [ 19 ] as the algorithm to include in the prototype application. Their work predicts the level of lung opacity and the geographic extent of disease regions from the input x-ray data, using a deep neural network. The main network consists of a network pretrained on large public non–COVID-19 data sets followed by 2 regression networks, one for each of the opacity and extent outputs. We computed total disease severity as the sum of the 2 neural network outputs. Training of regression networks was performed using COVID-19 data obtained by radiologist scoring of chest x-ray images with the following scores: (1) the extent of involvement of ground-glass opacity for each lung (for a total between 0 and 8), and (2) the degree of opacity for each lung (for a total between 0 and 6).

For our purposes, we were able to leverage their latest publicly available implementation and database [ 20 ], thus allowing us to only focus on developing the explainability methods. The multisite data set consisted of posteroanterior deidentified chest x-ray images of patients with varying COVID-19 severity, and each x-ray image had an associated disease severity score obtained by radiologists. Many of the patients had x-ray images from several time points, and the number of time points per patient was not consistent across patients.

We aimed to explore local post-hoc explainability approaches as we needed to explain specific instances of pre-existing models. We focused on model-agnostic methods as they would be useful in broader contexts and be applicable independent of the clinicians’ choice. In particular, we selected LIME (local interpretable model-agnostic explanations) [ 21 ] in order to interpret the model output owing to the simplicity in its implementation and perceived intuitiveness of the results. In our case, LIME explained the predicted severity as a linear function of the contribution, positive or negative, that each area in the x-ray has for changing the total severity score. We found the kind of output produced by LIME, which consists of contiguous regions, and this is more consistent with users’ expectations.

Low Fidelity Prototype

Using our wireframe and the results from the ML model, a static low fidelity prototype of the tool was developed using FIGMA ( Figure 2 ) [ 22 ].

We defined 3 levels of the risk of admission to the ICU considering the severity score (low, medium, and high), using the observed ICU admission rates and associated severity scores in our test data set. We defined severity score ranges for each of the risk levels in such a way that similarity in severity was maximized within each range. The risk was then calculated as the average probability of admission within each category. Our data set contained 950 x-ray images representing 472 unique patients. A subset of 398 x-ray images for 162 unique patients with information on ICU stay was used to calculate the risk of admission.

For the explainability component, we used the LIME Python package [ 23 ] to explain the predicted severity as a linear function of the contributions from image patches that compose the full x-ray image. Each image was subdivided into patches using the Quickshift Segmentation algorithm [ 24 ]. We found that the resulting explanation was unintuitive upon visual inspection, as patches indicating high contributions to increasing severity did not appear to be diseased. In addition, the low fidelity prototype displayed importance values for all patches and proved too confusing during iterative testing. The patches showing a negative contribution were cluttering the display and were unnecessary to show highly diseased regions. This was addressed in future iterations.

According to the design thinking approach [ 25 ], we met with 2 clinicians who had participated in the focus group sessions, and asked them to provide individual feedback during informal sessions of 30 to 60 minutes. We asked for their overall impression of the tool prototype design, as well as comments on specific design choices made by the research team to improve UX and explainability. These sessions were conducted over Microsoft Teams. Follow-up was conducted via email as the team made iterations on the comments and ideas from physicians.

crime prediction using machine learning research paper

The objective of the test phase was to gain more insight into the working prototype. The high fidelity (software) prototype was developed in Python using the Plotly Dash framework [ 26 ], a library that allows for quick prototyping of user interfaces.

The interactive prototype was tested with 5 physicians from the initial focus group sessions. According to Doshi-Velez Kim [ 27 ], we used an application grounded evaluation approach in which intended end users used the tool simulating prognostic prediction in the context of a 1-hour semistructured interview. We began the sessions by asking clinicians to explore the application with no guidance and then facilitated a discussion about the various features of the tool by presenting 3 separate patient cases. The questions from the interview guide were as follows: (1) What is your overall impression of what the tool is presenting to you? (overall impression); (2) How well does the tool provide with you with necessary contextual information about the case? (domain representation; additional probes: Is there too much information or is there missing information? Is the information poorly organized or is the presentation of the information confusing?); (3) How does the tool help or hinder your ability to make a treatment decision or take action with the patient? (actionability; additional probes: Is the prediction the tool is making clear? Is the tool adequately transparent with regard to the certainty of the prediction? Are there any complimentary data points you feel are missing for you to make a decision? Do you feel this tool could be shared with a patient in its current state?); and (4) As we show different cases, what are your impressions of how changes in the prediction are explained by the tool? (consistency; additional probes: In cases where the tool shows the progression of predictions for a single patient, does the tool adequately explain the changes? In cases where the tool shows a range of different patients, are the differences in predictions adequately explained by the tool?).

Prototype Phase (Low Fidelity)

In evaluating domain appropriate representation, the major information elements required by physicians were present in the first prototype; however, physicians did mention that the following elements needed to be added: (1) A more prevalent display of the date of COVID-19 diagnosis and date of symptom onset; (2) A more prevalent display for vaccination dates if available; (3) The method and volume of current oxygen in L/min; and (4) Additional laboratory test values for procalcitonin, C-reactive protein, and interleukin-6 (indicators of infection or inflammation that could be treated).

As mentioned above in the ideate phase, our objective with this design was to present the prediction in the most significant quadrant of the screen to ensure potential actionability; however, both our clinician testers commented on how presenting the image first violated the basic approach for clinical assessment and decision-making. While we assumed that the additional information displayed in the interface meant to provide patient content would only be looked at if needed, clinicians told us that this would be relevant and would need to be incorporated into their assessment of the quality of the ML prediction. Based on this feedback, we chose to reverse the orientation of the information for the development of the interactive prototype.

In addressing the consistency metric, while we explained to the physicians that the interactive tool would ultimately allow them to move through multiple images to compare the progression of the disease, we could not simulate this in the static image. Physicians did comment that it would be much easier if the multiple images could be displayed in comparison. They also commented on the importance of being able to turn the heat map overlay on and off so as to be able to compare the areas highlighted by the ML model with the actually affected areas on the x-ray image. This suggestion was implemented in the interactive prototype.

Finally, physicians commented that it was unclear how the consolidation in the image contributed to the severity score and how the severity score contributed to the risk of ICU admission. These were addressed in the interactive prototype through the creation of tool tip pop-ups with textual explanations.

Testing Phase (High Fidelity)

The final interactive prototype was designed based on all feedback identified from the prototyping sessions and is presented in Figure 3 .

crime prediction using machine learning research paper

Domain Appropriate Representation

Clinicians found the design of the prototype elegant, and domain appropriate representation of data was displayed in the tool. The added contextual data (history, vitals, and laboratory values) were deemed necessary, and no items were considered superfluous. Clinicians appreciated that the domain of interest was succinctly represented on a single screen. However, one of the most important issues uncovered was the degree to which physicians erroneously assumed that the additional data present in the tool, namely vitals and laboratory values, were being included in the x-ray severity score. While the tool tips made it explicitly clear that no data other than chest x-ray data were being used in the prediction model, further work needed to be done to visually distinguish information elements used to provide additional domain context from information used in the ML model.

Based on initial feedback during the prototype phase, we specifically probed clinicians on the reorganization of the structure and sequencing of the information in the interface, and the degree to which it supported the standard thought processes clinicians would follow to reach a decision. Physicians responded well to the revised structure and information flow. Moreover, they highlighted the possibility of using the tool as an additional teaching resource with junior staff or nonradiology specialists.

Potential Actionability

In order to reduce the clutter in the display of explainability compared to the low fidelity version, we only showed the patches that had a positive contribution to severity in red, where higher values had a darker color. In order to allow the end user to focus on the most predictive areas only, we tested several methods for displaying only a subset of the regions, including displaying a fixed number of patches (eg, 3, 4, and 5) or displaying patches that explained a certain percentage of the final score (eg, 80% and 90%). The selected prototype only displayed the most predictive patches that cumulatively explained 90% of the final score. The rationale behind it is that in some cases where the top contribution comes from many patches with low and equivalent values, by only showing a fixed number of patches, we will be omitting several equivalently important patches. The remaining explainable model hyperparameters (maximum distance, color space and image space proximity ratio, and kernel size for the Quickshift Segmentation; regularization coefficient for the explainer’s ridge regression; and width of the explainer’s exponential kernel) were then optimized by maximizing the coefficient of determination via sequential optimization using decision trees [ 28 ]. This optimization process resulted in more intuitive and actionable explanations.

We further explored potential actionability through the following three possible use case scenarios for the tool in the interviews: (1) Triage of patients presenting at the emergency department; (2) Discharge planning; and (3) Shared decision-making with patients.

In the first scenario (triage), many physicians were quick to point out the evolution of care protocols across the different waves of the pandemic. In the first and second waves, the lack of global disease knowledge caused a large number of pre-emptive ICU admissions; however, this is no longer the case. Physicians did note that they felt the tool could be very helpful in terms of planning for potential ICU admissions over time and helping manage staff resource issues.

In the second scenario (discharge planning), we proposed that the tool could be used as a final check for moving patients from high acuity care to lower levels of acuity, either discharge to home or discharge to virtual hospital care. Physicians generally felt that the tool could be useful as an additional data source to confirm an assessment of low risk.

In the third scenario (shared decision-making), we asked physicians whether they thought the tool could be helpful in shared decision-making, especially in scenarios where there may be some disagreement between a physician and a patient about a proposed next step. Physicians noted that the tool could be useful in explaining escalations in care to patients currently experiencing moderate symptoms.

Consistency

In terms of consistency, physicians unanimously appreciated the capacity to compare multiple x-ray images in the same view. They also appreciated the ability to toggle the explainability overlay so that both options made it easier for them to assess how consistently the tool was identifying elements of the x-ray image they felt would contribute to overall disease severity. Not all physicians agreed with the tool’s assessments, but felt that more exposure to a larger number of predictions would be necessary for them to gauge how much they trusted the tool.

Designing for Trust and Decision-Making

As discussed by Wang et al [ 29 ], when predictive AI is used in decision support tools, end users seek explanations to help improve their decision-making, and in cases where the tool performs in unexpected ways, explanations are critical for allowing users to identify what elements of the underlying model may be contributing to an unexpected prediction. In our case, we used saliency heat maps to show causal attribution to a severity score, where the highlighted regions represented areas with the greatest contribution to the severity score. Physicians appreciated the ability to toggle the heat map on and off to clearly identify the areas of the image that most contributed. However, multiple physicians did note that trust in the application would be built over longer term use, allowing them to assess the degree to which the application would align or deviate from their own unaided clinical assessments.

As described by Wang et al [ 29 ], this may be considered a type of heuristic representative bias, whereby past experience can lead a physician to wrongly associate a current case with similar previous cases. While our design allowed physicians to compare multiple instances of chest x-ray images for a single patient, a further iteration could incorporate features that would help to address this heuristic bias. Specifically, we could include the potential to compare an existing case to similar prototype example cases and use a dissimilarity metric to compare cases.

It is also important to note that there was skepticism that a model based solely on chest x-ray images could provide prediction as good as a model based on multiple inputs. This highlights the design challenge of optimizing domain appropriate representation and potential actionability in the user interface. Clinicians felt that it is important to see the prediction of chest x-ray images in the context of additional clinical information that feeds into their heuristic framework used for assessing a patient’s disease trajectory. Those additional data points clearly played a role in their likelihood to trust the explanation; however, they were not accounted for in our model. Further exploration of this challenge might include comparing the accuracy of the current model to predictions that use additional key inputs.

Design Thinking and Rapid Evolution

The COVID-19 pandemic evolved rapidly, and as such, the constant engagement with end users allowed the team to improve the potential application of the tool as well as the information displayed. During the prototype phase of development, physicians pointed out that the hospital was rapidly starting up a virtual care service for early discharge of COVID-19 patients to be cared for at home. This allowed the team to realign some of the discussion in the testing phase to assess the suitability of the tool for a unique case that did not exist in the ideate phase of development.

Moreover, it allowed us to modify the scope of information displayed in the tool to bring vaccine-related information into the main display. Again, this information only began to become available in the prototype phase of the project.

Finally, we were able to probe physicians around the applicability of the prototype as a shared decision-making tool to be used with patients, which was suggested informally during the ideate and prototype phases.

The above examples illustrate the importance of agility that is integral to the design thinking approach and represent ways in which the potential applicability and design were improved, which would not have been addressed in a traditional waterfall development approach.

Combining Design Theory With Additional Frameworks for a More Robust Approach

While design theory provides a well-established approach for continuous engagement with end users, we believe our approach of augmenting design thinking by incorporating additional conceptual frameworks helped to create a more robust collaborative tool design.

First, we used the NASSS framework during the “define” phase of the project to systematically analyze the results of our physician focus groups. This approach helped the team to quickly identify how the potential solutions would align with the various subdomains of the model. We see this as a pragmatic approach and helpful augmentation of the design thinking process to ensure the chosen design direction does not face dramatic sociotechnical barriers to development and potential implementation. Recent research into adoption of ML into clinical practice has used the NASSS framework in a similar manner. Pumplun et al [ 30 ] used the NASSS framework to identify 13 specific factors influencing the adoption of ML systems and further proposed a maturity model to be used by health care institutions to assess their readiness to adopt ML-based tools.

Similarly, we augmented the ideate, prototype, and test phases of the project by applying the evaluation metrics proposed by Tonekaoni et al [ 4 ]. This approach allowed us to begin the ideation and design process focused on core domains that would impact physician trust (domain appropriate representation, potential actionability, and consistency). It provided a consistent lens to assess both the prototype and test versions of the tool. This consistency in approach led the team to quickly identify design-specific improvements that directly led to the production of a prototype our physicians felt could provide immediate value.

Overall, the approach taken in our work can be situated in an evolving space concerned with incorporating AI technologies into health care software. Anderson et al [ 31 ] proposed a framework of 5 lenses from which to view this growing research field. Our work makes contributions to 2 of these lenses (AI as alignment with human values and AI as a design process). First, we addressed the alignment of AI as it relates to clinician trust, describing an approach to wire framing and prototyping that incorporates the use of a theoretical framework for trust in the design process itself. We described how this allows to gauge end-user alignment or trust in AI at multiple stages and optimize designs accordingly.

Second, as described in detail throughout this work, we propose that the alignment of AI is dependent upon integration of end users throughout the larger design process. Our work shows the importance and value of engaging end users prior to tool development, specifically in the process of assessing the broader applicability of a potential AI tool and its eventual use within actual health care environments.

Limitations

There are several limitations associated with this work. First, from a ML perspective, though we can verify the intuitiveness of the explanation, the accuracy of explainability methods has not been properly studied to date. We thus do not know how well an explanation fits the true underlying prediction in spite of its level of intuitiveness. This is of particular concern in the case of additive feature attribution methods like LIME, where a local linear model is used to explain a potentially more complex nonlinear underlying model.

Second, we used a publicly available data set with limited data, and thus, there were several implications. For example, it was difficult to find exemplary samples where an explanation can clearly demonstrate why an algorithm deviated from the ground truth or examples that can shed light on why an algorithm may have behaved in an unpredictable way. The data set contained a lot of missing data and was limited beyond imaging information, and as such, it was challenging to find examples with the full patient state (such as vital signs, multiple time points, etc) to provide end users with the desired contextual information to make a fully informed assessment. Finally, the data set was relatively old considering the rapid evolution of COVID-19 and approaches to its treatment, and this has implications on the likelihood of ICU admission considering the state of a patient.

With regard to our focus group participants, it is important to note that only physicians were represented in this research. While this was intentional in the study design, as it is primarily physicians who will make decisions about ICU admission, our work could have benefited from the inclusion of additional health care providers, such as nurses and respiratory therapists. Indeed as Nalin [ 32 ] pointed out, a larger system perspective of the use of the proposed tool could have provided richer data in the focus group phases.

Our work set out to use a design thinking approach to develop an XML-based decision support tool to assist clinicians. We augmented the design thinking approach by using the NASSS framework to help inform the development focus and direction, and added a formal evaluation framework from the report by Tonekaboni et al [ 4 ] to continuously focus our design on elements that would improve clinician trust in the tool. This research contributes to the body of health care literature that deeply integrates end users into the design and evaluation of XML in clinical decision support tools. As discussed, clinician trust is seen to be one of the key barriers to larger scale adoption of ML-based clinical decision support tools.

We believe that the approach described in our work is a unique and valuable contribution that outlines a direction for ML experts, UX designers, and clinician end users on how to collaborate in the creation of trustworthy and usable XML-based clinical decision support tools.

Acknowledgments

This research was funded by the National Research Council of Canada as part of the COVID-19 Pandemic Challenge Program. We thank Maryam Bafandkar and Yunong Shuang for their contributions to prototype design development. We thank Samira Rahimi, Amrita Sandhu, and Gauri Sharma for their contributions to the scoping review.

Data Availability

Two separate data sets were used in the completion of this work. The data sets used for developing the underlying disease severity and outcome prediction models are available through the repository of Cohen et al [ 2 , 20 ]. The focus group data generated and analyzed during the study are not publicly available owing to the consent obtained when the data were originally collected.

Authors' Contributions

MS, RH, and JH designed the study. RH, JH, and FT completed the rapid reviews. MS and VD conducted the focus groups. RH, JH, and MS developed and designed the prototype. MS, RH, JH, and VD wrote the manuscript.

Conflicts of Interest

None declared.

Description of nonadoption, abandonment, scale-up, spread, and sustainability of technology in health care (NASSS) categories and example physician comments.

  • Wiens J, Saria S, Sendak M, Ghassemi M, Liu VX, Doshi-Velez F, et al. Do no harm: a roadmap for responsible machine learning for health care. Nat Med. Sep 2019;25(9):1337-1340. [ CrossRef ] [ Medline ]
  • Roberts M, Driggs D, Thorpe M, Gilbey J, Yeung M, Ursprung S, et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nat Mach Intell. Mar 15, 2021;3(3):199-217. [ CrossRef ]
  • Bruckert S, Finzel B, Schmid U. The next generation of medical decision support: A roadmap toward transparent expert companions. Front Artif Intell. 2020;3:507973. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Tonekaboni S, Joshi S, McCradden M, Goldenberg A. What clinicians want: Contextualizing explainable machine learning for clinical end use. Proceedings of Machine Learning Research. 2019.:1-21. [ FREE Full text ]
  • Augmented intelligence in medicine. American Medical Association. URL: https://www.ama-assn.org/amaone/augmented-intelligence-ai [accessed 2024-03-10]
  • Gade K, Geyik S, Kenthapadi K, Mithal V, Taly A. Explainable AI in Industry: Practical Challenges and Lessons Learned. In: WWW '20: Companion Proceedings of the Web Conference 2020. 2020. Presented at: The Web Conference 2020; April 20-24, 2020; Taipei Taiwan. [ CrossRef ]
  • Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. May 2019;1(5):206-215. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Liao T, Taori R, Raji D, Schmidt L. Are We Learning Yet? A Meta Review of Evaluation Failures Across Machine Learning. In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. 2021. Presented at: 35th Conference on Neural Information Processing Systems, NeurIPS 2021; December 6-14, 2021; Virtual, Online.
  • Holzinger A, Biemann C, Pattichis C, Kell D. What do we need to build explainable AI systems for the medical domain? arXiv. 2017. URL: https://arxiv.org/abs/1712.09923 [accessed 2024-03-10]
  • Abdul A, Vermeulen J, Wang D, Lim B, Kankanhalli M. Trends and Trajectories for Explainable, Accountable and Intelligible Systems: An HCI Research Agenda. In: CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 2018. Presented at: 2018 CHI Conference on Human Factors in Computing Systems; April 21-26, 2018; Montreal QC Canada. [ CrossRef ]
  • Schwartz JM, Moy AJ, Rossetti SC, Elhadad N, Cato KD. Clinician involvement in research on machine learning-based predictive clinical decision support for the hospital setting: A scoping review. J Am Med Inform Assoc. Mar 01, 2021;28(3):653-663. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chen H, Gomez C, Huang C, Unberath M. Explainable medical imaging AI needs human-centered design: guidelines and evidence from a systematic review. NPJ Digit Med. Oct 19, 2022;5(1):156. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Brown T, Katz B. Change by design. J of Product Innov Manag. Mar 07, 2011;28(3):381-383. [ CrossRef ]
  • Brown T. Design thinking. Harv Bus Rev. Jun 2008;86(6):84-92, 141. [ Medline ]
  • Greenhalgh T, Wherton J, Papoutsi C, Lynch J, Hughes G, A'Court C, et al. Beyond adoption: A new framework for theorizing and evaluating nonadoption, abandonment, and challenges to the scale-up, spread, and sustainability of health and care technologies. J Med Internet Res. Nov 01, 2017;19(11):e367. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Levac D, Colquhoun H, O'Brien KK. Scoping studies: advancing the methodology. Implement Sci. Sep 20, 2010;5:69. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Greenhalgh T, Abimbola S. The NASSS Framework - A synthesis of multiple theories of technology implementation. Stud Health Technol Inform. Jul 30, 2019;263:193-204. [ CrossRef ] [ Medline ]
  • Fereday J, Muir-Cochrane E. Demonstrating rigor using thematic analysis: A hybrid approach of inductive and deductive coding and theme development. International Journal of Qualitative Methods. Nov 29, 2016;5(1):80-92. [ CrossRef ]
  • Cohen JP, Dao L, Roth K, Morrison P, Bengio Y, Abbasi AF, et al. Predicting COVID-19 pneumonia severity on chest X-ray with deep learning. Cureus. Jul 28, 2020;12(7):e9448. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • COVID Severity. GitHub. URL: https://github.com/mlmed/covid-severity [accessed 2021-10-22]
  • Ribeiro M, Singh S, Guestrin C. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In: KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. Presented at: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13-17, 2016; San Francisco, CA, USA. [ CrossRef ]
  • Figma. URL: https://www.figma.com/ [accessed 2024-03-11]
  • LIME Python package. GitHub. URL: https://github.com/marcotcr/lime/commit/ba43ac9ee0d6153aebbe271080efacb4cec8d684 [accessed 2024-03-10]
  • Quickshift Segmentation Algorithm. GitHub. URL: https://github.com/scikit-image/scikit-image/blob/main/skimage/segmentation/_quickshift.py [accessed 2024-03-10]
  • What is Design Thinking and Why Is It So Popular? Interaction Design Foundation. URL: https:/​/www.​interaction-design.org/​literature/​article/​what-is-design-thinking-and-why-is-it-so-popular [accessed 2024-03-10]
  • Dash Enterprise. Plotly. URL: https://plotly.com/dash/ [accessed 2024-03-10]
  • Doshi-Velez F, Kim B. Towards A Rigorous Science of Interpretable Machine Learning. arXiv. URL: https://arxiv.org/abs/1702.08608 [accessed 2024-03-10]
  • scikit-optimize. GitHub. URL: https://scikit-optimize.github.io/stable/modules/generated/skopt.optimizer.forest_minimize.html [accessed 2024-03-10]
  • Wang D, Yang Q, Abdul A, Lim B. Designing Theory-Driven User-Centric Explainable AI. In: CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 2019. Presented at: 2019 CHI Conference on Human Factors in Computing Systems; May 4-9, 2019; Glasgow, Scotland, UK. [ CrossRef ]
  • Pumplun L, Fecho M, Wahl N, Peters F, Buxmann P. Adoption of machine learning systems for medical diagnostics in clinics: Qualitative interview study. J Med Internet Res. Oct 15, 2021;23(10):e29301. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Andersen T, Nunes F, Wilcox L, Coiera E, Rogers Y. Introduction to the special issue on human-centred AI in healthcare: Challenges appearing in the wild. ACM Trans. Comput. Hum. Interact. Jun 30, 2023;30(2):1-12. [ CrossRef ]
  • Nalin K. A distributed decision making process from a systems perspective: following 33 patients at an emergency department. ResearchGate. 2017. URL: https:/​/www.​researchgate.net/​publication/​315729386_A_distributed_decision_making_process_from_a_systems_perspective_following_33_patients_at_an_emergency_department [accessed 2024-03-10]

Abbreviations

Edited by A Mavragani; submitted 04.07.23; peer-reviewed by K Nalin, C Manlhiot, A Oostdyk; comments to author 20.10.23; revised version received 26.01.24; accepted 19.02.24; published 16.04.24.

©Michael Shulha, Jordan Hovdebo, Vinita D’Souza, Francis Thibault, Rola Harmouche. Originally published in JMIR Formative Research (https://formative.jmir.org), 16.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.

Crime Rate Prediction Using Machine Learning and Data Mining

  • Conference paper
  • First Online: 28 November 2020
  • Cite this conference paper

Book cover

  • Sakib Mahmud 18 ,
  • Musfika Nuha 18 &
  • Abdus Sattar 18  

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1248))

592 Accesses

8 Citations

Analysis of crime is a methodological approach to the identification and assessment of criminal patterns and trends. In a number of respects cost our community profoundly. We have to go many places regularly for our daily purposes, and many times in our everyday lives we face numerous safety problems such as hijack, kidnapping, and harassment. In general, we see that when we need to go anywhere at first, we are searching for Google Maps; Google Maps show one, two, or more ways to get to the destination, but we always choose the shortcut route, but we do not understand the path situation correctly. Is it really secure or not that’s why we face many unpleasant circumstances; in this job, we use different clustering approaches of data mining to analyze the crime rate of Bangladesh and we also use K-nearest neighbor (KNN) algorithm to train our dataset. For our job, we are using main and secondary data. By analyzing the data, we find out for many places the prediction rate of different crimes and use the algorithm to determine the prediction rate of the path. Finally, to find out our safe route, we use the forecast rate. This job will assist individuals to become aware of the crime area and discover their secure way to the destination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Lin, Y., Chen, T., Yu, L.: Using machine learning to assist crime prevention. In: 2017 sixth IIAI International Congress on Advanced Applied Science (IIAI-AAI) (2017)

Google Scholar  

Munasinghe, M., Perera, H., Udeshini, S., Weerasinghe, R.: Machine learning based criminal short listing using modus operandi features (2015). https://doi.org/10.1109/icter.2015.7377669

Chauhan, C., Sehgal, S.: A review: crime analysis exploitation data processing techniques and algorithms, pp. 21–25 (2017). https://doi.org/10.1109/ccaa.2017.8229823

Anon: [online] Available at https://www.researchgate.net/publication/322000460_A_review_Crime_analysis_using_data_mining_techniques_and_algorithms . Accessed 30 Aug. 2019 (2019)

Kerr, J.: Vancouver police go high tech to predict and prevent crime before it happens. Vancouver Courier , July 23, 2017. [Online] Available https://www.vancourier.com/news/vancouver-police-go-high-tech-topredict-and-prevent-crime-before-it-happens-1.21295288 . Accessed 09 Aug 2018

Marchant, R., Haan, S., Clancey, G., Cripps, S.: Applying machine learning to criminology: semi parametric spatial demographic Bayesian regression. Security Inform. 7 (1) (2018)

Download references

Acknowledgements

We are very grateful to my Daffodil International University for offering us the chance to be part of the independent research study that contributes to this work being developed. Many thanks to Mr. Abdus Sattar, Assistant Professor, Department of Computer Science and Engineering at Daffodil International University for innumerable debates and feedback that helped me effectively finish the job.

Author information

Authors and affiliations.

Department of Computer Science and Engineering, Daffodil International University, 102 Sukrabadh, Mirpur Road, Dhaka, 1207, Bangladesh

Sakib Mahmud, Musfika Nuha & Abdus Sattar

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Abdus Sattar .

Editor information

Editors and affiliations.

Sikkim Manipal Institute of Technology, Majhitar, Sikkim, India

Samarjeet Borah

Ratika Pradhan

Department of Computer Science and Engineering, JIS University, Kolkata, West Bengal, India

Nilanjan Dey

GLA University, Mathura, Uttar Pradesh, India

Phalguni Gupta

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper.

Mahmud, S., Nuha, M., Sattar, A. (2021). Crime Rate Prediction Using Machine Learning and Data Mining. In: Borah, S., Pradhan, R., Dey, N., Gupta, P. (eds) Soft Computing Techniques and Applications. Advances in Intelligent Systems and Computing, vol 1248. Springer, Singapore. https://doi.org/10.1007/978-981-15-7394-1_5

Download citation

DOI : https://doi.org/10.1007/978-981-15-7394-1_5

Published : 28 November 2020

Publisher Name : Springer, Singapore

Print ISBN : 978-981-15-7393-4

Online ISBN : 978-981-15-7394-1

eBook Packages : Intelligent Technologies and Robotics Intelligent Technologies and Robotics (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. (PDF) GUI BASED PREDICTION OF CRIME RATE USING MACHINE LEARNING

    crime prediction using machine learning research paper

  2. (PDF) Empirical Analysis for Crime Prediction and Forecasting Using

    crime prediction using machine learning research paper

  3. (PDF) Crime Rate Prediction Using Machine Learning and Data Mining

    crime prediction using machine learning research paper

  4. Frontiers

    crime prediction using machine learning research paper

  5. Crime Type and Occurrence Prediction Using Machine Learning Algorithm

    crime prediction using machine learning research paper

  6. (PDF) Crime forecasting: a machine learning and computer vision

    crime prediction using machine learning research paper

VIDEO

  1. “IoT-Based Disease Prediction using Machine Learning”

  2. Crime Rate Prediction Using K means Clustering With Graph View Project

  3. An Empirical Analysis of Machine Learning Algorithms for Crime Prediction Using Stacked Generalizati

  4. Crime Prediction Using Data Mining

  5. How to Predict Crime Rate in Data Mining?

  6. crime prediction (Python )

COMMENTS

  1. Crime Prediction Using Machine Learning and Deep Learning: A Systematic

    Additionally, the paper highlights potential gaps and future directions that can enhance the accuracy of crime prediction. Finally, the comprehensive overview of research discussed in this paper on crime prediction using machine learning and deep learning approaches serves as a valuable reference for researchers in this field. By gaining a ...

  2. Artificial intelligence & crime prediction: A systematic literature

    A total of 9 research papers, or 7%, applied human behavior analysis for crime prediction. A total of 24 research papers did not specify their approach in analyzing the crimes (20%). ... "Mapping the Risk Terrain for Crime Using Machine Learning" Li et al. (2021) A94 "Facial expression-based analysis on emotion correlations, hotspots, and ...

  3. Machine learning in crime prediction

    Predicting crimes before they occur can save lives and losses of property. With the help of machine learning, many researchers have studied predicting crimes extensively. In this paper, we evaluate state-of-the-art crime prediction techniques that are available in the last decade, discuss possible challenges, and provide a discussion about the future work that could be conducted in the field ...

  4. (PDF) Crime Prediction Using Machine Learning and Deep Learning: A

    The field of machine learning is a subset of artificial intel-. ligence that uses statistical models and algorithms to analyze. and make predictions based on data. On the other hand, deep ...

  5. Crime forecasting: a machine learning and computer ...

    Crime data of the last 15 years in Vancouver (Canada) were analyzed for prediction. This machine-learning-based crime analysis involves the collection of data, data classification, identification of patterns, prediction, and visualization. ... Pepperdine University Legal Studies Research Paper No. 2015/3. Google Scholar Fatih T, Bekir C (2015 ...

  6. (PDF) Crime Prediction and Analysis

    crime prediction and analysis methods are very important to. detect the future crimes and reduce them. In Recent time, many researchers have conducted experiments to pr edict the. crimes using ...

  7. Crime Prediction Using Machine Learning and Deep Learning: A Systematic

    predict crime, offering insights into different trends and factors related to criminal activities. Additionally, the paper highlights potential gaps and future directions that can enhance the accuracy of crime prediction. Finally, the comprehensive overview of research discussed in this paper on crime prediction using machine

  8. Crime analysis and prediction using machine-learning approach in the

    Crime is a socioeconomic problem that affects the quality of life and economic growth of a country, and it continues to increase. Crime prevention and prediction are systematic approaches used to locate and analyze historical data to identify trends that can be employed in identifying crimes and criminals. The objective of this study is to predict the type of crime that occurred in the city ...

  9. PDF Machine learning in crime prediction

    With the help of machine learning, many research-ers have studied predicting crimes extensively. In this paper, we evaluate state-of-the-art crime prediction techniques that are available in the last decade, discuss possible challenges, and provide a discussion about the future work that could be conducted in the eld of crime prediction.

  10. Crime forecasting: a machine learning and computer vision approach to

    In this paper, they attempted to investigate the use of deep learning to predict crime rates directly from raw satellite imagery. They trained a deep convolutional neural network (CNN) on satellite images obtained from over 1 million crime-incident reports (15 years of data) collected by the Chicago Police Department.

  11. Papers with Code

    Additionally, the paper highlights potential gaps and future directions that can enhance the accuracy of crime prediction. Finally, the comprehensive overview of research discussed in this paper on crime prediction using machine learning and deep learning approaches serves as a valuable reference for researchers in this field. By gaining a ...

  12. Leveraging transfer learning with deep learning for crime prediction

    Crime remains a crucial concern regarding ensuring a safe and secure environment for the public. Numerous efforts have been made to predict crime, emphasizing the importance of employing deep learning approaches for precise predictions. However, sufficient crime data and resources for training state-of-the-art deep learning-based crime prediction systems pose a challenge. To address this issue ...

  13. [2303.16310] Crime Prediction Using Machine Learning and Deep Learning

    Additionally, the paper highlights potential gaps and future directions that can enhance the accuracy of crime prediction. Finally, the comprehensive overview of research discussed in this paper on crime prediction using machine learning and deep learning approaches serves as a valuable reference for researchers in this field. By gaining a ...

  14. arXiv:2303.16310v1 [cs.LG] 28 Mar 2023

    research crime detection using machine learning and deep learning methodolo-gies. Based on the recent advances in this eld [1][2][3], this article will explore current trends in machine learning and deep learning for crime prediction and discuss how these cutting-edge technologies are being used to detect crimi-

  15. A study on predicting crime rates through machine learning and data

    In this study, various types of criminal analysis and prediction using several machine learning and data mining techniques, based on the percentage of an accuracy measure of the previous work, are surveyed and introduced, with the aim of producing a concise review of using these algorithms in crime prediction. ... The rest of the review paper ...

  16. Crime Rate Prediction Using Machine Learning and Data Mining

    In this paper, we use dif ferent models and table to. show the different types of crime rate, mostly working data from last 3 years of crime. and showing the level of crime prediction in dif ...

  17. Prediction Analysis on Trending Twitter Hashtags Using Machine Learning

    This is where machine learning comes in. By using machine learning algorithms, we can automatically process and analyse vast amounts of social media data generated in real time. In the case of our study, we proposed a machine-learning approach for predicting trending hashtags on Twitter and recommending hashtags based on user interaction.

  18. Credit risk prediction with and without weights of evidence using

    Quantitative learning models can provide banks with the necessary credit assessment by using various statistical and machine learning algorithms that are efficient, economical, and accurate. However, some machine learning models can be computationally demanding and complex.

  19. Crime Rate Prediction Using Machine Learning Techniques

    "Crime Rate Prediction Using Data Mining" by Raj et al. (2018): In this research, data mining techniques are applied to predict crime rates. The authors use a combination of feature selection methods and machine learning algorithms, such as Naive Bayes and k-nearest neighbors, to develop a predictive model.

  20. Integrating Machine Learning and Thermodynamic Modelling for ...

    Keywords: Machine Learning, Supercritical CO2, Thermodynamic Analysis, Random Forest, Support Vector Regression, Waste Heat Recovery. Suggested Citation: Suggested Citation Shabruhi Mishamandani, Arian and mojaddam, mohammad and Mohseni, Arman, Integrating Machine Learning and Thermodynamic Modelling for Performance Prediction and Optimization ...

  21. PDF Crime Prediction Using Machine Learning and Deep Learning: A Systematic

    Both machine learning and deep learning methodologies have the potential to be applied to the problem of crime prediction in many ways [5]. Machine learning algorithms have been utilized in crime ...

  22. A Roadmap for Using Causal Inference and Machine Learning to

    In this paper, we point out the need for developing a decision support tool to guide ICS selection and the gap in fulfilling the need. Then we outline an approach to close this gap via creating a machine learning model and applying causal inference to predict a patient's ICS response in the next year based on the patient's characteristics.

  23. (PDF) Machine learning in crime prediction

    failed to report a crime. The challenges presented forward b y. the researchers have been narrow ed down into 4 commonly. faced challenges in machine learning: (1) Data collection, (2) data ...

  24. Leveraging Machine Learning for Fraudulent Social Media Profile

    Automatic Detection of Fake Profile Using Machine Learning on Instagram. - International Journal of Scientific Research in Science and Technology, 2021 pp. 117-127. DOI: 10.32628/ijsrst218330. ... Bengali paper classification using ensemble machine learning algorithms. ... Integrating Machine Learning for Accurate Prediction of Early Diabetes ...

  25. JMIR Formative Research

    Background: Though there has been considerable effort to implement machine learning (ML) methods for health care, clinical implementation has lagged. Incorporating explainable machine learning (XML) methods through the development of a decision support tool using a design thinking approach is expected to lead to greater uptake of such tools.

  26. Crime Rate Prediction Using Machine Learning and Data Mining

    This research paper about future crime rate predictions are much more specific and precise. ... Crime Rate Prediction Using Machine Learning and Data Mining. In: Borah, S., Pradhan, R., Dey, N., Gupta, P. (eds) Soft Computing Techniques and Applications. Advances in Intelligent Systems and Computing, vol 1248. Springer, Singapore. https://doi ...

  27. Wafer-level packaging solder joint reliability lifecycle prediction

    This research study focuses on using support vector regression (SVR) techniques, such as single kernel, multiple kernels, and a newSupport vector regression technique, to predict the reliability of the Wafer-Level Package (WLP). The development of new electronic packaging structures often involves a design-on-simulation (DoS) approach. However, simulation results can be subjective, and there ...