The weather was quite warm on the evening of March 22, 2012. It had been a particularly pleasant walk from the train station to the bar in Lakeview—a pretty decent neighborhood in Chicago. As I was sipping my gin and tonic, I was listening to Brett Goldstein, the Chief Data Officer of Chicago, about using distress call data to predict violent events before they occurred. This was a meetup of the Data Science Chicago group, and Adam Pah wrote about it on his blog post. Along with about a hundred other people, I enjoyed hearing more on the serious efforts to increase transparency, publicly available data with practical interfaces, and new ways of utilizing the City’s tremendous amounts of data. But I was particularly interested in his work on predictive analytics. The police has limited resources. If we could use past distress call and crime records to accurately calculate probabilities of new events occurring at each neighborhood at each time slot, we could maximize the efficiency of these limited resources. Instead of reacting to a crime, increased patrols in an area could prevent it.
I left the bar and felt perfectly safe as I went home. Back then, I lived in Rogers Park—a not so decent neighborhood in Chicago. The night before, a man had been stabbed half a block away from my building. That very day, at 2:30 pm, a person was gunned down to his death right in front of my building. The police had said that two sub-factions of a gang were at war, and apparently I lived in the middle of the battlefield. I was biting my nails, strongly wishing that the data scientists in the city would hurry up and start predicting this stuff accurately, so that I wouldn't die from a stray bullet as I carried oranges home from the grocery store.
The academic approach to predicting and preventing violent events has all the characteristics of the new Big Data Age. Our capability to collect, store and analyze large amounts of data has increased exponentially. When it comes to efficient use of information, not all of our mathematical models and predictive algorithms can surpass human instinct—our brains’ powerful, complex heuristics of pattern recognition. However, these (perhaps not as smart) algorithms have an advantage over our almighty brains: The sheer amount of data they can incorporate without any bias. As we come up with cleverer ways to treat all this data, it might just be possible to get one step further than human detectives’ current expert analysis built on personal experience. An even more likely path is to develop models that can simplify and convert huge data into a form that expert humans can understand instinctively; therefore utilizing all the tools we have: math, computers, data, and brains with experience.
The U.S. military has been using this approach for some time now. Data on insurgent attacks were used to predict new incidents in the Iraq and Afghanistan campaigns. However, it is still unclear how effective these new counter-insurgency techniques turned out. In a military campaign, it is difficult to decouple other effects and evaluate such efforts precisely. A recent Nature news article reports that now these military methods are evaluated on domestic ground: The police stopping gangs and drug dealers using military tactics. The counter-insurgency doctrine is tested on Springfield, Massachusetts. It includes data collection techniques developed by the military, social network analysis to identify key gang leaders, and computational methods to predict possible crimes. Kevin Kit Parker from Harvard University and his team are collecting data of their own to assess the success of this approach. Preliminary results are impressive. According to John Barbieri, deputy chief of police, crime rates in Springfield dropped 62% since the first year of implementation.
Clearly, there is an interesting ethical discussion here on treating citizens as potential criminals, or how far predicting violent events could go, even perhaps into human rights and presumption of innocence territory. But steering away from it, and focusing on potential crime rates in specific locations, current developments in data analysis sound very promising. It may take some time to perfect, though. The Big Data Age is quite new and shiny, and its tools are still improving. There are many pitfalls we are still learning to avoid. I can’t help but think of an imaginary, terrible episode of Numb3rs—the TV show where they used math to solve crimes. I imagine a scientist looking at a lot of points on a graph depicting crime statistics. He says “Let’s look at this in a log-log plot.” And when he does, his eyes shine brightly. As he draws a straight line in the middle of the clearly curved, horribly noisy data over a single decade, “I’ve got it,” he shouts with joy. “It’s a power law!”
I have no doubt that one day, soon, we will master predictive analytics of crime and our neighborhoods will be safer. This is not a case of “In a future with jetpacks, Robocop will stop crime before it happens.” Preventing at least some violence is within our reach. The only question on my mind on that day was “Will I survive the gang wars to see the day?” Instead of waiting to find out, I moved to a much safer neighborhood. I am happy to report that my predictive analytics were successful: I am still alive.