Saul Loeb/AFP/Getty Images
It’s no big secret that we live in a surveillance state. The average American is caught on CCTV camera an estimated 75 times a day. Meanwhile an average Londoner, the world’s most photographed person, is snapped on public and private security cameras an estimated 300 times every 24 hours.
But if you thought that the future was just more cameras in more places, you’d better think again. Thanks to breakthroughs in fields like computer vision, tomorrow’s surveillance society is going to be a whole lot more advanced. The amount of information which can now be extracted from video footage is increasing all the time. As a result, instead of simply static recordings made for future reference, or live feeds viewed by bored workers watching banks of monitors at the same times, CCTV is getting smarter. Way smarter. I’m talking multiple orders of magnitude smarter, and a whole lot more mobile, too.
The eye in the sky
One such new technology, the so-called Aerial Suspicious Analysis (ASANA), is the work of computer vision company Skylark Labs. ASANA is a drone-based security system, designed to spot suspicious activity in crowds of hundreds of people from anywhere between 10 feet and 300 feet. Drones equipped with its cloud-based technology can hover over large gatherings of people and accurately identify behavior such as fighting, kicking, stabbing, fighting with weapons, and more. That’s alongside non-suspicious activities like high-fiving, dancing, and hugging, used to make the system more robust to false alerts. The system is also being modified to spot suspicious objects, such as unattended bags. Alerts can then be sent to the relevant authorities.
“The system first detects the posture of every human in a scene,” founder and CEO Amarjot Singh told Digital Trends. “The human pairs which are within certain predetermined proximity are jointly analyzed, based on the motion of the postures, to determine if an alert should be generated. Once it is determined that the pair is involved in a suspicious event, the posture of the individual human is further analyzed to determine if one of them, or both, are responsible for the alert — after which one or both are tracked through the crowd. The system also can re-identify the individuals if they move out of the visual field of the drone.”
An A.I. graduate from Cambridge University and postdoctoral fellow at Stanford University, Singh said that he was inspired to develop the technology after the 2013 Boston Marathon bombings and 2017 Manchester Arena bombing, between which 26 people died and many more were wounded.
“The current CCTV type surveillance systems were not enough to identify these attackers in time”
“The current CCTV type surveillance systems were not enough to identify these attackers in time,” Singh said. “This is due to the limited field of view of [CCTV cameras], which makes it possible for aggressive individuals to avoid detection. A surveillance system mounted on the drone is likely to capture these individuals due to its large field of view. Attacks like these could be prevented in future if surveillance cameras can automatically spot suspicious behavior, like someone leaving a bag unattended for a long period.”
Major security breaches are frequently drivers of enhanced security technology. As I’ve written before, the tragic events of 9/11, prior to which an image of hijackers Mohamed Atta and Abdulaziz Alomari passing through airport security was not flagged, prompted a flurry of investment in facial recognition systems. As Skylark Labs shows, however, spotting wrongdoers can be accomplished in plenty of ways that go beyond identifying the face of a wanted criminal.
Challenges to solve
Could such pose-spotting technology be used to identify potential troublemakers by their behavior before they cause any actual trouble? “Certain signs of aggression can be captured by these A.I. systems, but it might not be a good idea to use those to flag people as they might not necessarily act on it,” Singh said. “That being said, it could also prevent certain crimes from happening if … aggression is detected in advance.”
It’s worth noting, of course, that such smart cameras don’t necessarily have to be used to identify aggressive behavior. There are plenty of scenarios in which it might be desirable to spot certain other behavior in advance so that it can be acted upon. For instance, authorities have long used computer vision algorithms on the London Underground subway system to watch for potential suicide jumpers. The tech works by looking for individuals who wait on platforms for at least 10 minutes and miss multiple available trains during that time. If this occurs, the A.I. triggers an alarm.
Similarly, recent European Union proposals state that all new cars on the road, as of mid-2022, must be equipped with special driver monitoring systems capable of telling whether the person behind the wheel is distracted. To help with this, Israeli computer vision startup Eyesight Technologies has developed in-car monitoring technology that can spot when drivers are tired, using a smartphone, or even smoking at the wheel. They can then send out an alert.
Needless to say, this area nonetheless presents a possible ethical minefield. As could be seen by the numerous people who piled up on both sides of the Apple vs. FBI debate from several years ago, when Apple refused to create a backdoor into its iOS operating system to help investigators, there is no consensus opinion when it comes to surveillance.
Things become even dicier when you add in the possible related topic of predictive policing, or the possibility that similar technology could be used to target and intimidate marginalized groups by spotting other forms of banned behavior. In certain countries that could be used to surveil same-sex couples exhibiting affectionate behavior. In other scenarios it could be exploited to alert authorities of gathering crowds in areas where no crowds are expected.
Enter the Panopticon
The model for today’s surveillance technologies is often said to be the Panopticon, a prison concept designed by the English social theorist Jeremy Bentham in the 18th century. The Panopticon asks us to imagine a central watchtower in a courtyard, surrounded by a circular arrangement of cells. There is a guard in the watchtower, which shines a bright light outwards so that they can see everyone in the cells.
However, the people in the cells are unable to see the watchman. As a result, they assume that they are always being closely observed. As the French philosopher Michel Foucault observed in his 1975 book Discipline and Punish, the result is a form of asymmetrical surveillance: the prisoners, despite outnumbering the guard, moderate their own behavior under the assumption that they are being watched.
While, theoretically, anyone (or even no-one) could be in the Panopticon’s hypothetical watchtower, the way subjects moderate their behavior depends on their regard for the person that is supposedly watching them. A theoretical prisoner is going to act differently if they’re being watched by a seasoned veteran, capable of spotting even the most surreptitious of behavior, compared to a novice.
We have yet to see what behavior moderating impact the next generation of surveillance tools will have on the populace. Combine pose-predicting technology in public with the increased use of facial recognition everywhere from bars to hotels, and people are going to be more aware than ever that they are constantly being watched. In some contexts, that could be a great thing. In others, it could be concerning.
“I think this kind of technology can be used to develop effective safety and security systems that can aid law enforcement and can ensure the safety of individuals in public places,” Singh said. “Therefore I think the development of these systems is a good thing and is also necessary. That being said, I think these systems should be used with caution and should be monitored for specific applications like defense, similar to nuclear technology.”