Highly Scalable Distributed Architecture for Spatio-Temporal Analytics on Large Scale Camera Networks
In today's world, the proliferation of CCTV camera networks has reached unprecedented levels, with an estimated billion cameras installed globally. These cameras serve a crucial role in ensuring safety and security for individuals, businesses, and governments. However, the sheer volume of video data generated by these cameras poses significant challenges in terms of storage, analysis, and query processing. In response to these challenges, Dr. Suresh Purini and his team at the International Institute of Information Technology (IIIT), Hyderabad, have undertaken a groundbreaking project titled "Highly Scalable Distributed Architecture for Spatio-Temporal Analytics on Large-Scale Camera Networks."
Project Objectives
The primary objectives of this project, led by Dr. Suresh Purini and co-led by Dr. Ravi Kiran Sarvadevabhatla, were as follows:
- Build a distributed infrastructure for video stream ingestion.
- Build and optimize the query processing system.
- Benchmark the system performance on a variety of data sets.
- Build the deployment infrastructure and demonstrate the entire project.
Achieving the Objectives
The project successfully achieved all of its stated objectives. The team created a distributed infrastructure capable of handling video streams from a multitude of cameras, developed a powerful query processing system, rigorously tested the system's performance, and finally, created a robust deployment infrastructure to showcase their groundbreaking work.
Contributions to the Field
The significance of this project lies in its ability to address complex spatio-temporal analytics challenges in large-scale camera networks. The key contributions and insights include:
Semantic Scene Analysis (SSA)
The project delves into SSA, a cutting-edge area of research at the intersection of computer vision and natural language processing. SSA involves generating textual descriptions of scenes through object recognition and establishing relationships among recognized objects.
Cloud-Fog Architecture
The team proposes a novel cloud-fog distributed system architecture. In this architecture, deep learning pipelines deployed on fog nodes analyze video streams from connected CCTV cameras, generating textual Scene Description Records (SDRs). These SDRs are transmitted to cloud data centers, reducing network bandwidth and congestion.
Scalability and Extensibility
The architecture allows for easy scalability by adding more fog nodes and mini data centers as needed. It is also highly extensible, accommodating different deep learning models and complex query types.
Real-Time Vehicle Pursuit
As a case study, the project demonstrates a real-time vehicle pursuit algorithm that leverages the system's capabilities to handle multiple spatio-temporal queries and coordinate across multiple data centers.
Experimental Methodology and Artifact Availability
The researchers employed a cluster of five servers functioning as a mini data center. Within this setup, the fog node software was containerized, making it adaptable to any CPU-GPU node. The video sources for the experiment were pre-recorded CCTV camera feeds from the STREETS dataset [14], ensuring real-world relevance. They systematically benchmarked system performance, including accuracy in vehicle tracking. Additionally, they offer artifacts for others to build upon. This framework has great potential for improving video surveillance systems.
Conclusion and Future Scope
The project's success in addressing the challenges of video analytics in large-scale camera networks is a significant achievement. It opens up new possibilities for various domains beyond road camera networks, including surveillance networks in public places like malls and airports.
In the future, the team plans to explore the application of their architecture in different domains, leveraging the power of semantic scene analysis and cloud-fog distributed systems to revolutionize video analytics. The architecture's scalability, extensibility, and efficiency make it a promising solution for addressing the data deluge generated by the ever- expanding network of CCTV cameras worldwide.