profile photo

I'm a 2nd year Ph.D. student advised by Dr. Joy Arulraj in the School of Computer Science at Georgia Institute of Technology. I earned my Bachelor's degree in Computer Science from the Indian Institute of Technology, Kanpur, in 2017 and a Masters' degree from the Georgia Institute of Technology in 2021. Currently, I'm leading the research initiative of developing a Video Database Management System (EVA) at Georgia Tech Databases Lab. I have interned at Cloud SQL team at Google and SQL team at Snowflake.

Email  /  CV  /  Google Scholar  /  LinkedIn  /  Github

Research Interests

My research interest lies in the intersection of data management and machine learning. Specifically, I am developing a new video database management system - EVA - tailored to efficiently and accurately query videos at scale. My research focuses on improving the resource efficiency, query capabilities, and usability of video database management systems by developing novel query optimization and execution algorithms.

News

  • New! 12/21 - EVA: A Symbolic Approach to Accelerating Exploratory Video Analytics with Materialized Views got accepted to SIGMOD, 2022
  • 10/21 - Mentoring Undergraduate and Masters' lab members collaborating on our academic Video Database Management System, EVA
  • 9/21 - Started working on MAMBA - accelerating approximate query processing in video analytics.
  • 9/21 - Submitted a research draft EVA to SIGMOD 2022 .

Publications
3DSP

EVA: A Symbolic Approach to Accelerating Exploratory Video Analytics with Materialized Views

Zhuangdi Xu* Gaurav Tarlok Kakkar*, Joy Arulraj, Kishore Ramachandran

Paper Poster

Advances in deep learning have led to a resurgence of interest in video analytics. In an exploratory video analytics pipeline, a data scientist often starts by searching for a global trend and then iteratively refines the query until they identify the desired local trend. These queries tend to have overlapping computation and often differ in their predicates. However, these predicates are computationally expensive to evaluate since they contain user-defined functions (UDFs) that wrap around deep learning models.

In this paper, we present EVA, a video database management sys- tem (VDBMS) that automatically materializes and reuses the results of expensive UDFs to facilitate faster exploratory data analysis. It differs from the state-of-the-art (SOTA) reuse algorithms in traditional DBMSs in three ways. First, it focuses on reusing the results of UDFs as opposed to those of sub-plans. Second, it takes a symbolic approach to analyze predicates and identify the degree of overlap between queries. Third, it factors reuse into UDF evaluation cost and uses the updated cost function in critical query optimization decisions like predicate reordering and model selection. Our empirical analysis of EVA demonstrates that it accelerates exploratory video analytics workloads by 4× with a negligible storage overhead (1.001×). We demonstrate that the reuse algorithm in EVA complements the specialized filters adopted in SOTA VDBMSs.
Key Projects
3DSP

Exploratory Video Analytics: A Video Database Management System

Gaurav Tarlok Kakkar, Joy Arulraj

Code

EVA is a visual data management system (think MySQL for videos). It supports a declarative language similar to SQL and a wide range of commonly used computer vision models. EVA enables querying of visual data in user facing applications by providing a simple SQL-like interface for a wide range of commonly used computer vision models. It improves throughput by introducing sampling, filtering, and caching techniques. It improves accuracy by introducing state-of-the-art model specialization and selection algorithms.