BackdropAI: AI-powered context-based background changes for immersive live streaming

Padmaja


3

Votes

Description:

Traditional video conferencing or live streaming platforms often need the ability to dynamically change backgrounds based on the content being presented, leading to a less engaging and immersive experience for viewers. Our challenge is to develop an AI-powered solution that can seamlessly alter backgrounds to match the context of the content being presented, catering to use cases such as news reading, visualising verbal ideas, real-time presentations, and speech-based video searches.

The speakers, just using the Python ecosystem, built a tool that can dynamically change the background of the live-streaming video based on the context of the speaker. The tool takes the speaker’s video and audio as input and processes it with the help of OpenCV and Whisper. The core machine learning part of the tool was to generate images based on the audio input, which was achieved using Dall-e.

The speakers built the front end of the system using the Streamlit Python package. They further used streamlit-webrtc for the realtime media streams. Streamlit is targeted towards data scientists to rapidly build UI and deploy the Machine Learning models, just using Python (No knowledge of HTML, CSS, or Javascript needed).

In this talk, the speakers show how they built such a tool only using Python and discuss their design choices and challenges they faced when building such a system and talk about how they solved it.

The structure of the talk is as below:

  • Problem Statement Overview
  • Architecture overview
  • Data Preprocessing using Wispher and open-cv
  • BackdropAI components
    • Text summarisation using LLM
    • Image generation using Dall-e
    • Merging live video with changing backgrounds
  • Building web application using Streamlit
  • Deploying the application on Cloud
  • Challenges
    • Latency
    • Scale up

Prerequisites:

Basic python skills, and introductory knowledge of AI.

Speaker Info:

Manisha is a Data Scientist at Glance, InMobi Group, with five years of professional experience. Currently contributing to Glance's Personalization team, she specializes in developing end-to-end recommendation systems. Her background includes prior engagement as a VMware software developer, solidifying her expertise in building large-scale software systems.

Padmaja is a Data Scientist working at Glance. Her focus is on enhancing customer serviceability by using machine learning. She has experience working in personalization, monetization as well as customer acquisition. In the past, she has used Deep Learning for automatic music generation and natural language understanding. She has pr4eviously given talks at PyCon India, PyCon US and DevConf India. She has a strong passion for dance and loves going on adventurous trips.

Speaker Links:

Manisha LinkedIn: https://www.linkedin.com/in/r-manisha

Padmaja LinkedIn: https://www.linkedin.com/in/padmajavb/ Github: https://github.com/PadmajaVB

Section: Data Science, AI & ML
Type: Talks
Target Audience: Beginner
Last Updated: