Efficient ML: Achieving Low Latency in Real-Time Systems

SIDDHARTH SAHANI (~siddharth8) | 09 Jun, 2024

40

Votes

Description:

Kayzen's real time bidding processes more than 125 billion daily ad requests from across the internet. With a strict Service Level Agreement (SLA) in place, the system aims to handle the majority of these requests within a rapid 100-150 millisecond round-trip latency. This intricate ad auction process involves hundreds of competing advertisers, each vying for placement. Critical stages of this process, including audience targeting, advertiser objective optimization, and ad selection, heavily rely on sophisticated machine learning (ML) algorithms. Performing real-time inference and rendering predictions at such a massive scale under tight SLAs demands a robust and agile machine learning infrastructure.

In this session, we'll delve into the complexities of constructing a low-latency machine learning system capable of supporting the immense volume and throughput requirements of ad serving. We'll explore how we can engineer an adaptable, scalable framework to deliver real-time predictions crucial to the ad auction process. This involved leveraging ML models trained frequently on vast amounts of ever-evolving ad transaction data. Additionally, we'll discuss the invaluable insights gained from this endeavor, including strategies for performance optimization, latency reduction, and ensuring system reliability.

Introduction - Brief introduction to Kayzen's ad-serving system and its scale - Emphasis on the importance of low-latency systems in real-time ad auctions

Understanding the Ad-Serving Process - Overview of the ad-serving workflow - Explanation of the Service Level Agreements (SLAs) and latency requirements - Introduction to the challenges in handling over billion daily ad requests

Key Stages in the Ad Auction Process - Detailed walkthrough of audience targeting, optimization of advertiser objectives, and ad selection - Role of machine learning (ML) algorithms in these stages - Importance of real-time predictions in the ad auction process

Building a Scalable and Extensible System - Strategies used by Kayzen to handle high volume and throughput demands - Discussion on the frequent training of ML models using large, dynamic datasets - Methods for optimizing performance and reducing latency in real-time model serving

Challenges and Solutions - Common challenges encountered in building low-latency ML systems - Lessons learned and best practices for ensuring system reliability and performance - Techniques for feature transformation, enrichment, and caching to improve latency

Conclusion - Summary of key points covered in the talk - Final thoughts on the future of low-latency ML systems in ad tech - Q&A session

Prerequisites:

Basic Understanding of Machine Learning:

Familiarity with common ML concepts and algorithms
Understanding of how ML models are trained and deployed

Knowledge of Real-Time Systems:

Basic knowledge of real-time data processing and latency requirements
Experience with real-time applications, preferably in an ad tech or similar environment

Programming Skills:

Proficiency in a programming language commonly used in ML (e.g., Python, Java)
Experience with ML frameworks such as TensorFlow, PyTorch, or similar

Experience with Ad Tech (preferred but not mandatory):

Understanding of the ad-serving ecosystem and the ad auction process
Familiarity with the concepts of audience targeting, ad selection, and optimization

Interest in System Design and Scalability:

Enthusiasm for learning about scalable system architectures
Curiosity about optimizing performance and reliability in high-volume environments

Video URL:

https://drive.google.com/file/d/1uboP8_ayUohTpUpeoZUrTvs_YC2IPgwZ/view?usp=sharing

Speaker Info:

As a Senior Machine Learning Engineer for Kayzen, Siddharth is spear-heading the stability of AI infrastructure and model monitoring at scale. He builds & deploys models for making Advertisers & Media Buyers in the Demand Side successful in their programmatic ad spends. Additionally, he is responsible for designing and implementing several highly effective efficiency measures, including stability of Big data storage, processing and heavily optimising the data & model pipelines.

He is also the Conference Submissions Reviewer at IEEE for over 4 years. Previously, he worked for Ahmedabad based start-ups: Shipmnts & Infocusp. Siddharth also happens to be the Gold Medalist from SRM University and collaborated with IIT Madras for a coveted Research Fellowship. Siddharth brings in 8 years of experience in Enterprise and Cloud architecture.

He leverages this deep understanding of system architecture and machine learning to build scalable and reproducible ML Architectures across Industry verticals – from ed-tech to legal-tech to logistics and supply chain and lately ad-tech. Siddharth has also been the recipient of IET Trendsetter of the Year Award in both 2015 & 2016, and Amul Vidya Bhushan Award, 2012. He has mentored hundreds of data science enthusiasts from school kids to industry leaders via different platforms from The Climber, GreatLakes & Scaler. Besides being a Speaker & a Judge at various Seminars, Conferences & Panel discussions, he is a keen robotics enthusiast and is trained in Contemporary dance.

Speaker Links:

Socials LinkedIn: https://www.linkedin.com/in/siddharthsahani/ Github: https://github.com/dapperlabel StackOverflow: https://stackoverflow.com/users/6649426/siddharth-sahani Medium: https://medium.com/@siddharthsahani7

Links to previous talks 1. AIM MLDS 2024 https://www.youtube.com/watch?v=SzUeydIsVPs&list=PL1Osdi_5mfH6jNSR5iDl0J43GDfpEIFMV&index=25 Bengaluru February 2, 2024 2. IEEE Kaagaz Conference https://www.linkedin.com/feed/update/urn:li:activity:7001762071558148096/?updateEntityUrn=urn%3Ali%3Afs_feedUpdate%3A%28V2%2Curn%3Ali%3Aactivity%3A7001762071558148096%29 Hyderabad, 4-5 November 2022 3. IEEE Epsilon https://www.youtube.com/watch?v=Gvpua_zFxy4 Online April 2021 4. SRM Alumni Webinar https://www.youtube.com/watch?v=z09m2xDPzVw Online May-2020

Section:	Artificial Intelligence and Machine Learning
Type:	Talk
Target Audience:	Intermediate
Last Updated:	09 Jun, 2024

Comments