Efficient ML: Achieving Low Latency in Real-Time Systems
SIDDHARTH SAHANI (~siddharth8) |
40
Description:
Kayzen's real time bidding processes more than 125 billion daily ad requests from across the internet. With a strict Service Level Agreement (SLA) in place, the system aims to handle the majority of these requests within a rapid 100-150 millisecond round-trip latency. This intricate ad auction process involves hundreds of competing advertisers, each vying for placement. Critical stages of this process, including audience targeting, advertiser objective optimization, and ad selection, heavily rely on sophisticated machine learning (ML) algorithms. Performing real-time inference and rendering predictions at such a massive scale under tight SLAs demands a robust and agile machine learning infrastructure.
In this session, we'll delve into the complexities of constructing a low-latency machine learning system capable of supporting the immense volume and throughput requirements of ad serving. We'll explore how we can engineer an adaptable, scalable framework to deliver real-time predictions crucial to the ad auction process. This involved leveraging ML models trained frequently on vast amounts of ever-evolving ad transaction data. Additionally, we'll discuss the invaluable insights gained from this endeavor, including strategies for performance optimization, latency reduction, and ensuring system reliability.
Introduction - Brief introduction to Kayzen's ad-serving system and its scale - Emphasis on the importance of low-latency systems in real-time ad auctions
Understanding the Ad-Serving Process - Overview of the ad-serving workflow - Explanation of the Service Level Agreements (SLAs) and latency requirements - Introduction to the challenges in handling over billion daily ad requests
Key Stages in the Ad Auction Process - Detailed walkthrough of audience targeting, optimization of advertiser objectives, and ad selection - Role of machine learning (ML) algorithms in these stages - Importance of real-time predictions in the ad auction process
Building a Scalable and Extensible System - Strategies used by Kayzen to handle high volume and throughput demands - Discussion on the frequent training of ML models using large, dynamic datasets - Methods for optimizing performance and reducing latency in real-time model serving
Challenges and Solutions - Common challenges encountered in building low-latency ML systems - Lessons learned and best practices for ensuring system reliability and performance - Techniques for feature transformation, enrichment, and caching to improve latency
Conclusion - Summary of key points covered in the talk - Final thoughts on the future of low-latency ML systems in ad tech - Q&A session
Prerequisites:
Basic Understanding of Machine Learning:
- Familiarity with common ML concepts and algorithms
- Understanding of how ML models are trained and deployed
Knowledge of Real-Time Systems:
- Basic knowledge of real-time data processing and latency requirements
- Experience with real-time applications, preferably in an ad tech or similar environment
Programming Skills:
- Proficiency in a programming language commonly used in ML (e.g., Python, Java)
- Experience with ML frameworks such as TensorFlow, PyTorch, or similar
Experience with Ad Tech (preferred but not mandatory):
- Understanding of the ad-serving ecosystem and the ad auction process
- Familiarity with the concepts of audience targeting, ad selection, and optimization
Interest in System Design and Scalability:
- Enthusiasm for learning about scalable system architectures
- Curiosity about optimizing performance and reliability in high-volume environments
Video URL:
https://drive.google.com/file/d/1uboP8_ayUohTpUpeoZUrTvs_YC2IPgwZ/view?usp=sharing
Speaker Info:
As a Senior Machine Learning Engineer for Kayzen, Siddharth is spear-heading the stability of AI infrastructure and model monitoring at scale. He builds & deploys models for making Advertisers & Media Buyers in the Demand Side successful in their programmatic ad spends. Additionally, he is responsible for designing and implementing several highly effective efficiency measures, including stability of Big data storage, processing and heavily optimising the data & model pipelines.
He is also the Conference Submissions Reviewer at IEEE for over 4 years. Previously, he worked for Ahmedabad based start-ups: Shipmnts & Infocusp. Siddharth also happens to be the Gold Medalist from SRM University and collaborated with IIT Madras for a coveted Research Fellowship. Siddharth brings in 8 years of experience in Enterprise and Cloud architecture.
He leverages this deep understanding of system architecture and machine learning to build scalable and reproducible ML Architectures across Industry verticals – from ed-tech to legal-tech to logistics and supply chain and lately ad-tech. Siddharth has also been the recipient of IET Trendsetter of the Year Award in both 2015 & 2016, and Amul Vidya Bhushan Award, 2012. He has mentored hundreds of data science enthusiasts from school kids to industry leaders via different platforms from The Climber, GreatLakes & Scaler. Besides being a Speaker & a Judge at various Seminars, Conferences & Panel discussions, he is a keen robotics enthusiast and is trained in Contemporary dance.
Speaker Links:
Socials LinkedIn: https://www.linkedin.com/in/siddharthsahani/ Github: https://github.com/dapperlabel StackOverflow: https://stackoverflow.com/users/6649426/siddharth-sahani Medium: https://medium.com/@siddharthsahani7
Links to previous talks 1. AIM MLDS 2024 https://www.youtube.com/watch?v=SzUeydIsVPs&list=PL1Osdi_5mfH6jNSR5iDl0J43GDfpEIFMV&index=25 Bengaluru February 2, 2024 2. IEEE Kaagaz Conference https://www.linkedin.com/feed/update/urn:li:activity:7001762071558148096/?updateEntityUrn=urn%3Ali%3Afs_feedUpdate%3A%28V2%2Curn%3Ali%3Aactivity%3A7001762071558148096%29 Hyderabad, 4-5 November 2022 3. IEEE Epsilon https://www.youtube.com/watch?v=Gvpua_zFxy4 Online April 2021 4. SRM Alumni Webinar https://www.youtube.com/watch?v=z09m2xDPzVw Online May-2020