OLAF: One Load Audit Framework for provisioning efficient infrastructure and ideal scaling setting of ML services
Aashraya Sachdeva (~aashraya) |
Description:
The increasing adoption of ML in various industries highlights the importance of testing ML service performance under different loads. ML services interact with multiple external components, and their efficiency depends on the capacity of these components and the optimization of processes and threads. Currently, no single tool addresses all these aspects. Introducing OLAF, a tool that helps identify bottlenecks and performance issues in ML services by providing latency and throughput measurements under static or dynamic loads. It supports various resources, including microservices, databases, queues, and ML Model Server Platforms. OLAF is targeted at ML practitioners, DevOps engineers, and stakeholders involved in ML service development and deployment. The tool also facilitates integrating ML performance testing into the SDLC process, promoting efficient infrastructure provisioning for multiple ML use cases.
Prerequisites:
NA
Video URL:
TBA
Content URLs:
TBA
Speaker Links:
https://www.linkedin.com/in/aashsach/ https://www.usenix.org/conference/srecon23apac/presentation/sachdeva