Exploring GPU Alternatives for AI: A Hands-on Workshop on Gaudi 2
jaygala223 |
2
Description:
Problem
As Artificial Intelligence (AI) adoption continues to rise, organizations are faced with a rapidly evolving landscape of AI hardware accelerators. With so many alternatives available, ranging from Graphics Processing Units (GPUs) to specialized AI chips like Tensor Processing Units (TPUs) and Gaudi, it can be challenging to determine the most suitable hardware for specific AI use cases. This lack of awareness often leads to poor hardware choices, resulting in inefficient model deployment and higher training and inference costs.
Workshop
This workshop aims to overcome this knowledge gap by providing hands-on training on leveraging the Gaudi 2 AI Accelerator for various AI workloads. Gaudi 2 is a purpose-built AI processor designed to accelerate deep learning models efficiently.
The workshop will provide attendees with a comprehensive understanding of the Gaudi 2 AI Accelerator and its applications in AI workloads. It will cover significant modern AI techniques like fine-tuning large language models (LLMs), optimizing inference, multi-modal generative AI development, and developing Retrieval Augmented Generation (RAG) pipelines.
It will consist of 4 modules, each focusing on a specific AI technique. Attendees will gain hands-on experience with the Gaudi 2 processor and learn how to implement various AI models themselves. The workshop will also introduce the best practices in fine-tuning LLMs and deploying models.
Modules
LLM Inference and Compression (45 mins): Attendees will be implementing LLM inference and will be able to compress and deploy those models too.
Fine-tuning LLMs (45 mins): Attendees will be implementing LLM fine-tuning with open source models like Llama 2 and will be able to describe the best practices for fine-tuning different models.
Multi-modal Generative AI (45 mins): Attendees will be performing optimized multi-model inference with models like Stable Diffusion.
Developing a Hybrid RAG Pipeline (45 mins): Attendees will be implementing a hybrid RAG pipeline with some components running on CPUs and some on the accelerator for speed and efficiency.
Resources
Attendees will receive credentials to access Gaudi 2 cards and run models during the hands-on sessions.
Outcomes
By the end of this workshop, attendees will have practical experience with the Gaudi 2 AI Accelerator and its applications in various AI applications. They will understand the strengths and limitations of the hardware and learn optimization techniques for running different models efficiently.
The workshop will provide a comprehensive learning experience, enabling attendees to make informed decisions for AI deployments on similar deep learning accelerators.
Prerequisites:
- A laptop with stable internet connection
- Familiarity with Python
Speaker Info:
Jay is an AI Software Solutions Engineer at Intel where he works on optimizing AI Workloads and Frameworks for Intel hardware. He has been with Intel since December 2023.
Jay is passionate about AI research and has worked with researchers at the Indian Institute of Technology (IIT) Patna and the CVSSP Lab at the University of Surrey. His research focuses on AI for social good and involves areas like medical diagnosis, cloud detection, and illegal fishing identification. He was invited as a speaker by Cohere for AI in May 2024 for his cloud detection paper from NeurIPS 2023 CCAI Workshop.
Jay was part of the inaugural Google ML Bootcamp India in 2022 during which he got his TensorFlow Developer Certification. He has also contributed to the KerasCV, Keras and TensorFlow projects.
He is well versed in and has experience in Python, Deep Learning (CV, NLP, Gen AI), PyTorch, TensorFlow, Git, Docker, Kubernetes, Shell Scripting, etc.
Speaker Links:
Here are my profiles: Github | LinkedIn | Personal Website | Blog | Google Scholar