Fine-Tuning Insights: Lessons from Experimenting with a Large Language Model on Slack Data

Samhita Alla (~samhita9)


0

Votes

Description:

Large language models (LLMs) have taken the world by storm, revolutionizing our understanding and generation of human-like text. These models have demonstrated remarkable capabilities in various tasks, such as question-answering, chatbots, and even creative writing. However, adapting these models to specific use cases requires fine-tuning, which can initially be a challenging process to comprehend.

In this talk, attendees will delve into the intricacies of fine-tuning and explore different methods such as Low Rank Adapters (LoRA) and 8-bit fine-tuning to improve its efficiency. They will gain an understanding of the important factors to consider when fine-tuning a large language model. Furthermore, the audience will grasp the distinctions between fine-tuning and prompt engineering and develop insights into when to employ each approach.

One significant hurdle in fine-tuning these models is infrastructure. Despite the availability of cloud tools like Google Colab and consumer-grade GPUs, creating a suitable runtime environment for fine-tuning remains a major challenge. To address this, attendees will learn how to declaratively specify infrastructure with Flyte, empowering them to configure training jobs that effectively utilize the necessary compute resources for fine-tuning large language models on their own data.

By the end of the talk, attendees will have a solid understanding of the fine-tuning process, practical methods to enhance its efficiency, and the means to overcome infrastructure challenges. They will leave with valuable takeaways, ready to apply their knowledge and effectively adapt large language models to their specific use cases.

Prerequisites:

The prerequisites for this talk are basic knowledge of Python and a general understanding of large language models.

Content URLs:

Blog post, GitHub repo

Speaker Info:

Samhita is a developer advocate at Union.ai and a former tools developer at Oracle. She is passionate about software development and technical writing, and enjoys tackling challenges in the fields of growth and developer relations. In her free time, Samhita loves building machine learning and web applications. Committed to helping others succeed in the tech industry, Samhita has self-published a book navigating through the complex landscape of the field.

Speaker Links:

Section: Data Science, AI & ML
Type: Talks
Target Audience: Intermediate
Last Updated: