Leveraging Python for Efficient Data Pipelines
PratikshaAggarwal |
1
Description:
Abstract: In the age of big data, building efficient and scalable data pipelines is critical for extracting valuable insights. Python, with its robust ecosystem of libraries and tools, has emerged as a powerful language for managing and processing large datasets. This talk will delve into the practical aspects of creating efficient data pipelines using Python, highlighting best practices, common pitfalls, and advanced techniques to optimize performance.
Description: As data becomes increasingly central to decision-making in businesses, the need for efficient data pipelines has never been more critical. This session aims to equip data engineers, analysts, and developers with the knowledge to build robust data pipelines using Python.
Prerequisites:
A basic understanding of Python programming and data handling is recommended.
Content URLs:
https://www.geeksforgeeks.org/data-science-vs-data-analytics/
Speaker Info:
Pratiksha Aggarwal is a dedicated software engineer with four years of experience in the tech industry, specializing in data engineering and Python programming. Over the past few years, She has developed a deep understanding of building and optimizing data pipelines to handle large-scale data processing tasks efficiently.
In addition to her professional work, Pratiksha is passionate about sharing knowledge and contributing to the Python community. She has been involved in various tech meetups and has written several articles on data engineering best practices. This talk at PyCon India 2024 will be an excellent opportunity for attendees to learn from her practical experiences and insights.
Key Skills: Data Engineering Python Programming Data Pipeline Design and Optimization ETL Processes Parallel Processing with Dask Workflow Orchestration with Apache Airflow and Luigi Performance Tuning and Memory Management Professional Experience: Software Engineer at Aura(2023-Present)
Developed and maintained scalable data pipelines. Implemented ETL processes to integrate data from various sources. Optimized data processing workflows to enhance performance. Previous Role at Bolo Live(2021-2023)
Contributed to data infrastructure projects. Collaborated with cross-functional teams to deliver data solutions. Education: Bachelor’s Degree in Computer Science Previous Speaking Engagements: Presented on data engineering topics at local tech meetups. Authored articles and tutorials on Python and data pipelines.
Speaker Links:
https://github.com/pratu098 https://medium.com/@pratiksha098 https://www.geeksforgeeks.org/data-science-vs-data-analytics/