Database Change Management - 1000s of tables across multiple environments and 1 Alembic to rule them all

Kunal Bhattacharya (~kunaljubce)




When working in production environments with thousands of tables on multiple environments, typically dev, stage, and prod, teams have a hard time keeping all environments in sync. The DevOps/Platforms teams face a similar (if not bigger) challenge during migration activities when they have to deploy the database change scripts like CREATE TABLE, ALTER TABLE, etc in some sequence only to find that the sequence provided in the DB migration scripts is not working as expected. Or worse still, some developer or analyst forgot to checkin their desired changes, which they had applied on lower environments, to the git repository.

Panic ensues and so does urgency as downstream steps are blocked till the fixes are ready.

Alembic to the rescue!

Alembic is a fully open source database migrations tool written by the authors of SQLAlchemy. It provides an ACID compliant framework wherein migration scripts for the entire database can be constructed, maintained, and executed for seamless portability on all environments. With single command upgrades and downgrades, DevOps/platforms teams no longer have to follow complicated documentations during releases and developer teams no longer have to waste times preparing and managing such documentations during every release cycle.

My talk will focus on the below topics -

  • What's alembic?
  • What are the use cases for Alembic?
  • What makes it the most reliable database change management tool available?
  • A quick demo using Alembic to migrate the database structure from one environment to the next.
  • How does it work? Any key concepts?
  • Any trade-offs v/s traditional database management done manually by following a series of steps?


Basic knowledge of SQL (DDL, DML commands) is required. Intermediate knowledge of Python including concepts of ORM would be beneficial.

Video URL:

Speaker Info:

Kunal is a Platforms Engineer at PepsiCo and has over a decade of experience working with databases in production settings of various organisations. Prior to being a platforms guy, he also worked as ETL developer and data engineer in multiple domains such as energy services, BFSI, and healthcare. He has rich expertise in building low latency distributed data processing systems on terabytes and petabytes scale and is an avid fan of Apache Spark. He is also passionate about contributing to OSS and is an active contributor to quinn, a PySpark utility package for Spark developers.

At PepsiCo, he develops and maintains the eCommerce Data Platform that ingests and processes over 3 TBs of data per day. He is playing a critical role in helping other teams build better data products by enhancing data observability and governance, architecting robust application platforms for those data products, and enhancing adoption of newer and better tech stacks to increase user delight.

Speaker Links:

GitHub LinkedIn Twitter

Section: Python in Platform Engineering and Developer Operations
Type: Talk
Target Audience: Intermediate
Last Updated: