Using the coverage database to speed up CI

AbdealiJK (~AbdealiJK) | 09 Jun, 2023

5

Votes

Description:

Ever sat there waiting for your pytest test cases to finish running? Or staring at the GitHub Action in your PR waiting for it to turn green ...

Writing tests is great and improves stability and allows developers to build things faster. But at some point, every project reaches a state where the test suite makes developers slower cause it just takes way too long to marge changes!

I've been in such projects more than once, and I decided to go about looking for a solution to this issue. In this talk, I will discuss the many available options:

The simple approaches that pytest provides out of the box
Breaking down into smaller projects
Ability to cache test results

We will deep dive into the 3rd approach which helps the most when working actively on a large project. The typical development flow most projects follow is:

The developer creates a branch to work on from the current "base" branch (called "main" by default, or "dev" in many orgs)
The developer writes or modifies code in their branch + write new tests
A developer wants to test if the modifications cause any tests to fail

At this point, the developer wants to run 2 tests:

First, the tests they have written
Second, the tests that may start failing due to their changes

We will go through how to query the coverage database of the "base" branch to find out which tests belong to the second category. And then run only them with pytest - hence speeding up the CI environment significantly.

Prerequisites:

The participants should know:

Basic python
Understand SQL or sqlite basics
Be comfortable writing tests and be aware of how unit tests work
Be aware of tools like git

Speaker Info:

Hi, I'm Abdeali Kothari - a.k.a Ali (if we're talking) or @AbdealiLoKo (if we're typing)
I graduated from IIT Madras and then worked with American Express, followed by Corridor Platforms where I am architecting a Decisioning platform for analytics in the Financial domain.

I've dabbled with Robotics, Operating System architectures, Machine Learning, Game Development, and Web Development a lot for a bunch of personal projects.
And worked mainly in Big Data, Machine Learning, and Analytics in the Financial Domain for enterprise-productional use-cases.

I'm a big fan of code hygiene and clean architecture. With a lot of Code Analytics experience under my belt.
And worked mainly in Python in all the above fields for about 13 years now (Back when the first blogpost telling us to stop using Python 2.x was written :D)

I'm extremely lazy - and hence an automation freak. And have created great automated test suites and CI/CD pipelines to help me remain lazy.

Speaker Links:

Previous talks I have done:

PyCon DE + PyData: Monorepos in Python - schedule, youtube, slides
PyConf Hyderabad 2022: Monorepos in Python - schedule, slides
BangPypers: Using sqlalchemy+marshmallow for faster queries - meetup, slides
BangPypers: Playwright and E2E testing - meetup, slides
FlaskCon: Enabling multi-tenancy with werkzeug – youtube
FlaskCon: Application config management – youtube
GUADEC - Static code analysis with coala

Section:	Developer tools and automation
Type:	Talks
Target Audience:	Intermediate
Last Updated:	21 Jun, 2023

Comments