Asyncio and Web crawlers
Haren Lewis (~haren) |
Asyncio, Python's very own library to write concurrent code using the async/await syntax; is a perfect fit for many IO and network bound problems. It also forms the foundation for many of the modern Python asynchronous frameworks that provide high-performance network and web-servers, database connection libraries, distributed task queues, etc.
In this workshop, we'll create a simple web crawler. The web crawler needs to wait for many responses which becomes time-consuming waiting on each request and processing it.
Overview of the talk:
- Introduction to Asyncio.
- Few concepts like event loop coroutines, event queue, etc.
- Traditional approach to web crawlers.
- What are coroutines.
- How generators work.
- Refactoring to asynchronous crawler using asyncio.
This workshop is inspired by the following article:
Just a developer, hacking his way into the tech industry