Asyncio and Web crawlers
Haren Lewis (~haren) |
Description:
Asyncio, Python's very own library to write concurrent code using the async/await syntax; is a perfect fit for many IO and network bound problems. It also forms the foundation for many of the modern Python asynchronous frameworks that provide high-performance network and web-servers, database connection libraries, distributed task queues, etc.
In this workshop, we'll create a simple web crawler. The web crawler needs to wait for many responses which becomes time-consuming waiting on each request and processing it.
Overview of the talk:
- Introduction to Asyncio.
- Few concepts like event loop coroutines, event queue, etc.
- Traditional approach to web crawlers.
- What are coroutines.
- How generators work.
- Refactoring to asynchronous crawler using asyncio.
- Conclusion
Prerequisites:
Python basics
Content URLs:
This workshop is inspired by the following article:
http://aosabook.org/en/500L/a-web-crawler-with-asyncio-coroutines.html
Speaker Info:
Just a developer, hacking his way into the tech industry
Speaker Links:
LinkedIn: https://linkedin.com/in/haren-lewis
Github: https://github.com/harenlewis