Scraping the web like a boss : Working with Scrapy framework
rajdeep (~rajdeep1008) |
Scrapers and crawlers are all around the web, and in this age of data, finding some quality data can already be time consuming. So, instead of making some long scripts from scratch and handling all the minute details of the scraper, why not use a framework which is fast, optimized, asynchronous and easy to use.
This talk is targeted towards python developers new to scraping or Scrapy framework to be specific.
The aim of the talk is to demonstrate how to get started with using Scrapy framework, using asynchronous features of Scrapy and when to use Scrapy and when not, introduction to concepts such as Spiders, Items, Loaders, Selectors and Pipeline.
The goal of the talk is to get user started with using Scrapy and demonstrate the working by scraping data from a website (website to be decided yet).
- Basic knowledge of Python
- A basic understanding of web scraping using any other library or tool would be beneficial.
I am a final year B.tech student at GTBIT. I have interned as python developer at Inc42 where i worked on automating stuff and django. Previously i have worked as an android intern at Instaspaces.
I have been working with python for a year now trying my hands on making some bots, doing web development using flask and django and currently trying the data science field. I got started with scraping for a personal bot project but starting doing a lot of that now for data science experiments.
I have been participating in a lot of hackathon for the past 2 years and won few of them.