+1 -1 +16
Vote on this proposal

Web scraping using requests and beautifulsoup

by Arvind S Raj (speaking)

Section
Web Development
Technical level
Beginner

Objective

This session will enable a person to retrieve web pages and extract information from them using requests and beautifulsoup. Requests is a simple yet elegant HTTP library that enables easy interaction with HTTP resources on the internet - it even supports SSL certificate verification, session objects, prepared requests, OAuth authentication and more. Beautifulsoup is a parser that makes it easier to extract specific data from a web page. It works on top of the standard HTML parser but also works with the lxml parser(it also supports XML parsing using lxml).

Description

The session will cover basic usage of requests module(such as custom headers, cookie handling, exceptions and perhaps even prepared requests and session objects) and beautiful soup to retrieve and extract data from web pages.

Speaker bio

I am a computer science student who loves programming in Python. I also have an interest in computer security and will be enrolling for a master's programme in the area this year. I am also an active member of the FOSS Club in my university. I have used a combination of requests and beautifulsoup for a project that involved extracting information from several web pages(I can't post any code publicly; sorry).