+1 -1 +20
Vote on this proposal

Rewriting the Wayback machine's live web proxy in Python

by Noufal Ibrahim (speaking)

Section
Web Development
Session type
Talk
Technical level
Intermediate

Objective

This talk will discuss the development and design of a high performance web app which successfully replaced a decade old existing service without any hiccups.

Description

The wayback machine is a high traffic website that has been online for over a decade. It was a mostly Java application. One component of the application is the Liveweb proxy. This is an HTTP proxy that archives a resource which is requested through it and the core data source for the wayback machine.

The liveweb proxy was rearchitected from scratch in Python and deployed on the actual website and has been running for a few months now without a single hitch. There were limitations in the standard library which needed to be worked around, careful tuning of parameters to balance disk I/O and memory usage, fine details of the HTTP protocol that needed to be understood and respected.

This talk discusses the architecture and design of the new system to handle the kind of traffic and patterns which are expected of an archiving proxy and how it was deployed.

Speaker bio

This talk will be presented by Anand Chitipothu and Noufal Ibrahim. Both of them are employees of the Internet Archive, working remotely from Bangalore.

Anand is a software consultant and trainer. He has been working with the Archive since 2007. He is co-ordinator of the PyCon India 2012 conference.

Noufal is a freelance trainer and consultant based out of Bangalore. Founder of PyCon India and organiser of the first two conferences in India.

Comments


  • 1

    [-] Anand B Pillai 943 days ago

    May I suggest rewording the title of the talk ? - "Proxying the wayback machine with Python" or similar. The current title is a bit verbose.


  • 1

    [-] Anand B Pillai 923 days ago

    Kindly note: Your talk needs more content in the description section for evaluation. Please add more content describing your talk. Think in terms of how you plan to present the talk and virtually walk through the slides here - Thanks - Admin.

Login with Twitter or Google to leave a comment →