Learn Django, Redis and Celery by visualizing Twitter data.
Haris Ibrahim K. V. (~harisibrahimkv) |
The tutorial will cover writing a Django web project from step 0. The idea is to allow a full hands on experience developing a website. However, the experience has to be gratifying and something that can be built upon. Hence, over the process, we will cover fetching tweets from Twitter using the Twython framework, how to spawn off asynchornous celery tasks to handle the archival and visualizing the tweets per hour graph using Chart.js.
Redis will be introduced to cover 3 simple use cases. Implementing a leaderboard, implementing a counter and implementing cache. These are 2 to 5 lines of code which makes a huge difference and impact on your web application. Redis will also be used as the task queue for celery.
During the course of development, we will cover using models, forms, implementing views, how the ORM works, routing, templates, and how all these tie together during the flow of development.
1) Defining the problem statement. Giving a brief introduction to what we are going to work through and achieve so that everyone has a good framework to start off. This step will define that: - We are going to build a Django project. - The end result of which is to fetch tweets and visualize them - To use celery in order to handle the fetching of tweets without blocking the user - To know the top 5 popular tweets in all the tweets we fetch - To know how many tweets we fetched - To cache the popular pictures for 10 seconds 2) Creating the project and discussing the required apps. This part will dive into hands-on coding after we discuss and decide on the apps ther we require to complete this project. We will initiate a project and create the apps required. 3) Thinking about and implementing the models. Creating the database. Discussion regarding the models required in order to complete this project will happen in this step. Once we agree upon the required fields, the model will be implemented and syncdb will create the database for us. 4) Using forms for user Input and the URL to be accessed. We will setup the Django form that we are going to use and validate user input. Once we finish implementing that, we will turn towards deciding and implemeting the URL pattern for the user to visit and get his input box. 5) Firing off the Twitter API using Twython. Once we receive and validate the user input, we will learn how to fire off the Twitter API using the Twython library. We will run a few samples and try printing out the responses to see if things are fine and also to get a hang of the Twitter API. 6) Introducing & implementing Celery. We will introduce and talk about Celery here. What it is used for and why you should use it. Especially since in the above step, the user would have had to wait for a long time after submitting the input, while we fetch tweets from the Twitter API. We will implement a small task and see how it handles the asynchronous task of fetching tweets. We will talk in a bit more details about message queues, messaging broker and exchanges. 7) Introducing and implementing Redis To get everyone on the same page, what Redis is will be explained here. We will implement Redis to handle the counter use case at this point. Where redis will keep the count of how many tweets we are fetching from the API. 8) Saving tweets. Once people are familiar with Celery and Redis, we will move onto saving all the tweets that we are fetching into the Database. Will talk about ORM best practices and race conditions at this point. 9) Implementing Redis to calculate popular. We will have to destroy the DB and redis keys as we will talk about how to calculate the popular tweets here. This is based on Retweet count. The 'sorted set' data structure will be introduced and implemented here. 10) Implementing Redis as cache. We will talk about caching and implement redis to act as a cache for displaying popular photos from the popular tweets that we already have. 11) Preparing for visualize! We will talk about the data required for visualization. The participants will be given a few minutes to generate that data from the Database in which we have stored all the tweets. Once the time expires, we will use a ready made solutiuon of preparing the dataset. 12) Visualize! Once we have the dataset, we will visualize that by passing it onto the templates and using the Chart.js library to draw the chart.
Make an account on dev.twitter.com and register an app. Get the Consumer Key and Secret. After that, generate the Oauth token and secret. Keep these handy before coming to the workshop.
I will have an offline version ready (tweet jsons in a txt file) just in case. You need to have the following softwares installed:
- Python 2.7
- Sqlite 3 / PostgreSQL
- Redis 2.8
- Chart.js (http://www.chartjs.org/)
Have a virtualenv (http://virtualenv.readthedocs.or...) up and running with the following packages installed.
Any Linux distro (preferable Ubuntu). If you have Windows, please install VirtualBox with Ubuntu! If you know your way around Windows and Mac, then please feel free to use them. It is just that my dev experience on those platforms are almost NIL. So I won't be of much help when you are trying to debug platform specific errors.
I have only been using Django and Python extensively since November 2013 when I joined Eventifier. The tutorial above is based on my experience using Django, Celery and Redis for various purposes within including the visualization feature which I have mentioned. Before joining at Eventifier, I was working as a Community manager at HasGeek. During that one year, I used to play around with a few pet projects. However, nothing serious. I am one of the content writer's for PyCon India. I love to teach. I have done a couple of workshops for the Bangalore Python User Group monthly meetup, Bangpypers (http://www.meetup.com/BangPypers/). One was on Introduction to Django. I was one among the three tutors (http://bangalore.python.org.in/blog/2014/01/18/january-meetup-report/). The second one on Introduction to Python. (http://bangalore.python.org.in/blog/2014/02/15/february-meetup-report/) I gave a beginner level talk at PyCon Singapore 2014 on Redis (https://pycon.sg/schedule/presentation/32/). That was my first ever talk at a conference. :). See my video and transcript here: http://sosaysharis.wordpress.com/2014/06/29/an-introduction-to-redis-pycon-singapore-2014/ I have given the above workshop (except the Celery part) at both PyCon Ireland 2014 and PyCon Poland 2014. Here is the content and transcript: https://github.com/harisibrahimkv/dj_workshop I also go around institutions teaching them what I know. The PythonExpress initiative has a couple of workshops that I have done. http://www.pythonexpress.in/trainers/harisibrahimkv Why I wanted to do this tutorial is because I felt that a lot more concrete knowledge could have been imparted if I could have given a hands on experience to the audience I guess that is about it. Do let me know in case you have any feedback regarding the content. Thanks!
- Blog: http://sosaysharis.wordpress.com/
- Github: http://github.com/harisibrahimkv
- Twitter: http://twitter.com/harisibrahimkv