Building a data pipeline for processing and storing Twitter data on AWS with Python

sparack | 21 Jun, 2020

0

Votes

Description:

Twitter data is a valuable source for academic researchers who wish to study the public conversation. Twitter's filtered stream API allows developers to get a sample of the real-time Tweets as they happen. However, in order to effectively process, store and study the Twitter data, you should have an efficient data pipeline. In this workshop, we will build a data pipeline on Amazon Web Services (AWS) using Python to:

Stream data from the Twitter Filtered Stream Endpoint
Process this data with Amazon Simple Queueing Service (SQS)
Store this Tweet data on Amazon S3

Prerequisites:

Ideally, participants should apply for a Twitter Developer Account and have an AWS account prior to attending this session.

Speaker Info:

Suhem Parack is a Sr. Developer Advocate at Twitter. He focuses on helping the academic research community succeed on Twitter's Developer Platform. Prior to joining Twitter, he worked as a Solutions Architect at Amazon in the Alexa org. Outside of writing code and helping developers, he enjoys reading and running.

Speaker Links:

https://developer.amazon.com/blogs/home/author/Suhem+Parack, https://blog.twitter.com/developer/en_us/authors.suhemparack.html, https://dev.to/suhemparack

Section:	Data Science, Machine Learning and AI
Type:	Workshop
Target Audience:	Intermediate
Last Updated:	21 Jun, 2020

Comments