Building a data pipeline for processing and storing Twitter data on AWS with Python

sparack


0

Votes

Description:

Twitter data is a valuable source for academic researchers who wish to study the public conversation. Twitter's filtered stream API allows developers to get a sample of the real-time Tweets as they happen. However, in order to effectively process, store and study the Twitter data, you should have an efficient data pipeline. In this workshop, we will build a data pipeline on Amazon Web Services (AWS) using Python to:

  • Stream data from the Twitter Filtered Stream Endpoint
  • Process this data with Amazon Simple Queueing Service (SQS)
  • Store this Tweet data on Amazon S3

Prerequisites:

Ideally, participants should apply for a Twitter Developer Account and have an AWS account prior to attending this session.

Speaker Info:

Suhem Parack is a Sr. Developer Advocate at Twitter. He focuses on helping the academic research community succeed on Twitter's Developer Platform. Prior to joining Twitter, he worked as a Solutions Architect at Amazon in the Alexa org. Outside of writing code and helping developers, he enjoys reading and running.

Speaker Links:

https://developer.amazon.com/blogs/home/author/Suhem+Parack, https://blog.twitter.com/developer/en_us/authors.suhemparack.html, https://dev.to/suhemparack

Section: Data Science, Machine Learning and AI
Type: Workshop
Target Audience: Intermediate
Last Updated: