Searchable Dataframes - Using Eland for In-Place, Pandas-Like Data Manipulation

kjaymiller


Description:

In observing data in dataframes, I struggle between knowing what information needs to be pulled down reported on. Having to hold millions of records in memory only to pull sections of it to visualize is a waste on resources and taxing to my computing environment. By storing object data in Elasticsearch, I can bring in data based on index patterns and only load in the data that I need as I ask for it, letting my server storage handle the taxing work.

Outline (30min)

  • Discuss Loading Data and the quantity Problem (10 minutes)
  • Show how eland has a similar from/to model for retrieving/sending data.
  • Discuss visualizing different slices of data with (Pandas vs Searching for that data with eland queries)
  • Show Visualizing Data via MatPlotLib from a Pandas and Eland Perspective

Prerequisites:

Basic Understanding of Pandas (What a dataframe is). Understanding of JSON

Content URLs:

Slides - Notebook and data repo - https://github.com/kjaymiller/sd-police-call-data

Speaker Info:

Jay is a Developer Advocate at Elastic, based in San Diego, Ca. A multipotentialite, Jay enjoys finding unique ways to merge his fascination with productivity, automation, and development to create tools and content to serve the tech community.

Section: Data Science, Machine Learning and AI
Type: Talks
Target Audience: Intermediate
Last Updated: