Big Data Analytics Using Apache Spark On IOT in Industrial way
Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining,this talk aims to provide you the whole steps of data science with Spark from the beginning till end . Apache Spark provides developers with an API(application programming interface) which is centered on a data structure which is called the resilient distributed dataset or RDD, it is a read-only set of many data items which is distributed over a large cluster of machines, that is organized in a fault-tolerant way. It has removed the limitations in the Map Reduce cluster computing programming paradigm, which forces a particular one line flow of data structure on distributed programs. In Map Reduce program get input data from hard disk, map a function across the input data, it reduces the result of various map, and store reduction results on disk. Under the map reduce model, the primitives of data processing are known as mappers and other set is called reducers. Data processing application are decomposed into mappers and reducers is sometimes not trivial. But, if we code an application in the form of map reduce, scaling the application to run over lets say tens thousands of machines in a cloud or cluster is like a configuration change. What this talk covers:
Basic Model of Spark
- Spark Driver and Workers
- Resilient Distributed Datasets
- Why performance is faster
Spark with IOT
- Using Spark in an IoT Analytics Platform
- Analysis of IoT Device JSON Data
Ending notes and strong warnings
The viewer is expected to know about basic Python . A little idea about Big Data, Data Analytic and IOT would be helpful.
PPT Slide link for these Talk is here :- https://www.slideshare.net/secret/F5wNhLeq9Xy98k
Shubham Sharma is currently working on Spark and hadoop .He is currently working as associate software engineer at certaintyinfotech . He is working on Big Data and Analytics using Python , Pandas , Anaconda , Spark etc .
He has strong roots in computer science and technology and believes that it is the only thing that can create this world a better place but is also quite interested in Mobile Application Security, Hadoop Security, and Web development as they are an integral part of our lives.