Real time system monitoring using PySpark
nandkishore sharma (~nandkishore) |
Monitoring and alerting are essential components of any system. As the number of services grow, monitoring all of them all the time becomes a mammoth task in itself. Hence, there's always a need for having an intelligent system to monitor other systems’ behaviour and notify the appropriate stakeholders when an anomaly occurs. Here are some of the things I am going to cover:
- The need for effective anomaly detection in systems monitoring.
- System metrics that matter to you - CPU, memory, disk, etc
- Using StatsD and CollectD for data collection.
- Building a useful data pipeline
- Using PySpark for real time data processing
- Using NumPy and SciPy for business intelligence
- Implementing anomaly detection algorithms.
Participants should have some basic knowledge of systems monitoring. Having used tools like New Relic, Grafana would be an added advantage. Basic knowledge of streaming data and Kafka would also be useful.
I am an Engineer at Grofers. Worn multiple hats throughout my career starting from Full-stack Engineering, to Backend, to Data, and now to Release Engineering. Co-founded crowdsource logistic platform DbyT. Worked as a Programmer Analyst at Virginia based RTS Labs and as a Salesforce Consultant for Richmond based MCIM. Worked with clients from Healthcare, Mission Critical, Datacenter management, Payments industries.