Processing videos effectively - Piping the Parallelism out of Python
R S Nikhil Krishna (~rsnk) |
Description:
Almost all of us have used VLC, simply because it's so good at what it does. Reads multiple file formats, transcodes videos, makes basic filtering (brightness correction, etc) effortless, etc
But have you ever wondered what makes VLC so efficient? VLC uses libavcodec
in the backend, which is just a way for it to access the API of FFmpeg
, a key software in video-processing. However, one is seldom satisfied with the basic filters that libraries like libavcodec
offer, which leaves us to write our own code using libraries like OpenCV and Scikit-Image.
These codes are generally inefficient though because they do not make the best use of a user's computer capabilities, as well as in terms of encoding with minimal loss.
In this talk, we will take a look at what it takes to process videos in python as efficiently as libavcodec
/FFmpeg
in terms of both speed and quality
- Speeding up video processing (simple ex: watermarking) by
>2x
in Python by parallelizing across frames a) Using multiple CPU cores b) Using GPU cores - Scaling up to 4k and 8k videos - Better quality and compression by piping the data from Python to ffmpeg
- Minimizing I/O access by storing transport streams in memory from Python
The talk will cover it in the following manner:
- Quick basics of computer vision - How images and videos are stored and managed, and how to handle videos in python using FFmpeg and OpenCV, an open source computer vision library (5 min)
- Using piping and multiprocessing to smartly speed up and improve your video processing (15 min)
- Simple parallelism in Python making use of the
multiprocessing
module, and how it extends to CV (Computer Vision) (5 min) - Space vs Time difference in parallel processing videos
- Parallelizing the video processing pipeline on the GPU using
numba
andcudatoolkit
Prerequisites:
Recommended:
- Fundamentals of computer vision
- What are pipes and how they are used
Preferable:
- Understanding of the Python GIL and multiprocessing
- Basics of what FFmpeg is and its usage, what makes it efficient
- Basic understanding of Cuda kernels
Content URLs:
The presentation, which is still being curated, can be viewed here
The code to generate the results can be viewed here
Audience might find these links useful, however they are not a prerequisite
- Detailed presentation by the creator of the multiprocessing module
- Piping in Python
- Related content: Faster Video FPS in Python by shifting the decoding to another thread
Speaker Info:
In the The Computer Vision and Intelligence Group at IIT Madras, both authors have taken sessions to initiate hundreds of college students to Python, Computer Vision and AI, openly accessible at their github repo
R S Nikhil Krishna
Nikhil is a final year student at IIT Madras. He currently leads the Computer Vision and AI team at Detect Technologies and has headed the CVI group at CFI, IIT Madras in the past. In the past, He has worked on semi-autonomous tumour detection for automated brain surgery at the Division of Remote Handling and Robotics, BARC and on importance sampling for accelerated gradient optimization methods applied to Deep Learning at EPFL, Switzerland. His love for python started about 4 years back, with a multitude of computer vision projects like QR code recognition, facial expression identification, etc.
Lokesh Kumar T
Lokesh is a 3rd-year student at IIT Madras. He currently co-heads the CVI group, CFI. He uses Python for Computer Vision, Deep Learning, and Language Analysis. In DeTect technologies, he has worked on automating the chimney and stack inspections using Computer Vision and on on-Board vision-based processing for drones. His interest in python began during his stay at IIT Madras, from institute courses to CVI projects like face recognition, hand gesture control of bots, etc
Speaker Links:
R S Nikhil Krishna
Lokesh Kumar T