Unlocking a Simple, Extensible, and Performant Video Pipeline at Fyma with NVIDIA DeepStream

Discover how Fyma used NVIDIA DeepStream to improve the performance of its vision AI applications while speeding up development time.

Alvin Clark
7 min readintermediate
--
View Original

Overview

The article discusses how Fyma, a computer vision company, leveraged NVIDIA DeepStream to create a simple, extensible, and performant video processing pipeline. By overcoming challenges associated with live video streaming, Fyma improved their application’s performance and reduced development time significantly.

What You'll Learn

1

How to implement a video processing pipeline using NVIDIA DeepStream

2

Why using hardware acceleration improves video processing performance

3

How to utilize GStreamer components for video analytics

Prerequisites & Requirements

  • Understanding of computer vision concepts and video processing
  • Familiarity with NVIDIA DeepStream SDK(optional)
  • Experience with Python and video processing libraries

Key Questions Answered

What challenges does Fyma face with live video streaming?
Fyma encounters several challenges with live video streaming, including broken video from cameras, network issues causing interruptions, and the need for real-time processing to prevent frame accumulation. These challenges necessitate a robust video processing pipeline that can handle the infinite nature of live streams.
How did Fyma improve their video processing performance?
Fyma improved their video processing performance by implementing a custom ffmpeg encoder and transitioning to NVIDIA DeepStream, which allowed for hardware acceleration and reduced memory usage. This led to frame rates increasing up to 500 fps for a single video stream and improved accuracy by 2-3 times compared to earlier versions.
What is the significance of using DeepSORT in Fyma's application?
DeepSORT was integrated to enhance object tracking accuracy in Fyma's application. This required modifications to their custom encoder to output visual features, which improved the tracking capabilities but also increased CPU memory usage, necessitating an asynchronous tracking approach to balance resource consumption.
What benefits does Fyma gain from using NVIDIA DeepStream?
By using NVIDIA DeepStream, Fyma achieved a simpler and more extensible codebase while significantly enhancing performance. They were able to run multiple camera streams efficiently, achieving near real-time data processing and reducing the overall complexity of their application.

Key Statistics & Figures

Frame rate improvement
up to 500 fps
Achieved for a single video stream after implementing DeepStream
Accuracy improvement
2-3 times
Compared to the initial implementation before using DeepStream
Development time
less than two months
Time taken to implement DeepStream in Fyma's application

Technologies & Tools

Backend
Nvidia Deepstream
Used for building the video processing pipeline
Backend
Gstreamer
Framework for handling multimedia processing in the pipeline
Backend
Opencv
Initially used for capturing video in earlier versions
Backend
Ffmpeg
Used for video decoding and encoding
Backend
Deepsort
Improved object tracking accuracy in the application
Backend
Triton Inference Server
Enabled sharing of neural networks between camera streams

Key Actionable Insights

1
Adopting NVIDIA DeepStream can drastically enhance the performance of video processing applications.
Fyma's transition to DeepStream resulted in frame rates up to 10 times faster and improved accuracy. This demonstrates the importance of leveraging advanced frameworks to optimize resource usage and processing speed.
2
Implementing hardware acceleration is crucial for real-time video analytics.
Fyma's experience shows that moving from software to hardware decoding and encoding significantly improved their processing capabilities, emphasizing the need for hardware solutions in demanding applications.
3
Utilizing a pipeline-based architecture can simplify complex video processing tasks.
Fyma's use of GStreamer and DeepStream illustrates how a structured approach to building video pipelines can lead to better performance and easier maintenance of the codebase.

Common Pitfalls

1
Relying solely on software for video decoding and processing can severely limit performance.
Fyma's initial implementations suffered from low frame rates due to software limitations. Transitioning to hardware acceleration was essential for achieving the necessary performance levels.
2
Complex architectures can lead to uneven resource usage and increased maintenance challenges.
The introduction of asynchronous tracking in Fyma's application added complexity. Simplifying the architecture while maintaining performance is crucial for long-term success.

Related Concepts

Computer Vision
Video Processing
Real-time Analytics
Machine Learning Models