Simple streaming telemetry

Introducing gnmi-gateway: a modular, distributed, and highly available service for modern network telemetry via OpenConfig and gNMI

Netflix Technology Blog
11 min readintermediate
--
View Original

Overview

The article discusses Netflix's implementation of streaming telemetry through the open-source project gnmi-gateway, which leverages the OpenConfig data model and gRPC Network Management Interface (gNMI) protocol. It highlights the challenges of traditional network management tools and presents a modular solution for collecting and distributing telemetry data from network devices.

What You'll Learn

1

How to set up and configure gnmi-gateway for streaming telemetry

2

Why OpenConfig and gNMI improve network management

3

How to implement clustering for high availability in telemetry systems

Prerequisites & Requirements

  • Understanding of network management concepts and telemetry
  • Golang 1.13 or later, git, and openssl

Key Questions Answered

What is gnmi-gateway and how does it function?
gnmi-gateway is a modular, distributed service for OpenConfig modeled streaming telemetry data over gNMI. It allows network operators to collect and distribute telemetry data from various network devices efficiently, addressing the limitations of traditional network management tools.
How does Netflix ensure high availability in gnmi-gateway?
Netflix implements clustering in gnmi-gateway to allow multiple service instances to coordinate connections to targets, ensuring failure tolerance. This setup avoids duplicate connections and enables load balancing among instances, enhancing overall system reliability.
What are the key features of the gNMI protocol?
gNMI provides four RPC mechanisms: Capabilities, Get, Set, and Subscribe. The Subscribe RPC is particularly important as it streams state changes from network devices to clients, allowing real-time monitoring and management.
What challenges exist with traditional network management tools?
Traditional tools like SNMP and CLI screen-scraping are often unstructured, vendor-proprietary, and require active polling, leading to reliability issues and scalability challenges. These shortcomings necessitate modern solutions like streaming telemetry.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Protocol
Gnmi
Used for streaming telemetry data from network devices.
Data Model
Openconfig
Provides a strongly typed, vendor-agnostic data model for network devices.
Programming Language
Golang
The language used to build the gnmi-gateway service.
Protocol
Grpc
Facilitates communication between clients and gNMI targets.
Clustering Tool
Apache Zookeeper
Used for managing clustering and connection coordination in gnmi-gateway.

Key Actionable Insights

1
Implementing gnmi-gateway can significantly enhance your network monitoring capabilities by providing real-time telemetry data.
By leveraging streaming telemetry, network operators can avoid the pitfalls of traditional polling methods, leading to more accurate and timely insights into network performance.
2
Utilizing OpenConfig data models can standardize your network device configurations, making management easier and more efficient.
With a vendor-agnostic approach, OpenConfig allows for consistent data structures across different devices, simplifying integration and reducing complexity.
3
Consider clustering your telemetry services to improve fault tolerance and availability.
By coordinating multiple instances of gnmi-gateway, you can ensure that your telemetry data collection remains robust even in the face of individual service failures.

Common Pitfalls

1
Failing to implement proper error handling in telemetry systems can lead to data loss.
Without robust error handling, issues during data streaming can result in missed telemetry updates, which can compromise network monitoring efforts.
2
Neglecting to secure gNMI connections can expose sensitive network data.
Since gNMI requires TLS for secure connections, failing to implement this can leave your telemetry data vulnerable to interception.

Related Concepts

Streaming Telemetry
Network Management
Openconfig Data Models
Grpc Communication