Syscall Auditing at Scale

If you are are an engineer whose organization uses Linux in production, I have two quick questions for you: 1) How many unique outbound TCP connections have your servers made in the past hour? 2) Which processes and users initiated each of those connections? If you can answer both of these questions, fantastic! You can skip the…

Ryan Huber
11 min readadvanced
--
View Original

Overview

The article discusses the implementation of syscall auditing at scale, specifically through the use of the open-source tool go-audit developed by Slack. It highlights how syscall monitoring can enhance security and operational insights in Linux environments by providing detailed logging of system calls.

What You'll Learn

1

How to implement syscall auditing using go-audit

2

Why centralized logging improves security monitoring

3

When to use auditd versus go-audit for syscall monitoring

Prerequisites & Requirements

  • Basic understanding of Linux syscalls and auditing concepts
  • Familiarity with Linux command line tools like auditctl(optional)

Key Questions Answered

How does go-audit improve syscall logging compared to auditd?
go-audit converts multiline events from auditd into a single JSON blob, speaks directly to the kernel via netlink, and is designed to be highly performant. This allows for more efficient and scalable logging of syscall events, making it easier to analyze and monitor system behavior.
What are the benefits of centralized logging for syscall events?
Centralized logging allows for better correlation of events, making it possible to identify malicious activity without revealing specific commands to attackers. It also enables the analysis of historical data to improve security posture and incident response.
What challenges are associated with using auditd for syscall monitoring?
Auditd's log format can be difficult to work with due to its key=value structure, multiline events, and potential for interleaved messages. These challenges can hinder real-time monitoring and analysis, which is why go-audit was developed.
How much log volume can be expected when using go-audit?
The log volume is highly variable, but Slack reports logging hundreds of gigabytes per day across approximately 5500 instances. This indicates the scale at which go-audit can operate effectively in a production environment.

Key Statistics & Figures

Instances logging data
5500
This number reflects the scale at which Slack operates its monitoring infrastructure using go-audit.
Daily log volume
hundreds of gigabytes
This volume highlights the extensive data generated by syscall auditing in a large production environment.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend
Go-audit
Used for syscall auditing and logging in Linux environments.
Backend
Auditd
Traditional syscall auditing tool that go-audit aims to enhance.
Backend
Elasticsearch
Used for storing and querying logs generated by go-audit.

Key Actionable Insights

1
Implementing go-audit can significantly enhance your organization's monitoring capabilities by providing detailed syscall logs.
This is particularly useful for security teams needing to investigate incidents and understand system behavior in real-time.
2
Centralizing your logging infrastructure allows for better analysis and correlation of events across multiple servers.
This approach can help identify patterns of malicious behavior that might be missed when monitoring individual servers.
3
Utilizing JSON for log formatting can simplify the integration with modern logging systems like Elasticsearch.
This makes it easier to set up alerts and dashboards for monitoring system activity.

Common Pitfalls

1
Filtering events on individual servers can lead to missing critical information about potential threats.
This happens because attackers can exploit specific filters, and legitimate commands may also trigger alerts, leading to alert fatigue.

Related Concepts

Syscall Monitoring
Centralized Logging
Security Auditing