GitHub’s Metal Cloud

At GitHub we place an emphasis on stability, availability, and performance. A large component of ensuring we excel in these areas is deploying services on bare-metal hardware. This allows us…

Lee Reilly
8 min readintermediate
--
View Original

Overview

GitHub's Metal Cloud emphasizes stability, availability, and performance by deploying services on bare-metal hardware. The article discusses the development of gPanel, a Ruby on Rails application for managing physical infrastructure, and outlines the automated processes for server provisioning and operating system installation.

What You'll Learn

1

How to automate server provisioning using gPanel

2

Why bare-metal hardware can enhance performance and availability

3

When to use Intelligent Platform Management Interface (IPMI) for hardware management

Prerequisites & Requirements

  • Understanding of server provisioning and operating systems
  • Familiarity with Ruby on Rails and infrastructure management tools(optional)

Key Questions Answered

How does gPanel manage physical hardware in GitHub's data centers?
gPanel acts as the source of truth for GitHub's data center, tracking and managing physical components like cabinets, PDUs, and servers. It automates the provisioning process, allowing for efficient installation and configuration of operating systems on new hardware, ensuring that the entire company can access these resources without specialized knowledge.
What is the role of IPMI in GitHub's infrastructure management?
IPMI is utilized for out-of-band management of hardware, allowing GitHub to monitor and control servers remotely. This interface helps in automating tasks such as firmware updates and system reboots, which are crucial for maintaining the performance and reliability of their bare-metal servers.
What steps are involved in the burn-in process for new hardware?
The burn-in process includes two states: 'breakin' and 'memtesting'. During 'breakin', hardware is exercised to detect issues, while 'memtesting' uses a custom MemTest86 image to check memory integrity. If failures are detected, the hardware is marked as 'failed' for further review.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend
Ruby On Rails
Used to develop gPanel, the physical infrastructure management application.
Operating System
Ubuntu
Serves as the base for the PXE image used in server provisioning.
Management Interface
Ipmi
Facilitates out-of-band management of hardware.
Network Booting
Ipxe
Used for network booting and retrieving instructions from gPanel.
System Information
Facter
Gathers system information during the provisioning process.

Key Actionable Insights

1
Implementing gPanel can significantly streamline the process of managing physical hardware in a data center.
By centralizing hardware management and automating provisioning, teams can reduce the time and expertise needed to deploy new servers, allowing for faster scaling and improved operational efficiency.
2
Utilizing IPMI can enhance remote management capabilities for server hardware.
This allows teams to perform critical tasks such as firmware updates and system reboots without physical access, which is especially beneficial for distributed teams managing large data centers.
3
Automating the burn-in process can help identify hardware issues early.
By incorporating automated testing during the burn-in phase, organizations can reduce the risk of deploying faulty hardware, ensuring higher reliability and performance in production environments.

Common Pitfalls

1
Failing to automate the hardware provisioning process can lead to inefficiencies.
Without automation, the process relies heavily on specialized knowledge, making it difficult to scale operations and increasing the risk of errors during deployment.
2
Neglecting the burn-in process can result in deploying faulty hardware.
Skipping thorough testing phases like 'breakin' and 'memtesting' may lead to unexpected failures in production, impacting service availability and performance.