Hosted Search: LinkedIn Search as a managed service

LinkedIn Engineering Team
12 min readintermediate
--
View Original

Overview

The article discusses LinkedIn's Hosted Search, a fully managed cloud-based search solution designed to simplify the integration of search functionalities for application teams. It highlights the evolution from LinkedIn's legacy search system to Hosted Search, emphasizing reduced operational overhead and improved efficiency for developers.

What You'll Learn

1

How to integrate search functionality with minimal onboarding using Hosted Search

2

Why a clear boundary between data transformation and indexing simplifies operations

3

When to leverage Hosted Search for Global Secondary Indexes in Espresso

Prerequisites & Requirements

  • Understanding of search functionalities and cloud services

Key Questions Answered

What are the main advantages of using Hosted Search over legacy SeaS?
Hosted Search offers a fully managed service that reduces operational overhead for application teams, allowing them to focus on product development. It simplifies the integration of search functionalities and provides automated workflows, which enhances efficiency and lowers the complexity associated with maintaining a search solution.
How does Hosted Search support Global Secondary Indexes in Espresso?
Hosted Search builds indexes from data stored in Espresso tables and routes GSI queries to Hosted Search from the Espresso router. This allows for efficient querying across all partitions, returning document keys that match the query, which are then fetched from the Espresso storage nodes.
What is the role of the HS-Controller in Hosted Search?
The HS-Controller orchestrates all aspects of the Hosted Search ecosystem, managing the allocation of HS-Clusters and coordinating data transformations. It interacts with various services to ensure efficient operations, including triggering offline index creations and deployments.
What are the key components of the Hosted Search architecture?
The architecture includes tenant-indexes (TIs), HS-Clusters, and the HS-Controller, which manages resources and workflows. It utilizes Nuage as a customer-facing portal for onboarding and monitoring, ensuring streamlined operations for application teams.

Key Statistics & Figures

Number of use cases onboarded in the first year of Hosted Search
70
This number surpasses the total use cases leveraging the legacy SeaS solution during its entire lifetime.
Number of verticals served by Hosted Search
40
This reflects the rapid adoption and scaling capabilities of Hosted Search compared to its predecessor.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Database
Espresso
LinkedIn's online, distributed, fault-tolerant document store used as the source of truth for many applications.
Stream Processing
Apache Kafka
Used for handling live updates and data transformations in the Hosted Search ecosystem.
Workflow Management
Azkaban
Utilized by the HS-Controller to trigger the creation of offline indexes and their deployment.
Stream Processing
Apache Samza
Framework used for coordinating data transformations on nearline sources of truth.

Key Actionable Insights

1
Application teams should adopt Hosted Search to reduce the complexity of integrating search functionalities into their products.
By leveraging Hosted Search, teams can focus on delivering value to users without the burden of maintaining search infrastructure, thus accelerating development cycles.
2
Utilizing automated workflows in Hosted Search can significantly enhance operational stability and reduce human error.
Automated processes like Blue-Green deployments ensure that changes are safely rolled out, minimizing the risk of regressions and improving overall system reliability.
3
Understanding the clear separation between data transformation and indexing in Hosted Search can lead to better system design.
This separation allows teams to manage their data pipelines more effectively, improving the performance and maintainability of their search solutions.

Common Pitfalls

1
One common pitfall is underestimating the complexity of managing search functionalities without a managed service.
Many teams may struggle with the operational overhead and technical challenges associated with maintaining their own search infrastructure, leading to delays and increased costs.

Related Concepts

Search As A Service (seas)
Cloud-based Search Solutions
Data Transformation And Indexing
Automated Deployment Strategies