Building and maintaining the skills taxonomy that powers LinkedIn's Skills Graph

LinkedIn Engineering Team
11 min readintermediate
--
View Original

Overview

The article discusses the development and maintenance of LinkedIn's skills taxonomy, which underpins the Skills Graph. It emphasizes the importance of a skills-first approach in hiring and details the methodologies, including human curation and machine learning, used to enhance the taxonomy.

What You'll Learn

1

How to leverage a skills taxonomy for better job matching

2

Why a skills-first approach can enhance talent acquisition

3

How to apply machine learning techniques to scale taxonomy construction

Prerequisites & Requirements

  • Understanding of skills taxonomy and its applications in hiring
  • Familiarity with machine learning concepts and tools(optional)

Key Questions Answered

What is the purpose of LinkedIn's skills taxonomy?
LinkedIn's skills taxonomy organizes and categorizes skills based on their hierarchical relationships, allowing for effective matching of member skills to job opportunities. It serves as a foundational vocabulary for the Skills Graph, which enhances user experiences in tools like Recruiter and LinkedIn Learning.
How does LinkedIn ensure the quality of its skills taxonomy?
The skills taxonomy is curated through a combination of human taxonomists and machine learning. This dual approach helps identify and refine skill candidates, ensuring that the taxonomy remains relevant and high-quality by reducing noise and redundancy in skill data.
What role does machine learning play in the skills taxonomy?
Machine learning, specifically through the KGBert model, is utilized to predict relationships between skills and scale the taxonomy construction. This model significantly improves the accuracy of identifying skill relationships, enhancing the overall effectiveness of the Skills Graph.
How has the skills taxonomy evolved over time?
Since February 2021, LinkedIn's skills taxonomy has grown nearly 35%, now encompassing approximately 39,000 skills, 374,000 aliases, and over 200,000 connections between skills. This evolution reflects the dynamic nature of skills across various industries.

Key Statistics & Figures

Growth of skills taxonomy
35%
The skills taxonomy has expanded since February 2021, now including nearly 39,000 skills.
Number of aliases in the taxonomy
374,000
These aliases include abbreviations and translations for the skills listed.
Connections between skills
200,000
These connections help establish relationships within the skills taxonomy.

Technologies & Tools

Machine Learning
Kgbert
Used to predict relationships between skills and enhance the skills taxonomy.
Backend Service
Rest.li
Transforms structural information from the skills taxonomy into the Skills Graph for application use.
Data Storage
Hadoop Distributed File System
Stores the Skills Graph data for both online and offline use cases.

Key Actionable Insights

1
Implementing a skills-first approach can significantly widen your talent pool.
By focusing on skills rather than traditional qualifications, companies can access a broader range of candidates, particularly in sectors facing talent shortages.
2
Utilizing machine learning models like KGBert can streamline the process of taxonomy construction.
These models can enhance the accuracy and efficiency of identifying relationships between skills, making it easier to keep the skills taxonomy up-to-date.
3
Regularly updating your skills taxonomy is crucial for maintaining relevance in the job market.
As new skills emerge, continuously evolving the taxonomy ensures that it reflects current industry demands and trends.

Common Pitfalls

1
Failing to regularly update the skills taxonomy can lead to outdated or irrelevant skill data.
As industries evolve, new skills emerge, and without regular updates, the taxonomy may not reflect current job market needs.
2
Over-relying on automated systems without human oversight can introduce errors in skill categorization.
While machine learning can enhance efficiency, human curation is essential to ensure quality and relevance in the skills taxonomy.

Related Concepts

Skills-first Hiring Approach
Machine Learning In Data Curation
Dynamic Skills Taxonomy Development