#
XGBoost Programming Tutorials & Engineering Articles
127 XGBoost tutorials, guides, and engineering insights from NVIDIA, Uber, LinkedIn, and more
Companies Using This
XGBoost Articles & Tutorials
Filter:
The article discusses Project Aether, a tool developed by NVIDIA to facilitate the migration of CPU-based Apache Spark workloads to GPU-accelerated environments on Amazon EMR.
Navin Kumar
6 min read
Includes Code
Has Summary
--
Shopify open-sources Tangle, an ML experimentation platform built to solve six common failure modes in machine learning development.
Shopify Engineering
12 min read
Has Summary
--
The article discusses how Uber enhanced its Guidance Heatmap using deep probabilistic models to provide drivers with better insights into potential earnings.
Bob Zheng, Jane Hung, Arushi Singh, Dhruv Ghulati, Yifan Yu, Paul Frend, Elif Eser
9 min read
Has Summary
--
The article discusses the integration of XGBoost with Polars DataFrames, emphasizing the benefits of GPU acceleration for machine learning workflows.
Jiaming Yuan
7 min read
Includes Code
Has Summary
--
This article discusses how Uber has integrated explainability into its machine learning platform, Michelangelo, using Integrated Gradients (IG) to provide interpretable attributions for deep learni...
Hugh Chen, Eric Wang, Gaoyuan Huang, Howard Yu, Jia Li, Sally Lee
14 min read
Has Summary
--
This article provides insights into GPU-accelerating machine learning model training using CUDA-X Data Science, focusing on tree-based models like XGBoost, LightGBM, and CatBoost.
The article presents a comprehensive playbook developed through extensive experience in Kaggle competitions, detailing seven effective modeling techniques for handling tabular data.
The article discusses the advancements in XGBoost 3. 0, particularly its ability to train with terabyte-scale datasets on a single NVIDIA Grace Hopper Superchip.
This article discusses seven drop-in replacements for popular Python libraries that can significantly speed up data science workflows by leveraging GPU acceleration.
Jamil Semaan
8 min read
Includes Code
Has Summary
--
The article discusses the introduction of cuda-cccl, a Python library that provides high-level building blocks for NVIDIA CUDA kernel fusion, enabling developers to write efficient algorithms witho...
Ashwin Srinath
5 min read
Includes Code
Has Summary
--
NVIDIA utilizes data science and machine learning to enhance chip manufacturing processes, focusing on optimizing workflows through the use of CUDA-X libraries like cuDF and cuML.
Divyansh Jain
8 min read
Includes Code
Has Summary
--
The article discusses the enhancements in the Forest Inference Library (FIL) within NVIDIA cuML 25. 04, focusing on its capabilities for fast inference of tree-based models.
Dante Gama Dessavre
10 min read
Includes Code
Has Summary
--
The article discusses the application of Graph Neural Networks (GNNs) in enhancing fraud detection within financial services.
Naim
10 min read
Includes Code
Has Summary
--
The article discusses the latest enhancements in RAPIDS, including zero-code-change acceleration for Python machine learning, significant IO performance improvements, and out-of-core XGBoost capabi...
ApacheAzureAzure Blob StorageDaskGeminiGoogle CloudGoogle Cloud StorageLightGBMNetworkXPolarsPythonscikit-learnXGBoost
Nick Becker
9 min read
Includes Code
Has Summary
--
The article discusses how Atgenomix SeqsLab leverages NVIDIA technologies to enhance health omics analysis for precision medicine.
Yu-Ting Lin
9 min read
Has Summary
--
The article discusses the use of GPU acceleration to enhance performance in Apache Spark applications, highlighting the challenges of migrating workloads from CPUs to GPUs.
Matt Ahrens
9 min read
Includes Code
Has Summary
--
Kaggle Grandmasters David Austin, Chris Deotte, and Ruchi Bhatia shared insights on their winning strategies for data science competitions at the Google Cloud Next conference.
Jenn Yonemitsu
9 min read
Has Summary
--
The article discusses how feature engineering, particularly using NVIDIA cuDF-pandas for GPU acceleration, can significantly enhance model accuracy in Kaggle competitions involving tabular data.
This article discusses how Uber enhances personalized CRM communication using contextual bandit strategies, particularly focusing on the application of AI/ML techniques to optimize email content.
LJ (Lin) He, Yifeng Wu, Gaurav Jindal
13 min read
Has Summary
--
The article discusses how Uber utilizes Ray®, a general compute engine for Python®, to enhance the efficiency of its rides business through improved machine learning model performance and optimizat...
Kaichen Wei, Matt Walker, Peng Zhang
15 min read
Has Summary
--
The article discusses the strategies employed by the winners of the NVIDIA hackathon at ODSC West, focusing on how they utilized RAPIDS Python APIs to enhance machine learning workflows.
The article discusses the integration of CUDA-accelerated Homomorphic Encryption into Federated XGBoost, enhancing data privacy and security in federated learning environments.
Ziyue Xu
10 min read
Includes Code
Has Summary
--
The article discusses best practices for multi-GPU data analysis using RAPIDS with Dask, emphasizing the need for efficient memory management and accelerated networking.
The article discusses how NVIDIA RAPIDS can enhance causal inference on large datasets by leveraging GPU acceleration, specifically through the integration of the cuML library with the DoubleML fra...
Nick Becker
4 min read
Includes Code
Has Summary
--
The article discusses the practical implementation of Federated XGBoost using NVIDIA FLARE, highlighting its capabilities for concurrent training, fault tolerance, and experiment tracking.
Yuan-Ting Hsieh
5 min read
Includes Code
Has Summary
--
The article introduces five new technical courses offered by NVIDIA aimed at enhancing skills in AI and data science.
ApacheApache ArrowApache SparkComputer VisionNatural Language ProcessingPrompt EngineeringPyTorchTransformerTransformersXGBoost
Rachel Ho
4 min read
Has Summary
--
This article provides a comprehensive guide on leveraging RAPIDS for GPU-accelerated data processing on Databricks.
The article discusses Uber's evolution in machine learning (ML) through its centralized platform, Michelangelo, highlighting its transition from predictive to generative AI.
ApacheApache SparkAutoMLDeep LearningDockerGenerative AIHugging FaceKerasKubernetesPaLMPrompt EngineeringPyTorchTensorFlowXGBoost
Kai Wang, Min Cai, Joseph Wang, Eric Chen
28 min read
Has Summary
--
The article discusses the rapid adoption of federated learning (FL) and the new features introduced in NVIDIA FLARE 2. 4.
AWSAzureFederated LearningGPTGraph Neural NetworksgRPCHugging FaceMachine LearningNeural NetworksPyTorchXGBoost
Chester Chen
15 min read
Includes Code
Has Summary
--
The article discusses the development of LinkedIn's 'People You May Know' (PYMK) recommendation system, detailing its architecture and the challenges faced in scaling its scoring mechanism to handl...
Parag Agrawal
7 min read
Has Summary
--
The article discusses the integration of Metaflow and NVIDIA Triton Inference Server for developing and deploying machine learning models.
Eddie Mattia
12 min read
Includes Code
Has Summary
--
The article discusses the collaboration between H2O. ai and NVIDIA to enhance AI applications in financial services through generative AI and predictive analytics.
The article discusses a novel approach to clustering large and diverse datasets by combining dimensionality reduction, recursion, and supervised machine learning.
The article discusses how LinkedIn enhances its content moderation efforts through a new framework that utilizes machine learning for dynamic content prioritization.
Abhishek Chandak
7 min read
Has Summary
--
The article discusses Spotify's innovative approach to automating content marketing to efficiently acquire users at scale.
The article discusses how to optimize multi-GPU model training using Dask and XGBoost, addressing common challenges such as out-of-memory errors.
The article discusses how GPU-accelerated data analytics can enhance machine learning (ML) projects by improving speed and scalability.
Jay Rodge
14 min read
Includes Code
Has Summary
--
The article discusses the application of federated learning to traditional machine learning methods, highlighting its advantages in communication efficiency and the ability to train models collabor...
Kris Kersten
3 min read
Has Summary
--
The article discusses Cloudflare's Constellation, a set of APIs for running low-latency AI inference tasks on their global network.
Rita Kozlov
7 min read
Has Summary
--
This article discusses the use of time-series models, specifically autoregressive recursive neural networks and XGBoost, for predicting credit defaults.
Jiwei Liu
11 min read
Includes Code
Has Summary
--
The article discusses the development of Stripe Radar, a fraud prevention solution that evaluates transactions in real-time to prevent fraud.
Ryan Drapeau
11 min read
Has Summary
--
This article discusses the process of building categories for Airbnb listings using a combination of machine learning (ML) and human review.
Mihajlo Grbovic
13 min read
Has Summary
--
The article discusses Shopify's Merlin machine learning platform, focusing on its online inference capabilities for real-time predictions.
Isaac Vidas
15 min read
Has Summary
--
The article discusses the new capability of XGBoost 1. 7 to handle categorical features without manual encoding, which simplifies the training and inference processes for machine learning models.
Chris Jarrett
5 min read
Includes Code
Has Summary
--
The article discusses Spotify's evolution in machine learning (ML) infrastructure, emphasizing the integration of Ray to enhance flexibility and scalability for diverse ML practitioners.
Divita Vohra
13 min read
Includes Code
Has Summary
--
The article discusses how Uber optimizes the timing of push notifications using machine learning and linear programming.
The article discusses NVIDIA FLARE 2. 2, an open-source platform for federated learning that introduces new features aimed at reducing development time and enhancing deployment efficiency.
The article discusses the importance of explainability in machine learning models, particularly through the use of SHAP (SHapley Additive Explanations) and its GPU-accelerated variant, GPUTreeShap.
Parul Pandey
14 min read
Includes Code
Has Summary
--
The article discusses how Graph Neural Networks (GNNs) and NVIDIA GPUs can optimize fraud detection in financial services.
Ashish Sardana
21 min read
Includes Code
Has Summary
--
The article discusses the challenges of deploying AI models in production and how NVIDIA Triton Inference Server addresses these challenges.
Shankar Chandrasekaran
11 min read
Includes Code
Has Summary
--