How NVIDIA Uses scikit-learn

82 engineering articles about scikit-learn from NVIDIA's engineering team

Other NVIDIA Technologies

Python(740)PyTorch(566)Deep Learning(505)TensorFlow(444)Docker(292)Kubernetes(251)

Other Companies Using scikit-learn

Articles

Filter:

NVIDIA

Intermediate

Training XGBoost Models with GPU-Accelerated Polars DataFrames

The article discusses the integration of XGBoost with Polars DataFrames, emphasizing the benefits of GPU acceleration for machine learning workflows.

PolarsRustscikit-learnXGBoost

Jiaming Yuan

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Building an Interactive AI Agent for Lightning-Fast Machine Learning Tasks

The article discusses the development of an interactive AI agent designed to streamline machine learning workflows by leveraging GPU acceleration.

Machine LearningPythonscikit-learnStreamlit

Allison Ding

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

How NVIDIA DGX Spark’s Performance Enables Intensive AI Tasks

The article discusses how the NVIDIA DGX Spark supercomputer enhances performance for intensive AI tasks, providing a local alternative to cloud computing.

Fine-tuningGPTHugging FacePyTorchscikit-learn

Allen Bourgoyne

5 min read

Has Summary

NVIDIA

Intermediate

How to GPU-Accelerate Model Training with CUDA-X Data Science

This article provides insights into GPU-accelerating machine learning model training using CUDA-X Data Science, focusing on tree-based models like XGBoost, LightGBM, and CatBoost.

CatBoostLightGBMPythonscikit-learnSHAPXGBoost

Divyansh Jain

8 min read

Includes Code

Has Summary

NVIDIA

Advanced

The Kaggle Grandmasters Playbook: 7 Battle-Tested Modeling Techniques for Tabular Data

The article presents a comprehensive playbook developed through extensive experience in Kaggle competitions, detailing seven effective modeling techniques for handling tabular data.

AutoMLCatBoostLightGBMPolarsscikit-learnXGBoost

Kazuki Onodera

12 min read

Includes Code

Has Summary

NVIDIA

Advanced

NVIDIA RAPIDS 25.08 Adds New Profiler for cuML, Updates to the Polars GPU Engine, Additional Algorithm Support,

The NVIDIA RAPIDS 25.

EmbeddingPolarsPythonscikit-learn

Brian Tepera

8 min read

Includes Code

Has Summary

NVIDIA

Intermediate

7 Drop-In Replacements to Instantly Speed Up Your Python Data Science Workflows

This article discusses seven drop-in replacements for popular Python libraries that can significantly speed up data science workflows by leveraging GPU acceleration.

NetworkXPolarsPythonscikit-learnXGBoost

Jamil Semaan

8 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Optimizing Vector Search for Indexing and Real-Time Retrieval with NVIDIA cuVS

The article discusses the advancements in NVIDIA cuVS, a GPU-accelerated vector search library designed for high-performance indexing and low-latency retrieval.

ApacheElasticsearchGoogle CloudJavaOraclePythonRustscikit-learnVertex AI

Corey Nolet

7 min read

Has Summary

NVIDIA

Intermediate

RAPIDS Adds GPU Polars Streaming, a Unified GNN API, and Zero-Code ML Speedups

RAPIDS version 25.

DaskPolarsPythonPyTorchscikit-learn

Brian Tepera

6 min read

Includes Code

Has Summary

NVIDIA

Advanced

AI in Manufacturing and Operations at NVIDIA: Accelerating ML Models with NVIDIA CUDA-X Data Science

NVIDIA utilizes data science and machine learning to enhance chip manufacturing processes, focusing on optimizing workflows through the use of CUDA-X libraries like cuDF and cuML.

PolarsPythonscikit-learnSHAPXGBoost

Divyansh Jain

8 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Supercharge Tree-Based Model Inference with Forest Inference Library in NVIDIA cuML

The article discusses the enhancements in the Forest Inference Library (FIL) within NVIDIA cuML 25. 04, focusing on its capabilities for fast inference of tree-based models.

LightGBMNumPyPythonscikit-learnXGBoost

Dante Gama Dessavre

10 min read

Includes Code

Has Summary

NVIDIA

Intermediate

RAPIDS Brings Zero-Code-Change Acceleration, IO Performance Gains, and Out-of-Core XGBoost

The article discusses the latest enhancements in RAPIDS, including zero-code-change acceleration for Python machine learning, significant IO performance improvements, and out-of-core XGBoost capabi...

ApacheAzureAzure Blob StorageDaskGeminiGoogle CloudGoogle Cloud StorageLightGBMNetworkXPolarsPythonscikit-learnXGBoost

Nick Becker

9 min read

Includes Code

Has Summary

NVIDIA

Advanced

Applying Specialized LLMs with Reasoning Capabilities to Accelerate Battery Research

The article discusses the transformative role of domain-adapted large language models (LLMs) with reasoning capabilities in accelerating battery research.

ClaudeGeminiGPTKubernetesLLaMAscikit-learn

Rucha Apte

11 min read

Has Summary

NVIDIA

Advanced

Spotlight: Accelerating the Discovery of New Battery Materials with SES AI’s Molecular Universe

The article discusses how SES AI is leveraging NVIDIA's advanced hardware and software to accelerate the discovery of new battery materials through a comprehensive mapping of the Molecular Universe.

Pythonscikit-learn

Kang Xu

6 min read

Has Summary

NVIDIA

Advanced

Stacking Generalization with HPO: Maximize Accuracy in 15 Minutes with NVIDIA cuML

The article discusses the combination of stacking generalization and hyperparameter optimization (HPO) using NVIDIA's cuML library to enhance machine learning model accuracy efficiently.

Optunascikit-learn

Allison Ding

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Kaggle Grandmasters Unveil Winning Strategies for Data Science Superpowers

Kaggle Grandmasters David Austin, Chris Deotte, and Ruchi Bhatia shared insights on their winning strategies for data science competitions at the Google Cloud Next conference.

Google CloudLightGBMMVPscikit-learnXGBoost

Jenn Yonemitsu

9 min read

Has Summary

NVIDIA

Advanced

Accelerating the Future of Transportation with SES AI’s NVIDIA-Powered Innovation for Electric

SES AI is leveraging NVIDIA's advanced hardware and software to revolutionize battery technology for electric vehicles (EVs) by accelerating the discovery of novel materials through AI-driven appro...

Pythonscikit-learn

Wen Jie Ong

6 min read

Has Summary

NVIDIA

Intermediate

NVIDIA cuML Brings Zero Code Change Acceleration to scikit-learn

NVIDIA cuML has introduced a zero code change capability that allows data scientists and machine learning engineers to accelerate scikit-learn applications on NVIDIA GPUs without modifying existing...

NumPyPythonscikit-learn

Siddharth Sharma

8 min read

Includes Code

Has Summary

NVIDIA

Advanced

Accelerating Time Series Forecasting with RAPIDS cuML

The article discusses how RAPIDS cuML can accelerate time series forecasting by utilizing GPU-accelerated machine learning techniques.

Deep LearningPythonscikit-learn

Brian Tepera

4 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Harnessing GPU Acceleration for Multi-Label Classification with RAPIDS cuML

The article discusses the challenges of multi-label classification in machine learning and how RAPIDS cuML, a GPU-accelerated library, can enhance the efficiency of these workflows.

Pythonscikit-learn

Nick Becker

4 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Faster Causal Inference on Large Datasets with NVIDIA RAPIDS

The article discusses how NVIDIA RAPIDS can enhance causal inference on large datasets by leveraging GPU acceleration, specifically through the integration of the cuML library with the DoubleML fra...

Pythonscikit-learnXGBoost

Nick Becker

4 min read

Includes Code

Has Summary

NVIDIA

Intermediate

NVIDIA RAPIDS 24.10 Introduces Accelerated NetworkX with Zero Code Change, Updates for UMAP and cuDF-

The NVIDIA RAPIDS v24.

GitHub ActionsNetworkXNumPyPolarsPythonRapidsscikit-learn

Nick Becker

8 min read

Includes Code

Has Summary

NVIDIA

Advanced

Even Faster and More Scalable UMAP on the GPU with RAPIDS cuML

The article discusses the enhancements made to the UMAP dimension reduction algorithm using RAPIDS cuML, focusing on its accelerated performance on GPUs.

DockerPythonscikit-learn

Jinsol Park

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Accelerating Predictive Maintenance in Manufacturing with RAPIDS AI

The article discusses how RAPIDS AI can accelerate predictive maintenance in manufacturing by leveraging advanced data analytics to minimize downtime and optimize maintenance schedules.

ApacheApache ArrowAzurePandasPythonscikit-learn

Amarnath Mohan

11 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Bringing Confidentiality to Vector Search with Cyborg and NVIDIA cuVS

The article discusses the collaboration between Cyborg and NVIDIA to enhance the security of vector databases through the NVIDIA cuVS library, which accelerates encrypted vector search.

scikit-learn

Nicolas Dupont

6 min read

Includes Code

Has Summary

NVIDIA

Advanced

Robust Scene Text Detection and Recognition: Introduction

The article discusses the significance of robust scene text detection and recognition (STDR) in various applications, emphasizing the challenges faced in recognizing text from natural scenes.

KerasPyTorchscikit-learnTensorFlow

Vishal Chavan

8 min read

Has Summary

NVIDIA

Advanced

Reduce Apache Spark ML Compute Costs with New Algorithms in Spark RAPIDS ML Library

The article discusses the Spark RAPIDS ML library, an open-source Python package that accelerates Apache Spark ML applications using NVIDIA GPU technology.

ApacheApache SparkAWSPySparkPythonscikit-learn

Erik Ordentlich

8 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Accelerated Data Analytics: Machine Learning with GPU-Accelerated Pandas and Scikit-learn

The article discusses how GPU-accelerated data analytics can enhance machine learning (ML) projects by improving speed and scalability.

ApacheApache ArrowLightGBMMachine LearningPandasPythonscikit-learnXGBoost

Jay Rodge

14 min read

Includes Code

Has Summary

NVIDIA

Intermediate

GPU-Accelerated Single-Cell RNA Analysis with RAPIDS-singlecell

The article discusses the advancements in single-cell RNA sequencing analysis using the RAPIDS-singlecell library, which leverages GPU acceleration to significantly enhance performance.

NumbaNumPyPythonscikit-learnSciPy

Severin Dicks

13 min read

Includes Code

Has Summary

NVIDIA

Advanced

Applying Federated Learning to Traditional Machine Learning Methods

The article discusses the application of federated learning to traditional machine learning methods, highlighting its advantages in communication efficiency and the ability to train models collabor...

Federated LearningMachine Learningscikit-learnXGBoost

Kris Kersten

3 min read

Has Summary

NVIDIA

Beginner

Predicting Credit Defaults Using Time-Series Models with Recursive Neural Networks and XGBoost

This article discusses the use of time-series models, specifically autoregressive recursive neural networks and XGBoost, for predicting credit defaults.

LightGBMNeural NetworksPyTorchscikit-learnTensorFlowXGBoost

Jiwei Liu

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Limit Order Book Dataset Generation for Accelerated Short-Term Price Prediction with RAPIDS

The article discusses the generation of a Limit Order Book (LOB) dataset for short-term price prediction using RAPIDS, emphasizing the benefits of GPU acceleration in financial machine learning.

Machine LearningPythonscikit-learn

Andrew Briand

9 min read

Includes Code

Has Summary

NVIDIA

Intermediate

A Comprehensive Guide to Interaction Terms in Linear Regression

This article provides a comprehensive guide to understanding interaction terms in linear regression, emphasizing their importance in modeling the relationship between dependent and independent vari...

Pythonscikit-learnV

Eryk Lewinson

11 min read

Includes Code

Has Summary

NVIDIA

Intermediate

A Comprehensive Overview of Regression Evaluation Metrics

This article provides a comprehensive overview of regression evaluation metrics essential for assessing machine learning model performance.

scikit-learn

Eryk Lewinson

17 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Machine Learning in Practice: Deploy an ML Model on Google Cloud Platform

This article provides a comprehensive guide on deploying machine learning models on Google Cloud Platform (GCP).

AutoMLAWSAzureFlaskGoogle CloudGoogle Cloud FunctionsGoogle Cloud StorageHTMLIrisMachine LearningPandasPythonscikit-learnServerlessVertex AI

Kurtis Pykes

10 min read

Includes Code

Has Summary

NVIDIA

Beginner

Machine Learning in Practice: Build an ML Model

This article focuses on the practical aspects of building and training a machine learning (ML) model using Python, specifically utilizing the Iris Dataset.

DaskGoogle CloudIrisMachine LearningPythonscikit-learn

Kurtis Pykes

5 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Faster HDBSCAN Soft Clustering with RAPIDS cuML

The article discusses the enhancements in the RAPIDS cuML library for performing HDBSCAN soft clustering, providing significant performance improvements over traditional CPU-based methods.

DockerPyTorchscikit-learnTransformers

Nick Becker

9 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Improving Machine Learning Security Skills at a DEF CON Competition

The article discusses the emergence of machine learning (ML) security as a critical discipline at the intersection of information security and data science.

AWSMachine Learningscikit-learn

Joseph Lucas

7 min read

Has Summary

NVIDIA

Advanced

Achieving 100x Faster Single-Cell Modality Prediction with NVIDIA RAPIDS cuML

This article discusses the advancements in single-cell measurement technologies and how NVIDIA RAPIDS cuML can significantly accelerate single-cell modality prediction.

PyTorchscikit-learn

Jiwei Liu

7 min read

Includes Code

Has Summary

NVIDIA

Advanced

Solving AI Inference Challenges with NVIDIA Triton

The article discusses the challenges of deploying AI models in production and how NVIDIA Triton Inference Server addresses these challenges.

AWSBERTGPTKubernetesLightGBMPythonPyTorchscikit-learnSHAPT5TensorFlowTransformerXGBoost

Shankar Chandrasekaran

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Accelerating ETL on KubeFlow with RAPIDS

The article discusses how to accelerate ETL processes on KubeFlow using RAPIDS, a data science framework that leverages GPUs for improved performance.

DaskDockerKubernetesNumPyPandasPythonscikit-learnYAML

Jacob Tomlinson

12 min read

Includes Code

Has Summary

NVIDIA

Advanced

Faster Text Classification with Naive Bayes and GPUs

The article discusses the advantages of using Naive Bayes (NB) classifiers for text classification tasks, particularly when leveraging GPU acceleration through RAPIDS cuML.

DaskGoogle CloudNumPyPythonscikit-learnSciPy

Mickael Ide

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Dealing with Outliers Using Three Robust Linear Regression Models

This article discusses the challenges posed by outliers in linear regression and presents three robust regression models—Huber regression, RANSAC regression, and Theil-Sen regression—as solutions.

scikit-learn

Eryk Lewinson

12 min read

Includes Code

Has Summary

NVIDIA

Advanced

Speed Up Machine Learning Models with Accelerated WEKA

The article discusses Accelerated WEKA, a project that integrates GPU acceleration into the WEKA machine learning software using RAPIDS libraries.

ApacheApache ArrowDeep LearningJavaMachine LearningPythonscikit-learnXGBoost

Albert Bifet

11 min read

Has Summary

NVIDIA

Advanced

Step-by-Step Guide to Building a Machine Learning Application with RAPIDS

This article provides a comprehensive step-by-step guide for building a machine learning application using RAPIDS, a suite of open-source software libraries that leverage GPU acceleration.

DockerGoogle CloudMachine LearningPythonscikit-learnSHAPVertex AIXGBoost

Paul Mahler

10 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Fast Fine-Tuning of AI Transformers Using RAPIDS Machine Learning

The article discusses the fast fine-tuning of AI transformers using RAPIDS Machine Learning, highlighting the advantages of using cuML support vector machine (SVM) as a head module instead of the t...

Machine LearningPyTorchscikit-learnTransformers

Jiwei Liu

6 min read

Has Summary

NVIDIA

Intermediate

Three Approaches to Encoding Time Information as Features for ML Models

This article explores three effective approaches to encoding time information as features for machine learning models, emphasizing the importance of feature engineering in improving model accuracy.

Pythonscikit-learn

Eryk Lewinson

13 min read

Includes Code

Has Summary

NVIDIA

Advanced

A Data Scientist’s Guide to Gradient Descent and Backpropagation Algorithms

This article serves as a guide for Data Scientists to understand the fundamental concepts of gradient descent and backpropagation algorithms, which are essential for training Artificial Neural Netw...

Deep LearningNeural NetworksPyTorchscikit-learnTensorFlow

Richmond Alake

9 min read

Has Summary

NVIDIA

Intermediate

Real-time Serving for XGBoost, Scikit-Learn RandomForest, LightGBM, and More

The article discusses the deployment of tree-based models like XGBoost and LightGBM using the NVIDIA Triton Inference Server, emphasizing its capabilities for real-time serving and GPU acceleration.

AWSAzureDockerFlaskHelmJSONKubernetesLightGBMPythonPyTorchscikit-learnTensorFlowVertex AIXGBoost

William Hicks

7 min read

Has Summary

NVIDIA

Intermediate

Natural Language Processing First Steps: How Algorithms Understand Text

This article introduces the foundational techniques for preparing text data for Natural Language Processing (NLP) using vectorization, hashing, and tokenization.

DaskNatural Language Processingscikit-learn

Edward Krueger

10 min read

Has Summary