How NVIDIA Uses scikit-learn
82 engineering articles about scikit-learn from NVIDIA's engineering team
Other NVIDIA Technologies
Other Companies Using scikit-learn
Articles
Filter:
The article discusses the integration of XGBoost with Polars DataFrames, emphasizing the benefits of GPU acceleration for machine learning workflows.
Jiaming Yuan
7 min read
Includes Code
Has Summary
--
The article discusses the development of an interactive AI agent designed to streamline machine learning workflows by leveraging GPU acceleration.
Allison Ding
7 min read
Includes Code
Has Summary
--
The article discusses how the NVIDIA DGX Spark supercomputer enhances performance for intensive AI tasks, providing a local alternative to cloud computing.
Allen Bourgoyne
5 min read
Has Summary
--
This article provides insights into GPU-accelerating machine learning model training using CUDA-X Data Science, focusing on tree-based models like XGBoost, LightGBM, and CatBoost.
The article presents a comprehensive playbook developed through extensive experience in Kaggle competitions, detailing seven effective modeling techniques for handling tabular data.
Brian Tepera
8 min read
Includes Code
Has Summary
--
This article discusses seven drop-in replacements for popular Python libraries that can significantly speed up data science workflows by leveraging GPU acceleration.
Jamil Semaan
8 min read
Includes Code
Has Summary
--
The article discusses the advancements in NVIDIA cuVS, a GPU-accelerated vector search library designed for high-performance indexing and low-latency retrieval.
Corey Nolet
7 min read
Has Summary
--
RAPIDS version 25.
Brian Tepera
6 min read
Includes Code
Has Summary
--
NVIDIA utilizes data science and machine learning to enhance chip manufacturing processes, focusing on optimizing workflows through the use of CUDA-X libraries like cuDF and cuML.
Divyansh Jain
8 min read
Includes Code
Has Summary
--
The article discusses the enhancements in the Forest Inference Library (FIL) within NVIDIA cuML 25. 04, focusing on its capabilities for fast inference of tree-based models.
Dante Gama Dessavre
10 min read
Includes Code
Has Summary
--
The article discusses the latest enhancements in RAPIDS, including zero-code-change acceleration for Python machine learning, significant IO performance improvements, and out-of-core XGBoost capabi...
ApacheAzureAzure Blob StorageDaskGeminiGoogle CloudGoogle Cloud StorageLightGBMNetworkXPolarsPythonscikit-learnXGBoost
Nick Becker
9 min read
Includes Code
Has Summary
--
The article discusses the transformative role of domain-adapted large language models (LLMs) with reasoning capabilities in accelerating battery research.
Rucha Apte
11 min read
Has Summary
--
The article discusses how SES AI is leveraging NVIDIA's advanced hardware and software to accelerate the discovery of new battery materials through a comprehensive mapping of the Molecular Universe.
Kang Xu
6 min read
Has Summary
--
The article discusses the combination of stacking generalization and hyperparameter optimization (HPO) using NVIDIA's cuML library to enhance machine learning model accuracy efficiently.
Allison Ding
7 min read
Includes Code
Has Summary
--
Kaggle Grandmasters David Austin, Chris Deotte, and Ruchi Bhatia shared insights on their winning strategies for data science competitions at the Google Cloud Next conference.
Jenn Yonemitsu
9 min read
Has Summary
--
SES AI is leveraging NVIDIA's advanced hardware and software to revolutionize battery technology for electric vehicles (EVs) by accelerating the discovery of novel materials through AI-driven appro...
Wen Jie Ong
6 min read
Has Summary
--
NVIDIA cuML has introduced a zero code change capability that allows data scientists and machine learning engineers to accelerate scikit-learn applications on NVIDIA GPUs without modifying existing...
Siddharth Sharma
8 min read
Includes Code
Has Summary
--
The article discusses how RAPIDS cuML can accelerate time series forecasting by utilizing GPU-accelerated machine learning techniques.
Brian Tepera
4 min read
Includes Code
Has Summary
--
The article discusses the challenges of multi-label classification in machine learning and how RAPIDS cuML, a GPU-accelerated library, can enhance the efficiency of these workflows.
Nick Becker
4 min read
Includes Code
Has Summary
--
The article discusses how NVIDIA RAPIDS can enhance causal inference on large datasets by leveraging GPU acceleration, specifically through the integration of the cuML library with the DoubleML fra...
Nick Becker
4 min read
Includes Code
Has Summary
--
The NVIDIA RAPIDS v24.
Nick Becker
8 min read
Includes Code
Has Summary
--
The article discusses the enhancements made to the UMAP dimension reduction algorithm using RAPIDS cuML, focusing on its accelerated performance on GPUs.
Jinsol Park
11 min read
Includes Code
Has Summary
--
The article discusses how RAPIDS AI can accelerate predictive maintenance in manufacturing by leveraging advanced data analytics to minimize downtime and optimize maintenance schedules.
Amarnath Mohan
11 min read
Includes Code
Has Summary
--
The article discusses the collaboration between Cyborg and NVIDIA to enhance the security of vector databases through the NVIDIA cuVS library, which accelerates encrypted vector search.
Nicolas Dupont
6 min read
Includes Code
Has Summary
--
The article discusses the significance of robust scene text detection and recognition (STDR) in various applications, emphasizing the challenges faced in recognizing text from natural scenes.
Vishal Chavan
8 min read
Has Summary
--
The article discusses the Spark RAPIDS ML library, an open-source Python package that accelerates Apache Spark ML applications using NVIDIA GPU technology.
Erik Ordentlich
8 min read
Includes Code
Has Summary
--
The article discusses how GPU-accelerated data analytics can enhance machine learning (ML) projects by improving speed and scalability.
Jay Rodge
14 min read
Includes Code
Has Summary
--
The article discusses the advancements in single-cell RNA sequencing analysis using the RAPIDS-singlecell library, which leverages GPU acceleration to significantly enhance performance.
Severin Dicks
13 min read
Includes Code
Has Summary
--
The article discusses the application of federated learning to traditional machine learning methods, highlighting its advantages in communication efficiency and the ability to train models collabor...
Kris Kersten
3 min read
Has Summary
--
This article discusses the use of time-series models, specifically autoregressive recursive neural networks and XGBoost, for predicting credit defaults.
Jiwei Liu
11 min read
Includes Code
Has Summary
--
The article discusses the generation of a Limit Order Book (LOB) dataset for short-term price prediction using RAPIDS, emphasizing the benefits of GPU acceleration in financial machine learning.
Andrew Briand
9 min read
Includes Code
Has Summary
--
This article provides a comprehensive guide to understanding interaction terms in linear regression, emphasizing their importance in modeling the relationship between dependent and independent vari...
Eryk Lewinson
11 min read
Includes Code
Has Summary
--
This article provides a comprehensive overview of regression evaluation metrics essential for assessing machine learning model performance.
Eryk Lewinson
17 min read
Includes Code
Has Summary
--
This article provides a comprehensive guide on deploying machine learning models on Google Cloud Platform (GCP).
AutoMLAWSAzureFlaskGoogle CloudGoogle Cloud FunctionsGoogle Cloud StorageHTMLIrisMachine LearningPandasPythonscikit-learnServerlessVertex AI
Kurtis Pykes
10 min read
Includes Code
Has Summary
--
This article focuses on the practical aspects of building and training a machine learning (ML) model using Python, specifically utilizing the Iris Dataset.
Kurtis Pykes
5 min read
Includes Code
Has Summary
--
The article discusses the enhancements in the RAPIDS cuML library for performing HDBSCAN soft clustering, providing significant performance improvements over traditional CPU-based methods.
Nick Becker
9 min read
Includes Code
Has Summary
--
The article discusses the emergence of machine learning (ML) security as a critical discipline at the intersection of information security and data science.
Joseph Lucas
7 min read
Has Summary
--
This article discusses the advancements in single-cell measurement technologies and how NVIDIA RAPIDS cuML can significantly accelerate single-cell modality prediction.
Jiwei Liu
7 min read
Includes Code
Has Summary
--
The article discusses the challenges of deploying AI models in production and how NVIDIA Triton Inference Server addresses these challenges.
Shankar Chandrasekaran
11 min read
Includes Code
Has Summary
--
The article discusses how to accelerate ETL processes on KubeFlow using RAPIDS, a data science framework that leverages GPUs for improved performance.
Jacob Tomlinson
12 min read
Includes Code
Has Summary
--
The article discusses the advantages of using Naive Bayes (NB) classifiers for text classification tasks, particularly when leveraging GPU acceleration through RAPIDS cuML.
Mickael Ide
11 min read
Includes Code
Has Summary
--
This article discusses the challenges posed by outliers in linear regression and presents three robust regression models—Huber regression, RANSAC regression, and Theil-Sen regression—as solutions.
Eryk Lewinson
12 min read
Includes Code
Has Summary
--
The article discusses Accelerated WEKA, a project that integrates GPU acceleration into the WEKA machine learning software using RAPIDS libraries.
Albert Bifet
11 min read
Has Summary
--
This article provides a comprehensive step-by-step guide for building a machine learning application using RAPIDS, a suite of open-source software libraries that leverage GPU acceleration.
Paul Mahler
10 min read
Includes Code
Has Summary
--
The article discusses the fast fine-tuning of AI transformers using RAPIDS Machine Learning, highlighting the advantages of using cuML support vector machine (SVM) as a head module instead of the t...
Jiwei Liu
6 min read
Has Summary
--
This article explores three effective approaches to encoding time information as features for machine learning models, emphasizing the importance of feature engineering in improving model accuracy.
Eryk Lewinson
13 min read
Includes Code
Has Summary
--
This article serves as a guide for Data Scientists to understand the fundamental concepts of gradient descent and backpropagation algorithms, which are essential for training Artificial Neural Netw...
Richmond Alake
9 min read
Has Summary
--
The article discusses the deployment of tree-based models like XGBoost and LightGBM using the NVIDIA Triton Inference Server, emphasizing its capabilities for real-time serving and GPU acceleration.
William Hicks
7 min read
Has Summary
--
This article introduces the foundational techniques for preparing text data for Natural Language Processing (NLP) using vectorization, hashing, and tokenization.
Edward Krueger
10 min read
Has Summary
--