How NVIDIA Uses Apache
145 engineering articles about Apache from NVIDIA's engineering team
Other NVIDIA Technologies
Other Companies Using Apache
Articles
Filter:
The article discusses Project Aether, a tool developed by NVIDIA to facilitate the migration of CPU-based Apache Spark workloads to GPU-accelerated environments on Amazon EMR.
Navin Kumar
6 min read
Includes Code
Has Summary
--
NVIDIA's Sirius, an open-source GPU-native SQL engine, has set a new performance record on ClickBench, enhancing DuckDB with GPU-accelerated analytics.
Xiangyao Yu
6 min read
Has Summary
--
The article discusses the development of scientific AI agents using reinforcement learning (RL) techniques, specifically through the NVIDIA NeMo framework.
Christian Munley
12 min read
Includes Code
Has Summary
--
The article discusses the collaboration between IBM and NVIDIA to enhance large-scale data analytics through GPU-native Velox and NVIDIA cuDF, highlighting significant performance improvements over...
Gregory Kimball
7 min read
Has Summary
--
Train a Quadruped Locomotion Policy and Simulate Cloth Manipulation with NVIDIA Isaac Lab and Newton
This article discusses the integration of the Newton physics engine with NVIDIA Isaac Lab for training quadruped locomotion policies and simulating cloth manipulation.
The article discusses the advancements in NVIDIA cuVS, a GPU-accelerated vector search library designed for high-performance indexing and low-latency retrieval.
Corey Nolet
7 min read
Has Summary
--
The article discusses the deployment of a serverless, distributed data processing architecture using Apache Spark and NVIDIA AI on Azure.
Alexander Spiridonov
9 min read
Includes Code
Has Summary
--
The article discusses NVIDIA's advancements in molecular AI modeling through the introduction of cuEquivariance and NIM microservices, which enhance the speed and efficiency of training and inferen...
Neha Tadimeti
8 min read
Has Summary
--
The article discusses how NVIDIA cuOpt, an open-source GPU-accelerated optimization tool, enhances decision-making processes in businesses by efficiently solving complex linear programming (LP), mi...
The article discusses the application of Graph Neural Networks (GNNs) in enhancing fraud detection within financial services.
Naim
10 min read
Includes Code
Has Summary
--
The article discusses the latest enhancements in RAPIDS, including zero-code-change acceleration for Python machine learning, significant IO performance improvements, and out-of-core XGBoost capabi...
ApacheAzureAzure Blob StorageDaskGeminiGoogle CloudGoogle Cloud StorageLightGBMNetworkXPolarsPythonscikit-learnXGBoost
Nick Becker
9 min read
Includes Code
Has Summary
--
The article discusses how Atgenomix SeqsLab leverages NVIDIA technologies to enhance health omics analysis for precision medicine.
Yu-Ting Lin
9 min read
Has Summary
--
The article discusses the use of GPU acceleration to enhance performance in Apache Spark applications, highlighting the challenges of migrating workloads from CPUs to GPUs.
Matt Ahrens
9 min read
Includes Code
Has Summary
--
The article discusses how to accelerate Deep Learning (DL) and Large Language Model (LLM) inference using Apache Spark in cloud environments.
ApacheApache SparkAWSAzureDeep LearningDockerJSONNumPyPythonPyTorchSemantic SearchTensorFlowTransformers
Rishi Chandra
9 min read
Includes Code
Has Summary
--
The article discusses the integration of the 3D Gaussian Unscented Transform (3DGUT) into the gsplat library, enhancing neural rendering and scene reconstruction for realistic 3D simulations.
NVIDIA cuPyNumeric 25. 03 is a fully open-source library designed as a drop-in replacement for NumPy, leveraging the Legate framework for accelerated computing.
The article discusses how to accelerate Apache Parquet scans on Apache Spark using GPUs, specifically through the RAPIDS Accelerator for Apache Spark.
Matt Ahrens
7 min read
Includes Code
Has Summary
--
NVIDIA has open-sourced the KAI Scheduler, a Kubernetes-native GPU scheduling solution under the Apache 2. 0 license, originally developed for the Run:ai platform.
Ronen Dar
9 min read
Has Summary
--
This article discusses strategies for preventing GPU fragmentation in the Volcano Scheduler, focusing on an enhanced scheduling approach that integrates bin-packing with gang scheduling.
Ameya Parab
6 min read
Includes Code
Has Summary
--
The article discusses the performance and energy efficiency of the NVIDIA Grace CPU Superchip for ETL workloads, comparing it with AMD and Intel CPUs.
Gregory Kimball
6 min read
Includes Code
Has Summary
--
The article discusses how the NVIDIA RAPIDS Accelerator for Apache Spark enables zero code change for GPU-accelerated data processing, enhancing the performance of Apache Spark ML applications.
The article discusses optimizing high-performance remote I/O operations using NVIDIA KvikIO for data analysis workloads on cloud object storage services.
Tom Augspurger
8 min read
Includes Code
Has Summary
--
The article discusses how to read JSON Lines data using NVIDIA's cuDF library, achieving performance improvements of up to 100 times faster than traditional pandas methods.
Karthikeyan Natarajan
10 min read
Includes Code
Has Summary
--
The article discusses the collaboration between BRLi and Toulouse INP to develop AI-based flood models using NVIDIA PhysicsNeMo, addressing the limitations of traditional physics-based numerical si...
Ram Cherukuri
6 min read
Has Summary
--
The article discusses the optimization of JSON processing on Apache Spark using GPU acceleration, highlighting significant performance improvements achieved by a Fortune 100 retail company.
Matt Ahrens
8 min read
Includes Code
Has Summary
--
IBM has launched Granite 3. 0, a new generation of generative AI models that are compact yet deliver high accuracy and efficiency.
Maryam Ashoori
5 min read
Has Summary
--
NVIDIA has announced that its CUDA-X platform now accelerates the Polars Data Processing Library, enhancing its performance for data analytics.
Nick Becker
3 min read
Has Summary
--
The article discusses how RAPIDS AI can accelerate predictive maintenance in manufacturing by leveraging advanced data analytics to minimize downtime and optimize maintenance schedules.
Amarnath Mohan
11 min read
Includes Code
Has Summary
--
The article discusses the NVIDIA GH200 Grace Hopper Superchip, highlighting its significant advancements in energy efficiency and node consolidation for Apache Spark workloads.
Amr Elmeleegy
7 min read
Has Summary
--
The article discusses the Mistral NeMo 12B model, a next-generation language model developed by NVIDIA and Mistral, designed for high performance on a single GPU.
Anjali Shah
6 min read
Includes Code
Has Summary
--
This article provides a comprehensive guide on encoding and compression techniques for string data in the Parquet format using RAPIDS.
The article introduces five new technical courses offered by NVIDIA aimed at enhancing skills in AI and data science.
ApacheApache ArrowApache SparkComputer VisionNatural Language ProcessingPrompt EngineeringPyTorchTransformerTransformersXGBoost
Rachel Ho
4 min read
Has Summary
--
This article provides a comprehensive guide on leveraging RAPIDS for GPU-accelerated data processing on Databricks.
The article discusses the application of Graph Neural Networks (GNNs) in optimizing the design and simulation of lattice structures in additive manufacturing.
Ayush Jain
6 min read
Has Summary
--
The article discusses the release of the NVIDIA NeMo Canary model, a state-of-the-art multilingual model for speech recognition and translation.
The article discusses the NVIDIA NeMo Curator framework, an open-source tool designed to streamline the data curation process for training large language models (LLMs).
Mehran Maghoumi
6 min read
Has Summary
--
The article discusses the evaluation of Retrieval-Augmented Generation (RAG) systems, emphasizing the importance of embedding models and systematic evaluation processes.
Benedikt Schifferer
14 min read
Has Summary
--
The article discusses the use of nested data types in RAPIDS libcudf for optimizing ETL workflows.
Gregory Kimball
10 min read
Includes Code
Has Summary
--
The article discusses deploying large language models (LLMs) at the edge using the NVIDIA IGX Orin Developer Kit.
Nigel Nelson
9 min read
Has Summary
--
A Stanford University team is revolutionizing cardiovascular care through AI-driven simulations that provide patient-specific blood flow visualizations.
Harpreet Sethi
8 min read
Has Summary
--
The article discusses the integration of RAPIDS and Vadalog Parallel to enhance the performance of neurosymbolic AI systems, particularly in processing large knowledge graphs.
The article discusses the Spark RAPIDS ML library, an open-source Python package that accelerates Apache Spark ML applications using NVIDIA GPU technology.
Erik Ordentlich
8 min read
Includes Code
Has Summary
--
The article discusses the optimization of Extract-Transform-Load (ETL) operations using GPUs, specifically through the NVIDIA RAPIDS Accelerator for Apache Spark.
The article discusses the use of 3D geospatial data in immersive environments, specifically through the Cesium platform.
The article discusses how the NVIDIA RAPIDS Accelerator for Apache Spark can significantly enhance the performance and cost-effectiveness of extract-transform-load (ETL) processes, particularly for...
Joel Lashmore
7 min read
Has Summary
--
The article discusses how GPU-accelerated data analytics can enhance machine learning (ML) projects by improving speed and scalability.
Jay Rodge
14 min read
Includes Code
Has Summary
--
The article discusses the integration of distributed deep learning with Apache Spark 3. 4, highlighting new built-in APIs for both distributed model training and inference.
Lee Yang
6 min read
Includes Code
Has Summary
--
The article discusses NVIDIA PhysicsNeMo, a framework for developing physics-informed machine learning models, with a focus on the latest update that introduces support for Graph Neural Networks (G...
Bhoomi Gadhia
5 min read
Has Summary
--
The article discusses how Taboola integrated GPUs into their data processing pipeline to enhance efficiency and reduce costs.
Eyal Hirsch
12 min read
Includes Code
Has Summary
--
The article discusses the importance of automatic augmentation in deep learning, emphasizing its role in enhancing model accuracy by diversifying training datasets.
Kamil Tokarski
12 min read
Includes Code
Has Summary
--