How NVIDIA Uses LLaMA

9 engineering articles about LLaMA from NVIDIA's engineering team

Other NVIDIA Technologies

Python(740)PyTorch(566)Deep Learning(505)TensorFlow(444)Docker(292)Kubernetes(251)

Other Companies Using LLaMA

Palantir(2)

Shopify(2)

Articles

Filter:

NVIDIA

Advanced

Fine-Tuning LLMOps for Rapid Model Evaluation and Ongoing Optimization

The article discusses the operational challenges of deploying large language models (LLMs) and introduces LLMOps as a framework for managing their lifecycle.

AzureFine-tuningGitJSONKubernetesLLaMAMicroservicesMLflow

Liad Levi-Raz

12 min read

Includes Code

Has Summary

NVIDIA

Intermediate

AI Aims to Bring Order to the Law

Stanford University researchers developed an LLM system called STARA to streamline legal research by identifying redundant and outdated laws.

LLaMAPyTorch

Elias Wolfberg

4 min read

Has Summary

NVIDIA

Advanced

Applying Specialized LLMs with Reasoning Capabilities to Accelerate Battery Research

The article discusses the transformative role of domain-adapted large language models (LLMs) with reasoning capabilities in accelerating battery research.

ClaudeGeminiGPTKubernetesLLaMAscikit-learn

Rucha Apte

11 min read

Has Summary

NVIDIA

Intermediate

Customize Generative AI Models for Enterprise Applications with Llama 3.1

The article discusses the Llama 3. 1 collection of large language models (LLMs) and their applications in enterprise settings.

Generative AIHTMLKubernetesLangChainLLaMALlamaIndexRLHF

Chintan Patel

10 min read

Includes Code

Has Summary

NVIDIA

Advanced

Seamlessly Deploying a Swarm of LoRA Adapters with NVIDIA NIM

The article discusses the deployment of LoRA (Low-Rank Adaptation) fine-tuned models using NVIDIA NIM, highlighting the advantages of customizing large language models (LLMs) for specific tasks.

Hugging FaceLarge Language ModelsLLaMA

Shashank Verma

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Develop ML and AI with Metaflow and Deploy with NVIDIA Triton Inference Server

The article discusses the integration of Metaflow and NVIDIA Triton Inference Server for developing and deploying machine learning models.

AWSFastAPIFine-tuninggRPCHTTPSKubernetesLightGBMLLaMAPythonXGBoost

Eddie Mattia

12 min read

Includes Code

Has Summary

NVIDIA

Advanced

Optimizing Inference on Large Language Models with NVIDIA TensorRT-LLM, Now Publicly Available

NVIDIA has released TensorRT-LLM, an open-source library designed to optimize inference performance for large language models (LLMs) on NVIDIA GPUs.

DockerHugging FaceLarge Language ModelsLLaMAMistralPython

Neal Vaidya

10 min read

Includes Code

Has Summary

NVIDIA

Advanced

Announcing NVIDIA SteerLM: A Simple and Practical Technique to Customize LLMs During Inference

NVIDIA SteerLM is a novel technique designed to simplify the customization of large language models (LLMs) during inference.

EmotionGPTLLaMAPaLMPythonRLHF

Yi Dong

10 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Curating Trillion-Token Datasets: Introducing NVIDIA NeMo Data Curator

The article introduces the NVIDIA NeMo Data Curator, a scalable tool designed for curating trillion-token multilingual datasets for training large language models (LLMs).

DaskGPTLLaMAPython

Joseph Jennings

8 min read

Includes Code

Has Summary

You've reached the end! All 9 articles loaded.