← 返回首页
Use ROCm for AI inference — ROCm Documentation
Back to top
Ctrl+K
The ROCm 7.13.0 technology preview release documentation is available at ROCm Preview documentation. For production use, continue to use ROCm 7.2.3 documentation.

Use ROCm for AI inference

Use ROCm for AI inference#

2026-04-08

3 min read time

Applies to Linux

AI inference is a process of deploying a trained machine learning model to make predictions or classifications on new data. This commonly involves using the model with real-time data and making quick decisions based on the predictions made by the model.

Understanding the ROCm™ software platform’s architecture and capabilities is vital for running AI inference. By leveraging the ROCm platform’s capabilities, you can harness the power of high-performance computing and efficient resource management to run inference workloads, leading to faster predictions and classifications on real-time data.

Throughout the following topics, this section provides a comprehensive guide to setting up and deploying AI inference on AMD GPUs. This includes instructions on how to install ROCm, how to use Hugging Face Transformers to manage pre-trained models for natural language processing (NLP) tasks, how to validate vLLM on AMD Instinct™ MI300X GPUs and illustrate how to deploy trained models in production environments.

The AI Developer Hub contains AMD ROCm tutorials for training, fine-tuning, and inference. It leverages popular machine learning frameworks on AMD GPUs.

© 2026 Advanced Micro Devices, Inc