Skip to content

UoA Research AI GPU Platform

Running LLMs on NeSI

UoA Research AI GPU Platform

Index
Guides
Guides
- Guides
- Accessing the Kubernetes Cluster
- Running LLMs on NeSI Running LLMs on NeSI
  On this page
Platform
Platform
- Platform
- Platform Architecture
Reference
Reference
- AI Benchmarks

Running LLMs on NeSI¶

This guide provides an overview of how to deploy and run Large Language Models (LLMs) on the NeSI HPC platform using the UoA AI GPU resources.

Prerequisites¶

Before you begin, ensure you have:

Docker Engine: Installed locally (e.g., Docker Desktop).
SSH Access: Configured SSH keys for accessing the VM (GatewayPorts yes required in sshd_config).
DuckDNS: A domain and API key for external access.
API Keys: OpenAI, LiteLLM, etc., as needed for your specific model.

Quick Start¶

Configuration: Create a .env file with your secrets (API keys, passwords).
Launch: Run sudo docker compose up -d to start the services.
HPC Job Submission: Use the provided Slurm scripts to submit your model inference job to the HPC nodes.

Repository¶

For the full tutorial, example .env files, and Slurm submission scripts, please refer to the source repository:

drai-inn/llm-nesi-example