Running LLMs on NeSI¶
This guide provides an overview of how to deploy and run Large Language Models (LLMs) on the NeSI HPC platform using the UoA AI GPU resources.
Prerequisites¶
Before you begin, ensure you have:
- Docker Engine: Installed locally (e.g., Docker Desktop).
- SSH Access: Configured SSH keys for accessing the VM (
GatewayPorts yesrequired insshd_config). - DuckDNS: A domain and API key for external access.
- API Keys: OpenAI, LiteLLM, etc., as needed for your specific model.
Quick Start¶
- Configuration: Create a
.envfile with your secrets (API keys, passwords). - Launch: Run
sudo docker compose up -dto start the services. - HPC Job Submission: Use the provided Slurm scripts to submit your model inference job to the HPC nodes.
Repository¶
For the full tutorial, example .env files, and Slurm submission scripts, please refer to the source repository: