GPU: pay-as-you-go for AI work
Spin up T4, A100, or H100 GPUs from RunPod. SSH in, run your model, stop the pod. Pay by the second.
What it does
Real GPUs are rare on laptops, expensive on workstations, and overkill 95% of the time. CelDrive's GPU layer lets you spin one up only when you need it: for training, inference, image / video generation, or anything else that wants real CUDA cores.
celdrive gpu start --type T4|A100|H100 provisions a RunPod pod with a PyTorch image (CUDA, cuDNN, PyTorch, transformers, diffusers, all pre-installed) and returns SSH credentials. From there you SSH in like any other Linux box and run your workload, or use celdrive gpu run <cmd> for one-shot jobs that ship your current directory up, run, and pull results back.
Pods boot in 60–90 seconds. State is ephemeral: when you stop the pod, the disk goes with it. Anything you want to keep goes back to your CelDrive folder, which the pod has mounted automatically.
Pricing tiers
RunPod community-cloud rates, fetched live by the CLI at session start. Billed by the second: no monthly minimum, no idle subscription. Stop the pod the moment you're done.
| GPU | VRAM | ~Price | Best for |
|---|---|---|---|
| T4 | 16 GB | ~$0.39/hr | Stable Diffusion, Whisper, small Llama models, dev / iteration |
| A100 80 GB | 80 GB | ~$1.89/hr | Llama-3 70B inference, Mistral fine-tuning, serious training |
| H100 80 GB | 80 GB | ~$3.49/hr | Frontier model training / fine-tuning, fastest possible turn-around |
Per-second billing is RunPod's actual model: a 23-minute job on a T4 costs ~$0.15, not a full hour. If you boot a pod and stop it after 90 seconds because something didn't work, you owe ~$0.01.
What you'd use it for
- Stable Diffusion: generate at full speed without a local NVIDIA card; works fine on T4, faster on A100
- Llama-3 / Mistral / Mixtral inference: A100 80 GB fits Llama-3 70B comfortably; T4 handles 7-8B models
- Whisper transcription: multi-hour audio in minutes on a T4
- Fine-tuning small to mid-size models: LoRAs and full fine-tunes on 7B-30B class models
- Video transcoding: NVENC on a T4 chews through h264 / h265 / AV1
- 3D rendering: Blender Cycles / OptiX accelerated; cheaper than render farms for one-off shots
Honest about model fit: don't try to run frontier-scale models (GPT-4-class, 175B+) on a T4. They don't fit and they'd be unbearably slow if they did. Match the GPU to the workload: T4 for dev and small-model inference, A100 for serious work, H100 when wall-clock time is worth more than the hourly rate.
The flow
Three commands, same shape as compute. Output is real, billing is real.
→ requesting community pod (T4 16GB, PyTorch image)...
✓ pod ready in 73s · $0.39/hr · billed by the second
ssh: ssh celdrive-gpu (auto-configured)
$ celdrive gpu run "python train.py"
→ uploading cwd...
→ running on T4...
Epoch 1/10 loss=0.412 acc=0.873 [00:42]
Epoch 2/10 loss=0.301 acc=0.911 [00:39]
...
→ pulling output (model.pt, logs/)...
$ celdrive gpu stop
✓ pod terminated · billed 23 min · $0.15
The dashboard exposes the same controls: GPU panel → pick a type → Start session → SSH info appears. Live cost ticks up as the pod runs so you see exactly what the session is spending.
Setup gotcha
celdrive gpu start returns payment_required with a link to top up your RunPod account.
This is a real RunPod constraint, not something we add. We could front the credit and bill you on top, but that adds a markup and a billing surface we'd rather not own. It's the same arrangement RunPod has with everyone else who uses their API. It's a five-minute, one-time setup. Storage, memory, and compute are all unaffected by it.
If you'd rather not deal with RunPod directly, the dashboard walks you through the top-up the first time you click Start session on the GPU panel.