8 Sep, 2025
$ nvidia-detector
nvidia-driver-575
$ sudo apt install -y nvidia-driver-575
Now you have to reboot to let the latest driver load.
Then:
$ nvidia-smi
Mon Sep 8 16:01:57 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.03 Driver Version: 575.64.03 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 ... Off | 00000000:01:00.0 Off | N/A |
| N/A 44C P0 14W / 60W | 11MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 4028 G /usr/bin/gnome-shell 1MiB |
+-----------------------------------------------------------------------------------------+
Then install the version of the CUDA builds you need by replacing cu129
for CUDA 12.9 with the number in the top right of the above command.
$ python -m venv venv
$ venv/bin/pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu129
Some of the downloads are over 1GB but at the end:
$ venv/bin/python -c "import torch; print(torch.version.cuda); print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))"
12.4
True
NVIDIA GeForce RTX 3050 Ti Laptop GPU
That means torch can use CUDA.
Here's a Hello, World example saved as 1.py
:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
model = nn.Sequential(nn.Linear(1,1))
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
xs = torch.tensor([[-1.0], [0.0], [1.0], [2.0], [3.0], [4.0]], dtype=torch.float32)
ys = torch.tensor([[-4.0], [-2.0], [0.0], [2.0], [4.0], [6.0]], dtype=torch.float32)
for _ in range(500):
optimizer.zero_grad()
outputs = model(xs)
loss = criterion(outputs, ys)
loss.backward()
optimizer.step()
with torch.no_grad():
print(model(torch.tensor([[10.0]], dtype=torch.float32)))
And here's the output:
$ time venv/bin/python3 1.py
tensor([[17.9774]])
real 0m3.070s
user 0m3.159s
sys 0m1.358s
This example is similar to one in "AI and ML for coders in PyTorch" which I'm enjoying working through. I hope they don't mind me being inspired by it to demo how I set up PyTorch since it is such a trivial example.
If I make the learning rate smaller, like lr=0.001
, then the code runs in the same time but doesn't get to the right solution in the iterations given:
$ time venv/bin/python3 1.py
tensor([[13.4347]])
real 0m3.069s
user 0m3.156s
sys 0m1.351s
But with 5,000 trainings instead of 500 it does, and it doesn't take much longer to train:
$ time venv/bin/python3 1.py
tensor([[17.9591]])
real 0m2.434s
user 0m3.850s
sys 0m0.317s
With a bigger learning rate it gets the right answer in just 50 iterations fast:
$ time venv/bin/python3 1.py
tensor([[17.9770]])
real 0m3.005s
user 0m3.093s
sys 0m1.342s
You can print the weight and bias of a single neuron like this:
layer = model[0]
weights = layer.weight.data.numpy()
bias = layer.bias.data.numpy()
print("Weights:", weights)
print("Bias:", bias)
Weights: [[1.9971641]]
Bias: [-1.9912081]
Then you can think of other types of layers like convolution layers, embedding layers etc which do different things. Although the maths can be considered similar.
Be the first to comment.
Copyright James Gardner 1996-2020 All Rights Reserved. Admin.