Chapter 5: Data Types and Devices¶

“Precision and placement — the twin pillars of efficient tensor computation.”

🧬 5.1 `torch.dtype`: The Soul of a Tensor¶

Every tensor in PyTorch carries a data type that defines the precision and nature of its values.

x = torch.tensor([1, 2, 3], dtype=torch.float64)
print(x.dtype)  # torch.float64

x = x.to(torch.float16)
x = x.int()              # Shortcut for int32
x = x.type(torch.float32)

🔬 float32 is the sweet spot for training: fast and accurate. But for inference? float16 (or bfloat16) is often enough.

A tensor doesn’t just live in memory — it lives on a device.

cpu_tensor = torch.tensor([1.0])           # On CPU
gpu_tensor = cpu_tensor.to('cuda')         # Moved to GPU

You can also create tensors directly on a device:

device = torch.device('cuda')
x = torch.zeros(3, 3, device=device)

Detecting and using available GPUs:

if torch.cuda.is_available():
    print("CUDA ready! Let's party.")
else:
    print("Stuck on CPU. Meh.")

💡 For multi-GPU systems: use 'cuda:0', 'cuda:1', etc.

Sometimes you want to globally change the default dtype. PyTorch lets you do this:

torch.set_default_dtype(torch.float64)

This affects all future float tensors created via:

torch.zeros(3)      # Now float64
torch.tensor([1.0]) # Now float64

Check the current default:

torch.get_default_dtype()

Useful when training scientific models (need precision) or optimizing inference (want float16).

In modern deep learning (especially with GPUs like RTX, A100, etc.), mixed precision is the name of the game.

Training with both:

Start with PyTorch’s AMP (Automatic Mixed Precision):

from torch.cuda.amp import autocast

with autocast():
    output = model(input)
    loss = criterion(output, target)

⚠️ We’ll cover this fully in Chapter 17: Using Torch with CUDA. For now, just know it’s a killer optimization strategy.

torch.dtype controls a tensor’s precision and data interpretation.
torch.device decides where the tensor lives — CPU or GPU.
You can set default dtypes, move tensors across devices, and leverage mixed precision for high-performance computing.
These two concepts are subtle but powerful when optimizing both training and inference.