Home Knowledge Base Garbage Collection (GC)

Garbage Collection (GC) is the automatic memory management process that identifies and reclaims memory occupied by objects no longer reachable by the program — critical in AI and deep learning contexts where Python's reference counting and CUDA memory management interact in ways that cause VRAM leaks, training crashes, and subtle performance degradation.

What Is Garbage Collection?

Why GC Matters for AI Systems

Python's Reference Counting

Every Python object has a reference count (ob_refcnt). When you do: x = MyTensor() → refcount = 1 y = x → refcount = 2 del x → refcount = 1 del y → refcount = 0 → object freed immediately

Reference Cycles (not freed by reference counting alone): class Node: def __init__(self): self.next = None a = Node(); b = Node() a.next = b; b.next = a → cycle: neither freed when a and b go out of scope del a; del b → refcount still 1 for each (cycle prevents zero)

Python's cyclic GC detects and breaks these cycles — but runs periodically, not immediately.

Common GC-Related Bugs in AI Code

Accumulating Computational Graphs: losses = [] for batch in dataloader: loss = model(batch) losses.append(loss) # BUG: stores tensor + entire gradient graph

Fix: losses.append(loss.item()) # Detaches from graph, stores plain float

Storing Tensors in Class Attributes: self.last_output = model_output # BUG: holds VRAM until next forward pass Fix: self.last_output = model_output.detach().cpu() # Move to CPU, detach

Logging with Tensor Values: logger.info(f"Loss: {loss}") # OK if loss is float logger.info(f"Output: {output}") # BUG if output is a CUDA tensor — may retain graph

CUDA Memory Management

PyTorch's caching allocator optimizes CUDA malloc/free by keeping freed memory in a cache rather than returning it to CUDA immediately — improving performance by avoiding expensive CUDA mallocs on future allocations.

torch.cuda.empty_cache():

gc.collect():

import gc del large_model gc.collect() torch.cuda.empty_cache()

GC Tuning for Long Training Runs

Disable automatic GC in tight training loops (prevents GC pauses): import gc gc.disable() # Manual control

... training loop ...

if step % 100 == 0: gc.collect() # Periodic manual collection

For inference services, tune GC thresholds to reduce pause frequency: gc.set_threshold(10000, 20, 20) # Increase collection thresholds

GC in AI systems is the invisible memory management layer that silently determines whether long training runs complete or crash — by understanding Python reference counting, CUDA caching allocation, and their interaction, AI engineers eliminate the class of frustrating "why does training OOM at step N?" bugs that consume hours of debugging time.

garbage collectiongcmemory

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.