Introduction
Python Garbage Collection: When you write a Python program, it needs memory to store data. But what happens when you no longer need some of that data? If Python doesn’t remove it, your program can slow down or even crash because it’s using too much memory.
To fix this, Python has a system called garbage collection. It automatically cleans up memory by deleting data that is no longer needed. This helps your program run faster and use less memory.
But here’s the problem—sometimes, garbage collection itself can slow down your program. If Python spends too much time cleaning up memory, your code might pause or run slower than expected.
That’s why it’s important to understand:
- How Python manages memory
- Why garbage collection is useful
- How garbage collection can sometimes cause problems
In this blog post, I’ll explain everything in simple terms so you can optimize Python’s garbage collection and make your programs run smoothly.
How Python’s Garbage Collection Works

When you create something in Python (like a list, a number, or an object), Python stores it in memory.
But once you’re done with that thing — and you’re not using it anymore — Python needs a way to clean it up. Just like taking out the trash when you’re done eating chips.
This “cleaning up” is called garbage collection.
1. Reference Counting: “How many people are using this?”
Python keeps a counter for every object — it’s like asking:
“How many variables are using me right now?”
That count is called a reference count.
Example:
a = [1, 2, 3] # You create a list. Python says: "This list has 1 user (a)."
b = a # Now 'b' is also using the same list. "Now 2 users!"
So Python knows this list is still being used.
But if you delete both:
del a
del b
Now nobody is using the list. So Python says:
“Cool, I can throw this away!”
And it frees the memory.
2. Cyclic Garbage Collection: “Oh no, a loop!”
Sometimes, two things refer to each other, like this:
class Person:
def __init__(self, name):
self.name = name
self.friend = None
p1 = Person("Alice")
p2 = Person("Bob")
p1.friend = p2
p2.friend = p1
These two objects are holding hands — they’re keeping each other alive. Even if you delete p1
and p2
, Python says:
“Wait… something is still using them! Oh… it’s themselves.”
This is a loop, and reference counting can’t handle it.
That’s when Python’s secret tool kicks in: a little vacuum cleaner called the cyclic garbage collector. It looks for loops like this and cleans them up for you.
3. The gc
Module: Python’s Cleanup Button
Python also gives you a tool to peek into garbage collection yourself. It’s called the gc
module.
import gc
gc.collect() # You can run this to tell Python: "Please clean up now."
You can even turn it off (but usually don’t):
gc.disable() # Turns off automatic cleanup
gc.enable() # Turns it back on
TL;DR (But not a summary, just a reminder in plain words)
- Python keeps track of how many times something is being used.
- If no one is using it, it gets deleted automatically.
- When two things are stuck in a loop, Python has a special cycle detector to clean that up.
- If you want to see or control it, use the
gc
module.
Identifying Performance Bottlenecks in Python

Sometimes your Python program runs slower than expected or uses too much memory. That’s called a performance bottleneck—like when traffic piles up on one tiny bridge while the rest of the highway is clear.
One sneaky reason this happens? Too much garbage collection.
How Too Much Garbage Collection Slows You Down
Remember, Python’s garbage collector is supposed to clean up objects you’re not using anymore.
But if your program creates tons of short-lived objects, Python ends up spending way too much time cleaning, and not enough time doing the actual work.
It’s like if you kept pausing a movie every 5 seconds just to wipe the remote. Eventually, you’re not watching the movie anymore—you’re just cleaning nonstop.
Symptoms:
- Your program slows down randomly.
- Memory usage goes up and down weirdly.
- CPU usage spikes even when your code isn’t doing much.
Using gc.get_stats()
to See What’s Going On
Python gives you tools to see how the garbage collector is behaving. One simple one is:
import gc
stats = gc.get_stats()
print(stats)
This gives you numbers about:
- How frequently the garbage collector ran
- The number of objects it detected during each run
- And how many of those objects were actually removed
Tip: If it’s running too often, or finding a ton of stuff every time, your code might be creating objects like crazy.
Profiling Memory with objgraph
Want to see what’s taking up memory? Try objgraph
, a cool library that helps you visualize which objects are hanging around too long.
Step-by-step:
- First install it:
pip install objgraph
2. Then use it like this:
import objgraph
# Show top 10 object types taking memory
objgraph.show_most_common_types()
# See what’s keeping a specific object type alive
objgraph.show_backrefs(objgraph.by_type('dict')[0], filename='graph.png')
That last line creates a diagram (you’ll find it in your folder as graph.png
) that shows what’s keeping your dictionary objects from being deleted.
Dig Even Deeper with tracemalloc
tracemalloc
is another Python tool. It lets you track memory usage over time and see exactly where it’s coming from.
Basic usage:
import tracemalloc
tracemalloc.start()
# Run your code
my_big_function()
# Take a snapshot
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
# Print top 10 memory hogs
for stat in top_stats[:10]:
print(stat)
Now you can see which lines of your code are using the most memory. That’s gold when you’re trying to find bottlenecks!
Quick Summary
If your program is slow or memory-hungry, garbage collection might be running too much.
- Check out
gc.get_stats()
to get a behind-the-scenes look at Python’s garbage collector. - Use
objgraph
to spot which objects are lingering in memory longer than they should. - Turn to
tracemalloc
when you need to track down which parts of your code are eating up the most memory.
Must Read
- How to Return Multiple Values from a Function in Python
- Parameter Passing Techniques in Python: A Complete Guide
- A Complete Guide to Python Function Arguments
- How to Create and Use Functions in Python
- Find All Divisors of a Number in Python
Optimizing Garbage Collection for Better Performance
Sometimes Python’s garbage collector acts like an overenthusiastic cleaner — tidying up too often or at the wrong time, slowing your code down.
Let’s see how to tune it and make things faster.

1. Disabling and Enabling Garbage Collection
Sometimes, you know your code is about to create a bunch of temporary objects, like inside a big loop or data processing job. You don’t want Python sweeping the floor in the middle of that.
So, you tell Python:
“Hey, hold off on cleaning for a bit.”
That’s where this comes in:
import gc
gc.disable() # Stop garbage collection for now
# Do heavy work here
for i in range(1000000):
obj = [1] * 100 # Creating lots of small objects
gc.enable() # Turn it back on
Use this carefully, though! Always turn it back on, or you’ll end up with memory piling up like dirty dishes.
2. Tuning Thresholds with gc.set_threshold()
Python’s garbage collector works in three generations:
- Gen 0 collects things your program only uses for a short time.
- If something stays around a little longer, it gets moved to Gen 1.
- And if it lasts even longer, it ends up in Gen 2, where Python checks it less often.
Each generation has a threshold: when it’s crossed, Python runs the collector.
You can adjust these numbers with:
gc.set_threshold(700, 10, 10)
This tells Python:
- Only collect Gen 0 after 700 new objects are created.
- Collect Gen 1 and Gen 2 less frequently.
This helps if:
- Your program creates a lot of objects quickly.
- You want to reduce how often GC interrupts your code.
To see the current thresholds:
print(gc.get_threshold())
3. Managing Large Object Creation Efficiently
Let’s say you’re processing big data chunks or building tons of temporary lists, dicts, etc. If you don’t manage that carefully:
- You use too much memory.
- GC keeps interrupting you.
What to do:
- Reuse objects if possible (e.g., clear a list instead of making a new one).
- Break big tasks into smaller batches.
- Disable GC during heavy object creation, then re-enable and run
gc.collect()
after:
gc.disable()
# Create stuff...
gc.enable()
gc.collect()
This makes memory handling more predictable.
4. Using __slots__
to Save Memory
By default, Python stores each object’s attributes in a dictionary. This makes it easy to add new attributes on the fly—but that flexibility comes at a cost: extra memory.
If you’re creating lots of objects from a class and you already know which attributes they’ll have, you can use __slots__
to save memory.
With __slots__
, Python skips the internal dictionary and stores attributes more efficiently.
Example:
class Person:
__slots__ = ['name', 'age'] # Only these 2 attributes allowed
def __init__(self, name, age):
self.name = name
self.age = age
Benefits:
- Less memory per object.
- Faster attribute access.
- Can’t accidentally add new attributes (which is also a good thing sometimes).
Use this when you have lots of small objects, like millions of nodes in a graph, people in a database, etc.
Recap
Trick | What it does | When to use it |
---|---|---|
gc.disable() / gc.enable() | Pause GC while doing heavy work | During large object creation |
gc.set_threshold() | Adjust when GC should run | If GC runs too often |
Manual batching + gc.collect() | Clean up on your terms | After finishing a task |
__slots__ | Use less memory per object | When you create lots of class instances |
What’s a Memory Leak in Python?
Let’s say you’re done using some data in your program — like a list, object, or file — and you expect Python to throw it away and free up space.
But sometimes, Python keeps holding onto it. Even though your program doesn’t need it anymore.
That’s called a memory leak.
It’s like finishing your lunch but never throwing away the plate. If you keep doing that every day, your table fills up. Eventually, there’s no space left, and things slow down.

Why Does This Happen?
Python is supposed to clean up memory using garbage collection. But sometimes, it can’t clean everything — especially when:
- Two things are pointing at each other
They’re like friends saying, “Hey, don’t throw me away — he still needs me!” - You forgot to remove something from a global list
So Python thinks you still need it.
Okay, How Do We Fix That?
We can use tools and tricks to find where memory is leaking. Let’s go over them one at a time — super simple.
1. The weakref
Trick
Normally, if you point to an object, Python won’t delete it.
But weakref lets you point to it without protecting it. That way, Python can still throw it away when it’s done.
import weakref
class MyData:
pass
obj = MyData()
# weakref is like a soft hold — not a strong grip
weak_obj = weakref.ref(obj)
print(weak_obj()) # Shows the object
del obj
print(weak_obj()) # Now shows None — because the object was deleted
Why use it? To avoid holding on to objects you don’t really need.
2. Circular References Are Trouble
Let’s say:
class A:
def __init__(self):
self.b = None
class B:
def __init__(self):
self.a = None
Then you write:
a = A()
b = B()
a.b = b
b.a = a
Now, a
and b
are pointing at each other. Even if you delete a
and b
, Python may not clean them.
You can fix that by:
Cleaning Manually:
import gc
gc.collect() # This tells Python: "Go clean now!"
Or using weakref
for one of the links.
3. Tools to Catch Leaks
These help you see what’s not getting cleaned.
pympler
pip install pympler
Then:
from pympler import muppy, summary
all_objects = muppy.get_objects()
sum_obj = summary.summarize(all_objects)
summary.print_(sum_obj)
This shows what kind of objects are still hanging around.
objgraph
pip install objgraph
Then:
import objgraph
objgraph.show_growth(limit=3) # Shows what’s growing in memory
You can even make a picture that shows who is holding onto what.
guppy
/ heapy
This one is for deep memory inspection.
pip install guppy3
Then:
from guppy import hpy
h = hpy()
print(h.heap()) # Shows memory used by different types of objects
In Plain Words
Problem | What to Do |
---|---|
Objects staying alive too long | Check if you’re keeping references in global vars |
Circular references | Use gc.collect() or weakref |
Can’t find memory leak | Use pympler , objgraph , or guppy |
The Memory Leak Example (with Circular References)
Here’s a piece of code that leaks memory without us realizing:
import gc
class Node:
def __init__(self, name):
self.name = name
self.partner = None # Points to another Node
def create_leak():
a = Node("A")
b = Node("B")
a.partner = b
b.partner = a
return a, b
# Turn on debugging for GC
gc.set_debug(gc.DEBUG_UNCOLLECTABLE)
# Disable auto-GC so we can inspect manually
gc.disable()
for _ in range(1000):
a, b = create_leak()
del a
del b
# Force garbage collection
unreachable = gc.collect()
print(f"Unreachable objects: {unreachable}")
print("Garbage objects:", gc.garbage)
What’s Wrong Here?
a
andb
are pointing at each other.- Even after we delete
a
andb
, Python can’t automatically free them. - This causes a memory leak.
- When we call
gc.collect()
, Python detects these stuck objects, but it doesn’t know how to clean them.
How to Fix It — Step by Step
Step 1: Use weakref
to break the cycle
We’ll change one of the references into a weak reference, so it doesn’t count as a “real” hold.
import gc
import weakref
class Node:
def __init__(self, name):
self.name = name
self.partner = None
def create_fixed():
a = Node("A")
b = Node("B")
a.partner = weakref.ref(b) # ✅ weak reference
b.partner = a # Still a strong reference
return a, b
# Clean slate
gc.set_debug(gc.DEBUG_UNCOLLECTABLE)
gc.disable()
for _ in range(1000):
a, b = create_fixed()
del a
del b
unreachable = gc.collect()
print(f"Unreachable objects: {unreachable}")
print("Garbage objects:", gc.garbage)
Step 2: What changed?
- No memory leak now.
- Garbage collector can clean up properly.
gc.garbage
will be empty.- Memory stays under control even after 1,000 iterations.
Bonus Tip: Show What’s Leaking with objgraph
pip install objgraph
Then in your code:
import objgraph
objgraph.show_growth(limit=5) # Shows which objects are growing too much
In Short:
Problem | Fix |
---|---|
Objects reference each other (circle) | Use weakref to break the circle |
Still leaking? | Use gc.collect() and check gc.garbage |
Not sure what’s leaking? | Use objgraph , pympler , or guppy3 |
Best Practices for Efficient Memory Management

1. Avoid Unnecessary Object Creation
What this means:
Don’t keep making new objects if you can reuse existing ones. Each object you create takes up memory.
Bad:
def build_list_bad():
my_list = []
for i in range(10000):
my_list.append(str(i)) # Creates a new string each time
return my_list
Good:
def build_list_good():
my_list = [str(i) for i in range(10000)] # More memory-efficient
return my_list
Tip: Don’t create large objects inside loops unless needed. Reuse stuff when you can.
2. Use Generators and Iterators Instead of Lists
If you don’t need everything at once, don’t load everything at once.
What’s the difference?
- List = Stores all values in memory.
- Generator = Yields one value at a time. Much lighter.
Bad (memory-heavy):
def get_squares():
return [i*i for i in range(10**6)]
Good (memory-light):
def get_squares_gen():
for i in range(10**6):
yield i*i
Now you’re not filling memory with a million squares at once. You’re handing them out one at a time when needed.
3. Use Context Managers (the with
Statement)
Why?
Whenever you open something — like a file or a network connection — Python doesn’t always close it automatically.
That’s where context managers save you. They clean up when done.
Bad:
f = open("data.txt", "r")
data = f.read()
# forgot to close!
Good:
with open("data.txt", "r") as f:
data = f.read()
# file is auto-closed, even if something crashes
Works for:
- Files
- Database connections
- Network sockets
- Threads
- Even custom objects (you can write your own context managers too)
4. Use Manual Garbage Collection (Only When Needed)
Python usually takes care of memory for you, but sometimes — like in real-time systems or long-running apps — you may want to manually tell it to clean up.
How?
import gc
gc.collect() # Force garbage collection
When to use:
- If you’re processing millions of records
- Memory usage keeps climbing
- If you’re debugging a memory leak
But don’t overdo it — Python’s garbage collector is usually smart enough. Only step in when needed.
Step-by-Step: Using memory_profiler
to Measure Memory Line by Line
First, install the tool:
pip install memory-profiler
Now, write this Python script:
from memory_profiler import profile
@profile
def using_list():
result = [i * i for i in range(10**6)]
return result
@profile
def using_generator():
result = (i * i for i in range(10**6))
for _ in result:
pass
if __name__ == "__main__":
using_list()
using_generator()
Run it like this:
python -m memory_profiler your_script.py
What You’ll See:
You’ll get line-by-line memory usage. The function with the list will use a lot more memory compared to the one with the generator.
Bonus: Track Memory Over Time with tracemalloc
Install it (actually built into Python ≥ 3.4)
No need to install anything. Just import it.
Sample script using tracemalloc
:
import tracemalloc
def waste_memory():
big_list = [x ** 2 for x in range(10**6)]
return big_list
tracemalloc.start()
waste_memory()
current, peak = tracemalloc.get_traced_memory()
print(f"Current memory usage: {current / 10**6:.2f} MB")
print(f"Peak memory usage: {peak / 10**6:.2f} MB")
tracemalloc.stop()
Output:
Current memory usage: 2.5 MB
Peak memory usage: 85.3 MB
That “peak” tells you how much was used at the worst moment.
Which Should You Use?
Tool | Use When… |
---|---|
memory_profiler | You want line-by-line memory use |
tracemalloc | You want overall memory tracking, or want to compare snapshots |
Real-World Use Cases & Examples

1. Optimizing Garbage Collection in High-Performance Applications
Real-World Case:
You’re building a real-time analytics dashboard (say with WebSockets or FastAPI), processing thousands of user events per second.
Problem:
Python’s garbage collector kicks in too often, interrupting your event loop, causing lags or dropped messages.
Strategy:
- Disable automatic GC during critical performance windows.
- Re-enable or manually trigger it at safe points.
Example:
import gc
def handle_critical_events():
gc.disable() # Don't let GC interrupt
for _ in range(100000):
process_event()
gc.enable()
gc.collect() # Clean up manually
def process_event():
# Simulated event handling
x = {"data": "payload" * 100}
Result: Smoother performance with fewer pauses.
2. Memory Management in Data-Intensive Applications (like Pandas)
Real-World Case:
You’re doing ETL or ML preprocessing with millions of rows in Pandas.
Problem:
Memory keeps growing. Eventually, Python crashes with a MemoryError
.
Strategy:
- Delete unused DataFrames explicitly.
- Use
del
and manual garbage collection. gc.get_stats()
to monitor pressure.- Use chunking instead of loading entire files.
Example:
import pandas as pd
import gc
# Load large CSV in chunks
chunks = pd.read_csv("huge_data.csv", chunksize=50000)
for chunk in chunks:
result = chunk.groupby("user_id").sum()
# ...process result...
del chunk
gc.collect() # Free memory explicitly
Result: Memory is kept under control even with huge datasets.
3. How Major Frameworks Handle Garbage Collection
Django & Flask:
- They don’t touch garbage collection directly, but long-running apps (like gunicorn or uWSGI) can build up memory.
- Use memory leak detection tools (like
heapy
,objgraph
) if memory keeps rising.
Tip for Web Apps:
Use gc.collect()
during low-traffic hours or idle server moments (e.g., via a cron job or middleware).
import gc
from django.utils.deprecation import MiddlewareMixin
class MemoryCleanupMiddleware(MiddlewareMixin):
def process_response(self, request, response):
if should_clean(): # Your custom condition
gc.collect()
return response
Pandas:
Pandas objects (like DataFrame
, Series
) can hold onto memory, especially when chained or sliced. Use:
.copy()
to avoid memory leaks from views.del
to remove large intermediate steps.gc.collect()
after dropping columns or rows.
df = df.drop(columns=["big_column"])
gc.collect()
Wrap-up Cheatsheet:
Use Case | What to Do |
---|---|
Real-time apps (FastAPI, async) | Disable GC during tight loops |
Large datasets (Pandas, ETL) | Chunk, delete, collect |
Web servers (Django, Flask) | Run GC at low-load times |
Memory pressure debugging | Use gc , objgraph , tracemalloc |
Conclusion: Garbage Collection Doesn’t Have to Be a Mystery
Memory management in Python might sound boring at first, but once you see how garbage collection works under the hood — and how much control you actually have — it starts to feel more like a superpower than a chore.
We covered a lot:
- How Python decides when to collect garbage (reference counting + cyclic GC),
- How to spot and fix memory leaks,
- Real-world examples from web apps, data pipelines, and high-performance systems,
- And the tools and tricks (like
gc
,tracemalloc
,objgraph
,pympler
) that make your life easier.
In short: Python tries to take care of memory for you — but when things get messy, you can step in and make it better.
Pro tip before you go:
If your Python program keeps eating memory like it’s at an all-you-can-eat buffet — it’s probably time to bring in gc.collect()
and a few smart memory tricks.
FAQ: Python Garbage Collection & Memory Management
Garbage collection is how Python cleans up unused memory behind the scenes. It helps prevent your program from using up all your RAM. If you’re building something that runs for a long time — like a web server or a data pipeline — caring about memory can keep your app fast and stable.
gc.collect()
in my code? Use gc.collect()
when you know your program just finished doing something memory-heavy (like processing a huge file or dataset). It’s especially useful in loops, long-running services, or data pipelines where automatic collection might not keep up.
If memory usage keeps growing even when it shouldn’t, you might have a leak. You can use tools like tracemalloc
, objgraph
, or pympler
to trace what’s still hanging around in memory — even after you’re done with it.
Not usually, but if you’re running a high-traffic site or notice memory climbing over time, you can add gc.collect()
during quiet times or use tools like gunicorn
’s memory limits to keep things in check.
External Resources
📦 Bonus: External Resources Kit for Python Memory Management
Want to keep learning after this post? Here’s a handpicked list of practical resources to help you understand how Python handles memory—and how you can take control.
🧠 Official Docs & Tools
- Python gc module (official docs) – Manually control garbage collection and tune performance.
- tracemalloc – Track memory allocations over time.
- objgraph GitHub repo – Visualize object references to catch leaks.
- Pympler Docs – Analyze memory usage with real-time reporting tools.
📖 Tutorials & Talks
- Real Python: Memory Management in Python – Great beginner-friendly breakdown of memory behavior in Python.
- PyCon Talk: Fast Python, Slow Python – Learn how memory and speed interact in real-world Python apps.
Leave a Reply