How python allocates memory internally?

Certainly! Let's delve deeper into how Python allocates memory internally, exploring the underlying mechanisms, data structures, and algorithms involved. We'll focus on CPython, the reference implementation of Python, as it provides concrete examples of memory management in Python.

Overview of Python's Memory Management

Python's memory management is a complex system that involves several components working together:

  1. Reference Counting: Each object keeps track of how many references point to it. When the reference count drops to zero, the object is immediately deallocated.

  2. Garbage Collection (GC): A cyclic garbage collector detects and collects groups of objects involved in reference cycles that reference counting alone cannot dispose of.

  3. Memory Allocators:

    • Raw Memory Allocator: Low-level memory allocation using C's malloc() and free().

    • Object-Specific Allocators: Specialized allocators for different types of objects.

    • Pymalloc: A specialized allocator optimized for small objects (≤ 512 bytes).

Let's examine each of these components in detail, with examples to illustrate their operation.

Reference Counting in Depth

Reference counting is the primary memory management technique in Python. Every Python object contains a field ob_refcnt, which tracks the number of references to that object.

How Reference Counting Works

  • Incrementing Reference Count: Whenever a new reference to an object is created, the reference count is incremented.

  • Decrementing Reference Count: When a reference is deleted or goes out of scope, the reference count is decremented.

  • Deallocation: If the reference count reaches zero, the object's memory is deallocated immediately.

Example: Reference Counting with Lists

import sys

a = [1, 2, 3]
b = a
print(sys.getrefcount(a))  # Output: 3

Explanation:

  • Initial Reference: When a is created, its reference count is 1.

  • Assignment to b: Assigning b = a creates a new reference to the same list, incrementing the count to 2.

  • getrefcount() Function: The sys.getrefcount() function returns the reference count plus one additional reference used as an argument to the function itself, hence the output is 3.

Deleting References

del a
print(sys.getrefcount(b))  # Output: 2
  • Deleting a: The reference count decreases to 1 (but getrefcount() returns 2 due to the function argument).

  • Object Still Exists: Since b still references the list, it is not deallocated.

Garbage Collection and Cyclic References

Reference counting alone cannot handle cyclic references—situations where objects reference each other, forming a cycle.

Example: Cyclic References

class Node:
    def __init__(self):
        self.next = None

node1 = Node()
node2 = Node()

node1.next = node2
node2.next = node1

del node1
del node2

Explanation:

  • Cycle Formation: node1 and node2 reference each other.

  • Deletion: Deleting node1 and node2 decreases their reference counts but not to zero due to the cycle.

  • Garbage Collector: Python's cyclic garbage collector periodically searches for such cycles and deallocates them.

How the Garbage Collector Works

  • Generations: Objects are grouped into three generations based on their longevity. New objects start in the youngest generation.

  • Thresholds: The garbage collector runs when the number of allocations minus deallocations exceeds certain thresholds for each generation.

  • Cycle Detection: It uses algorithms to detect unreachable cycles and collects them.

Memory Allocation via Pymalloc

Pymalloc is an allocator optimized for small objects, which are prevalent in Python applications.

Pymalloc Structure

  • Arenas: Large contiguous memory blocks (256 KiB each) obtained from the system allocator.

  • Pools: Arenas are divided into pools of 4 KiB each.

  • Blocks: Pools are further divided into blocks of varying sizes (from 8 bytes upwards in multiples of 8 bytes).

Allocation Process

  1. Request: When an object needs memory, the size is rounded up to the nearest multiple of 8 bytes.

  2. Pool Selection: Pymalloc selects a pool that has blocks of the required size.

  3. Block Allocation: A free block within the pool is assigned to the object.

  4. Memory Reuse: When an object is deallocated, its block is marked as free for future allocations.

Example: Allocating Small Objects

a = 10
b = 20
c = "Hello"
  • Integers and Strings: These small objects are allocated via pymalloc.

  • Efficient Allocation: By reusing blocks, pymalloc reduces fragmentation and improves allocation speed.

Large Objects

  • System Allocator: Objects larger than 512 bytes bypass pymalloc and are allocated directly using the system's malloc() and free().

  • Example: Large lists or data structures consume more memory and are managed differently.

Memory Management for Different Object Types

Immutable Objects (Integers, Strings, Tuples)

  • Caching: Small integers (typically from -5 to 256) and interned strings are cached and reused.

  • No Deallocation: Cached objects remain in memory for the program's lifetime.

Example: Integer Caching

a = 100
b = 100
print(a is b)  # Output: True
  • Same Object: Both a and b point to the same integer object in memory.

Mutable Objects (Lists, Dictionaries, Custom Classes)

  • Dynamic Resizing: Lists and dictionaries can change size, requiring reallocation.

  • Over-allocation Strategy: Lists over-allocate memory to minimize reallocations.

Example: List Allocation

import sys

lst = []
print(sys.getsizeof(lst))  # Initial size

lst.append(1)
print(sys.getsizeof(lst))  # Size after appending
  • Memory Overhead: The list pre-allocates extra space to accommodate future elements.

Memory Allocation in Custom Objects

Example: Custom Class with __slots__

By default, instances of custom classes store attributes in a dictionary (__dict__), which can consume more memory.

class MyClass:
    def __init__(self, value):
        self.value = value
  • Memory Usage: Each instance has a __dict__ consuming additional memory.

Using __slots__ to Optimize Memory

class MyClass:
    __slots__ = ('value',)
    def __init__(self, value):
        self.value = value
  • Reduced Memory: __slots__ tells Python to allocate space for the specified attributes directly, without a __dict__.

  • Memory Allocation: The instance's memory footprint is smaller, and attribute access is faster.

Memory Allocation Steps with __slots__

  1. Class Definition: Python creates descriptors for each slot.

  2. Instance Creation: Memory is allocated for the instance, including space for the slots.

  3. Attribute Assignment: Values are stored directly in the allocated slots.

Deep Dive into Pymalloc

Block Management

  • Free Lists: For each block size, pymalloc maintains a free list of available blocks.

  • Allocation Algorithm:

    • First Fit: Allocates the first available block that fits the requested size.

    • Pool Replenishment: If no blocks are available, a new pool is created or fetched.

Memory Fragmentation Handling

  • Pooling Strategy: By dividing memory into pools and blocks, pymalloc reduces fragmentation.

  • Recycling Pools: Empty pools are returned to the arena, and completely unused arenas can be released back to the system.

Memory Overhead

  • Metadata Storage: Each pool and arena has associated metadata (e.g., usage counts), consuming additional memory.

  • Trade-offs: The overhead is justified by the performance gains in allocation and deallocation.

Interaction with Operating System Memory Management

  • System Calls: Python uses system calls like malloc(), free(), mmap(), and munmap() to request and release memory.

  • Virtual Memory: The operating system manages virtual memory pages, which Python's memory allocator utilizes.

Example: Large Memory Allocation

large_list = [0] * 10**7  # Allocate a list with 10 million zeros
  • Memory Request: Python requests a large block of memory from the system allocator.

  • Impact on OS: The operating system may need to allocate additional memory pages to satisfy the request.

Monitoring and Debugging Memory Usage

Tools and Techniques

  • sys.getsizeof(): Returns the size of an object in bytes.

  • Memory Profilers: Tools like memory_profiler can track memory usage over time.

  • Debug Builds: Python can be compiled with debugging options to track memory allocation and deallocation.

Example: Using memory_profiler

from memory_profiler import profile

@profile
def allocate_memory():
    large_list = [0] * 10**7
    return large_list

allocate_memory()
  • Output: The profiler reports the memory usage before and after the function execution.

Best Practices for Memory Management in Python

  • Reuse Objects: Reusing existing objects can reduce allocation overhead.

  • Use Generators: Generators consume less memory than lists because they yield items one at a time.

  • Limit Scope: Variables should be limited to the necessary scope to allow for earlier deallocation.

  • Avoid Cycles: Where possible, design data structures to avoid cyclic references.

  • Use __slots__: For classes with many instances, using __slots__ can reduce memory usage.

Advanced Topics

Custom Memory Allocators

  • allocators Module: Python allows customization of memory allocators using the allocators module (in C extensions).

  • Third-Party Allocators: External libraries like jemalloc or tcmalloc can replace the default system allocator for performance improvements.

Memory Pools in Extension Modules

  • Extension Types: C extension modules can implement custom memory pools for specific object types.

  • Example: NumPy uses its own memory management strategies for large arrays.

Conclusion

Python's memory allocation is a sophisticated system designed to balance performance and memory efficiency. Understanding the intricacies of:

  • Reference Counting: Immediate deallocation of objects without references.

  • Garbage Collection: Handling of cyclic references and unreachable objects.

  • Pymalloc and Allocators: Efficient allocation of small objects and interaction with the system allocator.