Skip to content

Commit 14547a0

Browse files
Update InternalDocs/garbge_collector.md
Co-Authored-By: Zanie Blue <contact@zanie.dev>
1 parent 00a7baa commit 14547a0

File tree

1 file changed

+55
-124
lines changed

1 file changed

+55
-124
lines changed

InternalDocs/garbage_collector.md

Lines changed: 55 additions & 124 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ As is explained later in the
107107
[Optimization: reusing fields to save memory](#optimization-reusing-fields-to-save-memory)
108108
section, these two extra fields are normally used to keep doubly linked lists of all the
109109
objects tracked by the garbage collector (these lists are the GC generations, more on
110-
that in the [Optimization: incremental collection](#Optimization-incremental-collection) section), but
110+
that in the [Optimization: generations](#Optimization-generations) section), but
111111
they are also reused to fulfill other purposes when the full doubly linked list
112112
structure is not needed as a memory optimization.
113113

@@ -199,22 +199,22 @@ unreachable:
199199

200200
```pycon
201201
>>> import gc
202-
>>>
202+
>>>
203203
>>> class Link:
204204
... def __init__(self, next_link=None):
205205
... self.next_link = next_link
206-
...
206+
...
207207
>>> link_3 = Link()
208208
>>> link_2 = Link(link_3)
209209
>>> link_1 = Link(link_2)
210210
>>> link_3.next_link = link_1
211211
>>> A = link_1
212212
>>> del link_1, link_2, link_3
213-
>>>
213+
>>>
214214
>>> link_4 = Link()
215215
>>> link_4.next_link = link_4
216216
>>> del link_4
217-
>>>
217+
>>>
218218
>>> # Collect the unreachable Link object (and its .__dict__ dict).
219219
>>> gc.collect()
220220
2
@@ -350,88 +350,42 @@ follows these steps in order:
350350
the reference counts fall to 0, triggering the destruction of all unreachable
351351
objects.
352352

353-
Optimization: incremental collection
354-
====================================
353+
Optimization: generations
354+
=========================
355355

356-
In order to bound the length of each garbage collection pause, the GC implementation
357-
for the default build uses incremental collection with two generations.
356+
In order to limit the time each garbage collection takes, the GC
357+
implementation for the default build uses a popular optimization:
358+
generations.
358359

359360
Generational garbage collection takes advantage of what is known as the weak
360361
generational hypothesis: Most objects die young.
361362
This has proven to be very close to the reality of many Python
362363
programs as many temporary objects are created and destroyed very quickly.
363364

364365
To take advantage of this fact, all container objects are segregated into
365-
two generations: young and old. Every new object starts in the young generation.
366-
Each garbage collection scans the entire young generation and part of the old generation.
367-
368-
The time taken to scan the young generation can be controlled by controlling its
369-
size, but the size of the old generation cannot be controlled.
370-
In order to keep pause times down, scanning of the old generation of the heap
371-
occurs in increments.
372-
373-
To keep track of what has been scanned, the old generation contains two lists:
374-
375-
* Those objects that have not yet been scanned, referred to as the `pending` list.
376-
* Those objects that have been scanned, referred to as the `visited` list.
377-
378-
To detect and collect all unreachable objects in the heap, the garbage collector
379-
must scan the whole heap. This whole heap scan is called a full scavenge.
380-
381-
Increments
382-
----------
383-
384-
Each full scavenge is performed in a series of increments.
385-
For each full scavenge, the combined increments will cover the whole heap.
386-
387-
Each increment is made up of:
388-
389-
* The young generation
390-
* The old generation's least recently scanned objects
391-
* All objects reachable from those objects that have not yet been scanned this full scavenge
392-
393-
The surviving objects (those that are not collected) are moved to the back of the
394-
`visited` list in the old generation.
395-
396-
When a full scavenge starts, no objects in the heap are considered to have been scanned,
397-
so all objects in the old generation must be in the `pending` space.
398-
When all objects in the heap have been scanned a cycle ends, and all objects are moved
399-
to the `pending` list again. To avoid having to traverse the entire list, which list is
400-
`pending` and which is `visited` is determined by a field in the `GCState` struct.
401-
The `visited` and `pending` lists can be swapped by toggling this bit.
402-
403-
Correctness
404-
-----------
405-
406-
The [algorithm for identifying cycles](#Identifying-reference-cycles) will find all
407-
unreachable cycles in a list of objects, but will not find any cycles that are
408-
even partly outside of that list.
409-
Therefore, to be guaranteed that a full scavenge will find all unreachable cycles,
410-
each cycle must be fully contained within a single increment.
411-
412-
To make sure that no partial cycles are included in the increment we perform a
413-
[transitive closure](https://en.wikipedia.org/wiki/Transitive_closure)
414-
over reachable, unscanned objects from the initial increment.
415-
Since the transitive closure of objects reachable from an object must be a (non-strict)
416-
superset of any unreachable cycle including that object, we are guaranteed that a
417-
transitive closure cannot contain any partial cycles.
418-
We can exclude scanned objects, as they must have been reachable when scanned.
419-
If a scanned object becomes part of an unreachable cycle after being scanned, it will
420-
not be collected at this time, but it will be collected in the next full scavenge.
366+
three spaces/generations. Every new
367+
object starts in the first generation (generation 0). The previous algorithm is
368+
executed only over the objects of a particular generation and if an object
369+
survives a collection of its generation it will be moved to the next one
370+
(generation 1), where it will be surveyed for collection less often. If
371+
the same object survives another GC round in this new generation (generation 1)
372+
it will be moved to the last generation (generation 2) where it will be
373+
surveyed the least often.
421374

422375
> [!NOTE]
423376
> The GC implementation for the free-threaded build does not use incremental collection.
424377
> Every collection operates on the entire heap.
425378
379+
426380
In order to decide when to run, the collector keeps track of the number of object
427381
allocations and deallocations since the last collection. When the number of
428382
allocations minus the number of deallocations exceeds `threshold0`,
429-
collection starts. `threshold1` determines the fraction of the old
430-
collection that is included in the increment.
431-
The fraction is inversely proportional to `threshold1`,
432-
as historically a larger `threshold1` meant that old generation
433-
collections were performed less frequently.
434-
`threshold2` is ignored.
383+
collection starts. Initially only generation 0 is examined. If generation 0 has
384+
been examined more than `threshold_1` times since generation 1 has been
385+
examined, then generation 1 is examined as well. With generation 2,
386+
things are a bit more complicated; see
387+
[Collecting the oldest generation](#Collecting-the-oldest-generation) for
388+
more information.
435389

436390
These thresholds can be examined using the
437391
[`gc.get_threshold()`](https://docs.python.org/3/library/gc.html#gc.get_threshold)
@@ -440,7 +394,7 @@ function:
440394
```pycon
441395
>>> import gc
442396
>>> gc.get_threshold()
443-
(700, 10, 10)
397+
(2000, 10, 10)
444398
```
445399

446400
The content of these generations can be examined using the
@@ -453,84 +407,61 @@ specifically in a generation by calling `gc.collect(generation=NUM)`.
453407
... pass
454408
...
455409
>>> # Move everything to the old generation so it's easier to inspect
456-
>>> # the young generation.
410+
>>> # the younger generation.
457411
>>> gc.collect()
458412
0
459413
>>> # Create a reference cycle.
460414
>>> x = MyObj()
461415
>>> x.self = x
462-
>>>
463-
>>> # Initially the object is in the young generation.
416+
>>>
417+
>>> # Initially the object is in the youngest generation.
464418
>>> gc.get_objects(generation=0)
465419
[..., <__main__.MyObj object at 0x7fbcc12a3400>, ...]
466-
>>>
420+
>>>
467421
>>> # After a collection of the youngest generation the object
468-
>>> # moves to the old generation.
422+
>>> # moves to the next generation.
469423
>>> gc.collect(generation=0)
470424
0
471425
>>> gc.get_objects(generation=0)
472426
[]
473427
>>> gc.get_objects(generation=1)
474-
[]
475-
>>> gc.get_objects(generation=2)
476428
[..., <__main__.MyObj object at 0x7fbcc12a3400>, ...]
477429
```
478430

431+
Collecting the oldest generation
432+
--------------------------------
433+
434+
In addition to the various configurable thresholds, the GC only triggers a full
435+
collection of the oldest generation if the ratio `long_lived_pending / long_lived_total`
436+
is above a given value (hardwired to 25%). The reason is that, while "non-full"
437+
collections (that is, collections of the young and middle generations) will always
438+
examine roughly the same number of objects (determined by the aforementioned
439+
thresholds) the cost of a full collection is proportional to the total
440+
number of long-lived objects, which is virtually unbounded. Indeed, it has
441+
been remarked that doing a full collection every <constant number> of object
442+
creations entails a dramatic performance degradation in workloads which consist
443+
of creating and storing lots of long-lived objects (for example, building a large list
444+
of GC-tracked objects would show quadratic performance, instead of linear as
445+
expected). Using the above ratio, instead, yields amortized linear performance
446+
in the total number of objects (the effect of which can be summarized thusly:
447+
"each full garbage collection is more and more costly as the number of objects
448+
grows, but we do fewer and fewer of them").
449+
479450

480451
Optimization: excluding reachable objects
481452
=========================================
482453

483454
An object cannot be garbage if it can be reached. To avoid having to identify
484-
reference cycles across the whole heap, we can reduce the amount of work done
485-
considerably by first identifying objects reachable from objects known to be
486-
alive. These objects are excluded from the normal cyclic detection process.
487-
488-
The default and free-threaded build both implement this optimization but in
489-
slightly different ways.
490-
491-
Finding reachable objects for the default build GC
492-
--------------------------------------------------
493-
494-
This works by first moving most reachable objects to the `visited` space.
495-
Empirically, most reachable objects can be reached from a small set of global
496-
objects and local variables. This step does much less work per object, so
497-
reduces the time spent performing garbage collection by at least half.
498-
499-
> [!NOTE]
500-
> Objects that are not determined to be reachable by this pass are not necessarily
501-
> unreachable. We still need to perform the main algorithm to determine which objects
502-
> are actually unreachable.
503-
We use the same technique of forming a transitive closure as the incremental
504-
collector does to find reachable objects, seeding the list with some global
505-
objects and the currently executing frames.
506-
507-
This phase moves objects to the `visited` space, as follows:
508-
509-
1. All objects directly referred to by any builtin class, the `sys` module, the `builtins`
510-
module and all objects directly referred to from stack frames are added to a working
511-
set of reachable objects.
512-
2. Until this working set is empty:
513-
1. Pop an object from the set and move it to the `visited` space
514-
2. For each object directly reachable from that object:
515-
* If it is not already in `visited` space and it is a GC object,
516-
add it to the working set
517-
518-
519-
Before each increment of collection is performed, the stacks are scanned
520-
to check for any new stack frames that have been created since the last
521-
increment. All objects directly referred to from those stack frames are
522-
added to the working set.
523-
Then the above algorithm is repeated, starting from step 2.
524-
455+
reference cycles across the whole heap, the free-threaded build first identifies
456+
objects reachable from objects known to be alive. These objects are excluded
457+
from the normal cyclic detection process.
525458

526459
Finding reachable objects for the free-threaded GC
527460
--------------------------------------------------
528461

529462
Within the `gc_free_threading.c` implementation, this is known as the "mark
530-
alive" pass or phase. It is similar in concept to what is done for the default
531-
build GC. Rather than moving objects between double-linked lists, the
532-
free-threaded GC uses a flag in `ob_gc_bits` to track if an object is
533-
found to be definitely alive (not garbage).
463+
alive" pass or phase. The free-threaded GC uses a flag in `ob_gc_bits` to track
464+
if an object is found to be definitely alive (not garbage).
534465

535466
To find objects reachable from known alive objects, known as the "roots", the
536467
`gc_mark_alive_from_roots()` function is used. Root objects include

0 commit comments

Comments
 (0)