Crash report
What happened?
Ok, this is hard to reproduce reliably, and may not even be fixable, but I'm reporting for completeness here.
If the kernel tries to allocate more python c stack space when there is extreme memory pressure, the allocation may fail, but the python c stack guard still believes there's plenty of space, resulting in stack overflows.
It's a bit tricky to reliably reproduce, because you need to avoid normal allocation failures, which unwind the stack, and avoid the python stack overflows, so it's triggered where you have a high c frame to python frame ratio, and only small object allocations to dodge. (I found this with fuzzing deeply nested imports, but this reproducer below is more reliable)
I have code that should reproduce, but you may need to tweak the PRESSURE_START/STOP_KIB constants depending on the exact stack/memory setup:
repro2.py
import faulthandler
import os
import resource
AS_LIMIT_MIB = int(os.environ.get("AS_LIMIT_MIB", "512"))
PRESSURE_START_KIB = int(os.environ.get("PRESSURE_START_KIB", "506817"))
PRESSURE_STOP_KIB = int(os.environ.get("PRESSURE_STOP_KIB", "506832"))
SPARE_KIB = 256
TUPLE_DEPTH = 5000
PRESSURE_CHUNKS = (
16 * 1024 * 1024,
4 * 1024 * 1024,
1024 * 1024,
256 * 1024,
64 * 1024,
16 * 1024,
4096,
)
def nested(depth):
obj = 1
for _ in range(depth):
obj = (obj,)
return obj
faulthandler.enable(all_threads=True)
# Build the objects before applying memory pressure, so the comparison is the
# first deep native-stack operation after the spare address space is released.
left = nested(TUPLE_DEPTH)
right = nested(TUPLE_DEPTH)
as_limit = AS_LIMIT_MIB * 1024 * 1024
resource.setrlimit(resource.RLIMIT_AS, (as_limit, as_limit))
for pressure_kib in range(PRESSURE_START_KIB, PRESSURE_STOP_KIB + 1, 1):
spare = bytearray(SPARE_KIB * 1024)
pressure = []
remaining = pressure_kib * 1024
try:
for size in PRESSURE_CHUNKS:
while remaining >= size:
pressure.append(bytearray(size))
remaining -= size
except MemoryError:
print(f"{pressure_kib}KiB: MemoryError before compare", flush=True)
break
del spare
print(f"{pressure_kib}KiB:", flush=True)
print(left == right, flush=True)
And a dockerfile that reproduces it on aarch64 (for me):
Dockerfile
FROM ubuntu:latest
ARG CPYTHON_REF=main
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
ca-certificates \
clang \
git \
libbz2-dev \
libffi-dev \
libgdbm-dev \
liblzma-dev \
libncurses-dev \
libreadline-dev \
libsqlite3-dev \
libssl-dev \
libxml2-dev \
libxslt1-dev \
libzstd-dev \
make \
pkg-config \
python3 \
tk-dev \
zlib1g-dev && \
rm -rf /var/lib/apt/lists/*
RUN git clone https://github.com/python/cpython /src/cpython && \
cd /src/cpython && \
git checkout --detach "$CPYTHON_REF"
RUN cd /src/cpython && \
./configure --prefix=/opt/cpython --without-ensurepip && \
make -j"$(nproc)" && \
make install
COPY repro2.py /src/repro2.py
ENV AS_LIMIT_MIB=512
ENV PRESSURE_START_KIB=506817
ENV PRESSURE_STOP_KIB=506832
ENV PYTHONUNBUFFERED=1
# CMD ["/opt/cpython/bin/python3", "/src/repro2.py"]
CMD ["/bin/sh", "-c", "set -xe && while true; do /opt/cpython/bin/python3 /src/repro2.py; sleep 0.2; done"]
Here's LLDB indicating that the stack pointer is causing the fault (address=0xffffffeebfc0)
, and yet the python c stack guard is still about 6MB below us:
LLDB session
(lldb) thread info
thread #1: tid = 101, 0x0000aaaaaafeb430 python3.16`PyObject_RichCompare(v=0x0000fffff79d3d40, w=0x0000fffff7a35e00, op=2) at object.c:1100, name = 'python3', stop reason = signal SIGSEGV: address not mapped to object (fault address=0xffffffeebfc0)
(lldb) register read sp fp pc
sp = 0x0000ffffffeec020
fp = 0x0000ffffffeec020
pc = 0x0000aaaaaafeb430 python3.16`PyObject_RichCompare at object.c:1100
(lldb) memory region $sp
[0x0000ffffffeeb000-0x0001000000000000) rw- [stack]
(lldb) script import lldb, os; p=lldb.debugger.GetSelectedTarget().process.GetProcessID(); print("".join(l for l in open(f"/proc/{p}/maps") if "[stack]" in l))
ffffffeeb000-1000000000000 rw-p 00000000 00:00 0 [stack]
(lldb) bt 20
* thread #1, name = 'python3', stop reason = signal SIGSEGV: address not mapped to object (fault address=0xffffffeebfc0)
* frame #0: 0x0000aaaaaafeb430 python3.16`PyObject_RichCompare(v=0x0000fffff79d3d40, w=0x0000fffff7a35e00, op=2) at object.c:1100
frame #1: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3d40, w=0x0000fffff7a35e00, op=2) at object.c:1135:11
frame #2: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3d90, w=0x0000fffff7a35e50, op=2) at tupleobject.c:755:17
frame #3: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3d90, w=0x0000fffff7a35e50, op=2) at object.c:1064:15 [inlined]
frame #4: 0x0000aaaaaafeb594 python3.16`PyObject_RichCompare(v=0x0000fffff79d3d90, w=0x0000fffff7a35e50, op=2) at object.c:1113:21
frame #5: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3d90, w=0x0000fffff7a35e50, op=2) at object.c:1135:11
frame #6: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3de0, w=0x0000fffff7a35ea0, op=2) at tupleobject.c:755:17
frame #7: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3de0, w=0x0000fffff7a35ea0, op=2) at object.c:1064:15 [inlined]
frame #8: 0x0000aaaaaafeb594 python3.16`PyObject_RichCompare(v=0x0000fffff79d3de0, w=0x0000fffff7a35ea0, op=2) at object.c:1113:21
frame #9: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3de0, w=0x0000fffff7a35ea0, op=2) at object.c:1135:11
frame #10: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3e30, w=0x0000fffff7a35ef0, op=2) at tupleobject.c:755:17
frame #11: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3e30, w=0x0000fffff7a35ef0, op=2) at object.c:1064:15 [inlined]
frame #12: 0x0000aaaaaafeb594 python3.16`PyObject_RichCompare(v=0x0000fffff79d3e30, w=0x0000fffff7a35ef0, op=2) at object.c:1113:21
frame #13: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3e30, w=0x0000fffff7a35ef0, op=2) at object.c:1135:11
frame #14: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3e80, w=0x0000fffff7a35f40, op=2) at tupleobject.c:755:17
frame #15: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3e80, w=0x0000fffff7a35f40, op=2) at object.c:1064:15 [inlined]
frame #16: 0x0000aaaaaafeb594 python3.16`PyObject_RichCompare(v=0x0000fffff79d3e80, w=0x0000fffff7a35f40, op=2) at object.c:1113:21
frame #17: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3e80, w=0x0000fffff7a35f40, op=2) at object.c:1135:11
frame #18: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3ed0, w=0x0000fffff7a35f90, op=2) at tupleobject.c:755:17
frame #19: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3ed0, w=0x0000fffff7a35f90, op=2) at object.c:1064:15 [inlined]
(lldb) p ((_PyThreadStateImpl*)tstate)->c_stack_top
(uintptr_t) 281474976706560
(lldb) p/x ((_PyThreadStateImpl*)tstate)->c_stack_top
(uintptr_t) 0x0000fffffffff000
(lldb) p/x ((_PyThreadStateImpl*)tstate)->c_stack_soft_limit
(uintptr_t) 0x0000ffffff810000
(lldb) p/x ((_PyThreadStateImpl*)tstate)->c_stack_hard_limit
(uintptr_t) 0x0000ffffff808000
(lldb) register read sp
sp = 0x0000ffffffeec220
(lldb) p/d (uintptr_t)$sp - ((_PyThreadStateImpl*)tstate)->c_stack_hard_limit
(uintptr_t) 7225888
(lldb) p/x (uintptr_t)$sp - ((_PyThreadStateImpl*)tstate)->c_stack_hard_limit
(uintptr_t) 0x00000000006e4220
As I mentioned at the top, I don't even know how you'd defend against this, but we have a record now.
Note: this also works at up to 50gb address space, if you alter the limits accordingly, so while 512mb is quite small, this does apply generally.
AI statement: The crash was identified using AFL++ normal fuzzing, I used chatgpt and claude to streamline the debugging and reproducer development, but all findings were double-checked, reproduced, tested, and verified by me within my (somewhat limited) knowledge of python internals. If you have questions about this, feel free to ask.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.16.0a0 (heads/main:540b3d0a7fa, Jun 1 2026, 10:44:26) [GCC 15.2.0]
Crash report
What happened?
Ok, this is hard to reproduce reliably, and may not even be fixable, but I'm reporting for completeness here.
If the kernel tries to allocate more python c stack space when there is extreme memory pressure, the allocation may fail, but the python c stack guard still believes there's plenty of space, resulting in stack overflows.
It's a bit tricky to reliably reproduce, because you need to avoid normal allocation failures, which unwind the stack, and avoid the python stack overflows, so it's triggered where you have a high c frame to python frame ratio, and only small object allocations to dodge. (I found this with fuzzing deeply nested imports, but this reproducer below is more reliable)
I have code that should reproduce, but you may need to tweak the PRESSURE_START/STOP_KIB constants depending on the exact stack/memory setup:
repro2.py
And a dockerfile that reproduces it on aarch64 (for me):
Dockerfile
Here's LLDB indicating that the stack pointer is causing the fault (address=0xffffffeebfc0)
, and yet the python c stack guard is still about 6MB below us:
LLDB session
As I mentioned at the top, I don't even know how you'd defend against this, but we have a record now.
Note: this also works at up to 50gb address space, if you alter the limits accordingly, so while 512mb is quite small, this does apply generally.
AI statement: The crash was identified using AFL++ normal fuzzing, I used chatgpt and claude to streamline the debugging and reproducer development, but all findings were double-checked, reproduced, tested, and verified by me within my (somewhat limited) knowledge of python internals. If you have questions about this, feel free to ask.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.16.0a0 (heads/main:540b3d0a7fa, Jun 1 2026, 10:44:26) [GCC 15.2.0]