Skip to content

Segfault from c stack overflow when under memory pressure #150722

@stestagg

Description

@stestagg

Crash report

What happened?

Ok, this is hard to reproduce reliably, and may not even be fixable, but I'm reporting for completeness here.

If the kernel tries to allocate more python c stack space when there is extreme memory pressure, the allocation may fail, but the python c stack guard still believes there's plenty of space, resulting in stack overflows.

It's a bit tricky to reliably reproduce, because you need to avoid normal allocation failures, which unwind the stack, and avoid the python stack overflows, so it's triggered where you have a high c frame to python frame ratio, and only small object allocations to dodge. (I found this with fuzzing deeply nested imports, but this reproducer below is more reliable)

I have code that should reproduce, but you may need to tweak the PRESSURE_START/STOP_KIB constants depending on the exact stack/memory setup:

repro2.py
import faulthandler
import os
import resource


AS_LIMIT_MIB = int(os.environ.get("AS_LIMIT_MIB", "512"))
PRESSURE_START_KIB = int(os.environ.get("PRESSURE_START_KIB", "506817"))
PRESSURE_STOP_KIB = int(os.environ.get("PRESSURE_STOP_KIB", "506832"))
SPARE_KIB = 256
TUPLE_DEPTH = 5000

PRESSURE_CHUNKS = (
    16 * 1024 * 1024,
    4 * 1024 * 1024,
    1024 * 1024,
    256 * 1024,
    64 * 1024,
    16 * 1024,
    4096,
)


def nested(depth):
    obj = 1
    for _ in range(depth):
        obj = (obj,)
    return obj


faulthandler.enable(all_threads=True)

# Build the objects before applying memory pressure, so the comparison is the
# first deep native-stack operation after the spare address space is released.
left = nested(TUPLE_DEPTH)
right = nested(TUPLE_DEPTH)

as_limit = AS_LIMIT_MIB * 1024 * 1024
resource.setrlimit(resource.RLIMIT_AS, (as_limit, as_limit))

for pressure_kib in range(PRESSURE_START_KIB, PRESSURE_STOP_KIB + 1, 1):
    spare = bytearray(SPARE_KIB * 1024)
    pressure = []
    remaining = pressure_kib * 1024

    try:
        for size in PRESSURE_CHUNKS:
            while remaining >= size:
                pressure.append(bytearray(size))
                remaining -= size
    except MemoryError:
        print(f"{pressure_kib}KiB: MemoryError before compare", flush=True)
        break

    del spare
    print(f"{pressure_kib}KiB:", flush=True)
    print(left == right, flush=True)

And a dockerfile that reproduces it on aarch64 (for me):

Dockerfile
FROM ubuntu:latest

ARG CPYTHON_REF=main
ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        ca-certificates \
        clang \
        git \
        libbz2-dev \
        libffi-dev \
        libgdbm-dev \
        liblzma-dev \
        libncurses-dev \
        libreadline-dev \
        libsqlite3-dev \
        libssl-dev \
        libxml2-dev \
        libxslt1-dev \
        libzstd-dev \
        make \
        pkg-config \
        python3 \
        tk-dev \
        zlib1g-dev && \
    rm -rf /var/lib/apt/lists/*

RUN git clone https://github.com/python/cpython /src/cpython && \
    cd /src/cpython && \
    git checkout --detach "$CPYTHON_REF"

RUN cd /src/cpython && \
    ./configure --prefix=/opt/cpython --without-ensurepip && \
    make -j"$(nproc)" && \
    make install

COPY repro2.py /src/repro2.py

ENV AS_LIMIT_MIB=512
ENV PRESSURE_START_KIB=506817
ENV PRESSURE_STOP_KIB=506832

ENV PYTHONUNBUFFERED=1

# CMD ["/opt/cpython/bin/python3", "/src/repro2.py"]
CMD ["/bin/sh", "-c", "set -xe && while true; do /opt/cpython/bin/python3 /src/repro2.py; sleep 0.2; done"]

Here's LLDB indicating that the stack pointer is causing the fault (address=0xffffffeebfc0)
, and yet the python c stack guard is still about 6MB below us:

LLDB session
(lldb) thread info
thread #1: tid = 101, 0x0000aaaaaafeb430 python3.16`PyObject_RichCompare(v=0x0000fffff79d3d40, w=0x0000fffff7a35e00, op=2) at object.c:1100, name = 'python3', stop reason = signal SIGSEGV: address not mapped to object (fault address=0xffffffeebfc0)

(lldb) register read sp fp pc
      sp = 0x0000ffffffeec020
      fp = 0x0000ffffffeec020
      pc = 0x0000aaaaaafeb430  python3.16`PyObject_RichCompare at object.c:1100
(lldb) memory region $sp
[0x0000ffffffeeb000-0x0001000000000000) rw- [stack]
(lldb) script import lldb, os; p=lldb.debugger.GetSelectedTarget().process.GetProcessID(); print("".join(l for l in open(f"/proc/{p}/maps") if "[stack]" in l))
ffffffeeb000-1000000000000 rw-p 00000000 00:00 0                         [stack]

(lldb) bt 20
* thread #1, name = 'python3', stop reason = signal SIGSEGV: address not mapped to object (fault address=0xffffffeebfc0)
  * frame #0: 0x0000aaaaaafeb430 python3.16`PyObject_RichCompare(v=0x0000fffff79d3d40, w=0x0000fffff7a35e00, op=2) at object.c:1100
    frame #1: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3d40, w=0x0000fffff7a35e00, op=2) at object.c:1135:11
    frame #2: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3d90, w=0x0000fffff7a35e50, op=2) at tupleobject.c:755:17
    frame #3: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3d90, w=0x0000fffff7a35e50, op=2) at object.c:1064:15 [inlined]
    frame #4: 0x0000aaaaaafeb594 python3.16`PyObject_RichCompare(v=0x0000fffff79d3d90, w=0x0000fffff7a35e50, op=2) at object.c:1113:21
    frame #5: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3d90, w=0x0000fffff7a35e50, op=2) at object.c:1135:11
    frame #6: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3de0, w=0x0000fffff7a35ea0, op=2) at tupleobject.c:755:17
    frame #7: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3de0, w=0x0000fffff7a35ea0, op=2) at object.c:1064:15 [inlined]
    frame #8: 0x0000aaaaaafeb594 python3.16`PyObject_RichCompare(v=0x0000fffff79d3de0, w=0x0000fffff7a35ea0, op=2) at object.c:1113:21
    frame #9: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3de0, w=0x0000fffff7a35ea0, op=2) at object.c:1135:11
    frame #10: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3e30, w=0x0000fffff7a35ef0, op=2) at tupleobject.c:755:17
    frame #11: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3e30, w=0x0000fffff7a35ef0, op=2) at object.c:1064:15 [inlined]
    frame #12: 0x0000aaaaaafeb594 python3.16`PyObject_RichCompare(v=0x0000fffff79d3e30, w=0x0000fffff7a35ef0, op=2) at object.c:1113:21
    frame #13: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3e30, w=0x0000fffff7a35ef0, op=2) at object.c:1135:11
    frame #14: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3e80, w=0x0000fffff7a35f40, op=2) at tupleobject.c:755:17
    frame #15: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3e80, w=0x0000fffff7a35f40, op=2) at object.c:1064:15 [inlined]
    frame #16: 0x0000aaaaaafeb594 python3.16`PyObject_RichCompare(v=0x0000fffff79d3e80, w=0x0000fffff7a35f40, op=2) at object.c:1113:21
    frame #17: 0x0000aaaaaafeafac python3.16`PyObject_RichCompareBool(v=0x0000fffff79d3e80, w=0x0000fffff7a35f40, op=2) at object.c:1135:11
    frame #18: 0x0000aaaaab05ac80 python3.16`tuple_richcompare(v=0x0000fffff79d3ed0, w=0x0000fffff7a35f90, op=2) at tupleobject.c:755:17
    frame #19: 0x0000aaaaaafeb678 python3.16`do_richcompare(tstate=0x0000aaaaab802580, v=0x0000fffff79d3ed0, w=0x0000fffff7a35f90, op=2) at object.c:1064:15 [inlined]

(lldb) p ((_PyThreadStateImpl*)tstate)->c_stack_top
(uintptr_t) 281474976706560
(lldb) p/x ((_PyThreadStateImpl*)tstate)->c_stack_top
(uintptr_t) 0x0000fffffffff000
(lldb) p/x ((_PyThreadStateImpl*)tstate)->c_stack_soft_limit
(uintptr_t) 0x0000ffffff810000
(lldb) p/x ((_PyThreadStateImpl*)tstate)->c_stack_hard_limit
(uintptr_t) 0x0000ffffff808000

(lldb) register read sp
      sp = 0x0000ffffffeec220
(lldb) p/d (uintptr_t)$sp - ((_PyThreadStateImpl*)tstate)->c_stack_hard_limit
(uintptr_t) 7225888
(lldb) p/x (uintptr_t)$sp - ((_PyThreadStateImpl*)tstate)->c_stack_hard_limit
(uintptr_t) 0x00000000006e4220

As I mentioned at the top, I don't even know how you'd defend against this, but we have a record now.

Note: this also works at up to 50gb address space, if you alter the limits accordingly, so while 512mb is quite small, this does apply generally.

AI statement: The crash was identified using AFL++ normal fuzzing, I used chatgpt and claude to streamline the debugging and reproducer development, but all findings were double-checked, reproduced, tested, and verified by me within my (somewhat limited) knowledge of python internals. If you have questions about this, feel free to ask.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.16.0a0 (heads/main:540b3d0a7fa, Jun 1 2026, 10:44:26) [GCC 15.2.0]

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-crashA hard crash of the interpreter, possibly with a core dump
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions