Skip to content

Align more error messages with CPython 3.14.5#7988

Merged
youknowone merged 7 commits into
RustPython:mainfrom
ShaharNaveh:ruff-error-msg-1
May 28, 2026
Merged

Align more error messages with CPython 3.14.5#7988
youknowone merged 7 commits into
RustPython:mainfrom
ShaharNaveh:ruff-error-msg-1

Conversation

@ShaharNaveh
Copy link
Copy Markdown
Contributor

@ShaharNaveh ShaharNaveh commented May 27, 2026

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Improved parser error diagnostics with enhanced specificity for common issues including unclosed brackets, indentation errors, missing tokens, and invalid assignments, providing clearer and more actionable error messages.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

📝 Walkthrough

Walkthrough

This PR refactors RustPython's parse error handling in the compiler. Error type imports are updated to use ParseErrorType directly, and the public error struct fields are simplified from qualified paths. The core from_ruff_parse_error function is rewritten from if/else chains to a match statement, adding specialized diagnostic handling for EOF, indentation, missing commas, and invalid assignments.

Changes

Parse Error Diagnostics

Layer / File(s) Summary
Error type declarations and imports
crates/compiler/src/lib.rs
Direct imports of LexicalErrorType and ParseErrorType from ruff_python_parser replace the qualified parser::ParseErrorType path. Public CompileErrorType::Parse and ParseError::error field types are updated to use ParseErrorType directly.
Parse error conversion and diagnostic generation
crates/compiler/src/lib.rs
from_ruff_parse_error function refactored from nested if/else to match over ParseErrorType variants, with specialized diagnostic handlers for EOF unclosed-bracket cases, indentation errors with location adjustments, missing-comma detection (Comma/Int token pair), invalid assignment targets (parses expression slice and generates messages based on ast::Expr shape), invalid named assignment targets (custom "assignment expressions" message), and default location normalization when ranges end at column 1.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • RustPython/RustPython#7933: Both PRs modify how CompileError::Parse/ParseErrorType variants (especially ParseErrorType::ExpectedToken) are converted into user-facing diagnostics/messages, just in different layers/files (from_ruff_parse_error in the compiler vs SyntaxErrorInfo::analyze_compile_error in the VM).
  • RustPython/RustPython#7928: Both PRs update RustPython's handling of parser ParseErrorType variants to produce more precise, CPython-aligned user-facing error messages/caret behavior (main PR in from_ruff_parse_error diagnostic conversion; retrieved PR in VM syntax-error construction via analyze_compile_error).

Suggested reviewers

  • youknowone

Poem

🐰 A parse error caught mid-flight,
Now matches patterns, oh what sight!
From tangled ifs to match arrays bright,
Diagnostics dance with newfound might,
Syntax errors show the proper light! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title references aligning error messages with CPython 3.14.5, which directly matches the core objective of improving error message compatibility shown in the raw summary's description of parse error conversion logic enhancements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] test: cpython/Lib/test/test_named_expressions.py (TODO: 7)

dependencies:

dependent tests: (no tests depend on named_expressions)

[ ] test: cpython/Lib/test/test_syntax.py (TODO: 210)

dependencies:

dependent tests: (no tests depend on syntax)

[ ] test: cpython/Lib/test/test_generators.py (TODO: 11)
[ ] test: cpython/Lib/test/test_genexps.py (TODO: 5)
[x] test: cpython/Lib/test/test_generator_stop.py
[x] test: cpython/Lib/test/test_yield_from.py (TODO: 1)

dependencies:

dependent tests: (no tests depend on generator)

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@ShaharNaveh ShaharNaveh marked this pull request as ready for review May 27, 2026 17:20
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/compiler/src/lib.rs (1)

101-114: ⚡ Quick win

Extract the repeated range-normalization block.

The loc/end_loc setup and the "column 1 on the next line" fixup are duplicated across four arms. Pulling that into a small helper would keep the message-specific branches much smaller and reduce drift when these diagnostics change again.

As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".

Also applies to: 118-129, 161-172, 180-191

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/compiler/src/lib.rs` around lines 101 - 114, Extract the duplicated
loc/end_loc setup and "column 1 on the next line" adjustment into a small helper
(e.g., normalize_parse_range or compute_parse_locations) that takes the Parse
Error's error.location and a reference to source_code and returns (loc,
end_loc); replace each arm that currently calls source_code.source_location(...,
PositionEncoding::Utf8) and performs the prev_line_end fix with a single call to
that helper, then only supply the message-specific ParseErrorType (like
ParseErrorType::OtherError(msg)) in each branch so the parsing arms no longer
duplicate the range-normalization logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/compiler/src/lib.rs`:
- Around line 88-94: The UTF-8 column fallback uses Unicode scalar count which
underestimates byte offsets for PositionEncoding::Utf8; in the fallback inside
the block that builds end_loc (using SourceLocation and loc.character_offset)
compute line_end_col from the line's byte length (use line.len()) instead of
line.chars().count() + 1 so character_offset remains measured in UTF-8 bytes;
keep the OneIndexed::new(...) and the existing unwrap_or(loc.character_offset)
logic but pass the byte-based column value.
- Around line 61-80: The match is currently borrowing error.error (match
&error.error) but some arms return error.error, causing a move-from-borrow
compile error; fix by matching the owned value instead (e.g., destructure error
first and use match error.error { ... }) or by returning a clone of the error
variant in those arms (e.g., error.error.clone()); update the branch that
handles Lexical(LexicalErrorType::Eof) to still set is_unclosed_bracket, compute
loc/end_loc via source_code.source_location and then return either
(ParseErrorType::OtherError(msg), loc, end_loc) or the owned/cloned original
error value for the fallback arm so you no longer move out of a borrowed
reference.

---

Nitpick comments:
In `@crates/compiler/src/lib.rs`:
- Around line 101-114: Extract the duplicated loc/end_loc setup and "column 1 on
the next line" adjustment into a small helper (e.g., normalize_parse_range or
compute_parse_locations) that takes the Parse Error's error.location and a
reference to source_code and returns (loc, end_loc); replace each arm that
currently calls source_code.source_location(..., PositionEncoding::Utf8) and
performs the prev_line_end fix with a single call to that helper, then only
supply the message-specific ParseErrorType (like
ParseErrorType::OtherError(msg)) in each branch so the parsing arms no longer
duplicate the range-normalization logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 8ea2731e-a951-4835-ae3f-a69cfee1284b

📥 Commits

Reviewing files that changed from the base of the PR and between ca412fc and 86984a0.

⛔ Files ignored due to path filters (4)
  • Lib/test/test_generators.py is excluded by !Lib/**
  • Lib/test/test_genexps.py is excluded by !Lib/**
  • Lib/test/test_named_expressions.py is excluded by !Lib/**
  • Lib/test/test_syntax.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • crates/compiler/src/lib.rs

Comment on lines +61 to +80
let (error_type, location, end_location) = match &error.error {
ParseErrorType::Lexical(LexicalErrorType::Eof) => {
if let Some((bracket_char, bracket_offset)) = find_unclosed_bracket(source_text) {
let bracket_text_size = ruff_text_size::TextSize::new(bracket_offset as u32);
let loc =
source_code.source_location(bracket_text_size, PositionEncoding::Utf8);
let end_loc = SourceLocation {
line: loc.line,
character_offset: loc.character_offset.saturating_add(1),
};
let msg = format!("'{bracket_char}' was never closed");
is_unclosed_bracket = true;
(ParseErrorType::OtherError(msg), loc, end_loc)
} else {
let loc =
source_code.source_location(error.location.start(), PositionEncoding::Utf8);
let end_loc =
source_code.source_location(error.location.end(), PositionEncoding::Utf8);
(error.error, loc, end_loc)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the referenced ranges
sed -n '40,120p' crates/compiler/src/lib.rs
echo '---'
sed -n '160,220p' crates/compiler/src/lib.rs

# Locate the function that contains this match (if needed)
rg -n "from_ruff_parse_error|find_unclosed_bracket|match &error\.error" crates/compiler/src/lib.rs

# Run clippy to surface any compile-time ownership errors
cargo clippy -q

Repository: RustPython/RustPython

Length of output: 151


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the exact match block and the other referenced ranges
nl -ba crates/compiler/src/lib.rs | sed -n '45,120p'
echo '---'
nl -ba crates/compiler/src/lib.rs | sed -n '160,240p'

# Locate every occurrence of the suspected match
rg -n "match\s+&error\.error|error\.error\s*," crates/compiler/src/lib.rs

# Find the ParseErrorType definition (to see if it is Copy/Clone-only, etc.)
rg -n "enum\s+ParseErrorType" -S . | head -n 50
rg -n "struct\s+ParseError|enum\s+ParseError" -S . | head -n 50

Repository: RustPython/RustPython

Length of output: 108


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Helper to print with line numbers without nl
print_lines () {
  awk 'NR>='$2' && NR<='$3'{printf "%d\t%s\n", NR, $0}' "$1"
}

FILE="crates/compiler/src/lib.rs"

echo "=== match &error.error block (around line ~61-90) ==="
print_lines "$FILE" 55 110

echo
echo "=== additional referenced ranges (83-97, 179-194) ==="
print_lines "$FILE" 80 105
echo "---"
print_lines "$FILE" 170 210

echo
echo "=== occurrences of match &error.error ==="
rg -n "match\s+&error\.error" "$FILE"

echo
echo "=== any returns of error.error inside that match ==="
rg -n "match\s+&error\.error|return\s+\(error\.error|,\s*error\.error\s*," "$FILE"

echo
echo "=== what type is error (search for ParseError construction/fields) ==="
rg -n "ParseError|parse error|error\.error|struct\s+ParseError|enum\s+ParseError" "$FILE"

Repository: RustPython/RustPython

Length of output: 8431


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Try to find the ruff_python_parser source in the local cargo registry
RUST_REG_SRC="${CARGO_HOME:-$HOME/.cargo}/registry/src"
if [ ! -d "$RUST_REG_SRC" ]; then
  echo "No cargo registry source dir at: $RUST_REG_SRC" >&2
  exit 0
fi

# Search for ParseErrorType definition and the OtherError variant
rg -n "pub\s+enum\s+ParseErrorType" "$RUST_REG_SRC" -S --no-mmap | head -n 20 || true
rg -n "OtherError" "$RUST_REG_SRC" -S --no-mmap | rg -n "ParseErrorType" | head -n 50 || true

# Also search for an impl Copy for ParseErrorType (should be absent if it contains String)
rg -n "impl\s+.*Copy\s+for\s+ParseErrorType" "$RUST_REG_SRC" -S --no-mmap | head -n 20 || true

Repository: RustPython/RustPython

Length of output: 10058


🏁 Script executed:

#!/bin/bash
set -euo pipefail

RUST_REG_SRC="${CARGO_HOME:-$HOME/.cargo}/registry/src"

# Find and print the ParseErrorType definition block
def_file="$(rg -l --no-mmap "enum\s+ParseErrorType" "$RUST_REG_SRC" | rg "rustpython-ruff_python_parser-" | head -n 1)"
echo "ParseErrorType definition file: $def_file"

if [ -z "${def_file:-}" ]; then
  echo "ParseErrorType definition file not found" >&2
  exit 0
fi

# Print around the enum definition
line="$(rg -n --no-mmap "enum\s+ParseErrorType" "$def_file" | head -n 1 | cut -d: -f1)"
start=$((line-10))
end=$((line+60))
echo "--- definition context ($start-$end) ---"
sed -n "${start},${end}p" "$def_file"

echo
# Check for Copy/Clone derives/impls
rg -n --no-mmap "derive\([^\)]*\bCopy\b" "$def_file" || true
rg -n --no-mmap "impl\s+.*Copy\s+for\s+ParseErrorType" "$RUST_REG_SRC" | head -n 20 || true

echo
# Print the ParseError struct definition (to confirm field names/types)
parse_def_file="$(rg -l --no-mmap "struct\s+ParseError" "$RUST_REG_SRC" | rg "rustpython-ruff_python_parser-" | head -n 1)"
echo "ParseError definition file: $parse_def_file"
parse_line="$(rg -n --no-mmap "struct\s+ParseError" "$parse_def_file" | head -n 1 | cut -d: -f1)"
start2=$((parse_line-10))
end2=$((parse_line+60))
sed -n "${start2},${end2}p" "$parse_def_file"

Repository: RustPython/RustPython

Length of output: 5621


Fix move-out-of-borrowed error.error in match &error.error

ParseErrorType is not Copy (e.g., OtherError(String)), so the arms that return error.error while matching on &error.error will fail to compile (“cannot move out of borrowed content”). Match on the owned error.error (destructure error first, then match error.error { ... other => (other, loc, end_loc) }) or return error.error.clone() in those arms.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/compiler/src/lib.rs` around lines 61 - 80, The match is currently
borrowing error.error (match &error.error) but some arms return error.error,
causing a move-from-borrow compile error; fix by matching the owned value
instead (e.g., destructure error first and use match error.error { ... }) or by
returning a clone of the error variant in those arms (e.g.,
error.error.clone()); update the branch that handles
Lexical(LexicalErrorType::Eof) to still set is_unclosed_bracket, compute
loc/end_loc via source_code.source_location and then return either
(ParseErrorType::OtherError(msg), loc, end_loc) or the owned/cloned original
error value for the fallback arm so you no longer move out of a borrowed
reference.

Comment on lines +88 to +94
let line_idx = loc.line.to_zero_indexed();
let line = source_text.split('\n').nth(line_idx).unwrap_or("");
let line_end_col = line.chars().count() + 1; // 1-indexed, past last char
let end_loc = SourceLocation {
line: loc.line,
character_offset: loc.character_offset.saturating_add(1),
character_offset: ruff_source_file::OneIndexed::new(line_end_col)
.unwrap_or(loc.character_offset),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "line_end_col|PositionEncoding::Utf8|SourceLocation|character_offset|OneIndexed" crates/compiler/src/lib.rs
echo "----"
sed -n '60,120p' crates/compiler/src/lib.rs
echo "----"
# Find the definition/usage expectations for these types
rg -n "enum PositionEncoding|struct OneIndexed|OneIndexed<|type OneIndexed|PositionEncoding::Utf8" crates -S
echo "----"
# Search for similar fallback logic elsewhere in the repo
rg -n "chars\\(\\)\\.count\\(\\) \\+ 1|line\\.chars\\(\\)\\.count\\(\\)\\s*\\+\\s*1|PositionEncoding::Utf8.*chars\\(\\)\\.count" crates -S

Repository: RustPython/RustPython

Length of output: 11305


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for the crate in the workspace/repo
fd -t f "lib.rs" crates -H | rg -n "ruff_source_file" || true
fd -t f "*ruff_source_file*" . || true

# Search for the actual definition of PositionEncoding / OneIndexed in the repo
rg -n "enum PositionEncoding|struct OneIndexed|OneIndexed::new|PositionEncoding::Utf8" . || true

# If definitions aren't in-repo, find the dependency version to use for targeted source lookup
rg -n "ruff_source_file" Cargo.toml Cargo.lock . || true

Repository: RustPython/RustPython

Length of output: 8598


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Try to locate the dependency source in the local cargo registry
echo "Searching local cargo registry for PositionEncoding..."
rg -n "enum PositionEncoding" ~/.cargo/registry/src 2>/dev/null || true
rg -n "PositionEncoding::Utf8" ~/.cargo/registry/src 2>/dev/null || true

# Print candidate crate directories if found
ls -R ~/.cargo/registry/src 2>/dev/null | rg -n "rustpython-ruff_source_file|ruff_source_file" || true

Repository: RustPython/RustPython

Length of output: 120


🏁 Script executed:

#!/bin/bash
set -euo pipefail

ls -la ~
echo "----"
ls -la ~/.cargo || true
echo "----"
ls -la ~/.cargo/registry || true
echo "----"
ls -la ~/.cargo/registry/src || true

# If present, search for the crate directory name
if [ -d ~/.cargo/registry/src ]; then
  fd -t d rustpython-ruff_source_file ~/.cargo/registry/src || true
  fd -t d ruff_source_file ~/.cargo/registry/src || true
fi

Repository: RustPython/RustPython

Length of output: 973


🌐 Web query:

ruff_source_file PositionEncoding Utf8 meaning character_offset units

💡 Result:

In the context of the Ruff source code and its PositionEncoding implementation, the term character_offset represents the position of a character within a specific line, and its numerical unit is determined by the PositionEncoding variant selected [1][2]. Ruff defines three primary encoding variants in its PositionEncoding enum to handle how character offsets are measured, particularly for compatibility with Language Server Protocol (LSP) requirements and non-BMP (Basic Multilingual Plane) characters [1][2]: - Utf8: character_offset is measured in bytes from the start of the line. - Utf16: character_offset is measured in UTF-16 code units (16-bit units), which is common for LSP clients like VS Code or Neovim [3][1][2]. - Utf32: character_offset is measured in Unicode scalar values (essentially counting the number of actual characters/code points) [1][2]. When Ruff performs operations such as mapping between internal TextSize (which are byte-based) and SourceLocation (which uses a character_offset to specify column position), it uses the chosen PositionEncoding to correctly calculate the offset [4][1][5]. This mechanism ensures that diagnostics, quick-fixes, and editor markers remain accurate regardless of whether a file contains multi-byte characters or non-BMP symbols (like emojis) that would otherwise cause misalignment if only simple byte-counting were used [1][5].

Citations:


Fix UTF-8 column fallback for IndentationError

PositionEncoding::Utf8 uses character_offset measured in UTF-8 bytes, but this fallback computes the column via line.chars().count() + 1 (Unicode scalars). That under-reports the offset on non-ASCII lines. Use the line’s byte length (line.len()) to keep the offset consistent.

Suggested fix
-                let line_end_col = line.chars().count() + 1; // 1-indexed, past last char
+                let line_end_col = line.len() + 1; // UTF-8 bytes, 1-indexed, past last char
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let line_idx = loc.line.to_zero_indexed();
let line = source_text.split('\n').nth(line_idx).unwrap_or("");
let line_end_col = line.chars().count() + 1; // 1-indexed, past last char
let end_loc = SourceLocation {
line: loc.line,
character_offset: loc.character_offset.saturating_add(1),
character_offset: ruff_source_file::OneIndexed::new(line_end_col)
.unwrap_or(loc.character_offset),
let line_idx = loc.line.to_zero_indexed();
let line = source_text.split('\n').nth(line_idx).unwrap_or("");
let line_end_col = line.len() + 1; // UTF-8 bytes, 1-indexed, past last char
let end_loc = SourceLocation {
line: loc.line,
character_offset: ruff_source_file::OneIndexed::new(line_end_col)
.unwrap_or(loc.character_offset),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/compiler/src/lib.rs` around lines 88 - 94, The UTF-8 column fallback
uses Unicode scalar count which underestimates byte offsets for
PositionEncoding::Utf8; in the fallback inside the block that builds end_loc
(using SourceLocation and loc.character_offset) compute line_end_col from the
line's byte length (use line.len()) instead of line.chars().count() + 1 so
character_offset remains measured in UTF-8 bytes; keep the OneIndexed::new(...)
and the existing unwrap_or(loc.character_offset) logic but pass the byte-based
column value.

@youknowone youknowone merged commit e80a14b into RustPython:main May 28, 2026
27 checks passed
JamesClarke7283 added a commit to JamesClarke7283/RustPython that referenced this pull request May 29, 2026
Continues RustPython#7928/RustPython#7933/RustPython#7988. Translates many more ruff ParseErrorType
variants to CPython's exact wording in CompileError::from_ruff_parse_error,
and routes ast.parse() / compile(PyCF_ONLY_AST) through the same path so
those messages match too (previously they leaked raw ruff strings). Adds a
few codegen/symtable checks.

Covered: aug-assign/delete/set/dict/f-string/t-string targets; "cannot use
{kind} as import target"; string-prefix incompatibility and "invalid
character 'X' (U+XXXX)"; parenthesized def/lambda params; missing
default/argument value; dict ':' / value syntax; "'elif' block follows an
'else' block"; raise-from; comprehension 'if'; ternary statement keywords;
match "case ... as <target>" -> "cannot use {kind} as pattern target" and
"case ... as _"; __debug__ as def/class/type-param/except name; "name 'x'
is nonlocal and global"; generic type-parameter wording.

Lib/test: drop the now-passing "# TODO: RUSTPYTHON; Wrong error message"
doctest markers and @expectedfailure decorators.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
JamesClarke7283 added a commit to JamesClarke7283/RustPython that referenced this pull request May 29, 2026
Continues RustPython#7928/RustPython#7933/RustPython#7988. Translates many more ruff ParseErrorType
variants to CPython's exact wording in CompileError::from_ruff_parse_error,
and routes ast.parse() / compile(PyCF_ONLY_AST) through the same path so
those messages match too (previously they leaked raw ruff strings). Adds a
few codegen/symtable checks.

Covered: aug-assign/delete/set/dict/f-string/t-string targets; "cannot use
{kind} as import target"; string-prefix incompatibility and "invalid
character 'X' (U+XXXX)"; parenthesized def/lambda params; missing
default/argument value; dict ':' / value syntax; "'elif' block follows an
'else' block"; raise-from; comprehension 'if'; ternary statement keywords;
match "case ... as <target>" -> "cannot use {kind} as pattern target" and
"case ... as _"; __debug__ as def/class/type-param/except name; "name 'x'
is nonlocal and global"; generic type-parameter wording.

Lib/test: drop the now-passing "# TODO: RUSTPYTHON; Wrong error message"
doctest markers and @expectedfailure decorators.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
JamesClarke7283 added a commit to JamesClarke7283/RustPython that referenced this pull request May 30, 2026
Continues RustPython#7928/RustPython#7933/RustPython#7988. Translates many more ruff ParseErrorType
variants to CPython's exact wording in CompileError::from_ruff_parse_error,
and routes ast.parse() / compile(PyCF_ONLY_AST) through the same path so
those messages match too (previously they leaked raw ruff strings). Adds a
few codegen/symtable checks.

Covered: aug-assign/delete/set/dict/f-string/t-string targets; "cannot use
{kind} as import target"; string-prefix incompatibility and "invalid
character 'X' (U+XXXX)"; parenthesized def/lambda params; missing
default/argument value; dict ':' / value syntax; "'elif' block follows an
'else' block"; raise-from; comprehension 'if'; ternary statement keywords;
match "case ... as <target>" -> "cannot use {kind} as pattern target" and
"case ... as _"; __debug__ as def/class/type-param/except name; "name 'x'
is nonlocal and global"; generic type-parameter wording.

Lib/test: drop the now-passing "# TODO: RUSTPYTHON; Wrong error message"
doctest markers and @expectedfailure decorators.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
JamesClarke7283 added a commit to JamesClarke7283/RustPython that referenced this pull request May 30, 2026
Continues RustPython#7928/RustPython#7933/RustPython#7988. Translates many more ruff ParseErrorType
variants to CPython's exact wording in CompileError::from_ruff_parse_error,
and routes ast.parse() / compile(PyCF_ONLY_AST) through the same path so
those messages match too (previously they leaked raw ruff strings). Adds a
few codegen/symtable checks.

Covered: aug-assign/delete/set/dict/f-string/t-string targets; "cannot use
{kind} as import target"; string-prefix incompatibility and "invalid
character 'X' (U+XXXX)"; parenthesized def/lambda params; missing
default/argument value; dict ':' / value syntax; "'elif' block follows an
'else' block"; raise-from; comprehension 'if'; ternary statement keywords;
match "case ... as <target>" -> "cannot use {kind} as pattern target" and
"case ... as _"; __debug__ as def/class/type-param/except name; "name 'x'
is nonlocal and global"; generic type-parameter wording.

Lib/test: drop the now-passing "# TODO: RUSTPYTHON; Wrong error message"
doctest markers and @expectedfailure decorators.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
JamesClarke7283 added a commit to JamesClarke7283/RustPython that referenced this pull request Jun 3, 2026
Continues RustPython#7928/RustPython#7933/RustPython#7988. Translates many more ruff ParseErrorType
variants to CPython's exact wording in CompileError::from_ruff_parse_error,
and routes ast.parse() / compile(PyCF_ONLY_AST) through the same path so
those messages match too (previously they leaked raw ruff strings). Adds a
few codegen/symtable checks.

Covered: aug-assign/delete/set/dict/f-string/t-string targets; "cannot use
{kind} as import target"; string-prefix incompatibility and "invalid
character 'X' (U+XXXX)"; parenthesized def/lambda params; missing
default/argument value; dict ':' / value syntax; "'elif' block follows an
'else' block"; raise-from; comprehension 'if'; ternary statement keywords;
match "case ... as <target>" -> "cannot use {kind} as pattern target" and
"case ... as _"; __debug__ as def/class/type-param/except name; "name 'x'
is nonlocal and global"; generic type-parameter wording.

Lib/test: drop the now-passing "# TODO: RUSTPYTHON; Wrong error message"
doctest markers and @expectedfailure decorators.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants