Align all remaining error messages with CPython#7993
Conversation
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] test: cpython/Lib/test/test_unicodedata.py (TODO: 22) dependencies: dependent tests: (no tests depend on unicode) [ ] test: cpython/Lib/test/test_patma.py (TODO: 13) dependencies: dependent tests: (no tests depend on patma) [ ] test: cpython/Lib/test/test_generators.py (TODO: 11) dependencies: dependent tests: (no tests depend on generator) [ ] test: cpython/Lib/test/test_syntax.py (TODO: 17) dependencies: dependent tests: (no tests depend on syntax) [x] test: cpython/Lib/test/test_dict.py (TODO: 6) dependencies: dependent tests: (no tests depend on dict) [x] test: cpython/Lib/test/test_named_expressions.py (TODO: 1) dependencies: dependent tests: (no tests depend on named_expressions) [x] test: cpython/Lib/test/test_str.py (TODO: 7) dependencies: dependent tests: (no tests depend on str) Legend:
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughTightens symbol validation (rejecting ChangesSymbol validation and parse error alignment
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)
4718-4729: ⚡ Quick winExtract duplicated type parameter validation into a helper method.
The type parameter
__debug__validation logic is duplicated identically in both function and class compilation paths. Extract this into a shared helper to improve maintainability and prevent future divergence.♻️ Proposed refactor to eliminate duplication
Add a helper method to the compiler struct:
fn validate_type_params_no_debug(&self, type_params: Option<&ast::TypeParams>) -> CompileResult<()> { if let Some(params) = type_params { for tp in ¶ms.type_params { let tp_name = match tp { ast::TypeParam::TypeVar(t) => &t.name, ast::TypeParam::TypeVarTuple(t) => &t.name, ast::TypeParam::ParamSpec(t) => &t.name, }; if tp_name.as_str() == "__debug__" { return Err(self.error(CodegenErrorType::Assign("__debug__"))); } } } Ok(()) }Then replace the duplicated blocks with a single call in each location:
- // Reject `def f[__debug__](): ...` type parameter (mirrors class defs). - if let Some(params) = type_params { - for tp in ¶ms.type_params { - let tp_name = match tp { - ast::TypeParam::TypeVar(t) => &t.name, - ast::TypeParam::TypeVarTuple(t) => &t.name, - ast::TypeParam::ParamSpec(t) => &t.name, - }; - if tp_name.as_str() == "__debug__" { - return Err(self.error(CodegenErrorType::Assign("__debug__"))); - } - } - } + // Reject `def f[__debug__](): ...` type parameter (mirrors class defs). + self.validate_type_params_no_debug(type_params)?;Apply the same simplification in the class compilation path (lines 5307-5318).
Also applies to: 5307-5318
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/codegen/src/compile.rs` around lines 4718 - 4729, Extract the duplicated "__debug__" type-parameter check into a new helper on the compiler impl (e.g. fn validate_type_params_no_debug(&self, type_params: Option<&ast::TypeParams>) -> CompileResult<()>) that iterates params.type_params, extracts the name from ast::TypeParam variants, and returns Err(self.error(CodegenErrorType::Assign("__debug__"))) if a name equals "__debug__", otherwise Ok(()). Replace the duplicated validation blocks in the function compilation path and the class compilation path with a single call to this helper, keeping existing types (ast::TypeParams, CompileResult) and error construction unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/compiler/src/lib.rs`:
- Around line 986-1025: The helper parenthesized_param_message currently only
inspects the current physical line, missing multiline cases; update it to scan
backwards from line_start through previous lines to find a preceding "def " or
"lambda" token and then evaluate parenthesis depth across that full span.
Specifically, for "def " (use def_idx search but on the concatenated slice from
the found def position to the error start) walk characters from that def
position counting '(' and ')' to determine if the error position is inside the
function parameter list (return the CPython message if inside), and for "lambda"
search backwards similarly to find a lambda token without a ':' between it and
the error (consider spanning newlines). Keep the existing return messages and
reuse parenthesized_param_message, but replace single-line checks with these
backward-scanning, depth-aware checks so multiline parameter lists are detected.
- Around line 1397-1419: chunk_has_bare_assignment currently treats any
alphanumeric LHS as an identifier; update it to reject literals/keywords by
first checking the trimmed lhs start and exact content: ensure lhs is non-empty
and its first byte is ASCII alphabetic or b'_' (reject if it starts with a
digit, quote, '(' '[' '{', etc.), then ensure the whole lhs matches the
identifier shape (bytes().all(...) as you already do) and is not one of the
Python literal/keyword tokens like "True", "False", or "None" (use a small
static set and compare lhs.eq_ignore_ascii_case or exact match as appropriate).
Return false for those cases so chunk_has_bare_assignment only returns true for
real identifiers.
- Around line 1682-1698: The current is_in_case_pattern only checks the current
line for a "case " header so it misses multi-line case headers; change the logic
to scan backwards from range.start() to find the nearest preceding non-empty
line whose trimmed start begins with "case " (use the existing start and
line_start variables as anchors) and treat the location as being in a case
pattern if such a "case " header exists with indentation (case_indent) less than
the current line and there is no '=' token between the header start and the
range.start(); update the code around is_in_case_pattern, keeping references to
start, line_start, rest, and case_indent, to iterate previous lines (instead of
only checking source[line_start..]) and ensure multiline headers are recognized
before falling back to the match-search logic.
---
Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 4718-4729: Extract the duplicated "__debug__" type-parameter check
into a new helper on the compiler impl (e.g. fn
validate_type_params_no_debug(&self, type_params: Option<&ast::TypeParams>) ->
CompileResult<()>) that iterates params.type_params, extracts the name from
ast::TypeParam variants, and returns
Err(self.error(CodegenErrorType::Assign("__debug__"))) if a name equals
"__debug__", otherwise Ok(()). Replace the duplicated validation blocks in the
function compilation path and the class compilation path with a single call to
this helper, keeping existing types (ast::TypeParams, CompileResult) and error
construction unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: c64ba8a6-f09d-45ff-8156-2a8997556f5b
⛔ Files ignored due to path filters (4)
Lib/test/test_genexps.pyis excluded by!Lib/**Lib/test/test_named_expressions.pyis excluded by!Lib/**Lib/test/test_patma.pyis excluded by!Lib/**Lib/test/test_syntax.pyis excluded by!Lib/**
📒 Files selected for processing (5)
crates/codegen/src/compile.rscrates/codegen/src/symboltable.rscrates/compiler/src/lib.rscrates/vm/src/stdlib/_ast.rscrates/vm/src/vm/vm_new.rs
4383c21 to
a84b1a4
Compare
|
@coderabbitai review The duplicated |
|
✅ Actions performedReview triggered.
|
ShaharNaveh
left a comment
There was a problem hiding this comment.
tysm for working on this!
On one hand I can't argue with the value this brings, and on the other it feels a bit ridiculous because we are just reimplementing the lexer & parser atp.
I'll wait for @youknowone inputs
05843ed to
19195ed
Compare
|
@ShaharNaveh @youknowone is this okay to merge? |
Continues RustPython#7928/RustPython#7933/RustPython#7988. Translates many more ruff ParseErrorType variants to CPython's exact wording in CompileError::from_ruff_parse_error, and routes ast.parse() / compile(PyCF_ONLY_AST) through the same path so those messages match too (previously they leaked raw ruff strings). Adds a few codegen/symtable checks. Covered: aug-assign/delete/set/dict/f-string/t-string targets; "cannot use {kind} as import target"; string-prefix incompatibility and "invalid character 'X' (U+XXXX)"; parenthesized def/lambda params; missing default/argument value; dict ':' / value syntax; "'elif' block follows an 'else' block"; raise-from; comprehension 'if'; ternary statement keywords; match "case ... as <target>" -> "cannot use {kind} as pattern target" and "case ... as _"; __debug__ as def/class/type-param/except name; "name 'x' is nonlocal and global"; generic type-parameter wording. Lib/test: drop the now-passing "# TODO: RUSTPYTHON; Wrong error message" doctest markers and @expectedfailure decorators. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`eval` calls the host `print` function, so rust-lld reported it as an undefined symbol and the wasm32-unknown-unknown build failed to link (`kv_get`/`kv_put` are unused, so they were GC'd and did not error). Annotate the `extern "C"` block with `#[link(wasm_import_module = "env")]` so the linker emits the host functions as wasm imports from the `env` module, matching the wasmer host runtime in wasm-runtime/src/main.rs. Verified: `cargo build` (the CI "check wasm32-unknown without js" step) now links and produces the .wasm. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
test_dictcomps.test_illegal_assignment, test_fstring.test_invalid_string_prefixes and test_unicode_identifiers.test_invalid now pass thanks to this PR's error-message alignment, so their @unittest.expectedFailure markers caused "unexpected success" failures in CI. Remove the obsolete markers (same cleanup already applied to test_syntax/test_genexps/test_named_expressions/test_patma). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The identical `__debug__` type-parameter rejection in compile_function_def and compile_class_def is now a single shared helper, eliminating the duplicated block (addresses a CodeRabbit review note). Behavior-preserving: `def f[__debug__]`/`class C[__debug__]` still raise "cannot assign to __debug__" and valid type params still compile. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Shahar Naveh <50263213+ShaharNaveh@users.noreply.github.com>
- Report "Invalid star expression" for bare leading `*` in set/dict displays and non-call parenthesised groups (`{*}`, `(*,)`)
- Collapse double-comma in dict/set/list displays (`{1:2,, 3}`, `[1,, 2]`) to "invalid syntax"
- Add is_bare_star_first_in_group helper
- Drop stale "Is this intended to be part of the string?" uppercase-message entry
The bare-star helpers only checked that the error-adjacent token was `*`,
so a binary multiply with a missing operand was mislabeled "Invalid star
expression". `{1 *}`, `(1 *)`, `f(a *)` (and `f(g(a *))`, `{(x) *}`, …) now
correctly report "invalid syntax" like CPython. Require the `*` to start
its slot — mirroring `is_invalid_star_in_subscript` — and share the check
via `slot_starts_with_bare_star`.
The dict-value arm emitted "expression expected after dictionary key and
':'" for any unparseable value; CPython reserves that message for a
genuinely empty value and says "invalid syntax" otherwise. Restrict it to
empty values via `dict_value_is_empty`, so `{1: *}`, `{1: **}`, `{1: 2*}`,
`{1: +}`, `{1: not}`, … now match CPython.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
19195ed to
669fc00
Compare
Summary
Continues the parser error-message alignment from #7928 / #7933 / #7988. Translates many more ruff
ParseErrorTypevariants to CPython 3.14.5's exact wording inCompileError::from_ruff_parse_error(using the source slice / parsed AST kind), plus a few codegen/symtable checks.crates/vm/src/stdlib/_ast.rsnow routes its parse errors throughfrom_ruff_parse_error, soast.parse()/compile(..., PyCF_ONLY_AST)produce the same CPython-aligned messages as the exec path (they previously leaked raw ruff strings — and this fixes a regression where the var-param message diverged on the AST path).What's aligned
cannot use {attribute,subscript,tuple,list,literal,function call} as import target;import X from Y→ "Did you mean to use 'from ... import ...' instead?"'u' and 'b' prefixes are incompatible);invalid character 'X' (U+XXXX):and value syntax'elif' block follows an 'else' block;raise from; comprehensionif/ unparenthesized target; ternary statement-keyword hintscase … as <target>→cannot use {kind} as pattern target, andcase … as _→cannot use '_' as a target__debug__as adef/class/type-param/except-handler name →cannot assign to __debug__name 'x' is nonlocal and global; generic type-parameter wording (… cannot be used within the definition of a generic)This drops the now-passing
# TODO: RUSTPYTHON; Wrong error messagedoctest markers and@expectedFailuredecorators acrosstest_syntax.py,test_genexps.py,test_named_expressions.py, andtest_patma.py.Not covered (intentionally left marked)
A few
test_syntax.pycases need ruff-parser or deeper codegen changes and remain marked rather than emitting a silently-wrong message: the type-commentbare *case,class C(x for x in L)andf((x)=2)(ruff accepts these without a parse error), anddict(...); x $ y(duplicate-keyword vs lexer ordering). Runtime-error messages in unrelated subsystems (struct-sequence, format-spec, datetime,__import__, ast__replace__) are out of scope for this parser-focused change.Verification
cargo clippy -p rustpython-compiler -p rustpython-codegen -- -D warningsclean;cargo fmt.ast.parse().test_syntax test_genexps test_named_expressions test_patma test_type_params test_scope test_grammar test_compile test_exceptionsall pass (961 tests).🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
_as a capture/store targetChores