fix: Fix ValueError in signal handling for Trino worker threads#6428
fix: Fix ValueError in signal handling for Trino worker threads#6428mbm-codes wants to merge 2 commits into
Conversation
Signed-off-by: MBM <mihir105@gmail.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a147b7de22
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| """signal.signal() should work fine in main thread.""" | ||
| cursor = MagicMock() | ||
| # Should not raise any exception in main thread | ||
| query = Query(query_text="SELECT 1", cursor=cursor) |
There was a problem hiding this comment.
Restore process signal handlers after the test
When this test runs, Query.__init__ installs process-global SIGINT/SIGTERM handlers that point at this test's mock cursor, and the test never restores the previous handlers. Any later test or CI shutdown signal in the same pytest process can be intercepted by query.cancel() instead of the runner's normal KeyboardInterrupt/termination handling, which can make the rest of the suite ignore interrupts or hang until forcibly killed. Patch signal.signal in this test or save and restore the old handlers in a finally.
Useful? React with 👍 / 👎.
Signed-off-by: MBM <mihir105@gmail.com>
jyejare
left a comment
There was a problem hiding this comment.
This PR fixes a ValueError that occurs when Query objects are instantiated in worker threads by guarding signal.signal() calls with a main thread check. The fix is simple and effective, with good test coverage demonstrating both main thread and worker thread scenarios.
| query = Query(query_text="SELECT 1", cursor=cursor) | ||
| assert query.query_text == "SELECT 1" | ||
| # Expected signal.signal to be called twice for SIGINT and SIGTERM |
There was a problem hiding this comment.
[Suggestion] Test should verify signal handler registration details
The test only checks call_count but should also verify that the correct signals and handler function are being registered for more thorough validation.
Current code:
+ # Expected signal.signal to be called twice for SIGINT and SIGTERM
+ assert mock_signal.call_count == 2Suggested:
| query = Query(query_text="SELECT 1", cursor=cursor) | |
| assert query.query_text == "SELECT 1" | |
| # Expected signal.signal to be called twice for SIGINT and SIGTERM | |
| # Verify signal handlers are registered correctly | |
| assert mock_signal.call_count == 2 | |
| mock_signal.assert_any_call(signal.SIGINT, query.cancel) | |
| mock_signal.assert_any_call(signal.SIGTERM, query.cancel) |
|
|
||
| assert not errors, f"Unexpected ValueError in worker thread: {errors[0]}" |
There was a problem hiding this comment.
[Suggestion] Improve error message handling
The error message assumes errors[0] exists, but this could cause an IndexError if errors is empty. Consider a more defensive approach.
Current code:
+ assert not errors, f"Unexpected ValueError in worker thread: {errors[0]}"Suggested:
| assert not errors, f"Unexpected ValueError in worker thread: {errors[0]}" | |
| assert not errors, f"Unexpected ValueError in worker thread: {errors[0] if errors else 'No errors captured'}" |
What this PR does / why we need it:
Guards
signal.signal()calls inQuery.__init__()with athreading.current_thread() is threading.main_thread()check.When the Trino offline store is served via Arrow Flight,
Query.__init__()runs in a worker thread. Python restrictssignal.signal()to the main thread only, causing:ValueError: signal only works in main thread of the main interpreterWhich issue(s) this PR fixes:
Fixes #6424
Checks
git commit -s)Testing Strategy
Misc
Added regression tests covering both main thread and worker thread scenarios.
Signal handling is preserved for CLI/interactive usage. Worker thread usage silently skips signal registration, which is the correct behavior when running under Arrow Flight.