Skip to content

fix: Use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query#6311

Merged
ntkathole merged 1 commit into
feast-dev:masterfrom
abhijeet-dhumal:fix/spark-pull-all-select-star-empty-feature-cols-v2
Apr 22, 2026
Merged

fix: Use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query#6311
ntkathole merged 1 commit into
feast-dev:masterfrom
abhijeet-dhumal:fix/spark-pull-all-select-star-empty-feature-cols-v2

Conversation

@abhijeet-dhumal
Copy link
Copy Markdown
Contributor

What

pull_all_from_table_or_query always builds an explicit SELECT projection from join_key_columns + feature_name_columns + timestamp_fields. When feature_name_columns=[] — the "read all source columns" signal already used by FeatureBuilder.get_column_info for ray and pandas transformation modes — the generated SQL becomes:

SELECT user_id, event_timestamp
FROM s3a://bucket/reviews/
WHERE event_timestamp BETWEEN ...

All raw feature columns (rating, text, helpful_vote, etc.) are silently dropped. The UDF receives a 2-column DataFrame and every aggregation returns null or fails.

Why it happens

# no guard for empty feature_name_columns
(fields_with_aliases, aliases) = _get_fields_with_aliases(
  fields=join_key_columns + feature_name_columns + timestamp_fields,
  ...
)
fields_with_alias_string = ", ".join(fields_with_aliases)
# → "user_id, event_timestamp"  when feature_name_columns=[]

@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/spark-pull-all-select-star-empty-feature-cols-v2 branch from 8d6cc24 to 9e84c26 Compare April 22, 2026 15:24
@abhijeet-dhumal abhijeet-dhumal changed the title Fix: if feature_name_columns is empty, use SELECT * so the UDF recei… fix(spark): use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query Apr 22, 2026
@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/spark-pull-all-select-star-empty-feature-cols-v2 branch from 9e84c26 to 49f5031 Compare April 22, 2026 15:30
@abhijeet-dhumal abhijeet-dhumal marked this pull request as ready for review April 22, 2026 15:31
@abhijeet-dhumal abhijeet-dhumal requested a review from a team as a code owner April 22, 2026 15:31
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@ntkathole ntkathole changed the title fix(spark): use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query fix: Use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query Apr 22, 2026
@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/spark-pull-all-select-star-empty-feature-cols-v2 branch from 49f5031 to 6c8b25c Compare April 22, 2026 15:40
…ll_from_table_or_query

pull_all_from_table_or_query always builds an explicit SELECT projection
from join_key_columns + feature_name_columns + timestamp_fields.
When feature_name_columns=[] — the "read all source columns" signal used
by FeatureBuilder.get_column_info for BatchFeatureView with
TransformationMode.PYTHON, ray, and pandas — the generated SQL becomes:

  SELECT user_id, event_timestamp FROM source WHERE ...

All raw feature columns (rating, text, helpful_vote, …) are silently
dropped. The UDF receives a 2-column DataFrame and every aggregation
returns null or fails.

Fix: guard on feature_name_columns being non-empty before building the
explicit projection; fall through to SELECT * when it is empty.

Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/spark-pull-all-select-star-empty-feature-cols-v2 branch from 6c8b25c to 6d0c527 Compare April 22, 2026 16:05
@ntkathole ntkathole merged commit e1b1d2d into feast-dev:master Apr 22, 2026
30 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants