← 返回首页
fix: Invalid column names in get_historical_features when there are field mappings on join keys by aloysius-lim · Pull Request #4886 · feast-dev/feast · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus

fix: Invalid column names in get_historical_features when there are field mappings on join keys#4886

Merged
franciscojavierarceo merged 3 commits into
feast-dev:masterfrom
aloysius-lim:fix-get-historical-features-join-key-field-mapping
Feb 3, 2025
Merged

fix: Invalid column names in get_historical_features when there are field mappings on join keys#4886
franciscojavierarceo merged 3 commits into
feast-dev:masterfrom
aloysius-lim:fix-get-historical-features-join-key-field-mapping

Conversation

Copy link
Copy Markdown
Contributor

aloysius-lim commented Jan 3, 2025
edited
Loading

What this PR does / why we need it:

When there are field mappings on an entity's join keys, get_historical_features() fails because it does not use the original column name in the data source.

This is needed when the source data cannot be modified, and a field mapping is used to map the source data's column to the join key.

Example:

from feast import Entity, Field, FeatureStore, FeatureView from feast.types import Float32, String from feast.infra.offline_stores.contrib.spark_offline_store.spark_source import SparkSource # Initialize Feature Store. store = FeatureStore(...) # "driver_id" is used in the Feature Store. driver = Entity(name="driver", join_keys=["driver_id"]) # Using SparkSource as an example, but this applies to other sources. # Source data contains a primary key called "id". This is mapped to the join key "driver_id". driver_stats_src = SparkSource( name="driver_stats", field_mapping={"id": "driver_id"}, path=..., file_format=..., ) driver_stats_fv = FeatureView( name="driver_stats", source= driver_stats_src, entities=[driver], schema=[ # join key must be specified in the schema, else it is not included in driver_stats_fv.entity_columns Field(name="driver_id", dtype=String), Field(name="stat1", dtype=Float32), Field(name="stat2", dtype=Float32), ] ) # Get historical features store.get_historical_features( entity_df=..., features=[ "driver_stats:stat1", "driver_stats:stat2", ]

Without this fix (Spark example):

pyspark.errors.exceptions.captured.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `driver_id` cannot be resolved. Did you mean one of the following? [`id`, `stat1`, `stat2`]

Underlying Spark query:

driver_stats__subquery AS ( SELECT event_timestamp as event_timestamp, created as created_timestamp, -- Here is the problem. driver_id AS driver_id, stat1 as stat1, stat2 as stat2 FROM `feast_entity_df_677a1a6fd13443c6b0e8ccc059b25f01` WHERE event_timestamp <= '2025-01-05T14:00:00' )

With this fix, the field mapping is respected and the query becomes:

driver_stats__subquery AS ( SELECT event_timestamp as event_timestamp, created as created_timestamp, id AS driver_id, stat1 as stat1, stat2 as stat2 FROM `feast_entity_df_677a1a6fd13443c6b0e8ccc059b25f01` WHERE event_timestamp <= '2025-01-05T14:00:00' )

Which issue(s) this PR fixes:

#4889

Misc

This only works if the entity join key is specified in the schema of the FeatureView (see above).

aloysius-lim requested review from a team as code owners January 3, 2025 07:24
aloysius-lim force-pushed the fix-get-historical-features-join-key-field-mapping branch from 9c6ca6e to b0a70ac Compare January 3, 2025 07:26
Copy link
Copy Markdown
Contributor

Thanks for your contribution!
I’m wondering why we don’t have an issue to track this error: if you can’t find an existing issue, could you please open one before?

Copy link
Copy Markdown
Contributor Author

Thanks for your contribution! I’m wondering why we don’t have an issue to track this error: if you can’t find an existing issue, could you please open one before?

I added issue #4889 for this.

Copy link
Copy Markdown
Member

Do you mind rebasing? I fixed an issue so after you rebase it should go away.

aloysius-lim force-pushed the fix-get-historical-features-join-key-field-mapping branch from b0a70ac to 84c355f Compare January 3, 2025 23:16
Copy link
Copy Markdown
Contributor Author

I've rebased, but the tests are still failing due to other errors with Keycloak.

Copy link
Copy Markdown
Member

Can you rebase again? Sorry I reverted my most recent change. 😅

Signed-off-by: Aloysius Lim <aloysius.lim@cuezen.com>
Signed-off-by: Aloysius Lim <aloysius.lim@cuezen.com>
Signed-off-by: Aloysius Lim <aloysius.lim@cuezen.com>
aloysius-lim force-pushed the fix-get-historical-features-join-key-field-mapping branch from 84c355f to 9b60d43 Compare January 10, 2025 14:56
Copy link
Copy Markdown
Member

franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

lgtm!

franciscojavierarceo merged commit c9aca2d into feast-dev:master Feb 3, 2025
franciscojavierarceo pushed a commit that referenced this pull request Feb 4, 2025
# [0.44.0](v0.43.0...v0.44.0) (2025-02-04) ### Bug Fixes * Adding periodic check to fix the sporadic failures of the operator e2e tests. ([#4952](#4952)) ([1d086be](1d086be)) * Adding the feast-operator/bin to the .gitignore directory. Somehow it… ([#5005](#5005)) ([1a027ee](1a027ee)) * Changed Env Vars for e2e tests ([#4975](#4975)) ([fa0084f](fa0084f)) * Fix GitHub Actions to pass authentication ([#4963](#4963)) ([22b9138](22b9138)), closes [#4937](#4937) [#4939](#4939) [#4941](#4941) [#4940](#4940) [#4943](#4943) [#4944](#4944) [#4945](#4945) [#4946](#4946) [#4947](#4947) [#4948](#4948) [#4951](#4951) [#4954](#4954) [#4957](#4957) [#4958](#4958) [#4959](#4959) [#4960](#4960) [#4962](#4962) * Fix showing selected navigation item in UI sidebar ([#4969](#4969)) ([8ac6a85](8ac6a85)) * Invalid column names in get_historical_features when there are field mappings on join keys ([#4886](#4886)) ([c9aca2d](c9aca2d)) * Read project data from the 'projects' key while loading the registry state in the Feast UI ([#4772](#4772)) ([cb81939](cb81939)) * Remove grpcurl dependency from Operator ([#4972](#4972)) ([439e0b9](439e0b9)) * Removed the dry-run flag to test and we will add it back later. ([#5007](#5007)) ([d112b52](d112b52)) * Render UI navigation items as links instead of buttons ([#4970](#4970)) ([1267703](1267703)) * Resolve Operator CRD bloat due to long field descriptions ([#4985](#4985)) ([7593bb3](7593bb3)) * Update manifest to add feature server image for odh ([#4973](#4973)) ([6a1c102](6a1c102)) * Updating release workflows to refer to yml instead of yaml ([#4935](#4935)) ([02b0a68](02b0a68)) * Use locally built feast-ui package in dev feature-server image ([#4998](#4998)) ([0145e55](0145e55)) ### Features * Added OWNERS file for OpenshiftCI ([#4991](#4991)) ([86a2ee8](86a2ee8)) * Adding Milvus demo to examples ([#4910](#4910)) ([2daf852](2daf852)) * Adding retrieve_online_documents endpoint ([#5002](#5002)) ([6607d3d](6607d3d)) * Adding support to return additional features from vector retrieval for Milvus db ([#4971](#4971)) ([6ce08d3](6ce08d3)) * Creating/updating the stable branch after the release. ([#5003](#5003)) ([e9b53cc](e9b53cc)) * Implementing online_read for MilvusOnlineStore ([#4996](#4996)) ([92dde13](92dde13)) * Improve exception message for unsupported Snowflake data types ([#4779](#4779)) ([5992364](5992364)) * Operator add feast ui deployment ([#4930](#4930)) ([b026d0c](b026d0c)) * Updating documents to highlight v2 api for Vector Similarity Se… ([#5000](#5000)) ([32b82a4](32b82a4))
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Footer

© 2026 GitHub, Inc.