SparkRetrievalJob.persist() fails due to missing SparkSource mapping in SavedDatasetStorage.from_data_source #6261

Implemented get_historical_features() with job.persist() using a Spark offline store. However, job.persist() failed with a ValueError stating the method does not support SparkSource. Found that SavedDatasetStorage.from_data_source() only maps to FileSource. When I attempted to use a FileSource, it failed with an assertion error because SparkRetrievalJob.persist specifically expects SavedDatasetSparkStorage.
Additionally, We are using the Spark offline store with a path based SparkSource configuration (S3 with parquet format). When I went through the code found that the method only supports table based SparkSource, can you please consider adding support for path based spark source configuration as well?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparkRetrievalJob.persist() fails due to missing SparkSource mapping in SavedDatasetStorage.from_data_source #6261

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparkRetrievalJob.persist() fails due to missing SparkSource mapping in SavedDatasetStorage.from_data_source #6261

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Footer

Footer navigation