← 返回首页
fix: Ambiguous truth value of array during materialization by alan-gauthier-jt · Pull Request #6259 · feast-dev/feast · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus

fix: Ambiguous truth value of array during materialization#6259

Merged
ntkathole merged 2 commits into
feast-dev:masterfrom
alan-gauthier-jt:fix-array-materialize
Apr 14, 2026
Merged

fix: Ambiguous truth value of array during materialization#6259
ntkathole merged 2 commits into
feast-dev:masterfrom
alan-gauthier-jt:fix-array-materialize

Conversation

Copy link
Copy Markdown
Contributor

alan-gauthier-jt commented Apr 10, 2026
edited
Loading

What this PR does / why we need it:

feast materialize crashes with ValueError: The truth value of an empty array is ambiguous when a scalar feature column contains an empty numpy array (e.g. np.array([])). This is a real-world scenario when a DataFrame row has a missing value represented as an empty array rather than None or np.nan.

Root cause: In _convert_scalar_values_to_proto (sdk/python/feast/type_map.py), the null check uses not pd.isnull(value) for every value in the loop. pd.isnull() is vectorised — when value is a numpy array, it returns a boolean array instead of a scalar. Applying Python's not operator to that array raises ValueError. The same issue exists in:

  • the BOOL scalar path (not pd.isnull(value) in a list comprehension)
  • the UNIX_TIMESTAMP early-return path (_python_datetime_to_int_timestamp(values) called with the raw values list, including any array-like values)
  • the sample type-validation check (sample == 0)

Fix: Before calling pd.isnull(), guard both scalar conversion loops (generic and BOOL) and the sample type-validation with an explicit isinstance(value, np.ndarray) check. Any array-like value in a scalar feature column is unmappable to a protobuf scalar field anyway, so it is safely treated as null → ProtoValue().

Input value Behaviour before Behaviour after
np.array([]) (empty) ValueError crash ProtoValue() (null)
np.array([np.nan, 1.0]) ValueError crash ProtoValue() (null)
np.array([1.0, 2.0]) ValueError crash ProtoValue() (null)
None ProtoValue() (null) unchanged
scalar non-null ProtoValue(field=value) unchanged

Which issue(s) this PR fixes:

Fixes #6255

Checks

  • I've made sure the tests are passing.
  • My commits are signed off (git commit -s)
  • My PR title follows conventional commits format

Testing Strategy

  • Unit tests
  • Integration tests
  • Manual tests
  • Testing is not required for this change

Misc

alan-gauthier-jt requested a review from a team as a code owner April 10, 2026 13:30

This comment was marked as resolved.

Comment thread sdk/python/feast/type_map.py Outdated
return [ProtoValue(unix_timestamp_val=ts) for ts in int_timestamps] # type: ignore
out = []
for value in values:
if isinstance(value, np.ndarray) or (
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

let's extract this into small helper and use it everywhere to avoid duplication:

def _is_array_like(value: Any) -> bool: return isinstance(value, np.ndarray) or ( hasattr(value, "__len__") and not isinstance(value, (str, bytes)) )

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

Done

ntkathole changed the title fix: ambiguous truth value of array during materialization fix: Ambiguous truth value of array during materialization Apr 13, 2026
Comment thread sdk/python/feast/type_map.py Outdated
out.append(ProtoValue())
else:
(ts,) = _python_datetime_to_int_timestamp([value])
out.append(ProtoValue(unix_timestamp_val=ts)) # type: ignore
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

if else logic is going through all rows, better to pre filter first:

if feast_value_type == ValueType.UNIX_TIMESTAMP: out = [None] * len(values) clean_indices = [] clean_values = [] for i, value in enumerate(values): if _is_array_like(value) or value is None: out[i] = ProtoValue() else: clean_indices.append(i) clean_values.append(value) if clean_values: timestamps = _python_datetime_to_int_timestamp(clean_values) for i, ts in zip(clean_indices, timestamps): out[i] = ProtoValue(unix_timestamp_val=ts) return out

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

Thanks, I implemented your solution

Copy link
Copy Markdown
Member

@alan-gauthier-jt I think even better if we fix this at https://github.com/feast-dev/feast/blob/master/sdk/python/feast/type_map.py by adding logic for scalar columns, skip array-like values when picking a sample.

Comment thread sdk/python/feast/type_map.py Show resolved Hide resolved
Copy link
Copy Markdown
Member

ntkathole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

Thanks @alan-gauthier-jt looks good

Signed-off-by: Alan Gauthier <alan.gauthier@jobteaser.com>
Signed-off-by: Alan Gauthier <alan.gauthier@jobteaser.com>
ntkathole force-pushed the fix-array-materialize branch from 9037bdc to b91978e Compare April 14, 2026 07:31
Hide details View details ntkathole merged commit d0c8984 into feast-dev:master Apr 14, 2026
3 of 6 checks passed
franciscojavierarceo pushed a commit that referenced this pull request May 4, 2026
# [0.63.0](v0.62.0...v0.63.0) (2026-05-04) ### Bug Fixes * Add project filter to apply_data_source and delete_data_source (closes [#6206](#6206)) ([#6322](#6322)) ([96562c4](96562c4)) * Add project_id filter to SnowflakeRegistry UPDATE path ([#6243](#6243)) ([6658b71](6658b71)), closes [#6208](#6208) [#6208](#6208) * Add subprocess timeouts to prevent test_e2e_local hanging on Dask atexit handler ([3de6556](3de6556)) * Ambiguous truth value of array during materialization ([#6259](#6259)) ([d0c8984](d0c8984)) * Auto-detect GCS/S3 registry store when registry is passed as string ([#6260](#6260)) ([7ebcf03](7ebcf03)) * **bigquery:** Prefer query over table in get_table_query_string ([#6360](#6360)) ([77ed779](77ed779)), closes [#6200](#6200) * correct project_id scoping in get_user_metadata and delete_project ([0c469a7](0c469a7)) * disable Redis RDB persistence in test deployments ([44cd682](44cd682)) * Disable snowflake tests temporarily in CI ([#6356](#6356)) ([31d5a98](31d5a98)) * Filter empty SQL commands at execute_snowflake_statement call sites ([#6249](#6249)) ([92ffbb9](92ffbb9)) * Fix five bugs in milvus online store ([#6275](#6275)) ([212504b](212504b)) * Fix issue with apply feature view ([835cda8](835cda8)) * Fix streaming materialization for exotic sources with lazy UDF pipelines ([c07972d](c07972d)) * Handle missing features gracefully instead of panicking ([7d00b3a](7d00b3a)) * Harden informer cache with label selectors and memory optimizations ([#6242](#6242)) ([3f11356](3f11356)) * **helm:** Avoid nil pointer for metrics.enabled inside podAnnotations ([#6251](#6251)) ([c833f1a](c833f1a)) * Include git in feast server image ([fb03c46](fb03c46)) * Include StreamFeatureView in freshness metric ([#6269](#6269)) ([463f16c](463f16c)) * Pre-create S3A event log dir before SparkContext init ([#6317](#6317)) ([9feca77](9feca77)) * Remote Online Store Type Inference Error with All-NULL Columns ([#6063](#6063)) ([de67bdd](de67bdd)) * Remove selector with kustomize overlay using a JSON 6902 patch ([9107a43](9107a43)) * Resolve multiple bugs in SnowflakeRegistry and Snowflake connection handling ([#6315](#6315)) ([7e66a2e](7e66a2e)) * **spark:** BatchFeatureView with TransformationMode.PYTHON now reads all source columns ([a310eaf](a310eaf)) * **spark:** Use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query ([e1b1d2d](e1b1d2d)) * Support pandas mode in feature builder and fix dask column extraction ([863315e](863315e)) * support SQL string as entity_df in RemoteOfflineStore.get_historical_features ([c559889](c559889)) * Wrap LocalOutputNode return value in ArrowTableValue for consist… ([#6286](#6286)) ([a16cd55](a16cd55)) ### Features * Add agent skills and Cursor/Claude rules for Feast development ([312eea3](312eea3)) * Add feature view versioning support to FAISS online store ([b36acb7](b36acb7)) * Add feature view versioning support to Redis and DynamoDB online stores ([#6257](#6257)) ([edf25af](edf25af)), closes [#6164](#6164) [#6163](#6163) * Add optional 'org' in feature view ([#6288](#6288)) ([#6301](#6301)) ([608b105](608b105)) * Add RaySource, to_ray_dataset first-class method, docs, and tests ([1c98157](1c98157)) * Add TLS support for Go Feature Server ([#6229](#6229)) ([28a58d0](28a58d0)) * Add Vector Search support to MongoDBOnlineStore ([#6344](#6344)) ([c102738](c102738)) * Add versioning support to Milvus online store ([#6330](#6330)) ([3268ced](3268ced)) * Addresses performance issues in the Redis online store ([2e50da0](2e50da0)) * Allow to set gpu for ray ([5580ab4](5580ab4)) * Bump redis-py version cap from <5 to <8 ([#6339](#6339)) ([9538180](9538180)) * Expose feature_server, materialization, and openlineage configuration via FeatureStore CRD ([ec6ecfd](ec6ecfd)) * Make online_write_batch_size configurable in MaterializationConfig ([#6268](#6268)) ([d41becf](d41becf)) * Make udf optional if agg defined ([#5689](#5689)) ([#6328](#6328)) ([f630056](f630056)) * MongoDB offline store ([#6138](#6138)) ([8eebad7](8eebad7)) * Optional input_schema for ODFV ([#6308](#6308)) ([#6312](#6312)) ([f08b4e8](f08b4e8)) * Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser ([#6240](#6240)) ([dca57e8](dca57e8)) * **spark:** Add compute-on-read support for BatchFeatureView in get_… ([#6357](#6357)) ([630d9f8](630d9f8))
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ValueError: The truth value of an empty array is ambiguous during materialization

3 participants

Footer

© 2026 GitHub, Inc.