Skip to content

Conversation

@aniketpalu
Copy link
Contributor

What this PR does / why we need it:

  • Add support for entity_df=None in RayOfflineStore.get_historical_features with start_date/end_date.
    -- Derives entity set by reading distinct join keys from each FeatureView source within the time window, applies field mappings and join_key_map, filters by timestamp, and unions aligned schemas.
    -- Adds stable event_timestamp = end_date for PIT joins.

  • Signature change: get_historical_features accepts entity_df: Optional[Union[pd.DataFrame, str]] and **kwargs.
    -- Why: Match base interface and support date-only retrieval.

Which issue(s) this PR fixes:

RHOAIENG-38643

Misc

@aniketpalu aniketpalu requested a review from a team as a code owner November 25, 2025 19:18
Copy link
Contributor

@jyejare jyejare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good initially, have some doubts.

Also needs to add tests.

return pa.Table.from_pandas(df).schema


def _compute_non_entity_dates_ray(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have make a common utility function for this, so that it can be used in all stores without repeating the code.

wdyt ?

return _filter_range


def _make_select_distinct_keys(join_keys: List[str]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should drop rows with duplicate IDs, because there could be multiple transactions per ID and we need to choose the row based on timestamp while joining the colums from another table/view. I think this is the same case with your spark PR.

Please check the postgres implementation to understand the case.

Or Am I misreading this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants