Hi!
First and foremost, thanks for your contribution.
I'm using this dataset in my research; however, I'm having troubles to use the dataset after reading the SIGIR paper "RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System" . I'm hoping you could answer the following questions:
- Could you please explain me what is the meaning of the
a_ and b_ prefixes in the data files? e.g., rl4rs_dataset_a_rl vs rl4rs_dataset_b_rl.
- Could you please explain me what is the meaning of the
_rl and _sl suffixes in the data files? e.g., rl4rs_dataset_a_rl vs rl4rs_dataset_a_sl.
- Do users have a unique numerical identifiers? I tried doing a
.unique() operation on the user_protrait column. However, I got way more unique strings than what is reported in Table 2.
- Inside the
item_feature column, how can I identify the item numerical identifier? The paper says that the ID is inside this column but does not specify its position inside the array.
- If I want to perform an offline evaluation using a traditional user-rating matrix, can I join those datasets into a single matrix? or, instead, should I keep four different matrices (one for each data file)?
- Could you please provide or highlight the code that computes the statistics of the dataset?
- I'm trying to replicate Table 2 at the moment, however, I do not know how to map Slate-SL, Slate-RL, SeqSlate-SL, SeqSlate-RL to the data files.
- Similar to 7, how can I create the Slate and SeqSlate datasets shown on the same Table?
Thanks in advance!
Hi!
First and foremost, thanks for your contribution.
I'm using this dataset in my research; however, I'm having troubles to use the dataset after reading the SIGIR paper "RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System" . I'm hoping you could answer the following questions:
a_andb_prefixes in the data files? e.g.,rl4rs_dataset_a_rlvsrl4rs_dataset_b_rl._rland_slsuffixes in the data files? e.g.,rl4rs_dataset_a_rlvsrl4rs_dataset_a_sl..unique()operation on theuser_protraitcolumn. However, I got way more unique strings than what is reported in Table 2.item_featurecolumn, how can I identify the item numerical identifier? The paper says that the ID is inside this column but does not specify its position inside the array.Thanks in advance!