Questions and doubts about the dataset

Hi!

First and foremost, thanks for your contribution.

I'm using this dataset in my research; however, I'm having troubles to use the dataset after reading the SIGIR paper "RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System" . I'm hoping you could answer the following questions:
1. Could you please explain me what is the meaning of  the `a_` and `b_` prefixes in the data files? e.g., `rl4rs_dataset_a_rl` vs `rl4rs_dataset_b_rl`.
2. Could you please explain me what is the meaning  of the `_rl` and `_sl` suffixes in the data files? e.g., `rl4rs_dataset_a_rl` vs `rl4rs_dataset_a_sl`.
3. Do users have a unique numerical identifiers? I tried doing a `.unique()` operation on the `user_protrait` column. However, I got way more unique strings than what is reported in Table 2.
4. Inside the `item_feature` column, how can I identify the item numerical identifier? The paper says that the ID is inside this column but does not specify its position inside the array.
5. If I want to perform an offline evaluation using a traditional user-rating matrix, can I join those datasets into a single matrix? or, instead, should I keep four different matrices (one for each data file)?
6. Could you please provide or highlight the code that computes the statistics of the dataset? 
7. I'm trying to replicate Table 2 at the moment, however, I do not know how to map Slate-SL, Slate-RL, SeqSlate-SL, SeqSlate-RL to the data files.
8. Similar to 7, how can I create the Slate and SeqSlate datasets shown on the same Table?

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions and doubts about the dataset #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Questions and doubts about the dataset #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions