Change get_data() to always return a long form data frame with dimension IDs.#32
Draft
aaronweeden wants to merge 2 commits intoubccr:mainfrom
Draft
Change get_data() to always return a long form data frame with dimension IDs.#32aaronweeden wants to merge 2 commits intoubccr:mainfrom
get_data() to always return a long form data frame with dimension IDs.#32aaronweeden wants to merge 2 commits intoubccr:mainfrom
Conversation
10 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
WORK IN PROGRESS
Description
This PR changes the
get_data()method to always return a long form data frame and to include the dimension IDs in the data. For example, given this call toget_data():The
dfvariable will be assigned this data frame:Motivation and Context
Long form data are easier to process and plot.
Also, currently, sometimes data is returned as a Series instead of a DataFrame, which is difficult to work with because it often requires casting the Series as a DataFrame.
The dimension IDs are needed because labels are not necessarily unique (e.g., two people can have the same name).
This PR will also be a basis for allowing multiple metrics to be requested and returned in the same data frame (which will be a separate PR) and multiple data about the dimension to be included in the data frame (#35).
Tests performed
Types of changes
Checklist:
CHANGELOG.mdhas been updateddocs/developing.md) produces no errorsxdmod-notebooksrepository as necessary, and the notebooks all run successfully