Skip to content

TODOs after refactoring task schema #202

@pfliu-nlp

Description

@pfliu-nlp

Core file: dataset_info.jsonl`

Latest version

TODO Items

  • So far, almost all ERRORs result from the use of the google drive link, which can work sometimes but will fail as well sometimes. We can move them to S3 gradually (Since most of them are from summarization tasks, so maybe @yixinL7 and @xcfcode could help out with this part.
  • languages for several datasets should be added.

Some other follow-up things that should be done after task refactoring:

  • IMPORTANT: update get_dataset_info.py and dataset_info.json/ make sure it could be applied to explainaboard_web db: Post-refactoring (update get_dataset_info & dataset scripts) #203
  • update docs for newly-introduced task schema.
  • make sure all datasets include
    • languages
    • other important metadata
  • add task schema for (also think about modality-dependent schema)
    • glue-stsb
    • superglue
    • polyprompt
  • reformat the organization of some datasets
    • adv_mtl
  • add unit test for checking the validity of the newly-introduced script of dataset loader.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions