Skip to content

Should OpusCleaner have the notion of a "project"? #146

@bhaddow

Description

@bhaddow

I am trying to understand the intended workflow for OpusCleaner.

Suppose I want to build some MT systems. I fire up OpusCleaner, download some data, apply cleaning rules until I am happy, then I upload data to the data to the cluster for training. Maybe I come back the next day, and want to create a new version of this data set, or maybe I want train a completely different MT system.

For this, would it be useful if OC had the notion of a "project"? I open a project, add files to it, set some project-wide rules and parameters, then maybe some data set specific parameters. If I then want to work on a different MT system, then I open a different project. I can copy the project file onto a different server, and initialise it (by downloading the files). Maybe projects could have versions, so I can track which data/rule set I used.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions