Skip to content

ddbj/ddbj_curator_assistant

Repository files navigation

This project has been replaced with https://github.com/ddbj/mssassist

DDBJ Curators' Assistant

The system consists of:

  • A set of tools designed to automate and standardize, in a fast and easy way, the curation of Sequence Datasets submitted to DDBJ. It comprises four steps: validation (ddbj_mss_validation); auto-correction (ddbj_autofix); auto upload files to DDBJ Databases (ddbj_sakura2DB); and update work tracking spreadsheet (ddbj_kaeru, "kaeru 帰る" means "leave, go home" in Japanese language, in the sense that the work is done).
  • A database, named dblink_ddbj, that contains the most relevant information from DDBJ Database, designed specially for DDBJ curators.
  • An easy-to-use search engine tool, search_dblink, for quickly and simultaneously retrieving data from a wide range of accession IDs.

Mass Dataset Documentation

ag_packages_202204_MSS_workflow

Notice! The MSS tool repository have been migrated to ddbj/mssassist. Please switch to /home/w3const/mssassist tools. In w3const/mssassist, update frequency of the data tables has been increased than that of the tables in /home/andrea/script.

  1. DDBJ Mass Validation
    • An easy command line that identifies submitted files (annotation and fasta) and checks inconsistencies based on DDBJ rules.
    • Requirement: BioSample
    • Command line
      /home/w3const/mssassist/ddbj_mss_validation
      
    • Former command line
      bash /home/andrea/scripts/ddbj_mss_validation
      
      • Beta version
        bash /home/andrea/scripts/ddbj_mss_validation_beta
        
  2. DDBJ Autofix
    • A simple command line that interactively suggests corrections that have been detected by DDBJ Mss Validation and automatically fixes them.
    • Command line
      /home/w3const/mssassist/ddbj_autofix
      
    • Former command line
      bash /home/andrea/scripts/ddbj_autofix
      
      • Beta (CAUTION! Use this version when running ddbj_mss_validation_beta)
        bash /home/andrea/scripts/ddbj_autofix_beta
        
  3. DDBJ Sakura2DB
    • Interactive command line that automatically: a) identifies the file type; b) runs sakura2db (test and actual) for the corrected files to upload the files to their respective databases at DDBJ (Tsunami); c) moves the files to DONE directory.
    • Command line
      /home/w3const/mssassist/ddbj_sakura2DB
      
    • Former command line
      bash /home/andrea/scripts/ddbj_sakura2DB
      
      • Beta version
      bash /home/andrea/scripts/ddbj_sakura2DB_beta
      
  4. DDBJ Kaeru
    • Update work tracking spreadsheet, after running DDBJ Sakura2DB.
    • Command line
      /home/w3const/mssassist/ddbj_kaeru
      
    • Former command line
      bash /home/andrea/scripts/ddbj_kaeru
      

DBLink DDBJ and Search DBLink Documentation

  1. DBLink DDBJ
    • Comprises essential data (dblink dataset) from the major DDBJ databases: BioProject, BioSample, Sequence Read Archive (DRA), Assembled Sequences (Mass Data) and GEA.
  2. Search DBLINK
    • A simple command line tool that accesses the DBLink-DDBJ database and correlates the major DBLink dataset from DDBJ using one file with different accession IDs.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors