A limitation of GTDB-Tk is that it provides no guidance on the relationship between user genomes. For example, GTDB-Tk might establish that 3 user genomes belong to an Order and do not belong to any existing Families within this Order. However, it is left to the user to determine if these 3 genomes represent 1 new Family or multiple new Families. We should explore the accuracy of pplacer branch lengths and topology to see if we can resolve this using RED and potentially ANI if the user genomes appear to belong to the same Genus. This is a non-trivial task as we would first need to establish that the pplacer results are similar enough to a de novo tree that they can be used for this purpose.
A limitation of GTDB-Tk is that it provides no guidance on the relationship between user genomes. For example, GTDB-Tk might establish that 3 user genomes belong to an Order and do not belong to any existing Families within this Order. However, it is left to the user to determine if these 3 genomes represent 1 new Family or multiple new Families. We should explore the accuracy of pplacer branch lengths and topology to see if we can resolve this using RED and potentially ANI if the user genomes appear to belong to the same Genus. This is a non-trivial task as we would first need to establish that the pplacer results are similar enough to a de novo tree that they can be used for this purpose.