Skip to content

Are there recommended pipelines/methods for dereplicating LARGE custom databases? #43

Answered by bluenote-1577
MicroBTM asked this question in Q&A
Discussion options

You must be logged in to vote

@bryantmurphy The GTDB-R220 database is the species-level dereplicated genomes from GTDB. They have a specific pipeline for dereplication, see https://academic.oup.com/nar/article/50/D1/D785/6370255

For dereplicating large custom databases, see https://github.com/MrOlm/drep (quite popular) or https://github.com/raufs/skDER for possible tools.

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Answer selected by MicroBTM
Comment options

You must be logged in to vote
1 reply
@bluenote-1577
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants