Plant Transcription Factor Database
v4.0Previous version: v1.0, v2.0, v3.0
|Home BLAST Prediction RegMap ATRM Download Help About Links|
- The flowchart for construction of PlantTFDB
- Data source
- Pipeline to construct comprehensive protein dataset
- Family assignment rules
- Thresholds for domain identification
- Pipeline for parsing BLAST reciprocal best hits (RBHs) and inferring orthologous groups
- Pipeline for GO annotation
- Curation and projection of TF binding motifs
- Transcription factor information
- Multiple sequence alignment
- Phylogenetic trees
- Quick search
- TF prediction server
- Help for PlantRegMap
Thresholds for domain identification
In PlantTFDB, bit-score was used instead of the e value as the cutoff for each domain. Similar to Pfam, there are two thresholds for each domain: sequence cutoff and domain cutoff. Thresholds for auxiliary domain (except self-built domains) and forbidden domain were directly retrieved from Pfam (v27.0). Thresholds for DNA-binding domain and self-built auxiliary domains were determined by the following method.
- GO annotations (not including IEA annotations) were used as the first evidence to determine rough thresholds for DNA-binding domain, including a Trusted Cutoff (TC) and a Noise Cutoff (NC). TC is the lowest score of proteins possessing this domain and having "transcription factor activity". NC is the highest score of proteins without this domain or without "transcription factor activity".
- Because GO annotation could not provide enough information to determine all TC and NC for each domain, Pfam cutoffs (TC and NC) were used to adjust the TC and NC which were not well determined (less supported).
- TAIR annotations and Uniprot annotations were also used to refine TCs and NCs.
- We employed manual inspection for HMM alignments and further refined TCs and NCs, and chose a reasonable score between TC and NC as the cutoff.