Plant Transcription Factor Database
Previous version: v1.0, v2.0, v3.0
Transcription Factor Information
The ID of transcription factor collected in PlantTFDB. For species with genome annotation, IDs from genome annotation were adopted as the PlantTFDB ID directly. For species without genome annotation, a unique TF ID was assigned for each TF, which consists of three characters which represent the species (e.g. Aan represents
Artemisia annua
) and 6 figures.
The taxonomic ID and lineage for each organism was collected from NCBI Taxonomy.
Common name
The common names of transcription factors were collected from TAIR10, MSU and UniProt.
Gene Model
The gene (data source) coding for this transcription factor.
Gene Model ID
The ID of gene model, which was extracted from the original data source. Gene model ID can be searched in advanced search page.
Gene Model Type
The type of gene model. There are three types of gene model in PlantTFDB:
'genome' -- gene models came from genome annotation;
'PU_ref' -- gene models came from PlantGDB and UniGene, and they were selected as a representation of a cluster of PUTs and Unigene;
'PU_unref' -- gene models came from PlantGDB and UniGene, but they were not selected as a representation of a cluster of PUTs and Unigene;
The source where gene model was got
Signature Domain
The Domain used to identify and classify transcription factors.
Protein Features
Domain and other features identified by InterProScan v5.
Plant Ontology
Plant Ontology (PO) was downloaded from TAIR10 for
A. thaliana
and Plant Ontology Consortium for other species.
Nucleic Localization Signal
Nucleic Localization signal (NLS) predicted by predictnls.
3D Structure
The best Blast hit from PDB.
The express description (tissue specificity and developmental stage) was collected from UniProt. The best Blast hit from UniGene, GEO, Genevisible and the direct links to Expression Atlas, AtGenExpress and ATTED-II were added.
Function description
Expert-curated functional descriptions were collected from UniProt, TAIR and GeneRIF.
Manually curated regulations are collected from ATRM.
Protein-promoter and protein-protein interaction data were collected from BioGRID, IntAct, and BIND.
Mutation informations were collected from UniProt, T-DNA express, and riceGE.
The best Blast hit from GenBank, Refseq,SwissProt, TrEMBL and STRING.
Link Out
The links to well-known resources such as Phytozome, wikigene, iHOP .etc.
Publications related to the corresponding TF were collected from Entrez gene, GeneRIF, UniProt and ATRM.
Multiple Sequence Alignment
Protein alignment
Multiple sequence alignment for full length transcription factors was inferred using T-Coffee(v9.03).
Domain alignment
Multiple sequence alignment for domain was constructed through Hidden Markov Model-guided method.
Phylogenetic Trees
Phylogenetic trees for TFs within a family intra-species and within the same orthologous group are inferred using MrBayes (v3.2.6) based on the WAG model for 50,000 generations, and the result tree is an unrooted tree.
Phylogenetic trees for TFs of a family from all species are inferred using FastTree (v2.1.9) based on the WAG model with 100 times bootstraps, and the result tree is an unrooted tree.
Quick Search
In quick search box, you can search the TF using TF ID or common name.
TF Prediction Server
A TF prediction server has been upgraded in this version. The family assignment rules and thresholds determined by established methods (see details in the supplemental materials) are used to identify transcrption factors in the input sequences. When users input nucleic acid sequences, ESTScan 3.0 is employed to identify CDS regions of input nucleic acid sequences and translate them to protein sequences. When GC content of input sequences is less than 48%, the ESTScan model trained from the mRNA of
Arabidopsis thaliana
will be used. Otherwise, the model trained from
Oryza sativa
will be used. By checking "Best hit in
Arabidopsis thaliana
", links to the best hits in
Arabidopsis thaliana
will be added in the result for predicted transcription factors. Users can access it here to identify TFs in multiple sequences.
(See also: TF binding site prediction, Regulation prediction, GO enrichment, TF enrichment.)