PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Previous version: v3.0 v4.0
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG021920t2
Common NameTCM_021920
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family GRAS
Protein Properties Length: 756aa    MW: 85337.1 Da    PI: 6.5653
Description GRAS family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG021920t2genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1GRAS387.12e-1183827531374
              GRAS   1 lvelLlecAeavssgdlelaqalLarlselaspdgdpmqRlaayfteALaarlarsvselykalppsetseknsseelaalklfsevsPilkf 93 
                       l++lL++cA+av+++d + a++lL++++++ s  gd +qRla++f+ +L+arla+++s++yk l +++ts    s+ l+a+ l + ++P+ k+
  Thecc1EG021920t2 382 LRTLLIHCAQAVAADDRRSANELLKQIRQHTSRFGDGNQRLAHCFADGLEARLAGTGSQIYKGLVSKRTS---ASDILKAYLLHVAACPFRKV 471
                       6789************************************************************999999...9******************* PP

              GRAS  94 shltaNqaIleavegeervHiiDfdisqGlQWpaLlqaLasRpegppslRiTgvgspesg..skeeleetgerLakfAeelgvpfefnvlvak 184
                       sh+++N++I  a  ++ ++H+iDf+i +G+QWp+L++ L+ R+egpp+lRiTg++ p++g   +e++eetg+rLa +A+e++vpf++n+ +ak
  Thecc1EG021920t2 472 SHFICNKTINVASRKSMKLHVIDFGILYGFQWPTLIERLSLRSEGPPKLRITGIDFPQPGfrPAERVEETGRRLAAYAKEFKVPFQYNA-IAK 563
                       ***********************************************************9*****************************.7** PP

              GRAS 185 rledleleeLrvkpgEalaVnlvlqlhrlldesvsleserdevLklvkslsPkvvvvveqeadhnsesFlerflealeyysalfdsleaklpr 277
                       +++++++eeL+++++E ++Vn+ ++ ++llde+v+++s+r+ vL+l+++++P+++++   +  +n++ F++rf eal ++s++fd+le+ +pr
  Thecc1EG021920t2 564 KWDNIRVEELDIHEDEFVVVNCLYRAKNLLDETVAVDSPRNIVLNLIRKINPNIFIHGIMNGAYNAPFFVTRFREALFHFSSMFDMLETIVPR 656
                       ********************************************************************************************* PP

              GRAS 278 eseerikvErellgreivnvvacegaerrerhetlekWrerleeaGFkpvplsekaakqaklllrkvksdgyrveeesgslvlgWkdrpLvsv 370
                       e+ er+ +E+e+lgre+ nv+aceg er+er et+++W++r  +aGF ++p++ +++k a   +r ++ + + ++e+s +l++gWk+r ++++
  Thecc1EG021920t2 657 EDWERMLIEKEILGREALNVIACEGWERVERPETFKQWHARNLRAGFVQLPFGREIVKGATERVRSFYHKDFVIDEDSRWLLQGWKGRIIYAL 749
                       ********************************************************************889********************** PP

              GRAS 371 SaWr 374
                       SaW+
  Thecc1EG021920t2 750 SAWK 753
                       ***8 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
PROSITE profilePS5098565.646356733IPR005202Transcription factor GRAS
PfamPF035146.9E-116382753IPR005202Transcription factor GRAS
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0006355Biological Processregulation of transcription, DNA-templated
Sequence ? help Back to Top
Protein Sequence    Length: 756 aa     Download sequence    Send to blast
MVMDPRFRGF SGFQLSNQTV SVFPSQPASV FPNQNSVAGP RFQNTYIDHN FREFDYHPPD  60
PTPSNMAPIS SLSHEEDPSE DCDFSDSVLR YINHILLEED MEDKSCMLQE SLDLQAAEKS  120
FYDVLGKKYP PSPSAEQNST FVYESGENPD DSFVGNYSSY FSSCSDGSSY VIDTGRMQNL  180
GDYSTTQAQS LPVSGMSQSS YSSSMASIDG LIESPNSTLQ VPDWNGEIHS IWQFRKGVEE  240
ASKFIPGSEE LFGNLEVCGV ESQESKGWTS GLVVKEEKKD EGEYSPTGSK GKKISRRDDV  300
ETEEERCSKQ AAVYSESIVR SEMFDMVLLC SSGKAPTHFT NLRESLRNGT SKNVRQNGQS  360
KGPNGGKGRG KKQNGKKEVV DLRTLLIHCA QAVAADDRRS ANELLKQIRQ HTSRFGDGNQ  420
RLAHCFADGL EARLAGTGSQ IYKGLVSKRT SASDILKAYL LHVAACPFRK VSHFICNKTI  480
NVASRKSMKL HVIDFGILYG FQWPTLIERL SLRSEGPPKL RITGIDFPQP GFRPAERVEE  540
TGRRLAAYAK EFKVPFQYNA IAKKWDNIRV EELDIHEDEF VVVNCLYRAK NLLDETVAVD  600
SPRNIVLNLI RKINPNIFIH GIMNGAYNAP FFVTRFREAL FHFSSMFDML ETIVPREDWE  660
RMLIEKEILG REALNVIACE GWERVERPET FKQWHARNLR AGFVQLPFGR EIVKGATERV  720
RSFYHKDFVI DEDSRWLLQG WKGRIIYALS AWKPA*
3D Structure ? help Back to Top
Structure
PDB ID Evalue Query Start Query End Hit Start Hit End Description
5b3h_A4e-483657541379Protein SCARECROW
5b3h_D4e-483657541379Protein SCARECROW
Search in ModeBase
Functional Description ? help Back to Top
Source Description
UniProtProbable transcription factor involved in plant development. {ECO:0000250}.
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_007026998.20.0PREDICTED: scarecrow-like protein 9
RefseqXP_017976816.10.0PREDICTED: scarecrow-like protein 9
RefseqXP_017976817.10.0PREDICTED: scarecrow-like protein 9
SwissprotO809330.0SCL9_ARATH; Scarecrow-like protein 9
TrEMBLA0A061ER220.0A0A061ER22_THECC; GRAS family transcription factor isoform 1
STRINGEOY075000.0(Theobroma cacao)
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT2G37650.10.0GRAS family protein
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]