PlantTFDB
PlantRegMap/PlantTFDB v5.0
Plant Transcription Factor Database
Previous version: v3.0 v4.0
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG019956t1
Common NameTCM_019956
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family GRAS
Protein Properties Length: 1660aa    MW: 191184 Da    PI: 6.4778
Description GRAS family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG019956t1genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1GRAS126.92.3e-3973043333
              GRAS   3 elLlecAeavssgdlelaqalLarlselaspdg.dpmqRlaayfteALaarlarsvselykalppsetseknsseelaalklfsevs.Pilkf 93 
                       ++L++cA+a+++++l+ a++lL+r+ +la+ +       +++yf+eAL +r ++        +++ ++              f+ +s P + f
  Thecc1EG019956t1   7 DALVACAKAIQDENLTVADSLLERIWNLAAAQSwPGESDVVKYFAEALVRRAYG--------ISSASA-------------NFNLLSpPPIYF 78 
                       68**************************88888678899**************9........222222.............233333134556 PP

              GRAS  94 shltaNqaIleavegeervHiiDfdisqGlQWpaLlqaLasRpegppslRiTgvgspesg..skeeleetgerLakfAeelgvpfe.fnvlva 183
                              aI  a  g++r H i f       W  L+++La+ +++  s+R+ +++sp  +   k + e+ ++ L+  A e g+++e  +v++a
  Thecc1EG019956t1  79 LDNFSCDAINTACMGKKRFHLITFLFLPSDDWTYLFRSLANASGNFLSVRVSVIVSPFLEkiVKIQQEKSKHDLTTAAMERGIKLEdLRVVYA 171
                       6666679*************************************************9777778888899999*************85788899 PP

              GRAS 184 krledleleeLrvkp..gEalaVnlvlqlhrlldesvsleserdevLklvkslsPkvvvvveqeadhnsesFlerflealeyysalfdsleak 274
                       ++l d++ ++ ++ +  +Ea++V   ++lh+ll++   +e      L  +++++P++v++ eq adhn+++F +r+ ++++yy   fd +e++
  Thecc1EG019956t1 172 NSLGDVDASKADFTRttDEAVIVYYRYKLHELLADVRVMER----ELLKLRQINPEIVIIEEQYADHNDSNFIKRLEKSFQYYFNRFDFYEVT 260
                       9********99999889****************77777777....55568******************************************9 PP

              GRAS 275 lpreseerikvErellgreivnvvacegaerrerhetlekWrerleeaGFkpvplseka 333
                                        r+ivn+v ceg++r erh+tl++Wr+ l++ G+ pvpl  ++
  Thecc1EG019956t1 261 ---------------YCRQIVNIVGCEGTDRLERHQTLAQWRSLLRANGLLPVPLAPDI 304
                       ...............4699***********************************98765 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
PROSITE profilePS5098520.2341325IPR005202Transcription factor GRAS
PfamPF035147.9E-377304IPR005202Transcription factor GRAS
PROSITE profilePS5080810.756696750IPR003656Zinc finger, BED-type
SMARTSM006147.6E-17696746IPR003656Zinc finger, BED-type
PfamPF028927.8E-9699743IPR003656Zinc finger, BED-type
SuperFamilySSF576671.9E-7699747No hitNo description
SuperFamilySSF530982.65E-408361291IPR012337Ribonuclease H-like domain
PfamPF143721.1E-1510681159IPR025525hAT-like transposase, RNase-H fold
PfamPF056998.6E-1712091290IPR008906HAT, C-terminal dimerisation domain
PROSITE profilePS5060016.76114431627IPR003653Ulp1 protease family, C-terminal catalytic domain
SuperFamilySSF540011.18E-3314741655No hitNo description
PfamPF029023.5E-2414741653IPR003653Ulp1 protease family, C-terminal catalytic domain
Gene3DG3DSA:3.30.310.1301.1E-1215021622No hitNo description
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0006355Biological Processregulation of transcription, DNA-templated
GO:0006508Biological Processproteolysis
GO:0003677Molecular FunctionDNA binding
GO:0008234Molecular Functioncysteine-type peptidase activity
GO:0046983Molecular Functionprotein dimerization activity
Sequence ? help Back to Top
Protein Sequence    Length: 1660 aa     Download sequence    Send to blast
MSSSALDALV ACAKAIQDEN LTVADSLLER IWNLAAAQSW PGESDVVKYF AEALVRRAYG  60
ISSASANFNL LSPPPIYFLD NFSCDAINTA CMGKKRFHLI TFLFLPSDDW TYLFRSLANA  120
SGNFLSVRVS VIVSPFLEKI VKIQQEKSKH DLTTAAMERG IKLEDLRVVY ANSLGDVDAS  180
KADFTRTTDE AVIVYYRYKL HELLADVRVM ERELLKLRQI NPEIVIIEEQ YADHNDSNFI  240
KRLEKSFQYY FNRFDFYEVT YCRQIVNIVG CEGTDRLERH QTLAQWRSLL RANGLLPVPL  300
APDIWSGEHE DNGCVVFQND DGLLHFTSAW KLTDAVDHFN PISYNPIQGF NPNPALEDTV  360
RTLQVDRQAS SLNGLAAFAE IYDMLEDVCL KYELPLALTW VKGTPNGIMS GLNKKRSLSI  420
ETAYSYINCC YYYYYDYYVE KISQYRSFMQ ECAIYDIQEG QAIAGQALQS NEPFLFEPNI  480
TELRSNPFAE AAQKSGLHAA LAICLVNHYT DDVYILEFFL SSSEEKLEEP KSLALRIFED  540
LKKMKTKFVK LRVHGTEVGL QEEAIPNIPW EEMPMRSSSP ATSNDQFLNS NASRSLNVVE  600
LKDRHVVEIQ GPNGQEAATS NFHPAYLSIH ASSMAGTEHF NATNLRSYNG LLETHEPQLQ  660
EITEKNWISQ TISNIDHEIV KANRENSALP RTKQRKLVSK VWKEFTKFEE NGKQLAKCNH  720
CSKEFTGSSK SGTTHLKNHL ERCPRKKNEY QERQLKLSVK TGDLTNRDTS EGNSMFDQEK  780
SRLDLVKMII KHQYPLDVAE QEFFKSFVQN LQPMFEFQSQ ATIISDIHHI YEEEKKKLQQ  840
CFAQFACKFS LTISLWKDNL RKNAYCCLIA HFVDDDWELR RKILVFKNLE HNYGTGSIIR  900
VIQNSISEWN MSEKVCSISV DNSSLNNGIL QQIKESCLSD QVSLPSCHYY SSCTLIQDGL  960
HEIDDILLKL RKSIEYVTEL EHGKLKFQEA INQVTLQGGK STDYGPLRLD SNFSILDSAL  1020
ESRQIFCQLE QIDGHFKVNP SIEEWERALI LHSYLKGFYD NLSSFRQTHS STANTYFPQL  1080
CDMYKKFLQM EKKNYPFMMK RKFDDHWSLC NLVFAIAALL DPRLKFKFVE FSYGEIYGRD  1140
SKRQLKRFHR DLMDIYFEYA YEPRNRTTSA SVGCLTRQST ESANDSILDS FSRYASASNF  1200
NEVSSRKSDL DCYLEEPLLH LDGAFFDVLD WWRVNSERFP TLGRMAHDLL AMPVLVVPPC  1260
SDFSAVITNP AHNGLNPETM EALVCSHNWL EMPKGNDRAN HAPMQNTAKR KWEEKETREV  1320
KSCKNWNSEE TNNADKAKAS YKMLTRALPL ENDRQEGRPL KSSEPNHGKD TSGLIEIPNG  1380
SPSFDNQSEF QCYSSDESDG EIAGREQGEW REDDVRRYLL LPLTEKGRKR LNKWRNHKMS  1440
GKLIGRDKEF GVLDYKLAPL LTVPHGVETQ VKYYIDDSVV NTFFKLLKKR SDRFPKAYVS  1500
HYSFDSWIAT YLIEGSRSES QVFSWFKDEK LKDVQILFLP ACLSAHWVLF CVDTKKRTFS  1560
WLDSNISSRT SNVAEKQAIL GWFKRLLLPA FGYQNANEWP FEIRSDIPEQ KNGVDCGLFV  1620
MKYADCLTHG EFFPFTQQHM PYFRLRTFLD IYRGRLHSQ*
3D Structure ? help Back to Top
Structure
PDB ID Evalue Query Start Query End Hit Start Hit End Description
2ckg_A2e-201474165647224SENTRIN-SPECIFIC PROTEASE 1
2ckg_B2e-201474165647224SENTRIN-SPECIFIC PROTEASE 1
2ckh_A2e-201474165647224SENTRIN-SPECIFIC PROTEASE 1
6nnq_A1e-201474165646223Sentrin-specific protease 1
Search in ModeBase
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_017974961.10.0PREDICTED: uncharacterized protein LOC18602419 isoform X1
RefseqXP_017974962.10.0PREDICTED: uncharacterized protein LOC18602419 isoform X1
TrEMBLA0A061EJG70.0A0A061EJG7_THECC; Uncharacterized protein
STRINGEOY047780.0(Theobroma cacao)
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM8809420
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT2G01570.13e-44GRAS family protein
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]