SUPERFAMILY 1.75 HMM library and genome assignments server

SUPERFAMILY 2 can be accessed from supfam.org. Please contact us if you experience any problems.

Domain assignment for ENSGGOP00000020213 from Gorilla gorilla 76_3.1

Domain architecture


Domain assignment details

(
show help)
Strong hits

Sequence:  ENSGGOP00000020213
Domain Number 1 Region: 214-321
Classification Level Classification E-value
Superfamily Anthrax protective antigen 1.83e-19
Family Anthrax protective antigen 0.012
Further Details:      
 
Domain Number 2 Region: 1672-1752
Classification Level Classification E-value
Superfamily E set domains 3.64e-17
Family E-set domains of sugar-utilizing enzymes 0.021
Further Details:      
 
Domain Number 3 Region: 1933-2020
Classification Level Classification E-value
Superfamily E set domains 1.23e-16
Family Other IPT/TIG domains 0.042
Further Details:      
 
Domain Number 4 Region: 1756-1838
Classification Level Classification E-value
Superfamily E set domains 0.0000000000000056
Family Other IPT/TIG domains 0.017
Further Details:      
 
Domain Number 5 Region: 999-1076
Classification Level Classification E-value
Superfamily E set domains 0.000000000000016
Family E-set domains of sugar-utilizing enzymes 0.012
Further Details:      
 
Domain Number 6 Region: 1081-1161
Classification Level Classification E-value
Superfamily E set domains 0.0000000000000318
Family E-set domains of sugar-utilizing enzymes 0.037
Further Details:      
 
Domain Number 7 Region: 1842-1927
Classification Level Classification E-value
Superfamily E set domains 0.00000000000034
Family NF-kappa-B/REL/DORSAL transcription factors, C-terminal domain 0.029
Further Details:      
 
Domain Number 8 Region: 1503-1585
Classification Level Classification E-value
Superfamily E set domains 0.00000000000331
Family E-set domains of sugar-utilizing enzymes 0.026
Further Details:      
 
Domain Number 9 Region: 3097-3218,3247-3388
Classification Level Classification E-value
Superfamily Pectin lyase-like 0.000000000016
Family Galacturonase 0.077
Further Details:      
 
Domain Number 10 Region: 908-986
Classification Level Classification E-value
Superfamily E set domains 0.0000000000268
Family E-set domains of sugar-utilizing enzymes 0.02
Further Details:      
 
Domain Number 11 Region: 113-175
Classification Level Classification E-value
Superfamily E set domains 0.0000000000467
Family E-set domains of sugar-utilizing enzymes 0.043
Further Details:      
 
Domain Number 12 Region: 1405-1483
Classification Level Classification E-value
Superfamily E set domains 0.000000000182
Family E-set domains of sugar-utilizing enzymes 0.038
Further Details:      
 
Domain Number 13 Region: 1590-1665
Classification Level Classification E-value
Superfamily E set domains 0.00000000038
Family Other IPT/TIG domains 0.049
Further Details:      
 
Domain Number 14 Region: 1173-1230
Classification Level Classification E-value
Superfamily E set domains 0.00000000344
Family E-set domains of sugar-utilizing enzymes 0.02
Further Details:      
 
Domain Number 15 Region: 1248-1344
Classification Level Classification E-value
Superfamily Cupredoxins 0.00000234
Family Plastocyanin/azurin-like 0.063
Further Details:      
 
Weak hits

Sequence:  ENSGGOP00000020213
Domain Number - Region: 2152-2251,2335-2374,2409-2524
Classification Level Classification E-value
Superfamily Pectin lyase-like 0.0022
Family Chondroitinase B 0.088
Further Details:      
 
Domain Number - Region: 1-83
Classification Level Classification E-value
Superfamily E set domains 0.0103
Family E-set domains of sugar-utilizing enzymes 0.052
Further Details:      
 

Gene Ontology term assignment details

The top 10 most specific Gene Ontology terms for each namespace assigned to this domain architecture as determined by dcGO Predictor

(show help)

Biological Process IC (bits) H-Score
Molecular Function IC (bits) H-Score
Cellular Component IC (bits) H-Score

Protein sequence

External link(s) Protein: ENSGGOP00000020213   Gene: ENSGGOG00000014869   Transcript: ENSGGOT00000027596
Sequence length 4086
Comment pep:known_by_projection chromosome:gorGor3.1:8:108733964:108874347:1 gene:ENSGGOG00000014869 transcript:ENSGGOT00000027596 gene_biotype:protein_coding transcript_biotype:protein_coding
Sequence
GTLITIQGRIFTDVYGSNIALSSNGKNVRILRVYVGGMPCDLLIPQSDSLYGLKLDHPNG
DMGSMVCKTTGTFIGHHNVSFILDNDYGRSFPQKMAYFVSSLNKIAMFQTYAEVTMIFPS
QGSIRGGTTLTISGRFFDQTDFPVRVLVGGEPCDILNVTENSICCKTPPKPHILKTVYPG
GRGLKLEVWNNSRPVRLEEILEYNEKTPGYMGASWVDSASYIWLMEQDTFVARFSGFLVA
PDSDVYRFYIKGDDRYAIYFSQTGLPEDKVRIAYHSANANSYFSSPTQRSDDIHLQKGKE
YYIEILLQEYRLSAFVDVGLYQYRNVYTEQQTGDAVNEEQVIKSQSTIIQEVQVITLENW
ETTNAINEVQKIKVTSPCVEANSCSLYQYRLIYNMEKTVFLSADASEFILQSALNDLWSI
KPDTVQVIRTQNPQSYVYMVTFISTRGDFDLLGYEVVEGNNVTLDITEQTKGKPNLETFT
LNWDGIASKPLTLWSSEAEFQGAVEEMVSTKCPPQIANFEEGFVVKYFRDYETDFNLEHI
NRGQKTAETDAYCGRYSLKNPAVLFDSADVKPNRRPYGDILLFPYNQLCLAYKGFLANYI
GLKFQYQDNNKITRSTDTQFTYNFAYGNNWTYTCIDLLDLVRTKYTGTNISLQRISLHKA
SESQSFYVDVVYIGHTSTISTLDEMPKRRLPALANKGIFLEHFQVNQTKTNGPTMTNQYS
VTMTSYNCSYNIPMMAVSFGQIITHETENEFVYRGNNWPGKSKIRIQRIQAASPPLSGSF
DIQAYGHILKGLPAAVSAADLQFALQSLEGMGRISVTREGTCAGYAWNIKWRSTCGKQNL
LQINDSNIIGEKANMTVTRIKEGGLFRQHVLGDLLRTPSQQPQVEVYVNGIPAKCSGDCG
FTWDSNITPLVLATSPSQGSYEEGTILTIVGSGFSPSSAVTVSVGPVGCSLLSVDEKELK
CQILNGSAGHAPVAVSMADVGLAQNVGGEEFYFVYQSQISHVWPDSGSIAGGTLLTLSGF
GFNENSKVLVGNETCNVIEGDLNRITCRTPKKTEGTVDISVTTNGFQATARDAFSYNCLQ
TPIITDFSPKVRTILGEVNLTIKGYNFGNELTQNMAVYVGGKTCQILHWNFTDIRCLLPK
LSPGKHDIYVEVRNWGFASTRDKLNSSIQYVLEVTSMFPQRGSLFGGTEITVRGFGFSTI
PAENTVLLGSIPCNVTSSSENVIKCILHSTGNIFRITNNGKDSVHGLGYAWSPSVLNVSV
GDTVAWHWQTHPFLRGIGYRIFSVSSPGSVIYDGKGFTSGRQKSTSGSFSYQFTSPGIHY
YSSGYVDEAHSIFLQGVINVLPAETRHIPLHLFVGSSEATYAYGGPENLHLGSSVAGCLA
TEPLCGLNNTRVKNSKRLLFEVSSCFSPSISNITPSSGTVNELITIIGHGFSNLPCANKV
TIGSYPCVIEESSEDSITCHIDPQNSMDVGIRETVTLTVYNLGIAINTLSNEFDRRFVLL
PNIDLVLPNAGSTTGMTRVTIKGSGFAVSSAGVKVLMGHFPCKVLSVNYTAIECETSPAA
QQLVDVDLLIHGVPAQCQENCTFSYLESITPYITGVFPNSIIGSVKVLIEGEGLGTVLED
IAVFIGNQQFRAIEVNENNITALVTPLPVGHHSVSVVVGSKGLALGKLTVSSPPVASLSP
TSGSIGGGTTLVITGNGFYPGNTTVTIGDEPCQIISINPNEVYCRTPAGTTGMVDVKIFV
NTIAYPPLLFTYALEDTPFLRGIIPSRGPPGTEIEITGSNFGFEILEISVMINNIQCNVT
MANDSVLQCIVGDHAGGTFPVMMHHKTKGSAMSTVVFEYPLNIQNINPSQGSFGGGQTMT
VTGTGFNPQNSIILVCGSECAIDRLRSDYTTLLCEIPSNNGTGAEQACEVSVVNGKDLSQ
SMTPFTYAVSLTPLITAVSPKRGSTAGGTRLTVVGSGFSENIEDVHVTIAEAKCDVEYSN
KTHIICMTDAHTLSGWAPVCVHIRGVGMAKLDNADFLYVDAWSSNFSWGGKSPPEEGSLV
VITKGQTILLDQSTPILKMLLIQGGTLIFDEADIELQAENILITDGGVLQIGTETSPFQH
KAVITLHGHLRSPELPVYGAKTLAVREGILDLHGVPVPVIWTRLAHTAKAGERILILQEA
VTWKPGDNIVIASTGHRHSQGENEKMTIASVSPDGINITLSNPLNYTHLGITVTLPDGTL
FEARAEVGILTRNILIRGSDNVEWNNKIPACPDGFDTGEFATQTCLQGKFGEEIGSDQFG
GCVMFHAPVPGANMVTGRIEYVEVFHAGQAFRLGRYPIHWHLLGDLQFKSYVRGCAIHQA
YNRAVTIHNTHHLLVERNIIYDIKGGAFFIEDGIEHGNILQYNLAVFVQQSTSLLNDDVT
PAAFWVTNPNNTIRHNAVAGGTHFGFWYRMNNHPDGPSYDRNICQKRVPLGEFFNNTVHS
QGWFGMWIFEEYFPMQTGSCTSTVPAPAIFNSFTTWNCQKGAEWVNGGALQFHNFVMVNN
YEAGIETKRILAPYVGGWGETNGAVIKNAKIVGHLDELGMGSAFCTAKGLVLPFSEGLTV
SSVHFMNFDRPNCVALGVTSISGVCNDRCGGWSAKFVDVQYSHTPNKAGFRWEHEMVMID
VDGSLTGHKGHTVIPHSSLLDPSHCTQEAEWSIGFPGSVCDASVSFHRLAFNQPSPVSLL
EKDVVLSDSFGTSIIPFQKKRLTHMSGWMALIPNANHINWYFKGVDHITNISYTSTFYGF
KEEDYVIISHNFTQNPDMFNIIDTRNGSSNPLNWNTSKNGDWHLEANTSTLYYLVSGRND
LHQSQLISGNLDPDVKDVVINFQAYCCILQDCFPVHPPSRKPIPKERPATYNLWSNDSFW
QSSRENNYTVPHPGANVIIPEGTWIVADIDMPSMERLIIWGVLELEDKYNVGAAESSYRE
VILNATYISLQGGRLIGGWEDNPFKGDLKIVLRGNHTTPDWALPEGPNQGAKVLGVFGEL
DLHGIPRSIYKTKLSETALAGSKVLSLMDAVDWQEGEEIVITTTSYDFHQTETRSIVKIL
HDHKILILNDSLSYTHFAEKYHVPGTGESYTLAADVGILSRNIKIVGEDYPGWSEDSFGA
RVLVGSFTENMMTFKGNARISNVEFYHSGQEGFRDSTDPRYAVTFLNLGQIQEHGSSYIR
GCAFHHGFSPAIGVFGTDGLDIDDNIIHFTVGEGIRIWGNANRVRGNLIALSVWPGTYQN
RKDLSSTLWHAAIEINRGTNTVLQNNVVAGFGRAGYRIDGEPCPGQFNPVEKWFDNEAHG
GLYGIYMNQDGLPGCSLIQGFTIWTCWDYGIYFQTTESVHIYNVTLVDNGMAIFPMIYMP
AAISHKISSKNVQIKSSLIVGSSPGFNCSDVLTNDDPNIELTAAHRSPRSPSEGGRSGIC
WPTFASAHNMAPRKPHAGIMSYNAISGLLDISGSTFVGFKNVCSGETNVIFITNPLNEDL
QHPIHVKNIKLVDTTEQSKIFIHRPDISKVNPSDCVDMVCDAKRKSFLRDIDGSFLGNAG
SVIPQAEYEWDGNSQVGIGDYRIPKAMLTFLNGSRIPVTEKAPHKGIIRDSTCKYLPEWQ
SYQCFGMEYAMMVIESLDPDTETRRLSPVAIMGNGYVDLINGPQDHGWCAGYTCQRRLSL
FHSIVALNKSYEVYFTGTSPQNLRLMLLNVDHNKAVLVGIFFSTLQRLDVYVNNLLVCPK
TTIWNAQQKHCELNNHLYKDQFLPNLDSTVLGENYFDGTYQMLYLLVKGTIPVEIHTATV
IFVSFQLPVATEDDFYTSHNLVKNLALFLKIPSDKIRISKIRGKSLRRKRSMGFIIEIEI
GDPPIQFLSNGTTGQMQLSELQEIAGSLGQAVILGNISSILGFNISSMSITNPLPSPSDS
GWIKVTAQPVERSAFPVHHVAFVSSLLVITQPVAAQPGQPFPQQPSVKATDSDGNCVSVG
ITALTLRAILKDSNNNQVNGLSGNTTIPFSSCWANYTDLTPLRTGKNYKIEFILDNVVGV
ESRTFSLLAESVSSSGSSSSSNSKASTVGTYAQIMTVVISCLIGRMWLLEIFMAAVSTLN
ITLSKY
Download sequence
Identical sequences ENSGGOP00000020213 ENSGGOP00000014539

Jump to [ Top of page · Domain architecture · Domain assignment details · Most Informative Gene Ontologies ]