SUPERFAMILY 1.75 HMM library and genome assignments server

SUPERFAMILY 2 can be accessed from supfam.org. Please contact us if you experience any problems.

Domain assignment for XP_004995573.1.12839 from NCBI 2017_08 genome

Domain architecture


Domain assignment details

(
show help)
Strong hits

Sequence:  XP_004995573.1.12839
Domain Number 1 Region: 6507-6618
Classification Level Classification E-value
Superfamily PDZ domain-like 0.00000000000123
Family PDZ domain 0.022
Further Details:      
 
Domain Number 2 Region: 4040-4158
Classification Level Classification E-value
Superfamily Carbohydrate-binding domain 0.000000165
Family Cellulose-binding domain family III 0.016
Further Details:      
 
Domain Number 3 Region: 2280-2368
Classification Level Classification E-value
Superfamily Carbohydrate-binding domain 0.0000105
Family Cellulose-binding domain family III 0.014
Further Details:      
 
Domain Number 4 Region: 5519-5613
Classification Level Classification E-value
Superfamily Carbohydrate-binding domain 0.0000471
Family Cellulose-binding domain family III 0.024
Further Details:      
 
Weak hits

Sequence:  XP_004995573.1.12839
Domain Number - Region: 5696-5760
Classification Level Classification E-value
Superfamily Type I dockerin domain 0.000353
Family Type I dockerin domain 0.0045
Further Details:      
 
Domain Number - Region: 2441-2503
Classification Level Classification E-value
Superfamily Type I dockerin domain 0.00147
Family Type I dockerin domain 0.0045
Further Details:      
 
Domain Number - Region: 4230-4299
Classification Level Classification E-value
Superfamily Type I dockerin domain 0.00366
Family Type I dockerin domain 0.0069
Further Details:      
 
Domain Number - Region: 3202-3306
Classification Level Classification E-value
Superfamily Carbohydrate-binding domain 0.0977
Family Cellulose-binding domain family III 0.059
Further Details:      
 

Gene Ontology term assignment details

The top 10 most specific Gene Ontology terms for each namespace assigned to this domain architecture as determined by dcGO Predictor

(show help)

Molecular Function IC (bits) H-Score
Cellular Component IC (bits) H-Score
Biological Process IC (bits) H-Score

Protein sequence

External link(s) XP_004995573.1.12839
Sequence length 6675
Comment hypothetical protein PTSG_03839 [Salpingoeca rosetta]; AA=GCF_000188695.1; RF=representative genome; TAX=946362; STAX=946362; NAME=Salpingoeca rosetta; strain=ATCC 50818; AL=Scaffold; RT=Major
Sequence
MASPQARTQQQQQQQHHKHDQQQQPQHKHQRPRFGRSSLGPAIVALLLLFLATGPAALMQ
VPAAAAAAAAAQQHHVLAQSPILKKDTQAPCSPGQFWSIDLQACAPVTTCAPGTEYQLAQ
PTASSDRTCAHVSTCGWTEIELSPPTPTSDRLCMALAASSGPRPSPVASPPPPQQQQQQQ
QQQQQQQQQQQQQLSSSMHLGDTRRHAGEAYVGGAAFQGGAQSNSREWLQALEEHSQQRM
RRQVGDSSTPLPTPVPSSTLLAPSSSSTVVPLPSSLLSATVSPTPTPAASSFSFVPTPSP
SASATLAPSPSISTSIAAPSTLIATPTTTVVAPSPTPTPTSSALVSTPTPTPTPVLCPDE
FYLLAGSCVPCRVCNATQFEESPCADNQNRVCRPLRVCGPEEYELQPPTPTSNRECTDLV
QQCLPITPNYNWLSLHVLPEDSNTISAQFLASLGAAVAGDMLISSEGAVTATLNASGTWE
NLDAFPSVYARRFDLSTAATNTSEWCVLGVPVSPDPGLSLSVGPNRIGITLPQQTPIQTA
FANFPGLVDNTAILSQPENGGAATFFNGTWYTSTDLSNHLIPGVGYVLFAQQPATLVLAS
NSSDTSDDRRRRRRDLGYGSSAAGALGEFDIGDAAAWTSQSLHTLPQSSQQQHQRQQQPA
VLSTGSDAWLVSSKKHPQQQGRVHAAADHRRATTTVTIFAQVFEDGNLISAHGSALSVWL
LDPLMTSASSATFINATYVADGPAGPMFMITAVVPAPSTAAPAPAASASQQLQQQQQQQL
LLQFAVLDADTQRVLAVEDTLSLPSDADGSAITHGHAFRPIPLSVSAPASATGNTPPSQH
QQQQQQQQQQQQQQQQQQQQQQQQQQHTFGADDFTSTAIDALSSIQLSPDTVLATTYVQV
LDEFGPVTDDGTQLAVVCGAVRVAVALLADGPAGPTFQIPMIAQSRFETHARHGCEFRFE
VQRPHTASPTVSTAGDGGDGVLECIPATSIDLLHGSGAAYRPVHCRVASETPEEALTQPA
RGDGEHAAMQLLGSSVAGSRAPLFAQFADSHQQHQQHMQQHHRRHRREASMWTPRPGLRY
AMIVHATVRYKSDLSYVQQLGSRLACVDADGVINAADLSIGPTGEPWFQLACASNLPSTS
DTIMLQVFDANSSTTVQTALPLAFAAETVLGAIDAPLVLDIETVCGPGEFELRPPTNTTD
RVCRRVRECNELQYQQQAPTATSNRICEYITNCTVNQYERVPPTNTSDRQCGTIAAYVPP
LIRVTEAINLLTFGLLGRTEDNATTIEYGAPVASAPNAAGSLTVGLGPVTASVSTRHTPQ
PAASVRIVLAGTGDVWPDDPRVDARAQVYDAAGNALTASAQVILRVLPGPQLAGADPTTV
ETTCTTTARCRLRTTLPSAWFTVGGNATVEAALAGPAPVFQSIGVLMVQPIITYSYDRDV
VVGIPNNNLFVGSTLTAVVRGDAGDYAVRSFQLTFDAPPGLRIEGVTYDGATWAATVETQ
GTSRATIVANPANPDERPSDMVFADEELCRVSLRIDNSALEGSTLQLTTTINFLSNVKGE
KIRPRDTTTPVVAQAADRRGLHVGAAHLFVTPNVPTGIIAAVPNAELLNFATVSNAAVTS
AMTVYRVTANGALTAVTSGGLTCVSANTGRVRVAVDCSAVRLDGTETLAGPASIFVALSE
RGLLLRDMPTVTVWQLDGQPRIEVADPVLNRVDGWAPDSNPDCLGPVYQHAEVTVLANFT
DASALLLDVDVSTRAAPRLTATDTRVVAFDRGNVVGQNTGFTFIHVLSAAGVRLGAAPVQ
VSDDSVVPARLDVTVVESIALQARSGAFDPLDTSTAVAAIQNTLTSEFQQAHVRVDLHLS
DDALYTLSPQDFELRTLNPLVLGVVGGPSQASTVQALGTGVGPVLYVNWTSPATCDRVFS
LATTASVNVTLPAPRDVAVTLSVNRITPVGDAASLPGVGIATQATVTVVVEYVDDSTQDL
SLDDRLVVVAQDPSALTVTRGPGSSRFAVAATNRSGTFQLLVSFQHLSITQTASIDVALA
DTLTLTTAPYPAYPGSSALSATTLSLIGGTGTFQQAQLTARLSLLNGPTFDVTTAPAIRY
SALAPNSTTEPATNITVSPAGRISPTAATTADIAASFQDIAASKVRVLVSSTPVLIAEFI
NVDVPYFLVGMRGVGSTQVRFGVRLSDDRQLPASFLFPNNNMFRFSAAFATFSLDTIAAS
VSEGGRVLLHDNLPEEATLTIAAAHNATLAVSRPFTCNLNPEFGDMDAGETQGVPLPPTV
PGATVAIPVRINTGSASVQAIKLSLQYDPAHLDFVSAVAGADWPGGAFLFTANDPPGIVD
VGGVATPFTGSAAEIAVVTLRVLEAARDMRVPLLGEVVELSDDNNDPVGPFRGVPSVAAD
LEVIVGQARRRRSEDGAGMSAAIIDAAPTLTRLRRRTRTNPLGDANADGVFSLADASFAQ
QYLTRLIFDPTYGSDFTQEQLDALDIDHNNDRNPDDVFYLASVDFRNYRFFNNISITPVS
NNTGCHVTLQTRVFEGGDVPANGNNTFVFFDLESANTAFAGFVNNTILLQGQLETRNKGP
GLNGALLRAEFVGNGTFLIEASVPSAFDNIGVSLLIATTDATDQGSNARTTRILGGSVDP
PFLYTGTVSFTLNVNPQTSVPIRLSGGYDPLDTFDNMLGSENCHRSFPCDMDEYVVVPAT
PTSAATCAACRVCNATEYETVECTMDTNRECAACDVCIGDTYQAQACTPHSNTLCLPCGN
CTSSQYIVAECTDTTPTVCGDCGACNPDEYVSANCTEFAPTMCMDCDTCDPGQYIQHPCT
TYFNTDCEDCRVCNATTEYEVSPCLLTQNRECELCDICGDEQYETTPCGAFSNRECANCS
VCLMGQYENRFCDNCSPCGSGQYILVPCQDTMDTICDDCDTCDPGFFIERNCTEDANTKC
ARCGTCPTETYISSPCTMFKNTTCSACRVCEDDEYQDRDCENGLNRICKEIRNCQADEIE
VAPPTPTSDRVCVKVVPGALPPVEFLPTLGLGFYLNNTYAVSNQVSQRAIGEFWDGSTGQ
SGTVRVQLADTTASATYMQDRLPAVTVSGVIADRVVWYDDRTVHCRVQVRDRRFDAHTES
ARVDLEATPVGFSGTAVTASCATATTGQCTIDLSVPLSWFGVVGQDGDRTVSLAYGIRGS
GSTVDAGTVTLRPAPEVTVSRDVVAVVPHRDLYRGDRFTVPVQANAEYYINTWQLRIDTT
AGEVEIESIVYDEDVWTASVLIDDDKTAASINGGPADTLSRPTGPSPGAEALCTVNLRVT
GDAPLSQASMVLVTVLSLFDSKGQQPLLTPTAATTVDRDGTRRGAGAVYIVEDSVMGVFA
WAGQGSLVNTVPLNGVGITSVVSVCAVLKSSGGCPALTSGLTCVSDEPGAVAVGGGCARL
ELSSSTDMGSAQVNVSVSHGTSGHSTLFAVRVWYPESVMLTVDDDRLSRVAGWRDSSAGC
MDRFQSSRVRARGVFVSGADRFEAVVDEQVSGYLESNDSGVMTVAGLEVQGVSAGTASVR
LQTDAGVSVLVQVADETVSVQRLQAHVVGDVLLSGVPSSVELGDEVAAVAMVQQRLTQEF
ERADVYTEAVLSDGTRLGVTEAMGLVLGSRDPVVVRIVNETMRAADEDIEAVGTGQGLLL
EATWYDESGCNGTAPVIGAGLGQVDIVLPLPERVIVDLSSSSMTYAEDAAASIPGGPPLS
VSVRVYFEFADGRRQEMTTDGRTMYDGVSADVSDVFRVVRGESAVRLVPTGNVGTGQLRV
NFTHVNVTGSASVEVVVATGLEITPRPFPTYSGSGSVVETELSPIAGSGVYQRAVFDAVI
TISDGRQYGVATAGAMAYELSRPDVLAVVGTNRVQRVGTATGPASVVVRGVFGSTFAGQS
NVTSAGVEMRLSNETLSVVSVDGLSFVSTLSGQAGVSRSQAVCGITLSDGTRWTSGALFP
SGTLALSNVLMFETSEVLAASVNGASGLVQLEGNWYEQVTLTARAVDTGASASRMFACNL
SPVTGDIDLGSATGVPVPAVSVGARFSLSVRVALGSRELASMDVTVEYDAGLLRAVGVET
GSGWPGGPFTGNLDTPGMARFGGASDPVSGTRELAVVEFEVLGDGLASLGGFVTTVSDAD
GAAIIPDGTEFVAGAVEVLTGSGPGRRRRDIAASASADGTTVTAALLSGREWVQAARERR
SERRREQQAVRVRRADSCTPKPCLTCDPVRQTGDTDGDCLFDVRDVSFARMYLNWVALED
TAQLAQVTEPQLVNLDADQNDAVNTQDVDFLLKVNFGLYHFFANLSVESVGESLDACELS
ISLRLIAGGAGAGDSPANTNQTFLYFDIESEDASLAPAFASSVVVVGSAVAGLNKGAGYN
GGLWRAAPLGDEGYFGVRVATGISLENIGLSLIQGTVSPSGVGANARTTPILGGSPDPAK
FRFTSPLRFDLVVNAQTTIEFVAGAGYDPFTAFNNTLSSLQCQAAKGCPVGFRLVENATL
TSPPVCEPIPVVTYSHPPTFSLLPPFSPSLFATADDPTRGQTQLGDVSTVFTSPRFSAAS
ISDVDVFVTPQVLQTDNPTLTVLAQGRDAAGSSFFVDGDVVVVLRDRASGQEVVDQCTAG
VDGTCALTATVPQTWLDAAPGTIDVLAYAGGSPGSARVFASVDVAAGPSPLQSRAVTAVL
PRQTLYPGTVYEAEVYANFARSIAAFSLTVNVTQSLVVERLRVDEQYWNAVSDVFSPTTA
AINAIVSDDSNLDTGEQLLCKIEFVVNETVTADMGSMSVVVEELADEFSNRLSLRGTFDP
TFAYYETRDGTQSGPRGSVYIGDDVVVGVFATLQSSAFNTASLGLGATVVPVQLNALTRS
GQVLTGAQVSAADAAACSVVPASAGTLSGDCTQLVLNATQSVPGQTMRVSFGAASSGAQA
TNYTLEAVIADQLSVVLDDAELNAVASPADCPTAVYQGSRVRVTAVIERADMTSIGVDVT
QRVLPFLRVRDSDVATLNTTLDGTLYVVGTAPGTTFVDVLFLGGDVAATAAVNVSSTPVG
IERIGVEMVTGLGAETALTASPYGFAVNVTKVAQLEFEGDVAAFKASVLYADGLREDLDH
RARGVVLAFNSTDVLQVDGTAAAGGGPALSALAVGSGVVAVNVTWTPPNCTQALAVGMQD
VVVALQQPVRVIVGVSANVLTPASDPAACAGVATRTSLVVQLEYADGHRVDATADPRTVV
DVDDAVLDAEDRLSLAPDALPGAIEVVANATATAGAAGVRVSFTHLPLAETVSVAVVRAE
QLSVLLHPYPAFAGSTNVDASPLNPIADTSVYQQALASVVLVVSSPVGQVDVSDTAVVEA
ANSSVSVVAGSRMVTVEDPAAPSGLVRATLCSISNTTQIGISQTPVFVRDIDEVSFPDTF
EGERGSQGQVTTGVLLDDGTRLPADVLTPRLRTNLPLLQFTLSAVNNGGRNGPALVVDNA
TGVVTLLGNSAGYVELVVQSRLNPNVTRSIVMAANLLPAVGDIDIGSSTGPADGGPHRVG
DEFTLAVRVNTGTYTVAAVDVTLQYNASVVEAVRVERGPDWIGGTFIATLNDPVGEIAFG
GIASGIRGMDTIALVTLRVLDAVSDPMITVMSGTINTLATANQDPVGDDTPRPMVAGDIP
IRVQPVSRRRRSDSSGSTGAMLTHAAAAATASSSSSSAPPEAVAAAARVRRAMCSEYVRG
DTNLDCAFDLNDVTFLQSYLLSLHSPSGPLSVTPEQLEELDADNSGTPDTLDALFLLRTN
FRQLRFVSTPRLSRRPAMGPTGACTYRLVVDVYTKSGPAPPEQTDVFVDFESLGMDLGAL
VSNLSFSQGSLDQLKPSPFDGFLARAQPSAANSGVWELEFTTREVLGNLTLTLIQATTDA
LGNTSPSRSLVLTGDRVSPFTYPNSLVMNVTHGAGGVAQSTVRFGPSGYNPLVETSVFDG
PCSPILITTTTTTTTTTTTTTTTTTTTTTTTTTTTTATTTTSTSTTTTSRRPPITTSSTS
TSVPITIGPPQKSETFFESAAGIIVIVVVVLLLLCCCCCLFYVWRRRMPEEGDEESGHGM
FGRSMTYDVRDHKPHRLEHDTRNVTLVGKQKRGSLAASQALDEDNFLQVSSFPEVVSREE
DMTNGKDVETISLDLQESVGVEQGQGEAEGGEDMFETSFGTTPTSQAEGLRETEVVLSDD
GAPGGKVEEVVTPPHLSTSAYLNPPVDDTDSVQRVPVEGGSTGLYGETGAPATTYLRPPS
DSEDEATEMLGAGLAMAMADKGQGGGDEDRRASFVPNPFDPANESSSEDFYSDEEEGAGA
GEQDQERAALGPEGEREDTRAGQRDEEQQEEEEAAPPRSARSSKEAWGEPVHEPQYLDVR
PEPEQPGTDTGGAVEHVSQYLDVKPGEPGDGDGAGPSGEREHGGDGEEEDGEGGLQFFDD
DEVEFVPDDGQDAAWTESTMMEEKGSSSHSRSPEPETKVDEAEGSAETSFAMTAATTGDN
DGEAEDDDADESTDTFPSDSQQQQQQQQPRSRRPLPPAQEEEQEVKKDILPLTVYKRVGE
GVGFSIYGGTETGEAGIFVSRVDEEGPAAGIVRPDDEVLAVGLNKMRDLTSTEADLVMEE
ASGCETVKLVVARVPDKKPSADPRSTLSVGPGKFMLSKPLRSRRRSANLSGMANPSNPHQ
PDHVYDETFDVSEFA
Download sequence
Identical sequences F2U5J3
XP_004995573.1.12839 PTSG_03839T0

Jump to [ Top of page · Domain architecture · Domain assignment details · Most Informative Gene Ontologies ]