SUPERFAMILY 1.75 HMM library and genome assignments server

SUPERFAMILY 2 can be accessed from supfam.org. Please contact us if you experience any problems.

Domain assignment for A0A023SG72 from Uniprot 2018_03 genome

Domain architecture


Domain assignment details

(
show help)
Strong hits

Sequence:  A0A023SG72
Domain Number 1 Region: 3248-3550
Classification Level Classification E-value
Superfamily Trypsin-like serine proteases 3.68e-118
Family Viral cysteine protease of trypsin fold 0.0000000415
Further Details:      
 
Domain Number 2 Region: 6433-6617
Classification Level Classification E-value
Superfamily S-adenosyl-L-methionine-dependent methyltransferases 3.99e-71
Family Nsp15 N-terminal domain-like 0.00000886
Further Details:      
 
Domain Number 3 Region: 3967-4119
Classification Level Classification E-value
Superfamily Coronavirus NSP8-like 4.97e-62
Family Coronavirus NSP8-like 0.00000387
Further Details:      
 
Domain Number 4 Region: 6620-6770
Classification Level Classification E-value
Superfamily EndoU-like 2.16e-56
Family Nsp15 C-terminal domain-like 0.00000499
Further Details:      
 
Domain Number 5 Region: 4243-4365
Classification Level Classification E-value
Superfamily Coronavirus NSP10-like 8.24e-54
Family Coronavirus NSP10-like 0.00000537
Further Details:      
 
Domain Number 6 Region: 4784-5089,5128-5274
Classification Level Classification E-value
Superfamily DNA/RNA polymerases 2.22e-51
Family RNA-dependent RNA-polymerase 0.019
Further Details:      
 
Domain Number 7 Region: 1115-1269
Classification Level Classification E-value
Superfamily Macro domain-like 3.36e-33
Family Macro domain 0.00000984
Further Details:      
 
Domain Number 8 Region: 4130-4237
Classification Level Classification E-value
Superfamily Replicase NSP9 1.7e-32
Family Replicase NSP9 0.0000347
Further Details:      
 
Domain Number 9 Region: 5585-5888
Classification Level Classification E-value
Superfamily P-loop containing nucleoside triphosphate hydrolases 2.08e-30
Family Tandem AAA-ATPase domain 0.068
Further Details:      
 
Domain Number 10 Region: 3846-3928
Classification Level Classification E-value
Superfamily Coronavirus NSP7-like 4.18e-29
Family Coronavirus NSP7-like 0.000097
Further Details:      
 
Domain Number 11 Region: 854-965
Classification Level Classification E-value
Superfamily NSP3A-like 7.98e-28
Family NSP3A-like 0.00083
Further Details:      
 

Gene Ontology term assignment details

The top 10 most specific Gene Ontology terms for each namespace assigned to this domain architecture as determined by dcGO Predictor

(show help)

Cellular Component IC (bits) H-Score
Biological Process IC (bits) H-Score
Molecular Function IC (bits) H-Score

Protein sequence

External link(s) A0A023SG72
Sequence length 7078
Comment (tr|A0A023SG72|A0A023SG72_9BETC) ORF1ab {ECO:0000313|EMBL:AHX71944.1} KW=Complete proteome OX=1335626 OS=Middle East respiratory syndrome-related coronavirus. GN= OC=Nidovirales; Coronaviridae; Coronavirinae; Betacoronavirus.
Sequence
MSFVAGVTAQGARGTYRAALNSEKHQDHVSLTVPLCGSGNLVEKLSPWFMDGENAYEVVK
AMLLKKEPLLYVPIRLAGHTRHLPGPRVYLVERLIACENPFMVNQLAYSSSANGSLVGTT
LQGKPIGMFFPYDIELVTGKQNILLRKYGRGGYHYTPFHYERDNTSCPEWMDDFEADPKG
KYAQNLLKKLIGGDVTPVDQYMCGVDGKPISAYAFLMAKDGITKLADVEADVAARADDEG
FITLKNNLYRLVWHVERKDVPYPKQSIFTINSVVQKDGVENTPPHYFTLGCKILTLTPRN
KWSGVSDLSLKQKLLYTFYGKESLENPTYIYHSAFIECGSCGNDSWLTGNAIQGFACGCG
ASYTANDVEVQSSGMIKPNALLCATCPFAKGDSCSSNCKHSVAQLVSYLSERCNVIADSK
SFTLIFGGVAYAYFGCEEGTMYFVPRAKSVVSRIGDSIFTGCTGSWNKVTQIANMFLEQT
QHSLNFVGEFVVNDVVLAILSGTTTNVDKIRQLLKGVTLDKLRDYLADYDVAVTAGPFMD
NAINVGGTGLQYAAITAPYVVLTGLGESFKKVATIPYKVCNSVKDTLTYYAHSVLYRVFP
YDMDSGVSSFSELLFDCVDLSVASTYFLVRLLQDKTGDFMSTIITSCQTAVSKLLDTCFE
ATEATFNFLLDLAGLFRIFLRNAYVYTSQGFVVVNGKVSTLVKQVLDLLNKGMQLLHTKV
SWAGSNISAVIYSGRESLIFPSGTYYCVTTKAKSVQQDLDVILPGEFSKKQLGLLQPTDN
STTVSVTVSSNMVETVVGQLEQTNMHSPDVIVGDYVIISEKLFVRSKEEDGFAFYPACTN
GHAVPTLFRLKGGAPVKKVAFGGDQVHEVAAVRSVTVEYNIHAVLDTLLASSSLRTFVVD
KSLSIEEFADVVKEQVSDLLVKLLRGMPIPDFDLDDFIDAPCYCFNAEGDASWSSTMIFS
LHPVECDEECSEVEASDLEEGESECISETSTEQVDVSHEISDDEWAAAVDEAFPLDEAED
VTESVQEEAQPVEVPVEDIAQVVIADTLQETPVVSDTVEVPPQVVKLPSEPQTIQPEVKE
VAPVYEADTEQTQSVTVKPKRLRKKRNVDPLSNFEHKVITECVTIVLGDAIQVAKCYGES
VLVNAANTHLKHGGGIAGAINAASKGAVQKESDEYILAKGPLQVGDSVLLQGHSLAKNIL
HVVGPDARAKQDVSLLSKCYKAMNAYPLVVTPLVSAGIFGVKPAVSFDYLIREAKTRVLV
VVNSQDVYKSLTIVDIPQSLTFSYDGLRGAIRKAKDYGFTVFVCTDNSANTKVLRNKGVD
YTKKFLTVDGVQYYCYTSKDTLDDILQQANKSVGIISMPLGYVSHGLDLIQAGSVVRRVN
VPYVCLLANKEQEAILMSEDVKLNPSEDFIKHVRTNGGYNSWHLVEGELLVQDLRLNKLL
HWSDQTICYKDSVFYVVKNSTAFPFETLSACRAYLDSRTTQQLTIEVLVTVDGVNFRTVV
LNNKNTYRSQLGCVFFNGADISDTIPDEKQNGHSLYLADNLTADETKALKELYGPVDPTF
LHRFYSLKAAVHKWKMVVCDKVRSLKLSDNNCYLNAVIMTLDLLKDIKFVIPALQHAFMK
HKGGDSTDFIALIMAYGNCTFGAPDDASRLLHTVLAKAELCCSARMVWREWCNVCGIKDV
VLQGLKACCYVGVQTVEDLRARMTYVCQCGGERHRQIVEHTTPWLLLSGTPNEKLVTTST
APDFVAFNVFQGIETAVGHYVHARLKGGLILKFDSGTVSKTSDWKCKVTDVLFSGQKYSS
DCNVVRYSLDGNFRTEVDPDLSAFYVKDGKYFTSEPPVTYSPATILAGSVYTNSCLVSSD
GQPGGDAISLSFNNLLGFDSSKPVTKKYTYSFLPKEDGDVLLAEFDTYDPIYKNGAMYKG
KPILWVNKASYDTNLNKFNRASLRQIFDVAPIELENKFTPLSVESTPVEPPTVDVVALQQ
EMTIVKCKGLNKPFVKDNVSFVADDSGTPVVEYLSKEDLHTLYVDPKYQVIVLKDNVLSS
MLRLHTVESGDINVVAASGSLTRKVKLLFRASFYFKEFATRTFTATTAVGSCIKSVVRHL
GVTKGILTGCFSFVKMLFMLPLAYFSDSKLGTTEVKVSALKTAGVVTGNVVKQCCTAAVD
LSMDKLRRVDWKSTLRLLLMLCTTMVLLSSVYHLYVFNQVLSSDVMFEDAQGLKKFYKEV
RAYLGISSACDGLASAYRANSFDVPTFCANRSAMCNWCLISQDSITHYPALKMVQTHLSH
YVLNIDWLWFAFETGLAYMLYTSAFNWLLLAGTLHYFFAQTSIFVDWRSYNYAVSSAFWL
FTHIPMAGLVRMYNLLACLWLLRKFYQHVINGCKDTACLLCYKRNRLTRVEASTVVCGGK
RTFYITANGGISFCRRHNWNCVDCDTAGVGNTFICEEVANDLTTALRRPINATDRSHYYV
DSVTVKETVVQFNYRRDGQPFYERFPLCAFTNLDKLKFKEVCKTTTGIPEYNFIIYDSSD
RGQESLARSACVYYSQVLCKSILLVDSSLVTSVGDSSEIATKMFDSFVNSFVSLYNVTRD
KLEKLISTARDGVRRGDNFHSVLTTFIDAARGPAGVESDVETNEIVDSVQYAHKHDIQIT
NESYNNYVPSYVKPDSVSTSDLGSLIDCNAASVNQIVLRNSNGACIWNAAAYMKLSDALK
RQIRIACRKCNLAFRLTTSKLRANDNILSVRFTANKIVGGAPTWFNALRDFTLKGYVLAT
IIVFLCAVLMYLCLPTFSMVPVEFYEDRILDFKVLDNGIIRDVNPDDKCFANKHRSFTQW
YHEHVGGVYDNSITCPLTVAVIAGVAGARIPDVPTTLAWVNNQIIFFVSRVFANTGSVCY
TPIDEIPYKSFSDSGCILPSECTMFRDAEGRMTPYCHDPTVLPGAFAYSQMRPHVRYDLY
DGNMFIKFPEVVFESTLRITRTLSTQYCRFGSCEYAQEGVCITTNGSWAIFNDHHLNRPG
VYCGSDFIDIVRRLAVSLFQPITYFQLTTSLVLGIGLCAFLTLLFYYINKVKRAFADYTQ
CAVIAVVAAVLNSLCICFVASIPLCIVPYTALYYYATFYFTNEPAFIMHVSWYIMFGPIV
PIWMTCVYTVAMCFRHFFWVLAYFSKKHVEVFTDGKLNCSFQDAASNIFVINKDTYAALR
NSLTNDAYSRFLGLFNKYKYFSGAMETAAYREAAACHLAKALQTYSETGSDLLYQPPNCS
ITSGVLQSGLVKMSHPSGDVEACMVQVTCGSMTLNGLWLDNTVWCPRHVMCPADQLSDPN
YDALLISMTNHSFSVQKHIGAPANLRVVGHAMQGTLLKLTVDVANPSTPAYTFTTVKPGA
AFSVLACYNGRPTGTFTVVMRPNYTIKGSFLCGSCGSVGYTKEGSVINFCYMHQMELANG
THTGSAFDGTMYGAFMDKQVHQVQLTDKYCSVNVVAWLYAAILNGCAWFVKPNRTSVVSF
NEWALANQFTEFVGTQSVDMLAVKTGVAIEQLLYAIQQLYTGFQGKQILGSTMLEDEFTP
EDVNMQIMGVVMQSGVRKVTYGTAHWLFATLVSTYVIILQATKFTLWNYLFETIPTQLFP
LLFVTMAFVMLLVKHKHTFLTLFLLPVAICLTYANIVYEPTTPISSALIAVANWLAPTNA
YMRTTHTDIGVYISMSLVLVIVVKRLYNPSLSNFALALCSGVMWLYTYSIGEASSPIAYL
VFVTTLTSDYTITVFVTVNLAKVCTYAIFAYSPQLTLVFPEVKMILLLYTCLGFMCTCYF
GVFSLLNLKLRAPMGVYDFKVSTQEFRFMTANNLTAPRNSWEAMALNFKLIGIGGTPCIK
VAAMQSKLTDLKCTSVVLLSVLQQLHLEANSRAWAFCVKCHNDILAATDPSEAFEKFVSL
FATLMTFSGNVDLDALASDIFDTPSVLQATLSEFSHLATFAELEAAQKAYQEAMDSGDTS
PQVLKALQKAVNIAKNAYEKDKAVARKLERMADQAMTSMYKQARAEDKKAKIVSAMQTML
FGMIKKLDNDVLNGIISNARNGCIPLSVIPLCASNKLRVVIPDFTVWNQVVTYPSLNYAG
ALWDITVINNVDNEIVKSSDVVDSNENLTWPLVLECTRASTSAVKLQNNEIKPSGLKTMV
VSAGQEQTNCNTSSLAYYEPVQGRKMLMALLSDNAYLKWARVEGKDGFVSVELQPPCKFL
IAGPKGPEIRYLYFVKNLNNLHRGQVLGHIAATVRLQAGSNTEFASNSSVLSLVNFTVDP
QKAYLDFVNAGGAPLTNCVKMLTPKTGTGIAISVKPESTADQETYGGASVCLYCRAHIEH
PDVSGVCKYKGKFVQIPAQCVRDPVGFCLSNTPCNVCQYWIGYGCNCDSLRQAALPQSKD
SNFLNRVRGSIVNARIEPCSSGLSTDVVFRAFDICNYKAKVAGIGKYYKTNTCRFVELDD
QGHHLDSYFVVKRHTMENYELEKHCYDLLRDCDAVAPHDFFIFDVDKVKTPHIVRQRLTE
YTMMDLVYALRHFDQNSEVLKAILVKYGCCDVTYFENKLWFDFVENPSVIGVYHKLGERV
RQAILNTVKFCDHMVKAGLVGVLTLDNQDLNGKWYDFGDFVITQPGSGVAIVDSYYSYLM
PVLSMTDCLAAETHRDCDFNKPLIEWPLTEYDFTDYKVQLFEKYFKYWDQTYHANCVNCT
DDRCVLHCANFNVLFAMTMPKTCFGPIVRKIFVDGVPFVVSCGYHYKELGLVMNMDVSLH
RHRLSLKELMMYAADPAMHIASSNAFLDLRTSCFSVAALTTGLTFQTVRPGNFNQDFYDF
VVSKGFFKEGSSVTLKHFFFAQDGNAAITDYNYYSYNLPTMCDIKQMLFCMEVVNKYFEI
YDGGCLNASEVVVNNLDKSAGHPFNKFGKARVYYESMSYQEQDELFAMTKRNVIPTMTQM
NLKYAISAKNRARTVAGVSILSTMTNRQYHQKMLKSMAATRGATCVIGTTKFYGGWDFML
KTLYKDVDNPHLMGWDYPKCDRAMPNMCRIFASLILARKHGTCCTTRDRFYRLANECAQV
LSEYVLCGGGYYVKPGGTSSGDATTAYANSVFNILQATTANVSALMGANGNKIVDKEVKD
MQFDLYVNVYRSTSPDPKFVDKYYAFLNKHFSMMILSDDGVVCYNSDYAAKGYIAGIQNF
KETLYYQNNVFMSEAKCWVETDLKKGPHEFCSQHTLYIKDGDDGYFLPYPDPSRILSAGC
FVDDIVKTDGTLMVERFVSLAIDAYPLTKHEDIEYQNVFWVYLQYIEKLYKDLTGHMLDS
YSVMLCGDNSAKFWEEAFYRDLYSSPTTLQAVGSCVVCHSQTSLRCGTCIRRPFLCCKCC
YDHVIATPHKMVLSVSPYVCNAPGCGVSDVTKLYLGGMSYFCVDHRPVCSFPLCANGLVF
GLYKNMCTGSPSIVEFNRLATCDWTESGDYTLANTTTEPLKLFAAETLRATEEASKQSYA
IATIKEIVGERQLLLVWEAGKSKPPLNRNYVFTGYHITKNSKVQLGEYIFERIDYSDAVS
YKSSTTYKLTVGDIFVLTSHSVATLTAPTIVNQERYVKITGLYPTITVPEEFASHVANFQ
KSGYSKYVTVQGPPGTGKSHFAIGLAIYYPTARVVYTACSHAAVDALCEKAFKYLNIAKC
SRIIPAKARVECYDRFKVNETNSQYLFSTINALPETSADILVVDEVSMCTNYDLSIINAR
IKAKHIVYVGDPAQLPAPRTLLTRGTLEPENFNSVTRLMCNLGPDIFLSMCYRCPKEIVS
TVSALVYNNKLLAKKELSGQCFKILYKGNVTHDASSAINRPQLTFVKNFITANPAWSKAV
FISPYNSQNAVARSMLGLTTQTVDSSQGSEYQYVIFCQTADTAHANNINRFNVAITRAQK
GILCVMTSQALFESLEFTELSFTNYKLQSQIVTGLFKDCSRETSGLSPAYAPTYVSVDDK
YKTSDELCVNLNLPANVPYSRVISRMGFKLDATVPGYPKLFITREEAVRQVRSWIGFDVE
GAHASRNACGTNVPLQLGFSTGVNFVVQPVGVVDTEWGNMLTGIAARPPPGEQFKHLVPL
MHKGAAWPIVRRRIVQMLSDTLDKLSDYCTFVCWAHGFELTSASYFCKIGKEQKCCMCNR
RAAAYSSPLQSYACWTHSCGYDYVYNPFFVDVQQWGYVGNLATNHDRYCSVHQGAHVASN
DAIMTRCLAIHSCFIERVDWDIEYPYISHEKKLNSCCRIVERNVVRAALLAGSFDKVYDI
GNPKGIPIVDDPVVDWHYFDAQPLTRKVQQLFYTEDMASRFADGLCLFWNCNVPKYPNNA
IVCRFDTRVHSEFNLPGCDGGSLYVNKHAFHTPAYDVSAFRDLKPLPFFYYSTTPCEVHG
NGSMIEDIDYVPLKSAVCITACNLGGAVCRKHATEYREYMEAYNLVSASGFRLWCYKTFD
IYNLWSTFTKVQGLENIAFNVVKQGHFIGVEGELPVAVVNDKIFTKSGVNDICMFENKTT
LPTNIAFELYAKRAVRSHPDFKLLHNLQADICYKFVLWDYERSNIYGTATIGVCKYTDID
VNSALNICFDIRDNGSLEKFMSTPNAIFISDRKIKKYPCMVGPDYAYFNGAIIRDSDVVK
QPVKFYLYKKVNNEFIDPTECIYTQSRSCSDFLPLSDMEKDFLSFDSDVFIKKYGLENYA
FEHVVYGDFSHTTLGGLHLLIGLYKKQQEGHIIMEEMLKGSSTIHNYFITETNTAAFKAV
CSVIDLKLDDFVMILKSQDLGVVSKVVKVPIDLTMIEFMLWCKDGQVQTFYPRLQASADW
KPGHAMPSLFKVQNVNLERCELANYKQSIPMPRGVHMNIAKYMQLCQYLNTCTLAVPANM
RVIHFGAGSDKGIAPGTSVLRQWLPTDAIIIDNDLNEFVSDADITLFGDCVTVRVGQQVD
LVISDMYDPTTKNVTGSNESKALFFTYLCNLINNNLALGGSVAIKITEHSWSVELYELMG
KFAWWTVFCTNANASSSEGFLLGINYLGTIKENIDGGAMHANYIFWRNSTPMNLSTYSLF
DLSKFQLKLKGTPVLQLKESQINELVISLLSQGKLLIRDNDTLSVSTDVLVNTYRKLR
Download sequence
Identical sequences A0A023SG72

Jump to [ Top of page · Domain architecture · Domain assignment details · Most Informative Gene Ontologies ]