|
SUPERFAMILY
SUPERFAMILY is a database of structural and functional annotation for all proteins and genomes.
The SUPERFAMILY annotation is based on a collection of hidden Markov models, which represent
structural protein domains at the
SCOP
superfamily level. A superfamily groups together domains which have an evolutionary relationship. The annotation
is produced by scanning protein sequences from over
2,478 completely sequenced genomes
against the hidden Markov models.
For each protein you can:
Submit sequences for SCOP classification
View domain organisation, sequence alignments and protein sequence details
For each genome you can:
Examine superfamily assignments, phylogenetic trees, domain organisation lists and networks
Check for over- and under-represented superfamilies within a genome
For each superfamily you can:
Inspect SCOP classification, functional annotation, Gene Ontology annotation, InterPro abstract and genome assignments
Explore taxonomic distribution of a superfamily across the tree of life
All annotation, models and the database dump are freely available for download to everyone.
Description cont.
Jump to [ SUPERFAMILY description · Recent news ]
Major Features
Sequence search |
Submit your protein, or DNA, sequence for SCOP superfamily and family level classification.
|
Keyword search |
Search for superfamily, family or species names plus sequence,
SCOP,
PDB or hidden Markov model IDs.
|
Domain assignments |
Domain assignments, alignments and architectures for completely sequenced
eukaryotic and
prokaryotic organisms, plus
sequence collections.
|
Comparative genomics tools |
Browse unusual (over- and under-represented) superfamilies and families,
adjacent domain pair lists and graphs, unique domain pairs, domain combinations,
domain architecture co-occurrence networks and domain distribution across taxonomic kingdoms for each organism.
|
Genome statistics |
For each genome: number of sequences, number of sequences with assignment,
percentage of sequences with assignment, percentage total sequence coverage, number of domains assigned,
number of superfamilies assigned, number of families assigned, average superfamily size,
percentage produced by duplication, average sequence length, average length matched, number of domain pairs
and number of unique domain architectures.
|
Gene Ontology |
Domain-centric Gene Ontology (GO) automatically annotated
by Hai Fang.
|
Phenptype Ontology |
Domain-centric phenotype/anatomy ontology including
Disease Ontology,
Human Phenotype,
Mouse Phenotype,
Worm Phenotype,
Yeast Phenotype,
Fly Phenotype,
Fly Anatomy,
Zebrafish Anatomy,
Xenopus Anatomy,
Arabidopsis Plant.
|
Superfamily annotation |
InterPro abstracts for 1,052 superfamilies,
and Gene Ontology (GO) annotation for 763 superfamilies.
|
Functional annotation |
Functional annotation of
SCOP 1.73 superfamilies, by
Christine Vogel.
|
Phylogenetic trees |
Trees are generated using heuristic parsimony methods, and
are based on protein domain architecture data for all genomes in SUPERFAMILY.
Genome combinations, or specific clades, can be displayed as individual trees.
|
Similar domain architectures |
Find the 10 domain architectures which are most similar to a domain architecture of interest.
|
Hidden Markov models |
Produce SCOP domain assignments for your
sequences using the SUPERFAMILY models. HMM visualisation by Martin Madera,
e.g. model 0045110.
|
Profile comparison |
Find remote domain matches when the HMM search fails to find a significant match.
Profile comparison (PRC) for aligning and scoring two profile hidden Markov
models by Martin Madera
|
Web services |
Distributed Annotation Server and linking to SUPERFAMILY.
|
Downloads |
Sequences, assignments, models, MySQL database and scripts - updated weekly.
|
Jump to [ SUPERFAMILY description · Major features · Top of page ]
|