Resources related to the HMMER-SAM comparison
- Reference: Madera, M. and Gough, J. (2002) A comparison of profile
hidden Markov model procedures for remote homology detection. Nucl. Acids
Res., 30(19), 4321-4328;
online version,
PS,
gzipped PS,
or
PDF.
-
ERRATA
A few points related to how HMMER handles multiple sequence alignments were
explained in the paper in a way which is potentially misleading. These are minor
details and do not affect any of our results or conclusions.
- Requiring the user to specify a match/insert distinction by hand (as SAM does)
is certainly not something to be expected of an inexpert user. We should have
criticised SAM on this point, and apologise for failing to do so. This issue does
not arise with SAM T99 seeded from a single sequence.
- Stockholm (= modified SELEX), the default HMMER alignment format, can store
all the information the SAM a2m format can, as well as additional extensive
mark-up; we apologise for not making this sufficiently clear. However, this extra
information (including the match-insert distinction) is not used by default, as
implied in the paper.
- To reiterate, using the `--hand' option and additional mark-up in the
Stockholm format, HMMER can be made to behave in a way which is identical to the
SAM default, as stated in the paper.
- Our Figure 1 and the passage
In contrast, in a HMMER-style alignment all columns are supposed to be
aligned, though the HMMER model-building program treats the most divergent
regions as insertions.
gives the impression that HMMER expects the regions it classifies as insertions to
be aligned. This is not the case: insertion regions get averaged into a single
insert state which will be the same regardless of the exact alignment in the
region. However, which regions are treated as insertions and which as matches is
(by default) decided by the program, as stated in the paper.
- More errata: In Table 1 the HMMER sequence
1bqk contains
two dots. These should be dashes!
- convert.pl: The original
script to convert between HMMER and SAM model files, with
documentation.
- I've since added a PSI-BLAST output capability. The most recent sources are
here.
- a2m2selex.pl: Julian's
script to make HMMER work with SAM-style a2m files using the '--hand' option in
HMMER. For research purposes only: HMMER model building appears to be inferior to
SAM's, so most of the time it's much better to build the model using SAM and
convert it to HMMER using the above model-conversion script.
- Classification of globins and cupredoxins used in our paper:
This page is maintained by Martin
Madera. If you have any questions or suggestions, please feel free to email me.