Transmembrane helices are the most readily predictable
secondary structure components of proteins. They can be
predicted to a high degree of accuracy in a variety of
ways. Many of these methods compare new sequence data with
the sequence characteristics of known transmembrane domains.
However, the known transmembrane sequences are not necessarily
representative of a particular organism. We attempt to
demonstrate that parameters optimized for the known transmembrane
domains are far from optimal when predicting transmembrane
regions in a given genome. In particular, we have tested
the effect of nucleotide bias upon the composition and
hence the prediction characteristics of transmembrane helices.
Our analysis shows that nucleotide bias of a genome has
a strong and predictable influence upon the occurrences
of several of the most important hydrophobic amino acids
found within transmembrane helices. Thus, we show that
nucleotide bias should be taken into account when determining
putative transmembrane domains from sequence data.