Published online by Cambridge University Press: 01 March 2000
Transmembrane helices are the most readily predictable secondary structure components of proteins. They can be predicted to a high degree of accuracy in a variety of ways. Many of these methods compare new sequence data with the sequence characteristics of known transmembrane domains. However, the known transmembrane sequences are not necessarily representative of a particular organism. We attempt to demonstrate that parameters optimized for the known transmembrane domains are far from optimal when predicting transmembrane regions in a given genome. In particular, we have tested the effect of nucleotide bias upon the composition and hence the prediction characteristics of transmembrane helices. Our analysis shows that nucleotide bias of a genome has a strong and predictable influence upon the occurrences of several of the most important hydrophobic amino acids found within transmembrane helices. Thus, we show that nucleotide bias should be taken into account when determining putative transmembrane domains from sequence data.