Domains rich in alternating arginine and serine residues (RS
domains) are frequently found in metazoan proteins involved
in pre-mRNA splicing. The RS domains of splicing factors associate
with each other and are important for the formation of
protein–protein interactions required for both constitutive
and regulated splicing. The prevalence of the RS domain in splicing
factors suggests that it might serve as a useful signature for
the identification of new proteins that function in pre-mRNA
processing, although it remains to be determined whether RS
domains also participate in other cellular functions. Using
database search and sequence clustering methods, we have identified
and categorized RS domain proteins encoded within the entire
genomes of Homo sapiens, Drosophila melanogaster,
Caenorhabditis elegans, and Saccharomyces
cerevisiae. This genome-wide survey revealed a surprising
complexity of RS domain proteins in metazoans with functions
associated with chromatin structure, transcription by RNA
polymerase II, cell cycle, and cell structure, as well as pre-mRNA
processing. Also identified were RS domain proteins in S.
cerevisiae with functions associated with cell structure,
osmotic regulation, and cell cycle progression. The results
thus demonstrate an effective strategy for the genomic mining
of RS domain proteins. The identification of many new proteins
using this strategy has provided a database of factors that
are candidates for forming RS domain-mediated interactions
associated with different steps in pre-mRNA processing, in addition
to other cellular functions.