FMR1 is an RNA-binding protein that is either absent
or mutated in patients affected by the fragile X syndrome,
the most common inherited cause of mental retardation in
humans. Sequence analysis of the FMR1 protein has suggested
that RNA binding is related to the presence of two K-homologous
(KH) modules and an RGG box. However, no attempt has been
so far made to map the RNA-binding sites along the protein
sequence and to identify possible differential RNA-sequence
specificity. In the present article, we describe work done
to dissect FMR1 into regions with structurally and functionally
distinct properties. A semirational approach was followed
to identify four regions: an N-terminal stretch of 200
amino acids, the two KH regions, and a C-terminal stretch.
Each region was produced as a recombinant protein, purified,
and probed for its state of folding by spectroscopical
techniques. Circular dichroism and NMR spectra of the N-terminus
show formation of secondary structure with a strong tendency
to aggregate. Of the two homologous KH motifs, only the
first one is folded whereas the second remains unfolded
even when it is extended both N- and C-terminally. The
C-terminus is, as expected from its amino acid composition,
nonglobular. Binding assays were then performed using the
4-nt homopolymers. Our results show that only the first
KH domain but not the second binds to RNA, and provide
the first direct evidence for RNA binding of both the N-terminal
and the C-terminal regions. RNA binding for the N-terminus
could not be predicted from sequence analysis because no
known RNA-binding motif is identifiable in this region.
Different sequence specificity was observed for the fragments:
both the N-terminus of the protein and KH1 bind preferentially
to poly-(rG). The C-terminal region, which contains the
RGG box, is nonspecific, as it recognizes the bases with
comparable affinity. We therefore conclude that FMR1 is
a protein with multiple sites of interaction with RNA:
sequence specificity is most likely achieved by the whole
block that comprises the first ≈400 residues, whereas
the C-terminus provides a nonspecific binding surface.