The increasing intensity and frequency of floods is one of the many consequences of our changing climate. In this work, we explore ML techniques that improve the flood detection module of an operational early flood warning system. Our method exploits an unlabeled dataset of paired multi-spectral and synthetic aperture radar (SAR) imagery to reduce the labeling requirements of a purely supervised learning method. Prior works have used unlabeled data by creating weak labels out of them. However, from our experiments, we noticed that such a model still ends up learning the label mistakes in those weak labels. Motivated by knowledge distillation and semi-supervised learning, we explore the use of a teacher to train a student with the help of a small hand-labeled dataset and a large unlabeled dataset. Unlike the conventional self-distillation setup, we propose a cross-modal distillation framework that transfers supervision from a teacher trained on richer modality (multi-spectral images) to a student model trained on SAR imagery. The trained models are then tested on the Sen1Floods11 dataset. Our model outperforms the Sen1Floods11 baseline model trained on the weak-labeled SAR imagery by an absolute margin of $ 6.53\% $ intersection over union (IoU) on the test split.