Morphological re-inflection generation is one of the most challenging tasks in the natural language processing (NLP) domain, especially with morphologically rich, low-resource languages like Arabic. In this research, we investigate the ability of transformer-based models in the singular-to-plural Arabic noun conversion task. We start with pretraining a Character-BERT model on a masked language modeling task using 1,134,950 Arabic words and then adopting the fusion technique to transfer the knowledge gained by the pretrained model to a full encoder–decoder transformer model, in one of the proposed settings. The second proposed setting directly fuses the output Character-BERT embeddings into the decoder. We then analyze and compare the performance of the two architectures and provide an interpretability section in which we track the features of attention with respect to the model. We perform the interpretation on both the macro and micro levels, providing some individual examples. Moreover, we provide a thorough error analysis showing the strengths and weaknesses of the proposed framework. To the best of our knowledge, this is the first effort in the Arabic NLP domain that adopts the development of an end-to-end fused-transformer deep learning model to address the problem of singular-to-plural conversion.