The dimension of models derived on the basis of data is commonly restricted by the number of observations, or in the context of monitored systems, sensing nodes. This is particularly true for structural systems, which are typically high-dimensional in nature. In the scope of physics-informed machine learning, this article proposes a framework—termed neural modal ordinary differential equations (Neural Modal ODEs)—to integrate physics-based modeling with deep learning for modeling the dynamics of monitored and high-dimensional engineered systems. In this initiating exploration, we restrict ourselves to linear or mildly nonlinear systems. We propose an architecture that couples a dynamic version of variational autoencoders with physics-informed neural ODEs (Pi-Neural ODEs). An encoder, as a part of the autoencoder, learns the mappings from the first few items of observational data to the initial values of the latent variables, which drive the learning of embedded dynamics via Pi-Neural ODEs, imposing a modal model structure on that latent space. The decoder of the proposed model adopts the eigenmodes derived from an eigenanalysis applied to the linearized portion of a physics-based model: a process implicitly carrying the spatial relationship between degrees-of-freedom (DOFs). The framework is validated on a numerical example, and an experimental dataset of a scaled cable-stayed bridge, where the learned hybrid model is shown to out perform a purely physics-based approach to modeling. We further show the functionality of the proposed scheme within the context of virtual sensing, that is, the recovery of generalized response quantities in unmeasured DOFs from spatially sparse data.