The given article emphasizes the development and modeling of a hybrid navigational controller to optimize the path length and time taken. The proposed navigational controller is developed by hybridizing the metaheuristic moth–flame optimization (MFO) approach and the reinforcement learning (RL) approach. Input parameters like obstacle and target locations are fed to the MFO controller that implements a proper navigational direction selection. It forwards to the RL controller, which exercises further refinement of the output turning angle around obstacles. The collaboration of the global MFO approach with the local-based RL approach helps to optimize the path traversed by the humanoid robot in an unknown environment. The major breakthrough in this article is the utilization of humanoid robots for navigation purposes between various checkpoints. The humanoid robots are placed in a cluttered environment and assigned specific target positions to complete the assigned tasks. In the case of a multi-humanoid robot system, to avoid self-collision, it requires a Petri-Net controller to be configured in the navigation system to prevent deadlock situations and enhance the smooth completion of tasks without inter-collision among the humanoid robots. Simulations and real-time experiments are undertaken using different controllers involving single- and multi-humanoid robot systems. The robustness of the proposed controller is also validated in dynamic environment. Comparisons are carried with an established navigational controller in a similar environmental setup, which proves the proposed hybrid controller to be robust and efficient.