Path planning, as a critical component of mobile robotic systems, significantly impacts operational efficiency and energy consumption ratios. State-of-the-art algorithms often suffer from inadequate real-time adjustment capability, insufficient dynamic environment adaptation, and suboptimal computational efficiency. To resolve these limitations, we propose a bidirectionally optimized path planning algorithm named Bidirectional Q-learning LPA* (BQ-LPA*), which incorporates three key innovations. Specifically, to enhance the global search capability of the LPA* framework, we replace fixed heuristic functions with a Q-learning-driven adaptive heuristic mechanism, which improves path quality through dynamic heuristic weighting and update strategies. Additionally, to improve the convergence rate and sample efficiency of Q-learning in complex environments, we propose integrating the LPA* framework to provide prior knowledge guidance, which can effectively minimize redundant exploration attempts by informed pathfinding initialization. Moreover, the Q-learning method inherently faces dimensionality challenges in high-dimensional continuous spaces, which manifest as action space congestion, storage bottlenecks, and computational inefficiency. To mitigate these risks, we devise an LPA*-based space discretization strategy that can reduce action space dimensionality and preserve the path feasibility. Experimental results show that, compared with mainstream path planning algorithms, BQ-LPA* achieves higher accuracy and faster convergence in mobile robot path planning.