US20250236314
2025-07-24
Performing operations; transporting
B60W60/0011
The patent outlines a system and method for enhancing autonomous vehicle navigation by integrating a Large Vision Language Model (LVLM) with a Model Predictive Path Integral (MPPI) controller. The LVLM processes image data from driving scenarios to provide structured driving suggestions, which the MPPI controller uses to determine optimal paths. This integration aims to improve decision-making in complex environments by leveraging machine learning models trained on extensive datasets of road scenarios.
Focused on autonomous vehicles and intelligent transportation systems, the invention utilizes machine learning to optimize navigation. The MPPI controller, known for its sample-based optimization, evaluates multiple potential trajectories and selects the best one based on a cost function. This approach addresses challenges in balancing various costs like track, collision, and dynamic state costs, thereby enhancing path planning in scenarios involving unpredictable dynamics.
Traditional navigation systems rely on rule-based or data-driven methods, each with significant limitations. Rule-based systems require extensive rule creation and maintenance, while data-driven methods demand vast datasets for effective learning and interpretation. LVLMs offer a solution by merging visual inputs with text-based reasoning, enabling better contextual understanding of road situations. This capability surpasses pure text-based models, which lack environmental reasoning.
The invention features an LVLM that processes images from the vehicle's path to output driving instructions in a machine-readable format. These instructions guide the MPPI controller in calculating optimal paths by evaluating various trajectories based on assigned costs. The LVLM is pre-trained and fine-tuned with driving-related datasets to ensure accurate scene understanding and decision-making. Variations include scheduled prompts for querying the LVLM and using diverse sensor data for comprehensive environmental analysis.
The integration of MPPI with LVLM enhances autonomous vehicle capabilities by providing high-level guidance during path planning. The LVLM's structured outputs, formatted for easy parsing by the MPPI controller, allow for adaptive decision-making in real-time scenarios. By adjusting cost coefficients based on traffic situations identified by the LVLM, the system ensures efficient navigation even in complex environments. The MPPI's sampling-based approach optimizes cost functions that are challenging to approximate, ensuring selection of the lowest cost path.