Using LLMs For Autonomous Vehicles
LC-LLM is an explainable lane change prediction model that leverages the strong reasoning capabilities of LLMs.
Introduction
In order to maintain safe driving in constantly changing surroundings, autonomous vehicles need the ability to reliably anticipate the lane change intentions of nearby vehicles ahead of time and predict their future movements.
Current methods for predicting vehicle motion have considerable potential for enhancement, especially in terms of accurately foreseeing long-term trajectories and making predictions that are easier to understand.
Lane change prediction tasks are recast as language modelling problems.
Key Take Aways
Large Language Models can be leveraged due to its strong reasoning and self-explanation capabilities.
The Inference latency is a challenge of course, as responses should be sub 500 milliseconds ideally.
Off-line capabilities are of utmost importance, and having the model running locally, within the vehicle’s ecosystem.
Possessing natural language input and output capabilities afford an additional user interface via natural language.
It would be interesting to understand how this feature fits into the larger existing ecosystem of autonomous vehicle hardware, software, AI and prediction systems.
Recasting LLMs in this setting, should open our minds to other scenarios ware language models can be employed. Especially with the focus here being on explainability.
Lastly, I find the idea of intent detection fascinating, for long intents were associated with inbound customer conversations. However, this study recasts intent detection in a new milieu.
Considering the image above, the target vehicle is highlighted in green, while surrounding vehicles are shown in blue. An orange line indicates the future trajectory of the target vehicle for t time steps. The term “advanced prediction time” (T) refers to the temporal gap between the frame where a lane change occurs and the current frame.
Objective
The study uses the LLM’s capability of smart common sense reasoning to grasp complex interactive details, which boosts the accuracy of long-term predictions.
Added are explanatory elements to prompts during the analysis phase. The LC-LLM model not only predicts lane changes and trajectories but also explains its predictions, making them easier to understand.
Compared to basic models, LC-LLM improves intention prediction by 3.1%, reduces lateral trajectory prediction errors by 19.4%, and cuts longitudinal trajectory prediction errors by 38.1%. This is the first known effort to use LLMs for forecasting lane changes, demonstrating their ability to understand driving behaviour comprehensively.
Explainability
The researches realised that they can harness the power of LLMs to explain predicted lane change intentions and future trajectories, enhancing how to interpret forecasts in autonomous driving systems.
By integrating LLMs into prediction models, the aim is to provide clearer insights into why certain actions are anticipated, offering a deeper understanding of the decision-making process within these systems.
This approach not only improves the trustworthiness of autonomous driving technology but also opens avenues for further research into explainable AI within this field.
Intention Prediction
The idea of intention prediction within this research grabbed my attention. For years within chatbots, the main idea was to detect the user’s intention when initiating a conversation. Another challenge was to detect intent mid-way through a conversation, where the user digresses.
What makes intents hard in a conversational setting is the fact that intents are pre-defined classes, or groupings of expected and known user input. Hence once the intent of the user is established, the query or need of the user can be fulfilled.
Hence with this study, intent detection is again in the foreground, but the difference here is that the intent of surrounding drivers needs to be detected, and forecasted.
Predicting lane change intentions is a crucial aspect of both autonomous driving and advanced driver assistance systems (ADAS).
Studies in this area aim to precisely forecast when vehicles will change lanes, improving overall road safety and traffic efficiency by anticipating manoeuvres before they occur.
Practical Illustration
Considering the image below…
The green trajectory is the future trajectory of the vehicle, which the model needs to predict.
Observations are described and given in natural language via prompt engineering. Via supervision, the LC-LLM is fine-tuned for accurate predictions.
For inference, prompts are designed to have explainability and observability as part of their output.
The fine-tuned model is capable of predicting the target vehicle’s lane change intentions and future trajectory in the current frame, while simultaneously providing explanations for its predictions, thus improving the interpretability of the model’s outputs.
Prompt Detail
During the fine-tuning stage, the input prompts consist of two components:
A system message displayed in the upper text block &
A user message presented in the lower text block.
The user message primarily includes map details, the current state of the target vehicle, spatial relations between the target vehicle and nearby vehicles, and constructed chain-of-thought statements.
Apart from the prediction portion, the LC-LLM affords the user with natural language output on why certain decisions were made, or speech feedbackon the vehicles surroundings.
There is also scope for user input via speech, asking for advice or what the best course of action might be.
In Conclusion
The paper introduces LC-LLM, a lane change prediction model that not only forecasts lane change intentions and trajectories but also provides explanations for its predictions.
It reframes lane change prediction as a language modelling problem and fine-tunes the LLM using supervised techniques.
This approach capitalises on the LLM’s strong common sense reasoning and self-explanation abilities. Experiments demonstrate improved accuracy and interpretability of predictions.
However, the study currently focuses on highway scenarios and suggests future research extending to various traffic scenarios, including urban scenes.
It also highlights the need for compressing LLMs for faster inference and developing methods to predict lane changes for multiple vehicles simultaneously.
I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.