Next Previous

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

2025-08-13

[2023 - 2024]

Research on developing safer and more human-like autonomous driving systems using large language models that learn human driving preferences.

As autonomous driving technology evolves, the next major challenge is to create systems that not only follow rules but also drive smoothly, safely, and comfortably, much like a skilled human. This research project addresses this challenge by developing the “PrefDrive” framework, which integrates nuanced human driving preferences—such as maintaining safe distances or ensuring smooth acceleration—into autonomous driving models using Large Language Models (LLMs). The goal is to create a system that can align with a wide range of requirements, from basic operational needs like traffic rule compliance to more human-like driving behaviors.

The core of PrefDrive is its pioneering use of Direct Preference Optimization (DPO), a preference learning technique, in the autonomous driving domain. This approach trains the model by having it learn from pairs of “chosen” (desirable) and “rejected” (undesirable) driving actions for a given scenario, allowing it to discern the optimal human choice. For this research, we built and publicly released a comprehensive dataset of 74,040 driving preference sequences. By implementing memory-efficient techniques like LoRA and 4-bit quantization, we have also made advanced LLM fine-tuning accessible on consumer-grade hardware, broadening research opportunities in the field.

However, real-world driving decisions are rarely a simple binary choice. For a single correct action, there are often multiple potential incorrect actions, each carrying a different degree of risk. To capture this complex decision-making landscape, we evolved the project to create “Multi-PrefDrive”. This advanced framework trains the model by pairing one “chosen” action with

multiple “rejected” alternatives, such as actions that are “aggressive,” “inattentive,” or “overcautious”. This enables the model to develop a more nuanced understanding of the spectrum of possible driving errors.

Multi-PrefDrive implements the sophisticated Plackett-Luce preference model to handle the ranked list of multiple choices. Experiments in the CARLA simulator demonstrated that this multi-preference approach dramatically improves performance over the standard DPO, especially in safety. The framework achieved an 83.6% reduction in infrastructure collisions and, in certain environments, eliminated traffic light violations entirely. This work validates that teaching AI to understand complex human judgment is a critical step toward creating safer and more reliable autonomous vehicles.

Publication

Li, Yun, Javanmardi, Ehsan, Thompson, Simon, Katsumata, Kai, Orsholits, Alex, Tsukada, Manabu, "Multi-PrefDrive: Optimizing Large Language Models for Autonomous Driving Through Multi-Preference Tuning", In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 2025.Proceedings Article | Abstract | Links | BibTeX

@inproceedings{Li2025d,

title = {Multi-PrefDrive: Optimizing Large Language Models for Autonomous Driving Through Multi-Preference Tuning},

author = {Yun Li and Ehsan Javanmardi and Simon Thompson and Kai Katsumata and Alex Orsholits and Manabu Tsukada},

url = {https://liyun0607.github.io/},

year  = {2025},

date = {2025-10-19},

booktitle = {2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},

address = {Hangzhou, China},

abstract = {This paper introduces Multi-PrefDrive, a framework that significantly enhances LLM-based autonomous driving through multidimensional preference tuning. Aligning LLMs with human driving preferences is crucial yet challenging, as driving scenarios involve complex decisions where multiple incorrect actions can correspond to a single correct choice. Traditional binary preference tuning fails to capture this complexity. Our approach pairs each chosen action with multiple rejected alternatives, better reflecting real-world driving decisions. By implementing the Plackett-Luce preference model, we enable nuanced ranking of actions across the spectrum of possible errors. Experiments in the CARLA simulator demonstrate that our algorithm achieves an 11.0% improvement in overall score and an 83.6% reduction in

infrastructure collisions, while showing perfect compliance with traffic signals in certain environments. Comparative analysis against DPO and its variants reveals that Multi-PrefDrive’s superior discrimination between chosen and rejected actions, which achieving a margin value of 25, and such ability has been directly translates to enhanced driving performance. We implement memory-efficient techniques including LoRA and 4-bit quantization to enable deployment on consumer-grade hardware and will open-source our training code and multi-rejected dataset to advance research in LLM-based autonomous driving systems. Project Page (https://liyun0607.github.io/)},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Li, Yun, Javanmardi, Ehsan, Thompson, Simon, Katsumata, Kai, Orsholits, Alex, Tsukada, Manabu, "PrefDrive: Enhancing Autonomous Driving through Preference-Guided Large Language Models", In: 36th IEEE Intelligent Vehicles Symposium (IV2025), Cluj-Napoca, Romania, 2025.Proceedings Article | Abstract | Links | BibTeX

@inproceedings{Li2025c,

title = {PrefDrive: Enhancing Autonomous Driving through Preference-Guided Large Language Models},

author = {Yun Li and Ehsan Javanmardi and Simon Thompson and Kai Katsumata and Alex Orsholits and Manabu Tsukada},

url = {https://github.com/LiYun0607/PrefDrive/

https://huggingface.co/liyun0607/PrefDrive

https://huggingface.co/datasets/liyun0607/PrefDrive},

doi = {10.1109/IV64158.2025.11097672},

year  = {2025},

date = {2025-06-22},

urldate = {2025-06-22},

booktitle = {36th IEEE Intelligent Vehicles Symposium (IV2025)},

address = {Cluj-Napoca, Romania},

abstract = {This paper presents PrefDrive, a novel framework that integrates driving preferences into autonomous driving models through large language models (LLMs). While recent advances in LLMs have shown promise in autonomous driving, existing approaches often struggle to align with specific driving behaviors (e.g., maintaining safe distances, smooth acceleration patterns) and operational requirements (e.g., traffic rule compliance, route adherence). We address this challenge by developing a preference learning framework that combines multimodal perception with natural language understanding. Our approach leverages Direct Preference Optimization (DPO) to fine-tune LLMs efficiently on consumer-grade hardware, making advanced autonomous driving research more accessible to the broader research community. We introduce a comprehensive dataset of 74,040 sequences, carefully annotated with driving preferences and driving decisions, which, along with our trained model checkpoints, will be made publicly available to facilitate future research. Through extensive experiments in the CARLA simulator, we demonstrate that our preference-guided approach significantly improves driving performance across multiple metrics, including distance maintenance and trajectory smoothness. Results show up to 28.1% reduction in traffic rule violations and 8.5% improvement in navigation task completion while maintaining appropriate distances from obstacles. The framework demonstrates robust performance across different urban environments, showcasing the effectiveness of preference learning in autonomous driving applications.  },

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Launch Project

Project url:

liyun0607.github.io/

Category:

Project

Tags:

Autonomous Driving, Machine Learning

Related Projects:

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

autonomous driving machine learning

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

2025-08-13

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

machine learning v2x

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

machine learning v2x

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

machine learning v2x

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

2025-08-13

Extending Autoware for Cooperative Driving: Integrating Perception, Planning, and Coordination

autonomous driving v2x

Extending Autoware for Cooperative Driving: Integrating Perception, Planning, and Coordination

2025-08-09

Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality

extended reality

Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality

2025-04-09

Designing a Multipath Redundancy Communication Framework for Reliable Real-Time Streaming over 5G

v2x

Designing a Multipath Redundancy Communication Framework for Reliable Real-Time Streaming over 5G

2025-01-09

Federated Learning and Edge AI for Intelligent and Secure Connected Vehicle Networks

machine learning v2x

Federated Learning and Edge AI for Intelligent and Secure Connected Vehicle Networks

2025-01-09

Design of Digital Twin-Based Infrastructure for 3D Audio Spatialization

digital twins extended reality

Design of Digital Twin-Based Infrastructure for 3D Audio Spatialization

2024-08-09

Integrated Spatial Information Platform through BIM-Based Digital Twin

digital twins extended reality

Integrated Spatial Information Platform through BIM-Based Digital Twin

2024-08-09

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

[2023 - 2024]

Research on developing safer and more human-like autonomous driving systems using large language models that learn human driving preferences.

Publication

Related Projects:

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

autonomous driving machine learning

autonomous driving machine learning

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

machine learning v2x

machine learning v2x

Extending Autoware for Cooperative Driving: Integrating Perception, Planning, and Coordination

Extending Autoware for Cooperative Driving: Integrating Perception, Planning, and Coordination

autonomous driving v2x

autonomous driving v2x

Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality

Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality

extended reality

extended reality

Designing a Multipath Redundancy Communication Framework for Reliable Real-Time Streaming over 5G

Designing a Multipath Redundancy Communication Framework for Reliable Real-Time Streaming over 5G

v2x

v2x

Federated Learning and Edge AI for Intelligent and Secure Connected Vehicle Networks

Federated Learning and Edge AI for Intelligent and Secure Connected Vehicle Networks

machine learning v2x

machine learning v2x

Design of Digital Twin-Based Infrastructure for 3D Audio Spatialization

Design of Digital Twin-Based Infrastructure for 3D Audio Spatialization

digital twins extended reality

digital twins extended reality

Integrated Spatial Information Platform through BIM-Based Digital Twin

Integrated Spatial Information Platform through BIM-Based Digital Twin

digital twins extended reality

digital twins extended reality

Tsukada Laboratory

Topics

Access & Contact

Language