Next Previous

Multi-PrefDrive：マルチ嗜好学習によるLLMベース自動運転の高度化

2025-08-13

[2025]

人間の運転嗜好を学習する大規模言語モデルを用いて、より安全で人間らしい自動運転システムを実現する研究。

自動運転技術が進化する中で、単にルールに従うだけでなく、人間のようにスムーズで安全、かつ快適な運転を実現することが次の大きな課題となっています。そこで本研究プロジェクトでは、近年のAI技術の中核である大規模言語モデル（LLM）を活用し、人間の持つ繊細な運転の「好み」（例えば、適切な車間距離の維持やスムーズな加減速など）を自動運転モデルに組み込む「PrefDrive」フレームワークを開発しました。これにより、交通法規の遵守といった基本的な要求から、より人間らしい運転挙動まで、幅広い要求に応えられるシステムの実現を目指します。

PrefDriveの核心は、「直接的選好最適化（DPO）」という選好学習の手法を自動運転分野で先駆けて導入した点にあります。これは、特定の交通状況において「望ましい運転操作（chosen）」と「望ましくない運転操作（rejected）」のペアをモデルに提示し、その比較から人間にとっての最適解を学習させるアプローチです。この研究のために、74,040シーケンスにも及ぶ独自の運転選好データセットを構築し、公開しました。また、LoRAや4ビット量子化といったメモリ効率化技術を駆使することで、研究室レベルの一般的なGPUでも高度なLLMのファインチューニングを可能にし、研究のアクセス性を高めています。

しかし、実際の運転における意思決定は、単純な二者択一ではありません。一つの正しい操作に対し、危険性の度合いが異なる複数の誤った選択肢が存在します。この複雑な判断のニュアンスを捉えるため、我々はプロジェクトをさらに進化させ、「Multi-PrefDrive」を開発しました。この新フレームワークでは、一つの「望ましい操作」に対して、「攻撃的すぎる」「不注意」「過度に慎重」といった複数の「望ましくない操作」をセットで学習させます。これにより、モデルは多様なエラーの中から最適な行動をより精密に見分ける能力を獲得します。

Multi-PrefDriveでは、複数の選択肢を扱うためにPlackett-Luceモデルという高度な選好モデルを実装しました。CARLAシミュレータを用いた実験では、このアプローチが従来のDPOを上回り、特に安全性において劇的な性能向上を達成することを示しました。具体的には、インフラとの衝突を83.6%削減し、特定の環境下では信号無視を完全にゼロに抑えることに成功しました。この成果は、人間の複雑な判断基準をAIに学習させることが、より安全で信頼性の高い自動運転の実現に不可欠であることを示しています。

Publication

Li, Yun, Javanmardi, Ehsan, Thompson, Simon, Katsumata, Kai, Orsholits, Alex, Tsukada, Manabu, "Multi-PrefDrive: Optimizing Large Language Models for Autonomous Driving Through Multi-Preference Tuning", In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 2025.Proceedings Article | Abstract | Links | BibTeX

@inproceedings{Li2025d,

title = {Multi-PrefDrive: Optimizing Large Language Models for Autonomous Driving Through Multi-Preference Tuning},

author = {Yun Li and Ehsan Javanmardi and Simon Thompson and Kai Katsumata and Alex Orsholits and Manabu Tsukada},

url = {https://liyun0607.github.io/},

doi = {10.1109/IROS60139.2025.11247608},

year  = {2025},

date = {2025-10-19},

urldate = {2025-10-19},

booktitle = {2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},

address = {Hangzhou, China},

abstract = {This paper introduces Multi-PrefDrive, a framework that significantly enhances LLM-based autonomous driving through multidimensional preference tuning. Aligning LLMs with human driving preferences is crucial yet challenging, as driving scenarios involve complex decisions where multiple incorrect actions can correspond to a single correct choice. Traditional binary preference tuning fails to capture this complexity. Our approach pairs each chosen action with multiple rejected alternatives, better reflecting real-world driving decisions. By implementing the Plackett-Luce preference model, we enable nuanced ranking of actions across the spectrum of possible errors. Experiments in the CARLA simulator demonstrate that our algorithm achieves an 11.0% improvement in overall score and an 83.6% reduction in

infrastructure collisions, while showing perfect compliance with traffic signals in certain environments. Comparative analysis against DPO and its variants reveals that Multi-PrefDrive’s superior discrimination between chosen and rejected actions, which achieving a margin value of 25, and such ability has been directly translates to enhanced driving performance. We implement memory-efficient techniques including LoRA and 4-bit quantization to enable deployment on consumer-grade hardware and will open-source our training code and multi-rejected dataset to advance research in LLM-based autonomous driving systems. Project Page (https://liyun0607.github.io/)},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Li, Yun, Javanmardi, Ehsan, Thompson, Simon, Katsumata, Kai, Orsholits, Alex, Tsukada, Manabu, "PrefDrive: Enhancing Autonomous Driving through Preference-Guided Large Language Models", In: 36th IEEE Intelligent Vehicles Symposium (IV2025), Cluj-Napoca, Romania, 2025.Proceedings Article | Abstract | Links | BibTeX

@inproceedings{Li2025c,

title = {PrefDrive: Enhancing Autonomous Driving through Preference-Guided Large Language Models},

author = {Yun Li and Ehsan Javanmardi and Simon Thompson and Kai Katsumata and Alex Orsholits and Manabu Tsukada},

url = {https://github.com/LiYun0607/PrefDrive/

https://huggingface.co/liyun0607/PrefDrive

https://huggingface.co/datasets/liyun0607/PrefDrive},

doi = {10.1109/IV64158.2025.11097672},

year  = {2025},

date = {2025-06-22},

urldate = {2025-06-22},

booktitle = {36th IEEE Intelligent Vehicles Symposium (IV2025)},

address = {Cluj-Napoca, Romania},

abstract = {This paper presents PrefDrive, a novel framework that integrates driving preferences into autonomous driving models through large language models (LLMs). While recent advances in LLMs have shown promise in autonomous driving, existing approaches often struggle to align with specific driving behaviors (e.g., maintaining safe distances, smooth acceleration patterns) and operational requirements (e.g., traffic rule compliance, route adherence). We address this challenge by developing a preference learning framework that combines multimodal perception with natural language understanding. Our approach leverages Direct Preference Optimization (DPO) to fine-tune LLMs efficiently on consumer-grade hardware, making advanced autonomous driving research more accessible to the broader research community. We introduce a comprehensive dataset of 74,040 sequences, carefully annotated with driving preferences and driving decisions, which, along with our trained model checkpoints, will be made publicly available to facilitate future research. Through extensive experiments in the CARLA simulator, we demonstrate that our preference-guided approach significantly improves driving performance across multiple metrics, including distance maintenance and trajectory smoothness. Results show up to 28.1% reduction in traffic rule violations and 8.5% improvement in navigation task completion while maintaining appropriate distances from obstacles. The framework demonstrates robust performance across different urban environments, showcasing the effectiveness of preference learning in autonomous driving applications.  },

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Launch Project

Project url:

liyun0607.github.io/

Category:

Project

Tags:

Autonomous Driving, Machine Learning

Related Projects:

V2X協調自動運転に向けた3Dセマンティック占有予測

autonomous driving machine learning

V2X協調自動運転に向けた3Dセマンティック占有予測

2026-05-16

6G次世代UAV／AMRのための軌道計画

machine learning uav

6G次世代UAV／AMRのための軌道計画

2026-05-16

Smart Pole Interaction Unit（SPIU）：共有空間における歩行者・自動運転車インタラクションを支えるインフラ側コミュニケーション

autonomous driving v2x

Smart Pole Interaction Unit（SPIU）：共有空間における歩行者・自動運転車インタラクションを支えるインフラ側コミュニケーション

2026-04-06

次世代V2Xシステムに向けたGPS非依存型の高信頼車両測位フレームワーク

v2x

次世代V2Xシステムに向けたGPS非依存型の高信頼車両測位フレームワーク

2026-02-27

都市デジタルツインと空間コンピューティングのためのスケーラブルな空間インデックスとしての空間ID

digital twins extended reality

都市デジタルツインと空間コンピューティングのためのスケーラブルな空間インデックスとしての空間ID

2025-12-26

自律無人航空機における4次元経路計画・世界モデル・強化学習およびVLM/VLA統合

digital twins uav

自律無人航空機における4次元経路計画・世界モデル・強化学習およびVLM/VLA統合

2025-12-26

Multi-PrefDrive：マルチ嗜好学習によるLLMベース自動運転の高度化

autonomous driving machine learning

Multi-PrefDrive：マルチ嗜好学習によるLLMベース自動運転の高度化

2025-08-13

適応的協調認識（PHCP）：初めて出会う自動運転車同士がその場で「つながる」技術

machine learning v2x

適応的協調認識（PHCP）：初めて出会う自動運転車同士がその場で「つながる」技術

2025-08-13

Multi-PrefDrive：マルチ嗜好学習によるLLMベース自動運転の高度化

[2025]

人間の運転嗜好を学習する大規模言語モデルを用いて、より安全で人間らしい自動運転システムを実現する研究。

Publication

Related Projects:

V2X協調自動運転に向けた3Dセマンティック占有予測

V2X協調自動運転に向けた3Dセマンティック占有予測

autonomous driving machine learning

autonomous driving machine learning

6G次世代UAV／AMRのための軌道計画

6G次世代UAV／AMRのための軌道計画

machine learning uav

machine learning uav

Smart Pole Interaction Unit（SPIU）：共有空間における歩行者・自動運転車インタラクションを支えるインフラ側コミュニケーション

Smart Pole Interaction Unit（SPIU）：共有空間における歩行者・自動運転車インタラクションを支えるインフラ側コミュニケーション

autonomous driving v2x

autonomous driving v2x

次世代V2Xシステムに向けたGPS非依存型の高信頼車両測位フレームワーク

次世代V2Xシステムに向けたGPS非依存型の高信頼車両測位フレームワーク

v2x

v2x

都市デジタルツインと空間コンピューティングのためのスケーラブルな空間インデックスとしての空間ID

都市デジタルツインと空間コンピューティングのためのスケーラブルな空間インデックスとしての空間ID

digital twins extended reality

digital twins extended reality

自律無人航空機における4次元経路計画・世界モデル・強化学習およびVLM/VLA統合

自律無人航空機における4次元経路計画・世界モデル・強化学習およびVLM/VLA統合

digital twins uav

digital twins uav

Multi-PrefDrive：マルチ嗜好学習によるLLMベース自動運転の高度化

Multi-PrefDrive：マルチ嗜好学習によるLLMベース自動運転の高度化

autonomous driving machine learning

autonomous driving machine learning

適応的協調認識（PHCP）：初めて出会う自動運転車同士がその場で「つながる」技術

適応的協調認識（PHCP）：初めて出会う自動運転車同士がその場で「つながる」技術

machine learning v2x

machine learning v2x

塚田研究室

トピック

住所 & 連絡先

言語