Next Previous

4D Path Planning, World Models, Reinforcement Learning, and VLM/VLA Integration for Autonomous Drones

2025-12-26

[2023 - 2025]

This project seeks to establish a scientifically framework for Level-4 autonomous flight of unmanned aerial vehicles (UAVs). Unlike traditional methods limited to static, spatial path planning, this research integrates 4D spatiotemporal representations, predictive world-models, safety-constrained reinforcement learning, and GPS-independent Vision-Language(-Action) (VLM/VLA) modules. The resulting system aims to achieve safe, efficient, and flexible navigation in dynamic, uncertain environments while addressing regulatory certification requirements.

Level-4 autonomous flight of unmanned aerial vehicles (UAVs) is increasingly in demand for applications such as logistics, surveillance, and disaster response. However, ensuring safe navigation and precise path planning in dynamic environments remains a major challenge. Traditional methods focus heavily on spatial path generation under static assumptions, making them inadequate for adapting to moving or newly emerging obstacles. Furthermore, stable self-localization is difficult in environments where GPS is unavailable. To address these issues, this project aims to construct a scientifically rigorous and certifiable autonomous flight framework by integrating 4D spatiotemporal voxel representations, predictive world models, safety-constrained reinforcement learning, and Vision-Language/Action (VLM/VLA) modules.

The project introduces a 4D spatiotemporal voxel structure to represent static and dynamic obstacles in a unified way, with adaptive temporal resolution to predict future occupancy of nearby space. This supports efficient real-time updates of moving objects. Building on this representation, world models are employed to simulate environmental evolution, predicting trajectories of surrounding objects and environmental disturbances such as gusts or reduced visibility. Through short-horizon rollouts and counterfactual scenarios, the system transforms path planning into a risk-aware, forward-looking process, rather than a purely reactive response.

For decision-making, reinforcement learning (RL) is applied to optimize path following and obstacle avoidance, with explicit safety constraints. Offline pretraining on expert trajectories (e.g., A*, RRT*) provides efficient initial policies, while runtime shields based on control barrier functions and reachability analysis suppress unsafe actions during online learning. To enable navigation in GPS-denied environments, a VLM/VLA module integrates camera-based perception with natural-language instructions, allowing UAVs to understand and execute high-level mission commands (e.g., “survey this area and return”) even where mapping or GPS data are unreliable.

This research will present a new framework for Level-4 autonomous flight, combining rigorous spatiotemporal representation, learning-based control, formal safety assurance, and multimodal task understanding. Expected outcomes include measurable improvements in safety and efficiency in dynamic environments, enhanced GPS-independent localization and navigation, and a data representation foundation for future integration with Air Traffic Management (ATM) and Unmanned Traffic Management (UTM) systems. Looking forward, the project envisions extensions to multi-UAV coordination, weather-aware operations, certification frameworks based on learning assurance, and real-world flight demonstrations in logistics and emergency response scenarios.

Publication

Bao, Naren, Orsholits, Alex, Tsukada, Manabu, "4D Path Planning via Spatiotemporal Voxels in Urban Airspaces", In: 3rd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (IEEE MetaCom 2025), Seoul, Republic of Korea, 2025.Proceedings Article | Abstract | Links | BibTeX

Category:

Project

Tags:

Digital Twins

admin

Related Projects:

Smart Pole Interaction Unit (SPIU): Infrastructure-Side Communication for Pedestrian-AV Interaction in Shared Spaces

autonomous driving v2x

Smart Pole Interaction Unit (SPIU): Infrastructure-Side Communication for Pedestrian-AV Interaction in Shared Spaces

autonomous driving v2x

Smart Pole Interaction Unit (SPIU): Infrastructure-Side Communication for Pedestrian-AV Interaction in Shared Spaces

autonomous driving v2x

Smart Pole Interaction Unit (SPIU): Infrastructure-Side Communication for Pedestrian-AV Interaction in Shared Spaces

2026-04-06

Reliable GPS-Free Vehicle Localization Framework for Next-Generation V2X Systems

v2x

Reliable GPS-Free Vehicle Localization Framework for Next-Generation V2X Systems

v2x

Reliable GPS-Free Vehicle Localization Framework for Next-Generation V2X Systems

v2x

Reliable GPS-Free Vehicle Localization Framework for Next-Generation V2X Systems

2026-02-27

Spatial ID as a Scalable Spatial Index for Urban Digital Twins and Spatial Computing

digital twins extended reality

Spatial ID as a Scalable Spatial Index for Urban Digital Twins and Spatial Computing

digital twins extended reality

Spatial ID as a Scalable Spatial Index for Urban Digital Twins and Spatial Computing

digital twins extended reality

Spatial ID as a Scalable Spatial Index for Urban Digital Twins and Spatial Computing

2025-12-26

4D Path Planning, World Models, Reinforcement Learning, and VLM/VLA Integration for Autonomous Drones

digital twins

4D Path Planning, World Models, Reinforcement Learning, and VLM/VLA Integration for Autonomous Drones

digital twins

4D Path Planning, World Models, Reinforcement Learning, and VLM/VLA Integration for Autonomous Drones

digital twins

4D Path Planning, World Models, Reinforcement Learning, and VLM/VLA Integration for Autonomous Drones

2025-12-26

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

autonomous driving machine learning

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

2025-08-13

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

machine learning v2x

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

machine learning v2x

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

machine learning v2x

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

2025-08-13

Extending Autoware for Cooperative Driving: Integrating Perception, Planning, and Coordination

autonomous driving v2x

Extending Autoware for Cooperative Driving: Integrating Perception, Planning, and Coordination

2025-08-09

Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality

extended reality

Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality

2025-04-09

4D Path Planning, World Models, Reinforcement Learning, and VLM/VLA Integration for Autonomous Drones

[2023 - 2025]

Publication

admin

Related Projects:

Smart Pole Interaction Unit (SPIU): Infrastructure-Side Communication for Pedestrian-AV Interaction in Shared Spaces

Smart Pole Interaction Unit (SPIU): Infrastructure-Side Communication for Pedestrian-AV Interaction in Shared Spaces

autonomous driving v2x

autonomous driving v2x

Reliable GPS-Free Vehicle Localization Framework for Next-Generation V2X Systems

Reliable GPS-Free Vehicle Localization Framework for Next-Generation V2X Systems

v2x

v2x

Spatial ID as a Scalable Spatial Index for Urban Digital Twins and Spatial Computing

Spatial ID as a Scalable Spatial Index for Urban Digital Twins and Spatial Computing

digital twins extended reality

digital twins extended reality

4D Path Planning, World Models, Reinforcement Learning, and VLM/VLA Integration for Autonomous Drones

4D Path Planning, World Models, Reinforcement Learning, and VLM/VLA Integration for Autonomous Drones

digital twins

digital twins

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

Multi-PrefDrive: Advancing LLM-Based Autonomous Driving via Multi-Preference Learning

autonomous driving machine learning

autonomous driving machine learning

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

Progressive Heterogeneous Collaborative Perception (PHCP): A Communication Technology for Autonomous Vehicles to Adapt and Connect on the Fly

machine learning v2x

machine learning v2x

Extending Autoware for Cooperative Driving: Integrating Perception, Planning, and Coordination

Extending Autoware for Cooperative Driving: Integrating Perception, Planning, and Coordination

autonomous driving v2x

autonomous driving v2x

Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality

Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality

extended reality

extended reality

Tsukada Laboratory

Topics

Access & Contact

Language