Alex Orsholits

Yanru Chen, Sami Brahim Djelloul, Alex Orsholits, Manabu Tsukada, Hiroshi Esaki, "Spatial ID-Driven Edge-Cloud Architecture for Real-Time Urban Digital Twins", In: IEEE Consumer Communications & Networking Conference (CCNC2026), Las Vegas, USA, 2026.Proceedings Article | Abstract | BibTeX

@inproceedings{Chen2026,

title = {Spatial ID-Driven Edge-Cloud Architecture for Real-Time Urban Digital Twins},

author = {Yanru Chen and Sami Brahim Djelloul and Alex Orsholits and Manabu Tsukada and Hiroshi Esaki},

year  = {2026},

date = {2026-01-08},

urldate = {2026-01-08},

booktitle = {IEEE Consumer Communications & Networking Conference (CCNC2026)},

address = {Las Vegas, USA},

abstract = {Only the chairs can edit The integration of static geospatial datasets and real-time IoT streams is essential for responsive and scalable urban Digital Twins (DTs). However, current infrastructures remain fragmented across domains, formats, and reference systems, limiting interoperability and city-scale deployment. This paper presents the first city-scale implementation of a Spatial ID-driven edge-cloud architecture that unifies heterogeneous static and dynamic urban data under a hierarchical four-dimensional identifier. Unlike prior DT systems that rely on ad hoc tiling or local schemas, our design operationalizes Spatial ID as a universal indexing layer across batch and streaming pipelines, enabling multi-resolution queries, real-time synchronization, and cross-domain interoperability. A prototype deployment in Tokyo's Chiyoda and Bunkyo wards demonstrates the approach, integrating 3D city models with live IoT streams. Static data are encoded into Spatial IDs and distributed via a geospatial database and vector tiles, while dynamic streams are processed at the edge and synchronized with a cloud backend using a publish/subscribe model. The system supports real-time encoding, querying, distribution, and web/Mixed Reality (MR)-based visualization. Evaluation shows millisecond-to-second query performance over 148 million records, sub-100 ms vector tile delivery, and real-time IoT stream processing at 30 fps. These results establish Spatial ID not only as a conceptual framework but as a practical, deployable foundation for interoperable, low-latency, and scalable Digital Twin infrastructures aligned with the vision of Society 5.0.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Vishal Chauhan, Anubhav Anubhav, Robin Sidhu, Yu Asabe, Kanta Tanaka, Chia-Ming Chang, Xiang Su, Dr. Ehsan Javanmardi, Takeo Igarashi, Alex Orsholits, Kantaro Fujiwara, Manabu Tsukada, "A Silent Negotiator? Cross-cultural VR Evaluation of Smart Pole Interaction Units in Dynamic Shared Spaces", In: The ACM Symposium on Virtual Reality Software and Technology (VRST2025) , Montreal, Canada, 2025.Proceedings Article | Abstract | BibTeX | Links:

@inproceedings{Chauhan2025b,

title = {A Silent Negotiator? Cross-cultural VR Evaluation of Smart Pole Interaction Units in Dynamic Shared Spaces},

author = {Vishal Chauhan and Anubhav Anubhav and Robin Sidhu and Yu Asabe and Kanta Tanaka and Chia-Ming Chang and Xiang Su and Dr. Ehsan Javanmardi and Takeo Igarashi and Alex Orsholits and Kantaro Fujiwara and Manabu Tsukada},

url = {https://github.com/tlab-wide/Smartpole-VR-AWSIM.git},

doi = {10.1145/3756884.3765991},

year  = {2025},

date = {2025-11-12},

urldate = {2025-11-12},

booktitle = {The ACM Symposium on Virtual Reality Software and Technology (VRST2025) },

address = {Montreal, Canada},

abstract = {As autonomous vehicles (AVs) enter pedestrian-centric environments, existing vehicle-mounted external human–machine interfaces (eHMIs) often fall short in shared spaces due to line-of-sight limitations, inconsistent signaling, and increased cognitive burden on pedestrians. To address these challenges, we introduce the Smart Pole Interaction Unit (SPIU), an infrastructure-based eHMI that decouples intent signaling from vehicles and provides context-aware, elevated visual cues. We evaluate SPIU using immersive VR-AWSIM simulations in four high-risk urban scenarios: four-way intersections, autonomous mixed traffic, blindspots, and nighttime crosswalks. The experiment was developed in Japan and replicated in Norway, where forty participants engaged in 32 trials each under both SPIU-present and SPIU-absent conditions. Behavioral (response time) and subjective (acceptance scale) data were collected. Results show that SPIU significantly improves pedestrian decision-making, with reductions ranging from 40% to over 80% depending on scenario and cultural context, particularly in complex or low-visibility scenarios. Cross-cultural analyses highlight SPIU's adaptability across differing urban and social contexts. We release our open-source Smartpole-VR-AWSIM framework to support reproducibility and global advancement of infrastructure-based eHMI research through reproducible and immersive behavioral studies.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Yun Li, Ehsan Javanmardi, Simon Thompson, Kai Katsumata, Alex Orsholits, Manabu Tsukada, "Multi-PrefDrive: Optimizing Large Language Models for Autonomous Driving Through Multi-Preference Tuning", In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 2025.Proceedings Article | Abstract | BibTeX | Links:

@inproceedings{Li2025d,

title = {Multi-PrefDrive: Optimizing Large Language Models for Autonomous Driving Through Multi-Preference Tuning},

author = {Yun Li and Ehsan Javanmardi and Simon Thompson and Kai Katsumata and Alex Orsholits and Manabu Tsukada},

url = {https://liyun0607.github.io/},

year  = {2025},

date = {2025-10-19},

booktitle = {2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},

address = {Hangzhou, China},

abstract = {This paper introduces Multi-PrefDrive, a framework that significantly enhances LLM-based autonomous driving through multidimensional preference tuning. Aligning LLMs with human driving preferences is crucial yet challenging, as driving scenarios involve complex decisions where multiple incorrect actions can correspond to a single correct choice. Traditional binary preference tuning fails to capture this complexity. Our approach pairs each chosen action with multiple rejected alternatives, better reflecting real-world driving decisions. By implementing the Plackett-Luce preference model, we enable nuanced ranking of actions across the spectrum of possible errors. Experiments in the CARLA simulator demonstrate that our algorithm achieves an 11.0% improvement in overall score and an 83.6% reduction in

infrastructure collisions, while showing perfect compliance with traffic signals in certain environments. Comparative analysis against DPO and its variants reveals that Multi-PrefDrive’s superior discrimination between chosen and rejected actions, which achieving a margin value of 25, and such ability has been directly translates to enhanced driving performance. We implement memory-efficient techniques including LoRA and 4-bit quantization to enable deployment on consumer-grade hardware and will open-source our training code and multi-rejected dataset to advance research in LLM-based autonomous driving systems. Project Page (https://liyun0607.github.io/)},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Vishal Chauhan, Anubhav Anubhav, Chia-Ming Chang, Xiang Su, Jin Nakazato, Ehsan Javanmardi, Alex Orsholits, Takeo Igarashi, Kantaro Fujiwara, Manabu Tsukada, "Towards the Future of Pedestrian-AV Interaction: Human Perception vs. LLM Insights on Smart Pole Interaction Unit in Shared Spaces", In: International Journal of Human–Computer Studies (IJHCS), vol. 205, pp. 103628, 2025, ISBN: 1071-5819.Journal Article | Abstract | BibTeX | Links:

@article{Chauhan2025,

title = {Towards the Future of Pedestrian-AV Interaction: Human Perception vs. LLM Insights on Smart Pole Interaction Unit in Shared Spaces},

author = {Vishal Chauhan and Anubhav Anubhav and Chia-Ming Chang and Xiang Su and Jin Nakazato and Ehsan Javanmardi and Alex Orsholits and Takeo Igarashi and Kantaro Fujiwara and Manabu Tsukada},

doi = {10.1016/j.ijhcs.2025.103628},

isbn = {1071-5819},

year  = {2025},

date = {2025-09-13},

urldate = {2025-09-13},

journal = {International Journal of Human–Computer Studies (IJHCS)},

volume = {205},

pages = {103628},

abstract = {As autonomous vehicles (AVs) reshape urban mobility, establishing effective communication between pedestrians and self-driving vehicles has become a critical safety imperative. This work investigates the integration of Smart Pole Interaction Units (SPIUs) as external human–machine interfaces (eHMIs) in shared spaces and introduces an innovative approach to enhance pedestrian–AV interactions. To provide subjective evidence on SPIU usability, we conduct a group design study (“Humans”) involving 25 participants (aged 18–40). We evaluate user preferences and interaction patterns using group discussion materials, revealing that 90% of the participants strongly prefer real-time multi-AV interactions facilitated by SPIU over conventional eHMI systems, where a pedestrian must look at multiple AVs individually. Furthermore, they emphasize inclusive design through multi-sensory communication channels—visual, auditory, and tactile signals—specifically addressing the needs of vulnerable road users (VRUs), including those with impairments. To complement these non-expert, real-world insights, we employ three leading Large Language Models (LLMs) (ChatGPT-4, Gemini-Pro, and Claude 3.5 Sonnet) as “experts” due to their extensive training data. Using the advantages of the multimodal vision-language processing capabilities of these LLMs, identical questions (text and images) used in human discussions are posed to generate text responses for pedestrian–AV interaction scenarios. Responses generated from LLMs and recorded conversations from human group discussions are used to extract the most frequent words. A keyword frequency analysis from both humans and LLMs is performed with three categories, Context, Safety, and Important. Our findings indicate that LLMs employ safety-related keywords 30% more frequently than human participants, suggesting a more structured, safety-centric approach. Among LLMs, ChatGPT-4 demonstrates superior response latency, Claude shows a closer alignment with human responses, and Gemini-Pro provides structured and contextually relevant insights. Our results from “Humans” and “LLMs” establish SPIU as a promising system for facilitating trust-building and safety-ensuring interactions among pedestrians, AVs, and delivery robots. Integrating diverse stakeholder feedback, we propose a prototype SPIU design to advance pedestrian–AV interactions in shared urban spaces, positioning SPIU as crucial infrastructure hubs for safe and trustworthy navigation.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close

As autonomous vehicles (AVs) reshape urban mobility, establishing effective communication between pedestrians and self-driving vehicles has become a critical safety imperative. This work investigates the integration of Smart Pole Interaction Units (SPIUs) as external human–machine interfaces (eHMIs) in shared spaces and introduces an innovative approach to enhance pedestrian–AV interactions. To provide subjective evidence on SPIU usability, we conduct a group design study (“Humans”) involving 25 participants (aged 18–40). We evaluate user preferences and interaction patterns using group discussion materials, revealing that 90% of the participants strongly prefer real-time multi-AV interactions facilitated by SPIU over conventional eHMI systems, where a pedestrian must look at multiple AVs individually. Furthermore, they emphasize inclusive design through multi-sensory communication channels—visual, auditory, and tactile signals—specifically addressing the needs of vulnerable road users (VRUs), including those with impairments. To complement these non-expert, real-world insights, we employ three leading Large Language Models (LLMs) (ChatGPT-4, Gemini-Pro, and Claude 3.5 Sonnet) as “experts” due to their extensive training data. Using the advantages of the multimodal vision-language processing capabilities of these LLMs, identical questions (text and images) used in human discussions are posed to generate text responses for pedestrian–AV interaction scenarios. Responses generated from LLMs and recorded conversations from human group discussions are used to extract the most frequent words. A keyword frequency analysis from both humans and LLMs is performed with three categories, Context, Safety, and Important. Our findings indicate that LLMs employ safety-related keywords 30% more frequently than human participants, suggesting a more structured, safety-centric approach. Among LLMs, ChatGPT-4 demonstrates superior response latency, Claude shows a closer alignment with human responses, and Gemini-Pro provides structured and contextually relevant insights. Our results from “Humans” and “LLMs” establish SPIU as a promising system for facilitating trust-building and safety-ensuring interactions among pedestrians, AVs, and delivery robots. Integrating diverse stakeholder feedback, we propose a prototype SPIU design to advance pedestrian–AV interactions in shared urban spaces, positioning SPIU as crucial infrastructure hubs for safe and trustworthy navigation.

Close

Shangkai Zhang, Alex Orsholits, Ehsan Javanmardi, Manabu Tsukada, "AWSIM-VR: A Tightly-Coupled Virtual Reality Extension for Human-in-the-Loop Pedestrian-Autonomous Vehicle Interaction", In: 3rd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (IEEE MetaCom 2025), Seoul, Republic of Korea, 2025.Proceedings Article | Abstract | BibTeX | Links:

@inproceedings{Zhang2025,

title = {AWSIM-VR: A Tightly-Coupled Virtual Reality Extension for Human-in-the-Loop Pedestrian-Autonomous Vehicle Interaction},

author = {Shangkai Zhang and Alex Orsholits and Ehsan Javanmardi and Manabu Tsukada},

url = {https://github.com/zhangshangkai/AWSIM-VR},

doi = {10.1109/MetaCom65502.2025.00063},

year  = {2025},

date = {2025-08-27},

urldate = {2025-08-27},

booktitle = {3rd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (IEEE MetaCom 2025)},

address = {Seoul, Republic of Korea},

abstract = {Effective communication between autonomous vehicles (AVs) and pedestrians is crucial for ensuring future urban traffic safety. While external Human-Machine Interfaces (eHMIs) have emerged as promising solutions, current evaluation methodologies — particularly Virtual Reality (VR)-based studies — typically rely on scripted or pre-defined autonomous vehicle behaviors, limiting realism and neglecting pedestrians' active role in interactions. To address this, we introduce AWSIM-VR, a tightly-coupled VR extension of the AWSIM autonomous driving simulator, enabling real-time, human-in-the-loop pedestrian-AV interactions by directly integrating unmodified, real autonomous driving software (Autoware) into the simulation loop. Unlike previous systems, AWSIM-VR provides authentic, bidirectional interaction: pedestrians' actions dynamically influence vehicle decision-making and eHMI responses in real-time, closely emulating real-world AV scenarios. In user studies directly comparing AWSIM-VR to existing methodologies, participants reported significantly higher perceived realism and immersion, underscoring the importance of authentic autonomous behaviors in VR-based pedestrian interaction research. By directly utilizing production-level autonomous driving stacks, AWSIM-VR represents a significant methodological advancement, enabling more realistic, effective, and safer development and evaluation of eHMIs and AV technologies.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Naren Bao, Alex Orsholits, Manabu Tsukada, "4D Path Planning via Spatiotemporal Voxels in Urban Airspaces", In: 3rd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (IEEE MetaCom 2025), Seoul, Republic of Korea, 2025.Proceedings Article | Abstract | BibTeX | Links:

Yun Li, Ehsan Javanmardi, Simon Thompson, Kai Katsumata, Alex Orsholits, Manabu Tsukada, "PrefDrive: Enhancing Autonomous Driving through Preference-Guided Large Language Models", In: 36th IEEE Intelligent Vehicles Symposium (IV2025), Cluj-Napoca, Romania, 2025.Proceedings Article | Abstract | BibTeX | Links:

@inproceedings{Li2025c,

title = {PrefDrive: Enhancing Autonomous Driving through Preference-Guided Large Language Models},

author = {Yun Li and Ehsan Javanmardi and Simon Thompson and Kai Katsumata and Alex Orsholits and Manabu Tsukada},

url = {https://github.com/LiYun0607/PrefDrive/

https://huggingface.co/liyun0607/PrefDrive

https://huggingface.co/datasets/liyun0607/PrefDrive},

doi = {10.1109/IV64158.2025.11097672},

year  = {2025},

date = {2025-06-22},

urldate = {2025-06-22},

booktitle = {36th IEEE Intelligent Vehicles Symposium (IV2025)},

address = {Cluj-Napoca, Romania},

abstract = {This paper presents PrefDrive, a novel framework that integrates driving preferences into autonomous driving models through large language models (LLMs). While recent advances in LLMs have shown promise in autonomous driving, existing approaches often struggle to align with specific driving behaviors (e.g., maintaining safe distances, smooth acceleration patterns) and operational requirements (e.g., traffic rule compliance, route adherence). We address this challenge by developing a preference learning framework that combines multimodal perception with natural language understanding. Our approach leverages Direct Preference Optimization (DPO) to fine-tune LLMs efficiently on consumer-grade hardware, making advanced autonomous driving research more accessible to the broader research community. We introduce a comprehensive dataset of 74,040 sequences, carefully annotated with driving preferences and driving decisions, which, along with our trained model checkpoints, will be made publicly available to facilitate future research. Through extensive experiments in the CARLA simulator, we demonstrate that our preference-guided approach significantly improves driving performance across multiple metrics, including distance maintenance and trajectory smoothness. Results show up to 28.1% reduction in traffic rule violations and 8.5% improvement in navigation task completion while maintaining appropriate distances from obstacles. The framework demonstrates robust performance across different urban environments, showcasing the effectiveness of preference learning in autonomous driving applications.  },

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Alex Orsholits, Manabu Tsukada, "Context-Rich Interactions in Mixed Reality through Edge AI Co-Processing", In: The 39-th International Conference on Advanced Information Networking and Applications (AINA 2025), Barcelona, Spain, 2025, ISBN: 978-3-031-87771-1.Proceedings Article | Abstract | BibTeX | Links:

Alex Orsholits, Manabu Tsukada, "Edge Vision AI Co-Processing for Dynamic Context Awareness in Mixed Reality", IEEE VR 2025, Poster, 2025, (Honorable mention).Miscellaneous | Abstract | BibTeX | Links:

Vishal Chauhan, Anubhav Anubhav, Chia-Ming Chang, Jin Nakazato, Ehsan Javanmardi, Alex Orsholits, Takeo Igarashi, Kantaro Fujiwara, Manabu Tsukada, "Connected Shared Spaces: Expert Insights into the Impact of eHMI and SPIU for Next-Generation Pedestrian-AV Communication", In: International Conference on Intelligent Computing and its Emerging Applications (ICEA2024), pp. 16 - 20, Tokyo, Japan, 2024.Proceedings Article | Abstract | BibTeX | Links:

@inproceedings{Chauhan2024b,

title = {Connected Shared Spaces: Expert Insights into the Impact of eHMI and SPIU for Next-Generation Pedestrian-AV Communication},

author = {Vishal Chauhan and Anubhav Anubhav and Chia-Ming Chang and Jin Nakazato and Ehsan Javanmardi and Alex Orsholits and Takeo Igarashi and Kantaro Fujiwara and Manabu Tsukada},

doi = {10.1145/3732437.3732752},

year  = {2024},

date = {2024-11-28},

urldate = {2024-11-28},

booktitle = {International Conference on Intelligent Computing and its Emerging Applications (ICEA2024)},

pages = {16 - 20},

address = {Tokyo, Japan},

abstract = {Increasing prevalence of Autonomous Vehicles (AVs) necessitates efficient communication with susceptible road users, especially pedestrians, in communal urban areas. To improve pedestrians’ trust and safety, Smart Pole Interaction Units (SPIU) and external Human-Machine Interfaces (eHMI) have become crucial interfaces. In this study, we ask 12 automotive UI design experts to evaluate eHMI, SPIU, and eHMI+SPIU through an online survey. They evaluated the interfaces’ effects on five key parameters: Safety, Seamless, Adaptability, Accessibility, and Trust. Our findings show that eHMI stands out for its smooth integration (Seamless), whereas SPIU is favoured for fostering Safety, Adaptability, Accessibility, and Trust. Furthermore, an integrated eHMI+SPIU solution is rated higher than individual eHMI and SPIU. In particular, when several AVs interact, the best way to promote pedestrian trust is to employ eHMI in conjunction with SPIU. This study highlights the benefits of SPIU as a centralised information hub for reliable pedestrian communication and presents innovative design considerations for eHMI on AVs in shared spaces. The results provide a framework for more practical testing of these systems to create safe, inclusive, and human-centric pedestrian-AV interactions in various urban environments beyond shared spaces.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Alex Orsholits, Eric Nardini, Tsukada Manabu, "PLATONE: Assessing Simulation Accuracy of Environment-Dependent Audio Spatialization", In: International Conference on Intelligent Computing and its Emerging Applications (ICEA2024), pp. 57 - 62, Tokyo, Japan, 2024.Proceedings Article | Abstract | BibTeX | Links:

Vishal Chauhan, Anubhav Anubhav, Chia-Ming Chang, Jin Nakazato, Ehsan Javanmardi, Alex Orsholits, Takeo Igarashi, Kantaro Fujiwara, Manabu Tsukada , "Transforming Pedestrian and Autonomous Vehicles Interactions in Shared Spaces: A Think-Tank Study on Exploring Human-Centric Designs", In: 16th International ACM Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutoUI 2024), Work in Progress (WiP), pp. 1-8, California, USA, 2024.Proceedings Article | Abstract | BibTeX | Links:

Tokio Takada, Jin Nakazato, Alex Orsholits, Manabu Tsukada, Hideya Ochiai, Hiroshi Esaki, "Design of Digital Twin Architecture for 3D Audio Visualization in AR", In: The 2nd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom 2024), Hong Kong, China, 2024.Proceedings Article | Abstract | BibTeX | Links:

Alex Orsholits, Yiyuan Qian, Eric Nardini, Yusuke Obuchi, Manabu Tsukada, "PLATONE: An Immersive Geospatial Audio Spatialization Platform", In: The 2nd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom 2024), Hong Kong, China, 2024.Proceedings Article | Abstract | BibTeX | Links:

髙田季生, 中里仁, Alex Orsholits, 塚田学, 落合秀也, 江崎浩, "ARにおける3D音響可視化に向けたデジタルツインアーキテクチャの設計", マルチメディア、分散、協調とモバイル(DICOMO2024)シンポジウム, 岩手県花巻市, 2024.Conference | Abstract | BibTeX

Project Researcher

Publication

2026

2025

2024

Tsukada Laboratory

Topics

Access & Contact

Language