Publication
2025
Alex Orsholits, Manabu Tsukada, "Context-Rich Interactions in Mixed Reality through Edge AI Co-Processing", In: The 39-th International Conference on Advanced Information Networking and Applications (AINA 2025), Barcelona, Spain, 2025.Proceedings Article | Abstract | BibTeX
@inproceedings{Orsholits2025,
title = {Context-Rich Interactions in Mixed Reality through Edge AI Co-Processing},
author = {Alex Orsholits and Manabu Tsukada},
year = {2025},
date = {2025-04-09},
booktitle = {The 39-th International Conference on Advanced Information Networking and Applications (AINA 2025)},
address = {Barcelona, Spain},
abstract = {Spatial computing is evolving towards leveraging data streaming for computationally demanding applications, facilitating a shift to lightweight, untethered, and standalone devices. These devices are therefore ideal candidates for co-processing, where real-time context understanding and low-latency data streaming are fundamental for seamless, general-purpose Mixed Reality (MR) experiences. This paper demonstrates and evaluates a scalable approach to augmented contextual understanding in MR by implementing multi-modal edge AI co-processing through a Hailo-8 AI accelerator, a low-power ARM-based single board computer (SBC), and the Magic Leap 2 AR headset. The proposed system utilises the native WebRTC streaming capabilities of the Magic Leap 2 to continuously stream camera data to the edge co-processor, where a collection of vision AI models-object detection, pose estimation, face recognition, and depth estimation-are executed. The resulting inferences are then streamed back to the headset for spatial re-projection and transmitted to cloud-based systems for further integration with large-scale AI models, such as LLMs and VLMs. This seamless integration enhances real-time contextual understanding in MR while facilitating advanced multi-modal, multi-device collaboration, supporting richer, scalable spatial cognition across distributed systems.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Spatial computing is evolving towards leveraging data streaming for computationally demanding applications, facilitating a shift to lightweight, untethered, and standalone devices. These devices are therefore ideal candidates for co-processing, where real-time context understanding and low-latency data streaming are fundamental for seamless, general-purpose Mixed Reality (MR) experiences. This paper demonstrates and evaluates a scalable approach to augmented contextual understanding in MR by implementing multi-modal edge AI co-processing through a Hailo-8 AI accelerator, a low-power ARM-based single board computer (SBC), and the Magic Leap 2 AR headset. The proposed system utilises the native WebRTC streaming capabilities of the Magic Leap 2 to continuously stream camera data to the edge co-processor, where a collection of vision AI models-object detection, pose estimation, face recognition, and depth estimation-are executed. The resulting inferences are then streamed back to the headset for spatial re-projection and transmitted to cloud-based systems for further integration with large-scale AI models, such as LLMs and VLMs. This seamless integration enhances real-time contextual understanding in MR while facilitating advanced multi-modal, multi-device collaboration, supporting richer, scalable spatial cognition across distributed systems.
2024
Vishal Chauhan, Anubhav Anubhav, Chia-Ming Chang, Jin Nakazato, Ehsan Javanmardi, Alex Orsholits, Takeo Igarashi, Kantaro Fujiwara, Manabu Tsukada, "Connected Shared Spaces: Expert Insights into the Impact of eHMI and SPIU for Next-Generation Pedestrian-AV Communication", In: International Conference on Intelligent Computing and its Emerging Applications (ICEA2024), Tokyo, Japan, 2024.Proceedings Article | BibTeX
@inproceedings{Chauhan2024b,
title = {Connected Shared Spaces: Expert Insights into the Impact of eHMI and SPIU for Next-Generation Pedestrian-AV Communication},
author = {Vishal Chauhan and Anubhav Anubhav and Chia-Ming Chang and Jin Nakazato and Ehsan Javanmardi and Alex Orsholits and Takeo Igarashi and Kantaro Fujiwara and Manabu Tsukada},
year = {2024},
date = {2024-11-28},
urldate = {2024-11-28},
booktitle = {International Conference on Intelligent Computing and its Emerging Applications (ICEA2024)},
address = {Tokyo, Japan},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Alex Orsholits, Eric Nardini, Tsukada Manabu, "PLATONE: Assessing Simulation Accuracy of Environment-Dependent Audio Spatialization", In: International Conference on Intelligent Computing and its Emerging Applications (ICEA2024), Tokyo, Japan, 2024.Proceedings Article | BibTeX
@inproceedings{Orsholits2024b,
title = {PLATONE: Assessing Simulation Accuracy of Environment-Dependent Audio Spatialization},
author = {Alex Orsholits and Eric Nardini and Tsukada Manabu},
year = {2024},
date = {2024-11-28},
booktitle = {International Conference on Intelligent Computing and its Emerging Applications (ICEA2024)},
address = {Tokyo, Japan},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Vishal Chauhan, Anubhav Anubhav, Chia-Ming Chang, Jin Nakazato, Ehsan Javanmardi, Alex Orsholits, Takeo Igarashi, Kantaro Fujiwara, Manabu Tsukada
, "Transforming Pedestrian and Autonomous Vehicles Interactions in Shared Spaces: A Think-Tank Study on Exploring Human-Centric Designs", In: 16th International ACM Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutoUI 2024), Work in Progress (WiP), pp. 1-8, California, USA, 2024.Proceedings Article | Abstract | BibTeX | Links:
@inproceedings{Chauhan2024,
title = {Transforming Pedestrian and Autonomous Vehicles Interactions in Shared Spaces: A Think-Tank Study on Exploring Human-Centric Designs},
author = {Vishal Chauhan and Anubhav Anubhav and Chia-Ming Chang and Jin Nakazato and Ehsan Javanmardi and Alex Orsholits and Takeo Igarashi and Kantaro Fujiwara and Manabu Tsukada
},
doi = {10.1145/3641308.3685037},
year = {2024},
date = {2024-09-22},
urldate = {2024-09-22},
booktitle = {16th International ACM Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutoUI 2024), Work in Progress (WiP)},
pages = {1-8},
address = {California, USA},
abstract = {Our research focuses on the smart pole interaction unit (SPIU) as an infrastructure external human-machine interface (HMI) to enhance pedestrian interaction with autonomous vehicles (AVs) in shared spaces. We extensively study SPIU with external human-machine interfaces (eHMI) on AVs as an integrated solution. To discuss interaction barriers and enhance pedestrian safety, we engaged 25 participants aged 18-40 to brainstorm design solutions for pedestrian-AV interactions, emphasising effectiveness, simplicity, visibility, and clarity. Findings indicate a preference for real-time SPIU interaction over eHMI on AVs in multiple AV scenarios. However, the combined use of SPIU and eHMI on AVs is crucial for building trust in decision-making. Consequently, we propose innovative design solutions for both SPIU and eHMI on AVs, discussing their pros and cons. This study lays the groundwork for future autonomous mobility solutions by developing human-centric eHMI and SPIU prototypes as ieHMI.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Our research focuses on the smart pole interaction unit (SPIU) as an infrastructure external human-machine interface (HMI) to enhance pedestrian interaction with autonomous vehicles (AVs) in shared spaces. We extensively study SPIU with external human-machine interfaces (eHMI) on AVs as an integrated solution. To discuss interaction barriers and enhance pedestrian safety, we engaged 25 participants aged 18-40 to brainstorm design solutions for pedestrian-AV interactions, emphasising effectiveness, simplicity, visibility, and clarity. Findings indicate a preference for real-time SPIU interaction over eHMI on AVs in multiple AV scenarios. However, the combined use of SPIU and eHMI on AVs is crucial for building trust in decision-making. Consequently, we propose innovative design solutions for both SPIU and eHMI on AVs, discussing their pros and cons. This study lays the groundwork for future autonomous mobility solutions by developing human-centric eHMI and SPIU prototypes as ieHMI.
Alex Orsholits, Yiyuan Qian, Eric Nardini, Yusuke Obuchi, Manabu Tsukada, "PLATONE: An Immersive Geospatial Audio Spatialization Platform", In: The 2nd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom 2024), Hong Kong, China, 2024.Proceedings Article | Abstract | BibTeX
@inproceedings{Orsholits2024,
title = {PLATONE: An Immersive Geospatial Audio Spatialization Platform},
author = {Alex Orsholits and Yiyuan Qian and Eric Nardini and Yusuke Obuchi and Manabu Tsukada},
year = {2024},
date = {2024-08-12},
booktitle = {The 2nd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom 2024)},
address = {Hong Kong, China},
abstract = {In the rapidly evolving landscape of mixed reality (MR) and spatial computing, the convergence of physical and virtual spaces is becoming increasingly crucial for enabling immersive, large-scale user experiences and shaping inter-reality dynamics. This is particularly significant for immersive audio at city-scale, where the 3D geometry of the environment must be considered, as it drastically influences how sound is perceived by the listener. This paper introduces PLATONE, a novel proof-of-concept MR platform designed to augment urban contexts with environment-dependent spatialized audio. It leverages custom hardware for localization and orientation, alongside a cloud-based pipeline for generating real-time binaural audio. By utilizing open-source 3D building datasets, sound propagation effects such as occlusion, reverberation, and diffraction are accurately simulated. We believe that this work may serve as a compelling foundation for further research and development.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In the rapidly evolving landscape of mixed reality (MR) and spatial computing, the convergence of physical and virtual spaces is becoming increasingly crucial for enabling immersive, large-scale user experiences and shaping inter-reality dynamics. This is particularly significant for immersive audio at city-scale, where the 3D geometry of the environment must be considered, as it drastically influences how sound is perceived by the listener. This paper introduces PLATONE, a novel proof-of-concept MR platform designed to augment urban contexts with environment-dependent spatialized audio. It leverages custom hardware for localization and orientation, alongside a cloud-based pipeline for generating real-time binaural audio. By utilizing open-source 3D building datasets, sound propagation effects such as occlusion, reverberation, and diffraction are accurately simulated. We believe that this work may serve as a compelling foundation for further research and development.
Tokio Takada, Jin Nakazato, Alex Orsholits, Manabu Tsukada, Hideya Ochiai, Hiroshi Esaki, "Design of Digital Twin Architecture for 3D Audio Visualization in AR", In: The 2nd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom 2024), Hong Kong, China, 2024.Proceedings Article | Abstract | BibTeX
@inproceedings{Takada2024,
title = {Design of Digital Twin Architecture for 3D Audio Visualization in AR},
author = {Tokio Takada and Jin Nakazato and Alex Orsholits and Manabu Tsukada and Hideya Ochiai and Hiroshi Esaki},
year = {2024},
date = {2024-08-12},
urldate = {2024-08-12},
booktitle = {The 2nd Annual IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom 2024)},
address = {Hong Kong, China},
abstract = {Digital twins have recently attracted attention from academia and industry as a technology connecting physical space and cyberspace. Digital twins are compatible with Augmented Reality (AR) and Virtual Reality (VR), enabling us to understand information in cyberspace. In this study, we focus on music and design an architecture for a 3D representation of music using a digital twin. Specifically, we organize the requirements for a digital twin for music and design the architecture. We establish a method to perform 3D representation in cyberspace and map the recorded audio data in physical space. In this paper, we implemented the physical space representation using a smartphone as an AR device and employed a visual positioning system (VPS) for self-positioning. For evaluation, in addition to system errors in the 3D representation of audio data, we conducted a questionnaire evaluation with several users as a user study. From these results, we evaluated the effectiveness of the implemented system. At the same time, we also found issues we need to improve in the implemented system in future works.},
key = {CREST},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Digital twins have recently attracted attention from academia and industry as a technology connecting physical space and cyberspace. Digital twins are compatible with Augmented Reality (AR) and Virtual Reality (VR), enabling us to understand information in cyberspace. In this study, we focus on music and design an architecture for a 3D representation of music using a digital twin. Specifically, we organize the requirements for a digital twin for music and design the architecture. We establish a method to perform 3D representation in cyberspace and map the recorded audio data in physical space. In this paper, we implemented the physical space representation using a smartphone as an AR device and employed a visual positioning system (VPS) for self-positioning. For evaluation, in addition to system errors in the 3D representation of audio data, we conducted a questionnaire evaluation with several users as a user study. From these results, we evaluated the effectiveness of the implemented system. At the same time, we also found issues we need to improve in the implemented system in future works.
髙田季生, 中里仁, Alex Orsholits, 塚田学, 落合秀也, 江崎浩, "ARにおける3D音響可視化に向けたデジタルツインアーキテクチャの設計", マルチメディア、分散、協調とモバイル(DICOMO2024)シンポジウム, 岩手県花巻市, 2024.Conference | Abstract | BibTeX
@conference{髙田季生2024,
title = {ARにおける3D音響可視化に向けたデジタルツインアーキテクチャの設計},
author = {髙田季生 and 中里仁 and Alex Orsholits and 塚田学 and 落合秀也 and 江崎浩},
year = {2024},
date = {2024-06-26},
urldate = {2024-06-26},
booktitle = {マルチメディア、分散、協調とモバイル(DICOMO2024)シンポジウム},
address = {岩手県花巻市},
abstract = {近年,デジタルツインは学界および産業界から注目を集めており,フィジカル空間とサイバー空間を繋ぐ技術として着目されている.デジタルツインはAugmented Reality(AR)やVirtual Reality(VR)との相性がよく,ユーザは複雑な物理的実体やプロセスを理解できる.本研究では,音響に焦点を当て,デジタルツインを用いた音響の立体的な表現をするためのアーキテクチャの提案を行う.具体的には,音楽向けのデジタルツインの要件を整理し,アーキテクチャの設計を行った.既に収録した音声データをサイバー空間上にて立体表現を行い,フィジカル空間にマッピングするための手法を確立する.本稿では,フィジカル空間の表現にはARデバイスとしてGoogle Pixelを用いて実装を行い,自己位置の推定についてはVPSを用いた.評価においては,音声データの立体表現におけるシステム誤差に加えて,ユーザスタディとして複数人を体験者として実地調査を行った.これらの結果より,実装システムの有効性を評価することができ,一方で実装システムの改善を必要とする課題も発見できた.これらの成果を本稿で報告する.
},
keywords = {},
pubstate = {published},
tppubtype = {conference}
}
近年,デジタルツインは学界および産業界から注目を集めており,フィジカル空間とサイバー空間を繋ぐ技術として着目されている.デジタルツインはAugmented Reality(AR)やVirtual Reality(VR)との相性がよく,ユーザは複雑な物理的実体やプロセスを理解できる.本研究では,音響に焦点を当て,デジタルツインを用いた音響の立体的な表現をするためのアーキテクチャの提案を行う.具体的には,音楽向けのデジタルツインの要件を整理し,アーキテクチャの設計を行った.既に収録した音声データをサイバー空間上にて立体表現を行い,フィジカル空間にマッピングするための手法を確立する.本稿では,フィジカル空間の表現にはARデバイスとしてGoogle Pixelを用いて実装を行い,自己位置の推定についてはVPSを用いた.評価においては,音声データの立体表現におけるシステム誤差に加えて,ユーザスタディとして複数人を体験者として実地調査を行った.これらの結果より,実装システムの有効性を評価することができ,一方で実装システムの改善を必要とする課題も発見できた.これらの成果を本稿で報告する.