Beyond Short Segments: A Comprehensive Multi-Modal Salience Prediction Dataset with Standard-Length 360-Degree Videos
Authors: Li, Z., Adamczewska, N. and Tang, W.
Journal: Proceedings 2025 IEEE International Conference on Artificial Intelligence and Extended and Virtual Reality Aixvr 2025
Pages: 44-53
DOI: 10.1109/AIxVR63409.2025.00015
Abstract:Understanding user interactions in immersive 360-degree video environment is crucial for the quality of user experience. Complex spatial ambisonic information in 360-degree videos enriches sensory experiences but also poses unique challenges for multimodal salience prediction. Recent research has introduced various 360-degree datasets containing ambisonic sound but these datasets primarily contain only short video segments, typically under 30 seconds. Our present study shows that extrapolating ambisonic features learnt from short video segments to longer ones will lead to inaccuracies in VR video streaming. Therefore, we have developed a comprehensive multimodal standard-length (between 40s to 240s) 360-degree video dataset in response to challenges of multimodal salience predication. Based on data collected from 30 participants in mono and ambisonic audio settings, our user behaviour analysis sheds new light on the relationship between ambisonic audio distribution and viewer attention across full-length videos. The findings of our study underscore the complexity and challenges of applying a feature learning strategy from short segments to standard video lengths. They demonstrate that extrapolating learning from short video segments to longer ones is generally not applicable, even though it is widely used in current practices. Furthermore, we assess existing salience prediction models and introduce an efficient baseline model to evaluate the impact of different modality features in our dataset. The insights and the new dataset of our study establish a more realistic benchmark for future research on multimodal salience prediction in 360-degree videos. Our proposed dataset is available for download11dataset url.
Source: Scopus