參加人數
Multimedia Information System Lab, National Tsing Hua University (NTHU), Taiwan
The Multimedia Information System Laboratory (MISLab) was founded in August 2012. Led by professor Min-Chun Hu, our team aims to design original methodologies and develop practical multimedia systems to meet different demands of users. Our research topics include digital signal processing, digital content analysis/editing/presentation, machine learning and artificial intelligence, computer vision and pattern recognition, human-computer interaction, computer graphics, virtual reality and augmented reality.
Website: http://mislab.cs.nthu.edu.tw/
MediaTek Inc.
MediaTek Incorporated (TWSE: 2454) is a global fabless semiconductor company that enables nearly 2 billion connected devices a year. We are a market leader in developing innovative systems-on-chip (SoC) for mobile device, home entertainment, connectivity and IoT products. Our dedication to innovation has positioned us as a driving market force in several key technology areas, including highly power-efficient mobile technologies, automotive solutions and a broad range of advanced multimedia products such as smartphones, tablets, digital televisions, 5G, Voice Assistant Devices (VAD) and wearables. MediaTek empowers and inspires people to expand their horizons and achieve their goals through smart technology, more easily and efficiently than ever before. We work with the brands you love to make great technology accessible to everyone, and it drives everything we do. Visit www.mediatek.com for more information.
Information Technology Software Academy
Dear Participants, The IEEE AIVR 2021 Grand Challenge: Visual Attention Estimation in HMD Champion: "Piankk" (Chun Tsao and Po-Chyi Su, National Central University) Congratulations!!!
To improve the experience of XR applications, techniques of visual attention estimation have been developed for predicting human intention so that the HMD can pre-render the visual content to reduce rendering latency. However, most deep learning-based algorithms have to pay heavy computation to achieve satisfactory accuracy. This is especially challenging for embedded systems with finite resources such as computing power and memory bandwidth (e.g. standalone HMD). In addition, this research field relies on richer data to advance the most cutting-edge progress, while the number and diversity of existing datasets were still lacking. In this challenge, we collected a set of 360° MR/VR videos along with the information of user head pose and eye gaze signals. The goal of this competition is to encourage contestants to design lightweight visual attention estimation models that can be deployed on an embedded device of constrained resources. The developed models need to not only achieve high fidelity but also show good performance on the device.
This competition is divided into two stages: qualification and final competition.
Given the test dataset containing 360° videos, participants are asked to estimate a saliency map for each video. To be more precise, each pixel has a predicted value in the range [0,1]. Note that the goal of this challenge is to design a lightweight deep learning model suitable for constrained embedded systems. Therefore, we focus on prediction correctness, model size, computational complexity, performance optimization and the deployment on MediaTek’s Dimensity 1000+ platform.
With MediaTek’s platform and its heterogeneous computing capabilities such as CPUs, GPUs and APUs (AI processing units) embedded into the system-on-chip products, developers are provided the high performance and power efficiency for building the AI features and applications. Developers can target these specific processing units within the system-on-chip or, they can also let MediaTek NeuroPoint SDK intelligently handle the processing allocation for them. ***Please note that we use Tensorflow Lite in this challenge.
According to the points of each team in the final evaluation, we select the highest three teams for regular awards.
Special Awards
Best accuracy award for each track (award for the highest accuracy in the final competition)
USD 600
Please note that all challenge participants entering the final competition are expected to submit a 2-page contest paper describing their work and attend IEEE AIVR 2021 (http://www.ieee-aivr.org/) to present their work in the Challenge session. One conference registration can cover for the publication and conference fee of all co-authors. Papers will be included in the IEEE AVIR 2021 proceedings, which are published in IEEE Xplore. We still hope to run at least parts of the conference on location in Taiwan, but remote presentation will be possible for participants who cannot or do not want to travel due to COVID.
TWCC Award Provisions and Eligibility
The time is based on UTC+8.
Time | Event |
---|---|
08/02/2021 | Qualification Competition Start Date Date to Release Testing Data Date to Release Private Testing Data (without ground truth data) for Qualification |
08/16/2021 | Start Date of Result Uploading (at most three times per day) |
10/07/2021 11:59 PM UTC+8 | Qualification Competition End Date |
10/08/2021 12:00 PM UTC+8 | Finalist Announcement |
10/08/2021 | Final Competition Start Date |
10/20/2021 12:00 PM UTC+8 | Paper Submission Deadline |
11/10/2021 11:59 PM UTC+8 | Final Competition End Date |
11/12/2021 12:00 PM UTC+8 | Award Announcement |
The evaluation metrics are based on Salient360! [1] evaluation metrics:
For evaluating the accuracy of predicted saliency maps, we consider the following metrics [2, 3]:
We use five metrics to evaluate the prediction results. For each metric, the team with the best prediction will get the full points (20%) and the team with the worst one will get zero. The rest teams will get points directly proportional to the ranking. We will rank the 5 metrics individually, and then the score calculated based on the 5 ranking lists will be used to determine the position in the leaderboard.
$$ Score=\sum_{i} (n-R_{i})\cdot \frac{20}{n-1} $$
$$ R_{i}: \text {ranking in each metric} $$
$$ n: \text {number of teams} $$
Reference
[1] Salient360! Challenge: https://salient360.ls2n.fr/
[2] Bylinskii, Zoya, et al. "What do different evaluation metrics tell us about saliency models?." IEEE transactions on pattern analysis and machine intelligence 41.3 (2018): 740-757.
[3] Gutiérrez, Jesús, et al. "Toolbox and dataset for the development of saliency and scanpath models for omnidirectional/360 still images." Signal Processing: Image Communication 69 (2018): 35-42.
[4] Judd, Tilke, et al. "Learning to predict where humans look." 2009 IEEE 12th international conference on computer vision. IEEE, 2009.
Min-Chun Hu, National Tsing Hua University
Wan-Lun Tsai, National Tsing Hua University
Tse-Yu Pan, National Tsing Hua University
Herman Prawiro, National Tsing Hua University
CM Cheng, MediaTek
Hsien-kai Kuo, MediaTek
Min-Hung Chen, MediaTek
Email: vae.challenge@gmail.com