## 121

participants

#### Topic provider

Intelligent Vision System Lab (IVS), National Chiao Tung University (NCTU), Taiwan

The Intelligent Vision System Lab (IVS) at National Chiao Tung University is directed by Professor Jiun-In Guo. We are tackling practical open problems in autonomous driving research. We are focused intelligent vision processing system, applications, and SoC exploiting deep learning technology.
Web Site: http://ivs.ee.nctu.edu.tw/ivs/

Pervasive Artificial Intelligence Research (PAIR) Labs, Ministry of Science and Technology, Taiwan.

The Pervasive AI Research (PAIR) Labs, a national research lab funded by the Ministry of Science and Technology, Taiwan, is commissioned to achieve academic excellence, nurture local AI talents, build international linkage, and develop pragmatic approaches in the areas of applied AI technologies toward services, products, workflows, and supply chains innovation and optimization. PAIR Labs is constituted of 13 distinguished research institutes to conduct research in applied AI areas.
Web Site: https://pairlabs.ai/

###### 2019/08/10 Final Announcement of MMSP 2019 Embedded Deep Learning Object Detection Model Competition

Award Winner • Champion: R.JD • Runner-up: Omission • 3rd-place: Omission MMSP 2019 Paper Invitation Those teams who achieve mAP that is better than 0.46 in the final are invited to publish papers in the MMSP competition special session held in MMSP 2019. The teams are listed below: • R.JD • nctuai • chenjiaqi • IMMVP Final Evaluation Result • R.JD (The College of Information Engineering of Xiangtan University) --mAP: 0.5389 --Model Size(MByte): 124.6 --Complexity(GOPS/frame): 43.6 --Speed(ms/frame): 460.5 --Award: Champion --MMSP paper invitation: Yes •nctuai (Department of Electrical Engineering, National University of Tainan) --mAP: 0.4760 --Model Size(MByte): 195.2 --Complexity(GOPS/frame): 490.1 --Speed(ms/frame): 1338.9 --Award: Not qualified (mAP<0.50) --MMSP paper invitation: Yes •chenjiaqi (The College of Information Engineering of Xiangtan University) --mAP: 0.4619 --Model Size(MByte): 114.0 --Complexity(GOPS/frame): 339.5 --Speed(ms/frame): 1195.0 --Award: Not qualified (mAP<0.50) --MMSP paper invitation: Yes •IMMVP (Institute of Information Science, Academia Sinica) --mAP: 0.4605 --Model Size(MByte): 57.1 --Complexity(GOPS/frame): 724.9 --Speed(ms/frame): 510.3 --Award: Not qualified (mAP<0.50) --MMSP paper invitation: Yes •NPUST-MIS-No.1 (Department of Management Information Systems, National Pingtung University of Science and Technology) --mAP: 0.4396 --Model Size(MByte): 238.4 --Complexity(GOPS/frame): 115.6 --Speed(ms/frame): 514.5 --Award: Not qualified (mAP<0.50) --MMSP paper invitation: -----

#### Introduction

Object detection in computer vision area has been extensively studied and making tremendous progress in recent years using deep learning methods. However, due to the heavy computation required in most deep learning based algorithms, it is hard to run these models on embedded systems, which have limited computing capabilities.

In this competition, we encourage the participants to design object detection models that do not only fit for embedded systems but also achieve high accuracy at the same time.

The goal is to design a lightweight deep learning model suitable for constrained embedded system design. We focus on model size, computation complexity and performance optimization on NVIDIA Jetson TX2.

Given the test image dataset, participants are asked to detect objects belonging to the following three classes {pedestrian, vehicle, rider} in each image, including class, bounding box, and confidence.

This competition is divided into two stages: qualification and final competition.

• Qualification Competition: all participants submit their answers online. A score is calculated. The top 10 teams would be qualified to enter the final round of competition.

• Final Competition: the final score will be evaluated over NVIDIA Jetson TX2 for the final score.

#### Prize Information

Champion: One team ($USD 1,500) Runner-up: One team ($USD 1,000)

3rd-place: One team ($USD 750) #### Activity time DateActivity 2019/06/01Qualification Competition Start, Release Public Testing Data 2019/07/14Release Testing Data for Qualification 2019/07/21 15:59:59 UTCQualification Competition End 2019/07/22Final Competition Start, Release Private Testing Data for Final 2019/07/28 15:59:59 UTCFinal Competition End 2019/08/09Award Announcement #### Evaluation Criteria Qualification Competition The grading rule is based on MSCOCO object detection rule. • mean Average Precision (mAP) is used to evaluate the result. • Intersection over union (IoU) threshold is set at 0.5 The resulting average precision (AP) of each class will be calculated and the mean AP (mAP) over all classes is evaluated as the key metric. The public test dataset will be released at the beginning of the competition. The participants can submit the results to get the scores online to realize their ranks among all teams. Besides, during the qualification competition period, each team has to submit a team composition document, including team name, leader, team members, affiliation, and contact information, etc. Another private test dataset will be released a week before the end of the qualification competition and all participants have to submit the results for the private test dataset in a week and only the result for private test dataset will be graded for the qualification. Final Competition The finalists have to hand in a clone image of the eMMC partition on the Nvidia Jetson TX2 board that executes the object detection package. We will restore the submitted image to a Jetson TX2 board and grade the final score by running the model according to the following formula: • Mandatory criteria: mAP must be above 50% according to the final test dataset. • Model size (number of parameters * bit width used in storing the parameters) –30points The team with the smallest model will get the full score (30) and the team with the largest one will get zero. The rest teams will get scores directly proportional to the model size difference. For example, the smallest model contains 800K parameters, and the largest one contains 2M parameters. One team provides a model with 1.3M parameters and they will get $$Score = 30 \times \left(1300K - 2000K \over 800K - 2000K \right) = 17.5$$ Example of computing Model size: For a convolution layer with eight-bit parameters of (input size, output size, kernel size) =$(W_i×H_i×C_i, W_o×H_o×C_o, W_k×H_k)$with bias added, the model size of this layer will be $$(W_k×H_k×C_i+1)×C_o×8$$ For a fully-connected layer with eight-bit parameters of input size$W_i×H_i×C_i$, output size$W_o×H_o×C_o$, with bias added, the model size of this layer will be $$(W_i×H_i×C_i+1)×W_o×H_o×C_o×8$$ • Computation Complexity (GOPs/frame) –30 points The team with the smallest GOP number per frame will get the full score (30) and the team with the largest one will get zero. The rest teams will get scores directly proportional to the GOP value difference. For example, the smallest model consumes 40 GOPs/frame, and the largest one consumes 280 GOPs/frame. One team provides a model consuming 120 GOPs/frame and they will get $$Score = 30 \times \left(120 - 280 \over 40 - 280 \right) =20$$ Example of computing number of operations: For a convolution layer of input size$W_i×H_i×C_i$, output size$W_o×H_o×C_o$, kernel size$W_k×H_k$, with bias added, the total GOPs/frame for this layer will be $$(W_k×H_k×C_i×2+1)×W_o×H_o×C_o$$. For a fully-connected layer of input size$W_i×H_i×C_i$, output size$W_o×H_o×C_o\$, with bias added, the total GOPs/frame for this layer will be $$(W_i×H_i×C_i×2+1)×W_o×H_o×C_o$$

• Speed on NVIDIA Jetson TX2 –40 points
The team whose model can complete the detection task in the shortest time will get the full score (40) and the team that takes the longest time will get zero. The rest teams will get scores directly proportional to execution time difference. For example, the fastest model takes 40 seconds, and the longest one takes 520 seconds. One team provides a model taking 160 seconds and they will get $$Score = 40 \times \left(160 - 520 \over 40 - 520 \right) =30$$ The evaluation procedure will time the overall process from reading the private testing dataset in final to completing submission.csv file, including parsing image list, loading images, and any other overhead to conduct the detection through the testing dataset.

A technical report is required to reveal the model structure, complexity, and execution efficiency, etc. This report will be investigated and published in IEEE MMSP proceeding if it passes the review procedure.

#### Rules

• You cannot sign up to AIdea from multiple accounts and therefore you cannot submit from multiple accounts.
• Team mergers are not allowed in this competition.
• Each team can consist of a maximum of 6 team members.
• The task is open to the public. Individuals, institutions of higher education, research institutes, enterprises, or other organizations can all sign up for the task.
• A leaderboard will be set up and make available publicly.
• Multiple submissions are allowed before the deadline and the last one will be used to enter the final qualification consideration.
• The upload date/time will be used as the tie breaker.
• Privately sharing code or data outside of teams is not permitted. It's okay to share code if made available to all participants on the forums.
• Personnel of IVSLAB and PAIRLABS team are not allowed to participate in the task.
• Participants who win the Championship, Runner-up or 3rd-place are required to submit a technical report to IEEE MMSP Conference 2019.
• Common honor code should be observed. Any violation will be disqualified.