Abstract This article describes the artificial intelligence (AI) component of a drone for monitoring and patrolling tasks associated with disaster relief missions in specific restricted disaster scenarios, as specified by the Advanced Robotics Foundation in Japan. The AI component uses deep learning models for environment recognition and object detection. For environment recognition, we use semantic segmentation, or pixel-wise labeling, based on RGB images. Object detection is key for detecting and locating people in need. Since people are relatively small objects from the drone perspective, we use both RGB and thermal images. To train our models, we created a novel multispectral and publicly available data set of people. We used a geo-location method to locate people on the ground. The semantic segmentation models were extensively tested using different feature extractors. We created two dedicated data sets, which we have made publicly available. Compared with the baseline model, the best-performing model could increase the mean intersection over union (IoU) by 1.3%. Furthermore, we compared two types of person detection models. The first one is an ensemble model that combines RGB and thermal information via “late fusion”; the second one is a 4-channel model that combines these two types of information in an “early fusion” manner. The results suggest that the 4-channel model had a 40.6% increase of average precision for stricter IoU values (0.75) compared with the ensemble model and a 5.8% increase in the average precision compared with the thermal model. All models were deployed and tested on the NVIDIA AGX Xavier platform. To the best of our knowledge, this study was the first to use both RGB and thermal data from the perspective of a drone for monitoring tasks.
«
Abstract This article describes the artificial intelligence (AI) component of a drone for monitoring and patrolling tasks associated with disaster relief missions in specific restricted disaster scenarios, as specified by the Advanced Robotics Foundation in Japan. The AI component uses deep learning models for environment recognition and object detection. For environment recognition, we use semantic segmentation, or pixel-wise labeling, based on RGB images. Object detection is key for detecting...
»