Object detection performs two tasks (classification and localization) simultaneously. Two tasks share a similarity: they need robust features that effectively represent the visual appearance of the objects. However, two tasks also have differentproperties. First, classification mainly requires features from discriminative parts of an object to determine the object category, whereas localization mainly requires features from the entire object regions for localizing by drawing a bounding box. Second, classification has a translation invariant property, whereas localization has a translation variant property. In order to increase the efficiency of object detection, it is necessary to design a network in consideration of the commonalities and differences of two tasks. In this work, we simply modi?ed layers of the existing object detection networks into three parts by considering such characteristics: lower-layer feature sharing part, layer separation part, and feature fusion part. As a result, the performance of the proposed method was noticeably improved by properly sharing, separating, and fusing layers of the existing object detection networks.
«
Object detection performs two tasks (classification and localization) simultaneously. Two tasks share a similarity: they need robust features that effectively represent the visual appearance of the objects. However, two tasks also have differentproperties. First, classification mainly requires features from discriminative parts of an object to determine the object category, whereas localization mainly requires features from the entire object regions for localizing by drawing a bounding box. Seco...
»