Issue |
MATEC Web Conf.
Volume 277, 2019
2018 International Joint Conference on Metallurgical and Materials Engineering (JCMME 2018)
|
|
---|---|---|
Article Number | 02028 | |
Number of page(s) | 8 | |
Section | Data and Signal Processing | |
DOI | https://doi.org/10.1051/matecconf/201927702028 | |
Published online | 02 April 2019 |
Detect-and-describe: Joint learning framework for detection and description of objects
1
School of Software, Shanghai Jiao Tong University, Shanghai, China
2
School of Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
* Corresponding author: adeelz92@gmail.com
Traditional object detection answers two questions; “what” (what the object is?) and “where” (where the object is?). “what” part of the object detection can be fine grained further i-e. “what type”, “what shape” and “what material” etc. This results in shifting of object detection task to object description paradigm. Describing object provides additional detail that enables us to understand the characteristics and attributes of the object (“plastic boat” not just boat, “glass bottle” not just bottle). This additional information can implicitly be used to gain insight about unseen objects (e.g. unknown object is “metallic”, “has wheels”), which is not possible in traditional object detection. In this paper, we present a new approach to simultaneously detect objects and infer their attributes, we call it Detectand- Describe (DaD) framework. DaD is a deep learning-based approach that extends object detection to object attribute prediction as well. We train our model on aPascal train set and evaluate our approach on aPascal test set. We achieve 97.0% in Area Under the Receiver Operating Characteristic Curve (AUC) for object attributes prediction on aPascal test set. We also show qualitative results for object attribute prediction on unseen objects, which demonstrate the effectiveness of our approach for describing unknown objects.
© The Authors, published by EDP Sciences, 2019
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.