Detect-and-describe: Joint learning framework for detection and description of objects

Adeel Zafar; Umar Khalid

doi:10.1051/matecconf/201927702028

All issues

Volume 277 (2019)

MATEC Web Conf., 277 (2019) 02028

Abstract

Open Access

Issue		MATEC Web Conf. Volume 277, 2019 2018 International Joint Conference on Metallurgical and Materials Engineering (JCMME 2018)


Article Number		02028
Number of page(s)		8
Section		Data and Signal Processing
DOI		https://doi.org/10.1051/matecconf/201927702028
Published online		02 April 2019

MATEC Web of Conferences 277, 02028 (2019)

Detect-and-describe: Joint learning framework for detection and description of objects

Adeel Zafar¹^* and Umar Khalid²

¹ School of Software, Shanghai Jiao Tong University, Shanghai, China
² School of Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China

^* Corresponding author: adeelz92@gmail.com

Abstract

Traditional object detection answers two questions; “what” (what the object is?) and “where” (where the object is?). “what” part of the object detection can be fine grained further i-e. “what type”, “what shape” and “what material” etc. This results in shifting of object detection task to object description paradigm. Describing object provides additional detail that enables us to understand the characteristics and attributes of the object (“plastic boat” not just boat, “glass bottle” not just bottle). This additional information can implicitly be used to gain insight about unseen objects (e.g. unknown object is “metallic”, “has wheels”), which is not possible in traditional object detection. In this paper, we present a new approach to simultaneously detect objects and infer their attributes, we call it Detectand- Describe (DaD) framework. DaD is a deep learning-based approach that extends object detection to object attribute prediction as well. We train our model on aPascal train set and evaluate our approach on aPascal test set. We achieve 97.0% in Area Under the Receiver Operating Characteristic Curve (AUC) for object attributes prediction on aPascal test set. We also show qualitative results for object attribute prediction on unseen objects, which demonstrate the effectiveness of our approach for describing unknown objects.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.