Ultra-rapid object categorization in real-world scenes with top-down manipulations

Bingjie Xu; Mohan S. Kankanhalli; Qi Zhao

doi:10.1371/journal.pone.0214444

Ultra-rapid object categorization in real-world scenes with top-down manipulations

Bingjie Xu, Mohan S. Kankanhalli, Qi Zhao

Computer Science and Engineering

Research output: Contribution to journal › Article › peer-review

8 Scopus citations

Abstract

Humans are able to achieve visual object recognition rapidly and effortlessly. Object categorization is commonly believed to be achieved by interaction between bottom-up and top-down cognitive processing. In the ultra-rapid categorization scenario where the stimuli appear briefly and response time is limited, it is assumed that a first sweep of feedforward information is sufficient to discriminate whether or not an object is present in a scene. However, whether and how feedback/top-down processing is involved in such a brief duration remains an open question. To this end, here, we would like to examine how different top-down manipulations, such as category level, category type and real-world size, interact in ultra-rapid categorization. We have constructed a dataset comprising real-world scene images with a built-in measurement of target object display size. Based on this set of images, we have measured ultra-rapid object categorization performance by human subjects. Standard feedforward computational models representing scene features and a state-of-the-art object detection model were employed for auxiliary investigation. The results showed the influences from 1) animacy (animal, vehicle, food), 2) level of abstraction (people, sport), and 3) real-world size (four target size levels) on ultra-rapid categorization processes. This had an impact to support the involvement of top-down processing when rapidly categorizing certain objects, such as sport at a fine grained level. Our work on human vs. model comparisons also shed light on possible collaboration and integration of the two that may be of interest to both experimental and computational vision researches. All the collected images and behavioral data as well as code and models are publicly available at https://osf.io/mqwjz/.

Original language	English (US)
Article number	e0214444
Journal	PloS one
Volume	14
Issue number	4
DOIs	https://doi.org/10.1371/journal.pone.0214444
State	Published - Apr 2019

Bibliographical note

Funding Information:
This research was funded in part by the NSF under Grant 1849107, in part by the University of Minnesota Department of Computer Science and Engineering Start-up Fund (QZ), and in part by the National Research Foundation, Prime Minister’s Office, Singapore under its Strategic Capability Research Centres Funding Initiative.

Publisher Copyright:
© 2019 Xu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Access

10.1371/journal.pone.0214444

OpenUrl availability

Full text

Cite this

@article{17990b0525f44a8c9055acd7de8fa428,

title = "Ultra-rapid object categorization in real-world scenes with top-down manipulations",

abstract = "Humans are able to achieve visual object recognition rapidly and effortlessly. Object categorization is commonly believed to be achieved by interaction between bottom-up and top-down cognitive processing. In the ultra-rapid categorization scenario where the stimuli appear briefly and response time is limited, it is assumed that a first sweep of feedforward information is sufficient to discriminate whether or not an object is present in a scene. However, whether and how feedback/top-down processing is involved in such a brief duration remains an open question. To this end, here, we would like to examine how different top-down manipulations, such as category level, category type and real-world size, interact in ultra-rapid categorization. We have constructed a dataset comprising real-world scene images with a built-in measurement of target object display size. Based on this set of images, we have measured ultra-rapid object categorization performance by human subjects. Standard feedforward computational models representing scene features and a state-of-the-art object detection model were employed for auxiliary investigation. The results showed the influences from 1) animacy (animal, vehicle, food), 2) level of abstraction (people, sport), and 3) real-world size (four target size levels) on ultra-rapid categorization processes. This had an impact to support the involvement of top-down processing when rapidly categorizing certain objects, such as sport at a fine grained level. Our work on human vs. model comparisons also shed light on possible collaboration and integration of the two that may be of interest to both experimental and computational vision researches. All the collected images and behavioral data as well as code and models are publicly available at https://osf.io/mqwjz/.",

author = "Bingjie Xu and Kankanhalli, {Mohan S.} and Qi Zhao",

note = "Funding Information: This research was funded in part by the NSF under Grant 1849107, in part by the University of Minnesota Department of Computer Science and Engineering Start-up Fund (QZ), and in part by the National Research Foundation, Prime Minister{\textquoteright}s Office, Singapore under its Strategic Capability Research Centres Funding Initiative. Publisher Copyright: {\textcopyright} 2019 Xu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.",

year = "2019",

month = apr,

doi = "10.1371/journal.pone.0214444",

language = "English (US)",

volume = "14",

journal = "PloS one",

issn = "1932-6203",

publisher = "Public Library of Science",

number = "4",

}

TY - JOUR

T1 - Ultra-rapid object categorization in real-world scenes with top-down manipulations

AU - Xu, Bingjie

AU - Kankanhalli, Mohan S.

AU - Zhao, Qi

N1 - Funding Information: This research was funded in part by the NSF under Grant 1849107, in part by the University of Minnesota Department of Computer Science and Engineering Start-up Fund (QZ), and in part by the National Research Foundation, Prime Minister’s Office, Singapore under its Strategic Capability Research Centres Funding Initiative. Publisher Copyright: © 2019 Xu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PY - 2019/4

Y1 - 2019/4

N2 - Humans are able to achieve visual object recognition rapidly and effortlessly. Object categorization is commonly believed to be achieved by interaction between bottom-up and top-down cognitive processing. In the ultra-rapid categorization scenario where the stimuli appear briefly and response time is limited, it is assumed that a first sweep of feedforward information is sufficient to discriminate whether or not an object is present in a scene. However, whether and how feedback/top-down processing is involved in such a brief duration remains an open question. To this end, here, we would like to examine how different top-down manipulations, such as category level, category type and real-world size, interact in ultra-rapid categorization. We have constructed a dataset comprising real-world scene images with a built-in measurement of target object display size. Based on this set of images, we have measured ultra-rapid object categorization performance by human subjects. Standard feedforward computational models representing scene features and a state-of-the-art object detection model were employed for auxiliary investigation. The results showed the influences from 1) animacy (animal, vehicle, food), 2) level of abstraction (people, sport), and 3) real-world size (four target size levels) on ultra-rapid categorization processes. This had an impact to support the involvement of top-down processing when rapidly categorizing certain objects, such as sport at a fine grained level. Our work on human vs. model comparisons also shed light on possible collaboration and integration of the two that may be of interest to both experimental and computational vision researches. All the collected images and behavioral data as well as code and models are publicly available at https://osf.io/mqwjz/.

AB - Humans are able to achieve visual object recognition rapidly and effortlessly. Object categorization is commonly believed to be achieved by interaction between bottom-up and top-down cognitive processing. In the ultra-rapid categorization scenario where the stimuli appear briefly and response time is limited, it is assumed that a first sweep of feedforward information is sufficient to discriminate whether or not an object is present in a scene. However, whether and how feedback/top-down processing is involved in such a brief duration remains an open question. To this end, here, we would like to examine how different top-down manipulations, such as category level, category type and real-world size, interact in ultra-rapid categorization. We have constructed a dataset comprising real-world scene images with a built-in measurement of target object display size. Based on this set of images, we have measured ultra-rapid object categorization performance by human subjects. Standard feedforward computational models representing scene features and a state-of-the-art object detection model were employed for auxiliary investigation. The results showed the influences from 1) animacy (animal, vehicle, food), 2) level of abstraction (people, sport), and 3) real-world size (four target size levels) on ultra-rapid categorization processes. This had an impact to support the involvement of top-down processing when rapidly categorizing certain objects, such as sport at a fine grained level. Our work on human vs. model comparisons also shed light on possible collaboration and integration of the two that may be of interest to both experimental and computational vision researches. All the collected images and behavioral data as well as code and models are publicly available at https://osf.io/mqwjz/.

UR - http://www.scopus.com/inward/record.url?scp=85064171416&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064171416&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0214444

DO - 10.1371/journal.pone.0214444

M3 - Article

C2 - 30969988

AN - SCOPUS:85064171416

SN - 1932-6203

VL - 14

JO - PloS one

JF - PloS one

IS - 4

M1 - e0214444

ER -

Ultra-rapid object categorization in real-world scenes with top-down manipulations

Abstract

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this