Look Both Ways: Self-supervising Driver Gaze Estimation and Road Scene Saliency

Isaac Kasahara; Simon Stent; Hyun Soo Park

doi:10.1007/978-3-031-19778-9_8

Look Both Ways: Self-supervising Driver Gaze Estimation and Road Scene Saliency

Isaac Kasahara, Simon Stent, Hyun Soo Park

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

6 Scopus citations

Abstract

We present a new on-road driving dataset, called “Look Both Ways”, which contains synchronized video of both driver faces and the forward road scene, along with ground truth gaze data registered from eye tracking glasses worn by the drivers. Our dataset supports the study of methods for non-intrusively estimating a driver’s focus of attention while driving - an important application area in road safety. A key challenge is that this task requires accurate gaze estimation, but supervised appearance-based gaze estimation methods often do not transfer well to real driving datasets, and in-domain ground truth to supervise them is difficult to gather. We therefore propose a method for self-supervision of driver gaze, by taking advantage of the geometric consistency between the driver’s gaze direction and the saliency of the scene as observed by the driver. We formulate a 3D geometric learning framework to enforce this consistency, allowing the gaze model to supervise the scene saliency model, and vice versa. We implement a prototype of our method and test it with our dataset, to show that compared to a supervised approach it can yield better gaze estimation and scene saliency estimation with no additional labels.

Original language	English (US)
Title of host publication	Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings
Editors	Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	126-142
Number of pages	17
ISBN (Print)	9783031197772
DOIs	https://doi.org/10.1007/978-3-031-19778-9_8
State	Published - 2022
Event	17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel Duration: Oct 23 2022 → Oct 27 2022

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	13673 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	17th European Conference on Computer Vision, ECCV 2022
Country/Territory	Israel
City	Tel Aviv
Period	10/23/22 → 10/27/22

Bibliographical note

Funding Information:
Acknowledgement. This research is based on work supported by Toyota Research Institute and the NSF under IIS #1846031. The views and conclusions contained herein are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the sponsors.

Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Keywords

3D gaze
ADAS
Driving
Saliency
Self-supervised learning

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access

10.1007/978-3-031-19778-9_8

OpenUrl availability

Full text

Cite this

Kasahara, I., Stent, S., & Park, H. S. (2022). Look Both Ways: Self-supervising Driver Gaze Estimation and Road Scene Saliency. In S. Avidan, G. Brostow, M. Cissé, G. M. Farinella, & T. Hassner (Eds.), Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings (pp. 126-142). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13673 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19778-9_8

Look Both Ways: Self-supervising Driver Gaze Estimation and Road Scene Saliency. / Kasahara, Isaac; Stent, Simon; Park, Hyun Soo.
Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings. ed. / Shai Avidan; Gabriel Brostow; Moustapha Cissé; Giovanni Maria Farinella; Tal Hassner. Springer Science and Business Media Deutschland GmbH, 2022. p. 126-142 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13673 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Kasahara, I, Stent, S & Park, HS 2022, Look Both Ways: Self-supervising Driver Gaze Estimation and Road Scene Saliency. in S Avidan, G Brostow, M Cissé, GM Farinella & T Hassner (eds), Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13673 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 126-142, 17th European Conference on Computer Vision, ECCV 2022, Tel Aviv, Israel, 10/23/22. https://doi.org/10.1007/978-3-031-19778-9_8

Kasahara I, Stent S, Park HS. Look Both Ways: Self-supervising Driver Gaze Estimation and Road Scene Saliency. In Avidan S, Brostow G, Cissé M, Farinella GM, Hassner T, editors, Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings. Springer Science and Business Media Deutschland GmbH. 2022. p. 126-142. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-19778-9_8

Kasahara, Isaac ; Stent, Simon ; Park, Hyun Soo. / Look Both Ways : Self-supervising Driver Gaze Estimation and Road Scene Saliency. Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings. editor / Shai Avidan ; Gabriel Brostow ; Moustapha Cissé ; Giovanni Maria Farinella ; Tal Hassner. Springer Science and Business Media Deutschland GmbH, 2022. pp. 126-142 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{2abdeaad90004322ad80472d5caf12b4,

title = "Look Both Ways: Self-supervising Driver Gaze Estimation and Road Scene Saliency",

abstract = "We present a new on-road driving dataset, called “Look Both Ways”, which contains synchronized video of both driver faces and the forward road scene, along with ground truth gaze data registered from eye tracking glasses worn by the drivers. Our dataset supports the study of methods for non-intrusively estimating a driver{\textquoteright}s focus of attention while driving - an important application area in road safety. A key challenge is that this task requires accurate gaze estimation, but supervised appearance-based gaze estimation methods often do not transfer well to real driving datasets, and in-domain ground truth to supervise them is difficult to gather. We therefore propose a method for self-supervision of driver gaze, by taking advantage of the geometric consistency between the driver{\textquoteright}s gaze direction and the saliency of the scene as observed by the driver. We formulate a 3D geometric learning framework to enforce this consistency, allowing the gaze model to supervise the scene saliency model, and vice versa. We implement a prototype of our method and test it with our dataset, to show that compared to a supervised approach it can yield better gaze estimation and scene saliency estimation with no additional labels.",

keywords = "3D gaze, ADAS, Driving, Saliency, Self-supervised learning",

author = "Isaac Kasahara and Simon Stent and Park, {Hyun Soo}",

note = "Funding Information: Acknowledgement. This research is based on work supported by Toyota Research Institute and the NSF under IIS #1846031. The views and conclusions contained herein are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the sponsors. Publisher Copyright: {\textcopyright} 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 17th European Conference on Computer Vision, ECCV 2022 ; Conference date: 23-10-2022 Through 27-10-2022",

year = "2022",

doi = "10.1007/978-3-031-19778-9_8",

language = "English (US)",

isbn = "9783031197772",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "126--142",

editor = "Shai Avidan and Gabriel Brostow and Moustapha Ciss{\'e} and Farinella, {Giovanni Maria} and Tal Hassner",

booktitle = "Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings",

address = "Germany",

}

TY - GEN

T1 - Look Both Ways

T2 - 17th European Conference on Computer Vision, ECCV 2022

AU - Kasahara, Isaac

AU - Stent, Simon

AU - Park, Hyun Soo

N1 - Funding Information: Acknowledgement. This research is based on work supported by Toyota Research Institute and the NSF under IIS #1846031. The views and conclusions contained herein are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the sponsors. Publisher Copyright: © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

PY - 2022

Y1 - 2022

N2 - We present a new on-road driving dataset, called “Look Both Ways”, which contains synchronized video of both driver faces and the forward road scene, along with ground truth gaze data registered from eye tracking glasses worn by the drivers. Our dataset supports the study of methods for non-intrusively estimating a driver’s focus of attention while driving - an important application area in road safety. A key challenge is that this task requires accurate gaze estimation, but supervised appearance-based gaze estimation methods often do not transfer well to real driving datasets, and in-domain ground truth to supervise them is difficult to gather. We therefore propose a method for self-supervision of driver gaze, by taking advantage of the geometric consistency between the driver’s gaze direction and the saliency of the scene as observed by the driver. We formulate a 3D geometric learning framework to enforce this consistency, allowing the gaze model to supervise the scene saliency model, and vice versa. We implement a prototype of our method and test it with our dataset, to show that compared to a supervised approach it can yield better gaze estimation and scene saliency estimation with no additional labels.

AB - We present a new on-road driving dataset, called “Look Both Ways”, which contains synchronized video of both driver faces and the forward road scene, along with ground truth gaze data registered from eye tracking glasses worn by the drivers. Our dataset supports the study of methods for non-intrusively estimating a driver’s focus of attention while driving - an important application area in road safety. A key challenge is that this task requires accurate gaze estimation, but supervised appearance-based gaze estimation methods often do not transfer well to real driving datasets, and in-domain ground truth to supervise them is difficult to gather. We therefore propose a method for self-supervision of driver gaze, by taking advantage of the geometric consistency between the driver’s gaze direction and the saliency of the scene as observed by the driver. We formulate a 3D geometric learning framework to enforce this consistency, allowing the gaze model to supervise the scene saliency model, and vice versa. We implement a prototype of our method and test it with our dataset, to show that compared to a supervised approach it can yield better gaze estimation and scene saliency estimation with no additional labels.

KW - 3D gaze

KW - ADAS

KW - Driving

KW - Saliency

KW - Self-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85142724367&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85142724367&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-19778-9_8

DO - 10.1007/978-3-031-19778-9_8

M3 - Conference contribution

AN - SCOPUS:85142724367

SN - 9783031197772

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 126

EP - 142

BT - Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings

A2 - Avidan, Shai

A2 - Brostow, Gabriel

A2 - Cissé, Moustapha

A2 - Farinella, Giovanni Maria

A2 - Hassner, Tal

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 23 October 2022 through 27 October 2022

ER -