Learning to Detect Scene Landmarks for Camera Localization

Tien Do; Ondrej Miksik; Joseph Degol; Hyun Soo Park; Sudipta N. Sinha

doi:10.1109/CVPR52688.2022.01085

Learning to Detect Scene Landmarks for Camera Localization

Tien Do, Ondrej Miksik, Joseph Degol, Hyun Soo Park, Sudipta N. Sinha

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

9 Scopus citations

Abstract

Modern camera localization methods that use image retrieval, feature matching, and 3D structure-based pose estimation require long-term storage of numerous scene images or a vast amount of image features. This can make them unsuitable for resource constrained VR/AR devices and also raises serious privacy concerns. We present a new learned camera localization technique that eliminates the need to store features or a detailed 3D point cloud. Our key idea is to implicitly encode the appearance of a sparse yet salient set of 3D scene points into a convolutional neural network (CNN) that can detect these scene points in query images whenever they are visible. We refer to these points as scene landmarks. We also show that a CNN can be trained to regress bearing vectors for such landmarks even when they are not within the camera's field-of-view. We demonstrate that the predicted landmarks yield accurate pose estimates and that our method outperforms DSAC*, the state-of-the-art in learned localization. Furthermore, extending HLoc (an accurate method) by combining its correspondences with our predictions boosts its accuracy even further.

Original language	English (US)
Title of host publication	Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
Publisher	IEEE Computer Society
Pages	11122-11132
Number of pages	11
ISBN (Electronic)	9781665469463
DOIs	https://doi.org/10.1109/CVPR52688.2022.01085
State	Published - 2022
Event	2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 - New Orleans, United States Duration: Jun 19 2022 → Jun 24 2022

Publication series

Name	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume	2022-June
ISSN (Print)	1063-6919

Conference

Conference	2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
Country/Territory	United States
City	New Orleans
Period	6/19/22 → 6/24/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Keywords

Pose estimation and tracking

Access

10.1109/CVPR52688.2022.01085

OpenUrl availability

Full text

Cite this

Do, T., Miksik, O., Degol, J., Park, H. S., & Sinha, S. N. (2022). Learning to Detect Scene Landmarks for Camera Localization. In Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 (pp. 11122-11132). (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2022-June). IEEE Computer Society. https://doi.org/10.1109/CVPR52688.2022.01085

Learning to Detect Scene Landmarks for Camera Localization. / Do, Tien; Miksik, Ondrej; Degol, Joseph et al.
Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022. IEEE Computer Society, 2022. p. 11122-11132 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2022-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Do, T, Miksik, O, Degol, J, Park, HS & Sinha, SN 2022, Learning to Detect Scene Landmarks for Camera Localization. in Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2022-June, IEEE Computer Society, pp. 11122-11132, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, United States, 6/19/22. https://doi.org/10.1109/CVPR52688.2022.01085

Do T, Miksik O, Degol J, Park HS, Sinha SN. Learning to Detect Scene Landmarks for Camera Localization. In Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022. IEEE Computer Society. 2022. p. 11122-11132. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR52688.2022.01085

@inproceedings{22733d2176c84d628c645f01fdf82eb0,

title = "Learning to Detect Scene Landmarks for Camera Localization",

abstract = "Modern camera localization methods that use image retrieval, feature matching, and 3D structure-based pose estimation require long-term storage of numerous scene images or a vast amount of image features. This can make them unsuitable for resource constrained VR/AR devices and also raises serious privacy concerns. We present a new learned camera localization technique that eliminates the need to store features or a detailed 3D point cloud. Our key idea is to implicitly encode the appearance of a sparse yet salient set of 3D scene points into a convolutional neural network (CNN) that can detect these scene points in query images whenever they are visible. We refer to these points as scene landmarks. We also show that a CNN can be trained to regress bearing vectors for such landmarks even when they are not within the camera's field-of-view. We demonstrate that the predicted landmarks yield accurate pose estimates and that our method outperforms DSAC*, the state-of-the-art in learned localization. Furthermore, extending HLoc (an accurate method) by combining its correspondences with our predictions boosts its accuracy even further.",

keywords = "Pose estimation and tracking",

author = "Tien Do and Ondrej Miksik and Joseph Degol and Park, {Hyun Soo} and Sinha, {Sudipta N.}",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.; 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 ; Conference date: 19-06-2022 Through 24-06-2022",

year = "2022",

doi = "10.1109/CVPR52688.2022.01085",

language = "English (US)",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "11122--11132",

booktitle = "Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022",

}

TY - GEN

T1 - Learning to Detect Scene Landmarks for Camera Localization

AU - Do, Tien

AU - Miksik, Ondrej

AU - Degol, Joseph

AU - Park, Hyun Soo

AU - Sinha, Sudipta N.

PY - 2022

Y1 - 2022

N2 - Modern camera localization methods that use image retrieval, feature matching, and 3D structure-based pose estimation require long-term storage of numerous scene images or a vast amount of image features. This can make them unsuitable for resource constrained VR/AR devices and also raises serious privacy concerns. We present a new learned camera localization technique that eliminates the need to store features or a detailed 3D point cloud. Our key idea is to implicitly encode the appearance of a sparse yet salient set of 3D scene points into a convolutional neural network (CNN) that can detect these scene points in query images whenever they are visible. We refer to these points as scene landmarks. We also show that a CNN can be trained to regress bearing vectors for such landmarks even when they are not within the camera's field-of-view. We demonstrate that the predicted landmarks yield accurate pose estimates and that our method outperforms DSAC*, the state-of-the-art in learned localization. Furthermore, extending HLoc (an accurate method) by combining its correspondences with our predictions boosts its accuracy even further.

AB - Modern camera localization methods that use image retrieval, feature matching, and 3D structure-based pose estimation require long-term storage of numerous scene images or a vast amount of image features. This can make them unsuitable for resource constrained VR/AR devices and also raises serious privacy concerns. We present a new learned camera localization technique that eliminates the need to store features or a detailed 3D point cloud. Our key idea is to implicitly encode the appearance of a sparse yet salient set of 3D scene points into a convolutional neural network (CNN) that can detect these scene points in query images whenever they are visible. We refer to these points as scene landmarks. We also show that a CNN can be trained to regress bearing vectors for such landmarks even when they are not within the camera's field-of-view. We demonstrate that the predicted landmarks yield accurate pose estimates and that our method outperforms DSAC*, the state-of-the-art in learned localization. Furthermore, extending HLoc (an accurate method) by combining its correspondences with our predictions boosts its accuracy even further.

KW - Pose estimation and tracking

UR - http://www.scopus.com/inward/record.url?scp=85143512857&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85143512857&partnerID=8YFLogxK

U2 - 10.1109/CVPR52688.2022.01085

DO - 10.1109/CVPR52688.2022.01085

M3 - Conference contribution

AN - SCOPUS:85143512857

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 11122

EP - 11132

BT - Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022

PB - IEEE Computer Society

T2 - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022

Y2 - 19 June 2022 through 24 June 2022

ER -

Learning to Detect Scene Landmarks for Camera Localization

Abstract

Publication series

Conference

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this