Self-supervised Secondary Landmark Detection via 3D Representation Learning

Praneet Bala; Jan Zimmermann; Hyun Soo Park; Benjamin Y. Hayden

doi:10.1007/s11263-023-01804-y

Self-supervised Secondary Landmark Detection via 3D Representation Learning

Praneet Bala, Jan Zimmermann, Hyun Soo Park, Benjamin Y. Hayden

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

Recent technological developments have lead to great advances in the computerized tracking of joints and other landmarks in moving animals, including humans. Such tracking promises important advances in biology and biomedicine. Modern tracking models depend critically on labor-intensive annotated datasets of primary landmarks by non-expert humans. However, such annotation approaches can be costly and impractical for secondary landmarks, that is, ones that reflect fine-grained geometry of animals, and that are often specific to customized behavioral tasks. Due to visual and geometric ambiguity, non-experts are often not qualified for secondary landmark annotation, which can require anatomical and zoological knowledge. These barriers significantly impede downstream behavioral studies because the learned tracking models exhibit limited generalizability. We hypothesize that there exists a shared representation between the primary and secondary landmarks because the range of motion of the secondary landmarks can be approximately spanned by that of the primary landmarks. We present a method to learn this spatial relationship of the primary and secondary landmarks in three dimensional space, which can, in turn, self-supervise the secondary landmark detector. This 3D representation learning is generic, and can therefore be applied to various multiview settings across diverse organisms, including macaques, flies, and humans.

Original language	English (US)
Pages (from-to)	1980-1994
Number of pages	15
Journal	International Journal of Computer Vision
Volume	131
Issue number	8
DOIs	https://doi.org/10.1007/s11263-023-01804-y
State	Published - Aug 2023

Bibliographical note

Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

Keywords

Contrastive learning
Human and non-human dataset
Landmark detection
Self-supervised learning
Shared representations

Access

10.1007/s11263-023-01804-y

OpenUrl availability

Full text

Cite this

@article{e481507394c34512bebad55292486d28,

title = "Self-supervised Secondary Landmark Detection via 3D Representation Learning",

abstract = "Recent technological developments have lead to great advances in the computerized tracking of joints and other landmarks in moving animals, including humans. Such tracking promises important advances in biology and biomedicine. Modern tracking models depend critically on labor-intensive annotated datasets of primary landmarks by non-expert humans. However, such annotation approaches can be costly and impractical for secondary landmarks, that is, ones that reflect fine-grained geometry of animals, and that are often specific to customized behavioral tasks. Due to visual and geometric ambiguity, non-experts are often not qualified for secondary landmark annotation, which can require anatomical and zoological knowledge. These barriers significantly impede downstream behavioral studies because the learned tracking models exhibit limited generalizability. We hypothesize that there exists a shared representation between the primary and secondary landmarks because the range of motion of the secondary landmarks can be approximately spanned by that of the primary landmarks. We present a method to learn this spatial relationship of the primary and secondary landmarks in three dimensional space, which can, in turn, self-supervise the secondary landmark detector. This 3D representation learning is generic, and can therefore be applied to various multiview settings across diverse organisms, including macaques, flies, and humans.",

keywords = "Contrastive learning, Human and non-human dataset, Landmark detection, Self-supervised learning, Shared representations",

author = "Praneet Bala and Jan Zimmermann and Park, {Hyun Soo} and Hayden, {Benjamin Y.}",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.",

year = "2023",

month = aug,

doi = "10.1007/s11263-023-01804-y",

language = "English (US)",

volume = "131",

pages = "1980--1994",

journal = "International Journal of Computer Vision",

issn = "0920-5691",

publisher = "Springer Netherlands",

number = "8",

}

TY - JOUR

T1 - Self-supervised Secondary Landmark Detection via 3D Representation Learning

AU - Bala, Praneet

AU - Zimmermann, Jan

AU - Park, Hyun Soo

AU - Hayden, Benjamin Y.

PY - 2023/8

Y1 - 2023/8

N2 - Recent technological developments have lead to great advances in the computerized tracking of joints and other landmarks in moving animals, including humans. Such tracking promises important advances in biology and biomedicine. Modern tracking models depend critically on labor-intensive annotated datasets of primary landmarks by non-expert humans. However, such annotation approaches can be costly and impractical for secondary landmarks, that is, ones that reflect fine-grained geometry of animals, and that are often specific to customized behavioral tasks. Due to visual and geometric ambiguity, non-experts are often not qualified for secondary landmark annotation, which can require anatomical and zoological knowledge. These barriers significantly impede downstream behavioral studies because the learned tracking models exhibit limited generalizability. We hypothesize that there exists a shared representation between the primary and secondary landmarks because the range of motion of the secondary landmarks can be approximately spanned by that of the primary landmarks. We present a method to learn this spatial relationship of the primary and secondary landmarks in three dimensional space, which can, in turn, self-supervise the secondary landmark detector. This 3D representation learning is generic, and can therefore be applied to various multiview settings across diverse organisms, including macaques, flies, and humans.

AB - Recent technological developments have lead to great advances in the computerized tracking of joints and other landmarks in moving animals, including humans. Such tracking promises important advances in biology and biomedicine. Modern tracking models depend critically on labor-intensive annotated datasets of primary landmarks by non-expert humans. However, such annotation approaches can be costly and impractical for secondary landmarks, that is, ones that reflect fine-grained geometry of animals, and that are often specific to customized behavioral tasks. Due to visual and geometric ambiguity, non-experts are often not qualified for secondary landmark annotation, which can require anatomical and zoological knowledge. These barriers significantly impede downstream behavioral studies because the learned tracking models exhibit limited generalizability. We hypothesize that there exists a shared representation between the primary and secondary landmarks because the range of motion of the secondary landmarks can be approximately spanned by that of the primary landmarks. We present a method to learn this spatial relationship of the primary and secondary landmarks in three dimensional space, which can, in turn, self-supervise the secondary landmark detector. This 3D representation learning is generic, and can therefore be applied to various multiview settings across diverse organisms, including macaques, flies, and humans.

KW - Contrastive learning

KW - Human and non-human dataset

KW - Landmark detection

KW - Self-supervised learning

KW - Shared representations

UR - http://www.scopus.com/inward/record.url?scp=85160812894&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85160812894&partnerID=8YFLogxK

U2 - 10.1007/s11263-023-01804-y

DO - 10.1007/s11263-023-01804-y

M3 - Article

AN - SCOPUS:85160812894

SN - 0920-5691

VL - 131

SP - 1980

EP - 1994

JO - International Journal of Computer Vision

JF - International Journal of Computer Vision

IS - 8

ER -

Self-supervised Secondary Landmark Detection via 3D Representation Learning

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this