TY - JOUR
T1 - Automatic Segmentation of Kidneys and Kidney Tumors
T2 - The KiTS19 International Challenge
AU - Sathianathen, Niranjan J
AU - Heller, Nicholas
AU - Tejpaul, Resha
AU - Stai, Bethany
AU - Kalapara, Arveen
AU - Rickman, Jack
AU - Dean, Joshua
AU - Oestreich, Makinna
AU - Blake, Paul
AU - Kaluzniak, Heather
AU - Raza, Shaneabbas
AU - Rosenberg, Joel
AU - Moore, Keenan
AU - Walczak, Edward
AU - Rengel, Zachary
AU - Edgerton, Zach
AU - Vasdev, Ranveer
AU - Peterson, Matthew
AU - McSweeney, Sean
AU - Peterson, Sarah
AU - Papanikolopoulos, Nikolaos
AU - Weight, Christopher J
N1 - Publisher Copyright:
Copyright © 2022 Sathianathen, Heller, Tejpaul, Stai, Kalapara, Rickman, Dean, Oestreich, Blake, Kaluzniak, Raza, Rosenberg, Moore, Walczak, Rengel, Edgerton, Vasdev, Peterson, McSweeney, Peterson, Papanikolopoulos and Weight.
PY - 2022/1/4
Y1 - 2022/1/4
N2 - Purpose: Clinicians rely on imaging features to calculate complexity of renal masses based on validated scoring systems. These scoring methods are labor-intensive and are subjected to interobserver variability. Artificial intelligence has been increasingly utilized by the medical community to solve such issues. However, developing reliable algorithms is usually time-consuming and costly. We created an international community-driven competition (KiTS19) to develop and identify the best system for automatic segmentation of kidneys and kidney tumors in contrast CT and report the results. Methods: A training and test set of CT scans that was manually annotated by trained individuals were generated from consecutive patients undergoing renal surgery for whom demographic, clinical and outcome data were available. The KiTS19 Challenge was a machine learning competition hosted on grand-challenge.org in conjunction with an international conference. Teams were given 3 months to develop their algorithm using a full-annotated training set of images and an unannotated test set was released for 2 weeks from which average Sørensen-Dice coefficient between kidney and tumor regions were calculated across all 90 test cases. Results: There were 100 valid submissions that were based on deep neural networks but there were differences in pre-processing strategies, architectural details, and training procedures. The winning team scored a 0.974 kidney Dice and a 0.851 tumor Dice resulting in 0.912 composite score. Automatic segmentation of the kidney by the participating teams performed comparably to expert manual segmentation but was less reliable when segmenting the tumor. Conclusion: Rapid advancement in automated semantic segmentation of kidney lesions is possible with relatively high accuracy when the data is released publicly, and participation is incentivized. We hope that our findings will encourage further research that would enable the potential of adopting AI into the medical field.
AB - Purpose: Clinicians rely on imaging features to calculate complexity of renal masses based on validated scoring systems. These scoring methods are labor-intensive and are subjected to interobserver variability. Artificial intelligence has been increasingly utilized by the medical community to solve such issues. However, developing reliable algorithms is usually time-consuming and costly. We created an international community-driven competition (KiTS19) to develop and identify the best system for automatic segmentation of kidneys and kidney tumors in contrast CT and report the results. Methods: A training and test set of CT scans that was manually annotated by trained individuals were generated from consecutive patients undergoing renal surgery for whom demographic, clinical and outcome data were available. The KiTS19 Challenge was a machine learning competition hosted on grand-challenge.org in conjunction with an international conference. Teams were given 3 months to develop their algorithm using a full-annotated training set of images and an unannotated test set was released for 2 weeks from which average Sørensen-Dice coefficient between kidney and tumor regions were calculated across all 90 test cases. Results: There were 100 valid submissions that were based on deep neural networks but there were differences in pre-processing strategies, architectural details, and training procedures. The winning team scored a 0.974 kidney Dice and a 0.851 tumor Dice resulting in 0.912 composite score. Automatic segmentation of the kidney by the participating teams performed comparably to expert manual segmentation but was less reliable when segmenting the tumor. Conclusion: Rapid advancement in automated semantic segmentation of kidney lesions is possible with relatively high accuracy when the data is released publicly, and participation is incentivized. We hope that our findings will encourage further research that would enable the potential of adopting AI into the medical field.
KW - ct scans
KW - kidney tumors
KW - medical images
KW - renal mass
KW - semantic segmentation
UR - http://www.scopus.com/inward/record.url?scp=85131254408&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85131254408&partnerID=8YFLogxK
U2 - 10.3389/fdgth.2021.797607
DO - 10.3389/fdgth.2021.797607
M3 - Article
C2 - 35059687
AN - SCOPUS:85131254408
SN - 2673-253X
VL - 3
JO - Frontiers in Digital Health
JF - Frontiers in Digital Health
M1 - 797607
ER -