Survival Stacking Ensemble Model for Lung Cancer Risk Prediction

Data: 27.11.2024


Abstract

The most well-established risk factor for lung cancer (LC) is smoking, responsible for approximately 85% of cases. The Lung Cancer Risk Assessment Tool (LCRAT) is a key advancement in this field, which predicts individual risk based on factors like smoking habits, demographic details, personal and family medical history, and environmental exposures. This paper proposes a model with fewer features that improves state of the art performance, using a simplified stacking ensemble, making it more accessible and easier to implement in routine healthcare practice. The data used in this work were derived from two cohorts in the United States: The National Lung Screening Trial (NLST) and the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. Both our model and LCRAT achieve an AUC of 0.799 and 0.782 on test respectively. In terms of percentage of positives, in the 50% of the population, both detect 0.766 and 0.754 of the cases. The ensemble of different survival models enhances robustness by mitigating the weakness of individual models and directly impacts the efficiency of the model, increasing the efficiency and generalizability.

BIB_text

@Article {
title = {Survival Stacking Ensemble Model for Lung Cancer Risk Prediction},
pages = {155-159},
keywds = {
Cancer; ensemble models; machine learning; risk factors
}
abstract = {

The most well-established risk factor for lung cancer (LC) is smoking, responsible for approximately 85% of cases. The Lung Cancer Risk Assessment Tool (LCRAT) is a key advancement in this field, which predicts individual risk based on factors like smoking habits, demographic details, personal and family medical history, and environmental exposures. This paper proposes a model with fewer features that improves state of the art performance, using a simplified stacking ensemble, making it more accessible and easier to implement in routine healthcare practice. The data used in this work were derived from two cohorts in the United States: The National Lung Screening Trial (NLST) and the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. Both our model and LCRAT achieve an AUC of 0.799 and 0.782 on test respectively. In terms of percentage of positives, in the 50% of the population, both detect 0.766 and 0.754 of the cases. The ensemble of different survival models enhances robustness by mitigating the weakness of individual models and directly impacts the efficiency of the model, increasing the efficiency and generalizability.


}
isbn = {978-164368554-0},
date = {2024-11-27},
}
Vicomtech

Gipuzkoako Zientzia eta Teknologia Parkea,
Mikeletegi Pasealekua 57,
20009 Donostia / San Sebastián (Espainia)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbo (Espainia)

close overlay

Jokaeraren araberako publizitateko cookieak beharrezkoak dira eduki hau kargatzeko

Onartu jokaeraren araberako publizitateko cookieak