Survival Stacking Ensemble Model for Lung Cancer Risk Prediction

Fecha: 27.11.2024


Abstract

The most well-established risk factor for lung cancer (LC) is smoking, responsible for approximately 85% of cases. The Lung Cancer Risk Assessment Tool (LCRAT) is a key advancement in this field, which predicts individual risk based on factors like smoking habits, demographic details, personal and family medical history, and environmental exposures. This paper proposes a model with fewer features that improves state of the art performance, using a simplified stacking ensemble, making it more accessible and easier to implement in routine healthcare practice. The data used in this work were derived from two cohorts in the United States: The National Lung Screening Trial (NLST) and the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. Both our model and LCRAT achieve an AUC of 0.799 and 0.782 on test respectively. In terms of percentage of positives, in the 50% of the population, both detect 0.766 and 0.754 of the cases. The ensemble of different survival models enhances robustness by mitigating the weakness of individual models and directly impacts the efficiency of the model, increasing the efficiency and generalizability.

BIB_text

@Article {
title = {Survival Stacking Ensemble Model for Lung Cancer Risk Prediction},
pages = {155-159},
keywds = {
Cancer; ensemble models; machine learning; risk factors
}
abstract = {

The most well-established risk factor for lung cancer (LC) is smoking, responsible for approximately 85% of cases. The Lung Cancer Risk Assessment Tool (LCRAT) is a key advancement in this field, which predicts individual risk based on factors like smoking habits, demographic details, personal and family medical history, and environmental exposures. This paper proposes a model with fewer features that improves state of the art performance, using a simplified stacking ensemble, making it more accessible and easier to implement in routine healthcare practice. The data used in this work were derived from two cohorts in the United States: The National Lung Screening Trial (NLST) and the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. Both our model and LCRAT achieve an AUC of 0.799 and 0.782 on test respectively. In terms of percentage of positives, in the 50% of the population, both detect 0.766 and 0.754 of the cases. The ensemble of different survival models enhances robustness by mitigating the weakness of individual models and directly impacts the efficiency of the model, increasing the efficiency and generalizability.


}
isbn = {978-164368554-0},
date = {2024-11-27},
}
Vicomtech

Parque Científico y Tecnológico de Gipuzkoa,
Paseo Mikeletegi 57,
20009 Donostia / San Sebastián (España)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbao (España)

close overlay

Las cookies de publicidad comportamental son necesarias para cargar el contenido

Aceptar cookies de publicidad comportamental