Abstract
Breast cancer remains one of the leading causes of cancer-related deaths worldwide, making accurate survival prediction essential for improving patient care and treatment decisions. Advances in artificial intelligence, particularly deep learning (DL) models, offer promising solutions for enhancing predictive accuracy. In this study, we compare the performance of conventional machine learning (ML) models and DL models using a dataset of 4,024 patients. We evaluate Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Support Vector Machine (SVM), Logistic Regression, and K-Nearest Neighbors (KNN), alongside a fine-tuned GPT-4 model. Our results indicate that BiLSTM outperforms other models, achieving an F1 score of 0.95 and an accuracy of 91% for the "Alive" class, followed closely by LSTM with an F1 score of 0.94 and an accuracy of 90%. Traditional ML models such as SVM and Logistic Regression perform moderately well in detecting the "Dead" class, with F1 scores of 0.60 and 0.58, respectively. KNN lags behind with an F1 score of 0.47. Additionally, the fine-tuned GPT-4 model demonstrates strong predictive capability, reaching an F1 score of 0.97 for the "Alive" class and an overall accuracy of 96.67%. While deep learning models, particularly BiLSTM and GPT-4, prove effective in predicting survival, challenges remain in addressing class imbalance to improve prediction for the minority "Dead" class. This study highlights the potential of deep learning in medical prognosis and underscores the need for further optimization to ensure balanced performance across all patient groups.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2025 Sulaiman Khan, Muhammad Usman, Muhammad Ahmad, Fida Ullah, Sardar Usman, Muhammad Muzamil, Ameer Hamza, Muhammad Jalal, Ildar Batyrshin, Carlos Aguilar-Ibañez (Author)