Authors: Xia Jiang, Yijun Zhou, Chuhan Xu, Adam Brufsky, Alan Wells
Abstract: A grid search, at the cost of training and testing a large number of models,
is an effective way to optimize the prediction performance of deep learning
models. A challenging task concerning grid search is the time management.
Without a good time management scheme, a grid search can easily be set off as a
mission that will not finish in our lifetime. In this study, we introduce a
heuristic three-stage mechanism for managing the running time of low-budget
grid searches, and the sweet-spot grid search (SSGS) and randomized grid search
(RGS) strategies for improving model prediction performance, in predicting the
5-year, 10-year, and 15-year risk of breast cancer metastasis. We develop deep
feedforward neural network (DFNN) models and optimize them through grid
searches. We conduct eight cycles of grid searches by applying our three-stage
mechanism and SSGS and RGS strategies. We conduct various SHAP analyses
including unique ones that interpret the importance of the DFNN-model
hyperparameters. Our results show that grid search can greatly improve model
prediction. The grid searches we conducted improved the risk prediction of
5-year, 10-year, and 15-year breast cancer metastasis by 18.6%, 16.3%, and
17.3% respectively, over the average performance of all corresponding models we
trained. We not only demonstrate best model performance but also characterize
grid searches from various aspects such as their capabilities of discovering
decent models and the unit grid search time. The three-stage mechanism worked
effectively. It made our low-budget grid searches feasible and manageable, and
in the meantime helped improve model prediction performance. Our SHAP analyses
identified both clinical risk factors important for the prediction of future
risk of breast cancer metastasis, and DFNN-model hyperparameters important to
the prediction of performance scores.
Source: http://arxiv.org/abs/2408.07673v1