Toward a Data-Driven System for Personalized Cervical Cancer Screening
Friday, 13 November 2020 3:55pm - 4:10pm EDT
DescriptionMass-screening programs for cervical cancer in the Nordic countries have a proven strong effect for preventing cancer at the population level and have produced large amounts of data at centrally organized at nationwide registries. Despite this success, minimizing over-screening and under-treatment remains a major challenge. The main difficulties in deriving personalized models from the cancer screening data are due to its high scarcity, irregularity and skewness. In this paper, we present a novel approach based on matrix factorization for personalized time-dependent risk assessment of cervical cancer development. This can be cast as a time-series prediction model, where data from each female is represented as a sparse vector in the time dimension and data from the whole population is collected in a single matrix. We explore the latent structure of this matrix by imposing novel temporal regularization, and derive a small number of basic profiles to describe the population.

We validate the algorithm on both synthetic and real data from the Cancer Registry of Norway and demonstrate the potential for more efficient and personalized cancer screening by showing that the proposed approach can predict the risk of cervical cancer development up to 36 months ahead in time.
