Cancer is everywhere and yet it seems that the more we know about it, the more we have to learn. With almost one and every two people expected to receive a cancer diagnosis within their lifetime, we are in the fight against time to discover as much as possible about the many forms that cancers take. But it’s not so simple – there are hundreds of types of cancers and while data is prolific, every cancer is complex in its own unique way.
The development of a new artificial intelligence tool designed by researchers at the National Center for Computational Sciences at the Department of Energy's Oak Ridge National Laboratory aims to make processing all that complex cancer data easier. The artificial intelligence-based natural language processing tool is a part of a DOE-National Cancer Institute collaboration known as the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C). The project aims to ameliorate how we take information from textual cancer pathology reports.
"Population-level cancer surveillance is critical for monitoring the effectiveness of public health initiatives aimed at preventing, detecting, and treating cancer," said Gina Tourassi, director of the Health Data Sciences Institute and Oak Ridge National Laboratory. "Collaborating with the National Cancer Institute, my team is developing advanced artificial intelligence solutions to modernize the national cancer surveillance program by automating the time-consuming data capture effort and providing near real-time cancer reporting," adds Tourassi.
We depend on cancer registries to provide vital statistics to health professionals, researchers, and policymakers. Yet, "Manually extracting information is costly, time-consuming, and error-prone,” explains lead author of the paper, Mohammed Alawad, who is also a research scientist in the ORNL Computing and Computational Sciences. That’s what makes information-mining so critical. “So we are developing an AI-based tool," Alawad says.
The neural network of the new tool is capable of extracting information for five characteristics simultaneously: primary site (the body organ), laterality (right or left organ), behavior, histological type (cell type), and histological grade (how quickly the cancer cells are growing or spreading). As more data is added, the tool continues to learn and become more accurate.
"The next step is to launch a large-scale user study where the technology will be deployed across cancer registries to identify the most effective ways of integration in the registries' workflows. The goal is not to replace the human but rather augment the human," Tourassi said.
The study was published recently in the Journal of the American Medical Informatics Association.
Sources: Science Daily, Journal of the American Medical Informatics Association