An Intelligent Framework for an Effective Clinical Recommendation System to Predict Diseases from Multimodal Medical Data
Date
2023
Authors
Shashank
Journal Title
Journal ISSN
Volume Title
Publisher
National Institute Of Technology Karnataka Surathkal
Abstract
Over the past few decades, the enormous expansion of medical data has led to a
way for data analysis in the smart healthcare system. Data analytics in healthcare
typically involves the use of statistical and machine learning algorithms to process
and analyze clinical data in order to identify correlations and insights that can
help enhance health outcomes - in terms of automated disease prediction with
minimized human errors, a reduced readmission rate, improved clinical care at a
lower cost, and optimized hospital operations. In this direction, over the years,
there has been a significant study focusing on Health Information Systems (HIS),
particularly Clinical Recommendation Systems (CRS). A CRS offers computer-
generated suggestions and advice to healthcare professionals when making clinical
decisions. These systems evaluate patient information and propose suitable treat-
ment alternatives, considering clinical guidelines, evidence-based medicine, and
other pertinent factors. Lately, a tremendous amount of clinical data has been
acquired from various sources, including Electronic Health Records (EHRs), med-
ical imaging, laboratory tests, wearable devices, health apps, telemedicine, and
genomic data, which led to the concept of multimodality. Recent progress in deep
learning and machine learning algorithms has facilitated the use of artificial in-
telligence techniques on multimodal medical data, helping to improve diagnostic
predictions. Despite the considerable advantages offered by CRSs, their maximum
potential can only be realized by effectively tackling several existing challenges.
There is a considerable prospect of enhancing the predictive model’s ability, par-
ticularly with respect to multimodal medical data.
The primary objective of the research work presented in this thesis is to develop
an effective clinical recommendation system that can accurately predict abnormal-
ities from diverse types of clinical data for personalized, data-driven recommenda-
tions to healthcare providers. This study explores multiple approaches for disease
prediction using both unimodal and multimodal data sources, including diagnostic
clinical notes and radiology images. The research also presents the cross-modal
task of generating diagnostic reports from radiology images and analyzes the effec-
iii
tiveness of different imaging sequences in predicting diseases. Radiology reports
contain rich information about patients’ health conditions; however, their unstruc-
tured format makes it challenging to retrieve this valuable information. Towards
the unimodal task, we proposed an effective Unimodal Medical Text Embedding
Subnetwork (UM-TES) that incorporates a knowledge base trained on a large cor-
pus to extract the textual features and predict the pulmonary abnormalities from
the unstructured radiology free-text reports. The benchmarking analysis revealed
that UM-TES outperformed standard NLP and ML techniques in predicting pul-
monary diseases from unstructured diagnostic reports. Diagnostic imaging plays
a critical role in modern medicine, serving as an essential tool to aid in the prog-
nosis and therapy of various health ailments, supporting essential applications of
recommendation systems. The texture and shape of the tissues in the diagnostic
images are essential aspects of diagnosis. The pulmonary diseases have irregu-
lar and different sizes; hence, several studies sought to add new components to
existing deep learning techniques for acquiring multi-scale imaging features from
diagnostic chest X-rays. Towards this unimodal task of leveraging diagnostic im-
ages for disease prediction, the explainable and lightweight Unimodal Medical
Visual Encoding Subnetwork (UM-VES) is proposed to predict pulmonary abnor-
malities from the diagnostic chest X-ray images. The proposed model is tested
with a publicly available Open-I Dataset and data collected from a private hospi-
tal. After the comprehensive assessment, it was observed that the performance of
the designed approach showcased a 7% to 18% increase in accuracy compared to
the existing method.
Many contemporary DL strategies for radiology focus on a single modality of
data utilizing imaging features without considering the clinical context that pro-
vides more valuable complementary information for clinically consistent prognostic
decisions. Towards this objective, the two novel multimodal medical fusion tech-
niques: Compact Bilinear Pooling and Deep Hadamard Product is proposed to
integrate textual and visual medical features from clinical text reports and Chest
X-rays to predict abnormalities from multimodal data. A comprehensive analy-
sis was conducted and compared the performance of unimodal and multimodal
models. The proposed models were applied to standard augmented data and the
synthetic data generated to check the model’s ability to predict from the new and
unseen data. The proposed multimodal models have given superior results com-
pared to the unimodal models. There has been a significant contribution in the
area of cross-modal medical description generation. In order to create accurate
and reliable radiology reports, radiologists need to be experienced and dedicateiii
sufficient time to reviewing medical images. However, many radiology reports end
with ambiguous conclusions, leading patients to undergo additional tests, such as
pathology or advanced imaging. To address this, we propose an encoder-decoder-
based deep learning framework to produce diagnostic radiology reports based on
chest X-ray images. Additionally, we have developed a dynamic web portal that
accepts chest X-rays as input and generates a radiology report as output. We
conducted a thorough analysis and compared the performance of our model with
other state-of-the-art deep learning approaches. Our results show that our pro-
posed model outperforms existing models in terms of BLEU score on the Indiana
University Dataset.
In the medical domain, the radiologist examines multiple imaging modalities
to determine the disease outcome. Acute infarct is one such illness where radi-
ologists utilize multiple MRI sequences like DWI, T2-Flair, ADC, and SWI to
examine the prognosis. Currently, expert clinicians rely on manual interpretation
of imaging methods for diagnosing diseases. However, with the rising number
of chronic cases, this approach has become a burden on healthcare profession-
als, increasing their cognitive and diagnostic workload. Towards this multi-image
fusion task, We introduce the DL framework, including contour-based brain seg-
mentation techniques and two stacked multi-channel convolution neural networks,
SMC-CNN-M and SMC-CNN-I, to predict the disease from both multiple and in-
dividual MRI sequences. We evaluate our proposed models on a medical dataset
collected from a private hospital and compare their classification performance to
that of state-of-the-art deep learning networks. Additionally, we conduct a quan-
titative, qualitative, and ablation study on different MRI sequences to assess their
effectiveness and generate synthetic data using DCGAN to compare model per-
formance.
Description
Keywords
Unstructured Data Analysis, Multimodal Representation, Cross- modal Retrieval, Medical Image Fusion