Heterogeneous data format integration and conversion (HDFIC) using machine learning and IBM-DFDL for IoT

dc.contributor.authorSandeep, S.
dc.contributor.authorChandavarkar, B.R.
dc.contributor.authorKhatri, S.
dc.date.accessioned2026-02-04T12:25:02Z
dc.date.issued2024
dc.description.abstractThe future of the Internet of Things (IoT) demands the integration of synergetic applications to cater to societal needs. Examples of IoT-based confederated applications include Ambient Assisted Living with Active Healthy Ageing, CasAware with Smart Energy, Smart Gas Distribution Networks with GIS systems, and more. However, the data heterogeneity hinders integration, as these systems follow different standards, data formats, semantic models, and representations. Further, this leads to data interoperability issues in IoT. The major concern of academia and industry in the smooth integration of heterogeneous applications is interpreting different data formats and representing them in a common schema for further analysis. Existing solutions, such as message payload translation, middleware/cloud format, and Inter-IoT, are complex, time-consuming, and ineffective. Hence, this paper proposes the heterogeneous data format integration and conversion (HDFIC), a machine learning-based system to identify data formats using a Random Forest classifier and integrate them using the Data Format Description Language (DFDL). The content-based data format identification in the proposed HDFIC is trained with the standard features defined in RFC 7111, 8259, and 8996. Subsequently, the data is integrated into a single XML Schema Definition and converted into the required data format using the IBM App Connect Enterprise tool and DFDL. Finally, the performance of HDFIC is evaluated with the synergetic patient body vitals and room ambiance dataset for accuracy, data integration time, and conversion efficiency. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.
dc.identifier.citationEvolving Systems, 2024, 15, 2, pp. 375-396
dc.identifier.issn18686478
dc.identifier.urihttps://doi.org/10.1007/s12530-024-09568-7
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/21200
dc.publisherSpringer Nature
dc.subjectData integration
dc.subjectInteroperability
dc.subjectMachine learning
dc.subjectMiddleware
dc.subjectSemantics
dc.subjectAmbient assisted living
dc.subjectData format
dc.subjectDescription languages
dc.subjectGas distribution network
dc.subjectGIS systems
dc.subjectHeterogeneity
dc.subjectHeterogeneous data
dc.subjectMachine-learning
dc.subjectSmart energies
dc.subjectSynergetics
dc.subjectInternet of things
dc.titleHeterogeneous data format integration and conversion (HDFIC) using machine learning and IBM-DFDL for IoT

Files

Collections