JSON Document Clustering Based on Structural Similarity and Semantic Fusion

dc.contributor.authorUma Priya, D.
dc.contributor.authorSanthi Thilagam, P.S.
dc.date.accessioned2026-02-08T16:50:04Z
dc.date.issued2023
dc.description.abstractThe emerging drift toward real-time applications generates massive amounts of JSON data exponentially over the web. Dealing with the heterogeneous structures of JSON document collections is challenging for efficient data management and knowledge discovery. Clustering JSON documents has become a significant issue in organizing large data collections. Existing research has focused on clustering JSON documents using structural or semantic similarity measures. However, differently annotated JSON structures are also related by the context of the JSON attributes. As a result, existing research work is unable to identify the context hidden in the schemas, emphasizing the importance of leveraging the syntactic, semantic, and contextual properties of heterogeneous JSON schemas. To address the specific research gap, this work proposes JSON Similarity (JSim), a novel approach for clustering JSON documents by combining the structural and semantic similarity scores of JSON schemas. In order to capture more semantics, the semantic fusion method is proposed, which correlates schemas using semantic as well as contextual similarity measures. The JSON documents are clustered based on the weighted similarity matrix. The results and findings show that the proposed approach outperforms the current approaches significantly. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
dc.identifier.citationLecture Notes on Data Engineering and Communications Technologies, 2023, Vol.163, , p. 51-62
dc.identifier.issn23674512
dc.identifier.urihttps://doi.org/10.56578/jemse040403
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/33633
dc.publisherSpringer Science and Business Media Deutschland GmbH
dc.subjectClustering
dc.subjectJSON
dc.subjectSemantic similarity
dc.subjectStructural similarity
dc.titleJSON Document Clustering Based on Structural Similarity and Semantic Fusion

Files

Collections