JSON Document Clustering Based on Structural Similarity and Semantic Fusion

Uma Priya, D.; Santhi Thilagam, P.S.

JSON Document Clustering Based on Structural Similarity and Semantic Fusion

Date

2023

Authors

Uma Priya, D.

Santhi Thilagam, P.S.

Publisher

Springer Science and Business Media Deutschland GmbH

Abstract

The emerging drift toward real-time applications generates massive amounts of JSON data exponentially over the web. Dealing with the heterogeneous structures of JSON document collections is challenging for efficient data management and knowledge discovery. Clustering JSON documents has become a significant issue in organizing large data collections. Existing research has focused on clustering JSON documents using structural or semantic similarity measures. However, differently annotated JSON structures are also related by the context of the JSON attributes. As a result, existing research work is unable to identify the context hidden in the schemas, emphasizing the importance of leveraging the syntactic, semantic, and contextual properties of heterogeneous JSON schemas. To address the specific research gap, this work proposes JSON Similarity (JSim), a novel approach for clustering JSON documents by combining the structural and semantic similarity scores of JSON schemas. In order to capture more semantics, the semantic fusion method is proposed, which correlates schemas using semantic as well as contextual similarity measures. The JSON documents are clustered based on the weighted similarity matrix. The results and findings show that the proposed approach outperforms the current approaches significantly. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Keywords

Clustering, JSON, Semantic similarity, Structural similarity

Citation

Lecture Notes on Data Engineering and Communications Technologies, 2023, Vol.163, , p. 51-62

URI

https://doi.org/10.56578/jemse040403
https://idr.nitk.ac.in/handle/123456789/33633

Collections

Book Chapters

Full item page

JSON Document Clustering Based on Structural Similarity and Semantic Fusion

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By