Leveraging Structural and Semantic Measures for JSON Document Clustering

Uma Priya, D.; Santhi Thilagam, P.S.

Leveraging Structural and Semantic Measures for JSON Document Clustering

dc.contributor.author	Uma Priya, D.
dc.contributor.author	Santhi Thilagam, P.S.
dc.date.accessioned	2026-02-04T12:27:07Z
dc.date.issued	2023
dc.description.abstract	In recent years, the increased use of smart devices and digital business opportunities has generated massive heterogeneous JSON data daily, making efficient data storage and management more difficult. Existing research uses different similarity metrics and clusters the documents to support the above tasks effectively. However, extant approaches have focused on either structural or semantic similarity of schemas. As JSON documents are application-specific, differently annotated JSON schemas are not only structurally heterogeneous but also differ by the context of the JSON attributes. Therefore, there is a need to consider the structural, semantic, and contextual properties of JSON schemas to perform meaningful clustering of JSON documents. This work proposes an approach to cluster heterogeneous JSON documents using the similarity fusion method. The similarity fusion matrix is constructed using structural, semantic, and contextual measures of JSON schemas. The experimental results demonstrate that the proposed approach outperforms the existing approaches significantly. © 2023, IICM. All rights reserved.
dc.identifier.citation	Journal of Universal Computer Science, 2023, 29, 3, pp. 222-241
dc.identifier.issn	0948695X
dc.identifier.uri	https://doi.org/10.3897/jucs.86563
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/22148
dc.publisher	IICM
dc.subject	Clustering
dc.subject	Data Mining
dc.subject	JSON
dc.subject	Similarity Measures
dc.title	Leveraging Structural and Semantic Measures for JSON Document Clustering

Collections

Journal Articles

Leveraging Structural and Semantic Measures for JSON Document Clustering

Files

Collections