DWDS - A Scalable, Context-Aware Framework for Distributed Web Service Discovery Using Semantics
Date
2016
Authors
Kamath S, Sowmya
Journal Title
Journal ISSN
Volume Title
Publisher
National Institute of Technology Karnataka, Surathkal
Abstract
Service-oriented Computing is a popular paradigm that lays the foundation for a robust
distributed computing infrastructure for both intra- and cross-enterprise application
integration and collaboration. It spans a diverse range of possible applications -
standalone services, software mashups combining multiple Web services, micro-services
used in the implementation of entire IT system landscapes etc. For any such
service-oriented application design, the discovery of relevant services that can provide
the desired capability is one of the basic tasks. In particular, Distributed Web service
discovery is deemed to be one of the grand challenges in Web service research, due to
the current distributed nature of services on the Web. This applies in particular to
scenarios where a large number of service offers are available in an distributed
environment like the WWW; making the process of making a selection difficult,
especially given the limitations of keyword based matching. Hence, it is very desirable
to improve this process to support efficient Web service discovery, using intelligent,
semantics based techniques.
This thesis addresses the important aspects of Web service discovery, focusing on
services available in a large scale, distributed environment, that is, the Web. The main
research contributions are towards the improvement of service discovery based on
implicit semantic information, inference of service domain knowledge, context-aware
Web service discovery and composition-oriented Web service discovery. An efficient,
scalable framework called Distributed Web service Discovery with Semantics (DWDS),
is presented, to address the issue of distributed Web service discovery using semantics.
The DWDS framework also provides autonomic features like automated repository
management, redundancy control and verification, in order to maintain the validity of
the service repository. Based on the statistics collected over the course of 3 years, it was
found that the techniques proposed for developing the DWDS framework and adding
autonomic features were effective and supported scalable service discovery.
To support similarity based service discovery and effective categorisation of services
in the repository, automatic metadata generation and similarity computation
techniques, that use the inherent functional semantics of the services were incorporated.As traditional unsupervised categorisation mechanisms like K-means clustering and
supervised techniques like Classification cannot deal with the dynamic nature of the
framework, a bio-inspired incremental clustering algorithm, BI2C, based on the flocking
behaviour of birds was proposed. BI2C incrementally clusters the service collection
after each change introduced by periodic crawler runs, thus, supporting the scalability
of the service repository. BI2C achieved an average speed up of more than 57% over
traditional clustering algorithms and was also able to deal with the large volume of
service descriptions available in the DWDS repository.
To enable context-aware, natural language based querying during Web service
discovery in DWDS, semantics based query analysis techniques were developed. Also,
any complex queries were automatically processed to determine their constituent
sub-queries, so that composite service discovery can be supported. It was found that
the semantic analysis of user query to capture context were effective as it resulted in
16% improvement in precision and over 37% increase in recall, over keyword matching
based approach.
To extend composite service discovery, a concept of capturing the service
input/output dependencies formally through a Service Interface Graph (SIG) was
proposed. In the SIG, services are represented as nodes, and their dependencies are
captured as edges. Any user queries that have sub-queries are answered by traversing
the SIG, so that the correct invocation sequence required to satisfy the user query
requirements can be identified. Experimental evaluation showed that the proposed
method achieved an accuracy of 70.68% and effectively identified correct composition
templates in O(N2) time.
To summarize our contributions, our work focuses on developing a semantics based
distributed Web service discovery framework that can automatically retrieve service
descriptions available in heterogeneous sources on the Web, to build a scalable service
repository. Automated metadata generation and dynamic categorisation techniques
enable the framework to support efficient basic and composite Web service discovery in
a context-aware manner.
Description
Keywords
Department of Information Technology