Correlation Analysis and Tensor Data Modeling In Multimodal Environmental Wireless Sensor Networks
No Thumbnail Available
Date
2022
Authors
G, Rajesh
Journal Title
Journal ISSN
Volume Title
Publisher
National Institute of Technology Karnataka, Surathkal
Abstract
The major challenges during the data acquisition process in an environment wireless
sensor network (EWSN) architecture are the presence of outliers and missing data. The
outliers are ubiquitous in the data acquired by the EWSN due to sensor failures, aging
effects, power dwindling, external noise, etc. Missing data at the sink node owes to
the communication failures, sensor node malfunction, inadequate sampling frequency
and switching of sensor nodes into sleep mode, etc. as prominent rationales. Since the
data acquired by the sensor nodes in a multimodal EWSN are spatially, temporally and
attribute-wise correlated, these correlations play a pivotal role in missing data recovery
and data prediction mechanisms.
The thesis proposes an analytical framework to characterize the (multi-attribute)
correlation between different pairs of modalities in a real-world EWSN. Monte Carlo
simulation is performed to approximately model sensed environmental data character-
istics. Three classical estimates and four robust estimates of correlation coefficients
are used to establish the correlation between two typically correlated distinct pairs of
sensed modalities in the obtained data. Stationarity analysis among the acquired envi-
ronmental variables sheds light upon the best estimates of the correlation coefficient,
which could be used for the prediction of missing/outlier corrupted data in a known
region of slope/stationarity in the data characteristics. A novel outlier modeling scheme
using Chebyshev’s inequality is developed for the addition of gross sparse outliers in
the correlated data.
The multi-dimensional nature of the acquired data in EWSNs (spatial, temporal and
attribute dimensions) leads to tensors as a natural choice of data representation. The
inherent correlations in the acquired data cause redundancy and hence, low-rankness
of the acquired data tensor. Robust tensor principal component analysis (RTPCA) de-
composes a noisy data tensor into a low-rank tensor and a sparse tensor, which can be
v
exploited in the data recovery process of multi-attribute EWSNs, where the low-rank
component represents the intrinsic data tensor and the sparse component represents the
gross outlier tensor. A novel probabilistic outlier modeling scheme using multivariate
Chebyshev’s inequality hypothesis is introduced, which maps the sample population
and the associated magnitudes of outliers with the spatio-temporal correlations inher-
ently present in the acquired heterogeneous sensory data. The intrinsic data recovery
in EWSNs is investigated in the presence of a varying population of sparse outliers and
missing sample values.
A robust incremental tensor decomposition (ITD) framework is also proposed in this
thesis, which processes the tensor data sequentially and performs low-rank and sparse
decomposition of tensor data in a faster way compared to batch processing methods and
having comparable recovery accuracy. The ITD mechanism can be of greater interest,
especially in scenarios where data processing demands real-time execution.
Description
Keywords
Environmental wireless wensor networks, correlation coefficient estimates, robust tensor principal component analysis, Chebyshev’s inequality