Effect Of Data Preprocessing On The Prediction Accuracy Of Artificial Neural Network Model in Hydrologic Time Series
Date
2013
Authors
Banhatti, Aniruddha Gopal
Journal Title
Journal ISSN
Volume Title
Publisher
National Institute of Technology Karnataka, Surathkal
Abstract
The accurate prediction of hydrological behavior in both urban and rural watershed can
provide valuable information for the urban planning, land use, design of civil projects and
water resources management. Hydrology system is influenced by many factors such as
weather, land cover, infiltration, evapotranspiration, so it includes a good deal of stochastic
dependent component, multi-time scale and highly non-linear characteristics. Hydrologic time
series are often non-linear and non- stationary. In spite of high flexibility of Artificial
Neural Network (ANN) in modeling hydrologic time series, sometimes signals are highly
non-stationary and exhibit seasonal irregularity. In such situation, ANN may not be able to
cope with non-stationary data if pre-processing of input and/or output data is not performed.
Pre-processing data refers to analyzing and transforming input and output variables in order to
detect trends, minimize noise, underline important relationship and flatten the variables
distribution in a time series. These analyses and transformations help the model learn relevant
patterns. Pre-processing techniques, which facilitate stabilization of the mean and variance, and
seasonality removal, are often applied to remove non- stationary aspect in data used to build
soft computing models.
In this study, different data pre-processing techniques are presented to deal with irregularity
components that exist in a hydrologic time series data of the Brahmaputra basin within India
at the Pandu gauging station near Guwahati city and Pancharatna gauging station further
150km downstream of Pandu by using daily time unit and their properties are evaluated by
performing one step ahead flow forecasting using ANN. Three different preprocessed datasets
are used for the analysis. Various ANN models are generated by varying network internal
architecture with different input scenarios.
The model results were evaluated by using Root Mean Square Error (RMSE)and
Mean Absolute Percentage Error (MAPE) and found that Logarithmic based pre-processing
techniques provide better forecasting performance among various pre-processing
techniques.
The results indicate that detecting non-stationary aspect and selecting an appropriate preprocessing technique is highly beneficial in improving the prediction performance of ANN
model.
Description
Keywords
Department of Applied Mechanics and Hydraulics, Brahmaputra River, Gauging Station, Pandu, Pancharatna, Guwahati, Time Series, Data Preprocessing, ANN, FFBP, Activation Function, RMSE, MAPE