Filling of data gaps in Sea Surface Temperature

  • Slides: 24
Download presentation
Filling of data gaps in Sea Surface Temperature using Hadoop based Neural Networks Under

Filling of data gaps in Sea Surface Temperature using Hadoop based Neural Networks Under The Guidance of B. Sangamithra Presented By S. Devikumari 13 MT 8202 Wednesday, September 9, 2020

Contents Abstract Existing system Proposed system Data Processing With Hive Algorithm Neural network model

Contents Abstract Existing system Proposed system Data Processing With Hive Algorithm Neural network model Neural network steps Output Conclusion

Abstract The Large scale sea surface temperature(SST) data that are generated continuously by multiple

Abstract The Large scale sea surface temperature(SST) data that are generated continuously by multiple sensors in daily communications are possessed high significance for analyzing the behaviors of huge amounts of data. However, the natural properties of satellite data present three nontrivial challenges: large data scale leads it difficult to keep both efficiency and accuracy; similar data increases the system load; and noise in the data set is also an important influence factor of the processing result and need to be worked efficiently with the neural networks on large data sets. Data is divided into separated segments, and learned by a same network structure. A Hadoop based framework called HBNN (i. e. Hadoop-based Back propagation Neural Network) is proposed to process forecast on large-scale SST data. It uses a Back propagation algorithm which provides greater efficiency and good scalability.

Existing system Traditional methods used for filling of data gaps are not effective for

Existing system Traditional methods used for filling of data gaps are not effective for non-stationary and non-linear time series data. In case of missing data and when its location is random there is no solution is provided Proposed system The SST gap filling (missing data) problem can be solved through Hadoop based Neural Networks . We suggest an approach to solve the above mentioned problem i. e. , recovery of missing data in time series using artificial neural networks.

Data Processing With Hive is a database technology that can define databases and tables

Data Processing With Hive is a database technology that can define databases and tables to analyze structured data. Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy. So here Hive database is used to format unstructured data

Hive Queries create table SST_DATAtable (col_value STRING); LOAD DATA INPATH '/home/hadoop/training/hive/SSTdata. csv' OVERWRITE INTO

Hive Queries create table SST_DATAtable (col_value STRING); LOAD DATA INPATH '/home/hadoop/training/hive/SSTdata. csv' OVERWRITE INTO TABLE SST_DATA ; create table SSTdata (U 1 string, U 2 string, WV string, CLW string, RR string); insert overwrite table SSTdata SELECT regexp_extract(col_value, '^(? : ([^, ]*), ? ){1}', 1) U 1 string, regexp_extract(col_value, '^(? : ([^, ]*), ? ){2}', 1) U 2 string, regexp_extract(col_value, '^(? : ([^, ]*), ? ){3}', 1) WV string, regexp_extract(col_value, '^(? : ([^, ]*), ? ){4}', 1) CLW string, regexp_extract(col_value, '^(? : ([^, ]*), ? ){5}', 1) RR string, regexp_extract(col_value, '^(? : ([^, ]*), ? ){6}', 1) SST string from SST_DATAtable;

Back propagation Algorithm First apply to the inputs to the network and out the

Back propagation Algorithm First apply to the inputs to the network and out the output At the start the neural network is assigned random weights for its connections. The output achieved by using the given input is compared to the target output. The weights are adjusted to reduce the different between the target and output to the minimum. This process is repeated until a low enough difference is achieved. This is a stopping condition known as the desired error. Anther stopping condition is the maximum number of training epochs Error calculations Error = Output(1 -Output)(Target-Output) Change the weights. Let w 1 be the new(trained) weight and w is initial weight w 1= w+(Error* Output)

Neural Network Model

Neural Network Model

Steps for Neural Network Step 1: We can start the neural network tool using

Steps for Neural Network Step 1: We can start the neural network tool using the nntool command Step 2: Select the import. It will display the window. Step 3: Here brows the input and target datasets load from disk files.

Step 4: Create a Network

Step 4: Create a Network

Step 5: Train the Perception Now click the Train tab. Specify the inputs and

Step 5: Train the Perception Now click the Train tab. Specify the inputs and output by clicking the Training Info tab and selecting the inputs to trained input, and the targets to trained target. Click Train Network to train the Feed-forward backprob network. The following training results appear.

Training Phase

Training Phase

Performance of SST Data

Performance of SST Data

Training State of SST Data

Training State of SST Data

Regression of SST Data

Regression of SST Data

SSTDATA OUTPUT

SSTDATA OUTPUT

Conclusion The main objective of this study is to retrieve missing values in the

Conclusion The main objective of this study is to retrieve missing values in the measured sea surface temperature time series data. The proposed Hadoop based neural networks(HBNN) is well trained and tested in filling the SST data gaps with possible error estimation and correction. Our results show that the efficacy of the estimation procedure and thus the reliability of the estimated missing values are dependent on a number of factors.

Any Queries… Thank You

Any Queries… Thank You