Automated Shortterm Prediction of Solar Flares using Machine

  • Slides: 42
Download presentation
Automated Short-term Prediction of Solar Flares using Machine Learning Rami Qahwaji r. s. r.

Automated Short-term Prediction of Solar Flares using Machine Learning Rami Qahwaji r. s. r. qahwaji@bradford. ac. uk & Tufan. Colak t. colak@bradford. ac. uk EIMC, University of Bradford BD 71 DP, U. K. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Organisation of this talk o o o Objectives & related work Solar

SIPWORKIII 08/09/06 Organisation of this talk o o o Objectives & related work Solar data (features and activities) Data Association Machine learning algorithms Practical results Conclusions and future work http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Objective: o We aim to design an automated system that could provide

SIPWORKIII 08/09/06 Objective: o We aim to design an automated system that could provide short-term prediction of solar flares by establishing a correlation between sunspots and solar flares using machine learning. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Related Work o o Despite the recent advances in solar imaging, machine

SIPWORKIII 08/09/06 Related Work o o Despite the recent advances in solar imaging, machine learning has not been widely applied to solar data, except for verification purposes. Solar activity (i. e. , Wolf Number) was predicted first by (Calvo et al. 1995). (Borda et al. 2002) described a method for the automatic detection of solar flares using BP MLP, SVM and RBF were used for flares detection in (Qu et al. 2003). http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work o Solar data

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work o Solar data (features and activities) o Data Association Machine learning algorithms Practical results Conclusions and future work o o o http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Data? o o Data from the publicly available National Geophysical Data Centre

SIPWORKIII 08/09/06 Data? o o Data from the publicly available National Geophysical Data Centre (NGDC) sunspot groups and flares catalogues are used in our study. NGDC keeps record of data from several observatories around the world and holds one of the most comprehensive publicly available databases for solar features and activities. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 The NGDC sunspots catalogue o o The NGDC sunspot catalogue holds records

SIPWORKIII 08/09/06 The NGDC sunspots catalogue o o The NGDC sunspot catalogue holds records of sunspot groups supplying their date, time, location, physical properties, sunspot area and classification data. Two classification systems exist for sunspots: Mc. Intosh, which depends on the size, shape and spot density of sunspots, and Mt. Wilson. , which is based on the distribution of magnetic polarities within spot groups. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 The NGDC Flares catalogue o o This catalogue provides information about dates,

SIPWORKIII 08/09/06 The NGDC Flares catalogue o o This catalogue provides information about dates, starting and ending times for flare eruptions, location, NOAA number of the corresponding active region and x-ray classification for the detected flares. Not all the flares have associated NOAA numbers. Flares without NOAA numbers are not included in our study. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Data http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Data http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work Solar data (features

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work Solar data (features and activities) o Data Association and prediction model o Machine learning algorithms Practical results Conclusions and future work o o o http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Associating Flares and Sunspots o o We’ve investigated all the sunspot groups

SIPWORKIII 08/09/06 Associating Flares and Sunspots o o We’ve investigated all the sunspot groups that were associated with flares from 01 Jan 1992 till 31 Dec 2005. The degree of association was determined based on the NOAA region number and the timing information. A C++ platform that extracts online flares and sunspots info from NGDC catalogues was created. Our software has analysed the data related to 29343 flares and 110241 sunspots and has managed to associate 1425 M and X flares with their corresponding sunspot groups. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Associating Flares and Sunspots http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Associating Flares and Sunspots http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Flare Prediction http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Flare Prediction http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Theoretical Model http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Theoretical Model http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work Solar data (features

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work Solar data (features and activities) Data Association o Machine learning algorithms o Practical results Conclusions and future work o o o http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 o o o Various neural network topologies, support vector machines (SVM) and

SIPWORKIII 08/09/06 o o o Various neural network topologies, support vector machines (SVM) and Radial Basis Function Networks (RBFN) are optimized and compared. In our previous work (Qahwaji & Colak, CITSA 2006 and Colak & Qahwaji, WSC 11) the performance of several NN topologies (i. e. , Elman BP, FFBP, cascade FFBP, etc. ) was compared and it was concluded that CCNN provides better association between solar flares and sunspot classes. CCNN and RBFN are used because of their efficient performance in classification and time-series prediction (Frank et al. 1997). http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 SVM vs NN? o Thank You for Listening http: //spaceweather. inf. brad.

SIPWORKIII 08/09/06 SVM vs NN? o Thank You for Listening http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 o o o It is one of the recent trends in machine

SIPWORKIII 08/09/06 o o o It is one of the recent trends in machine learning to compare the performance of SVMs and NNs. The work reported in (Acir & Guzelis 2004), (Pal & Mather 2004), (Huang et al. 2004), and (Distante et al. 2003) supports this. Similar performance for SVMs was reported for flares detection in (Qu et al. 2003), http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Cascade FFBP o In cascade FFBP, the first layer has connecting weights

SIPWORKIII 08/09/06 Cascade FFBP o In cascade FFBP, the first layer has connecting weights with the input layer. Each subsequent layer has weights connecting it to the input layer and all previous layers. . http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 SVM (Support Vector Machines) maximises the distance between the closest vectors in

SIPWORKIII 08/09/06 SVM (Support Vector Machines) maximises the distance between the closest vectors in both classes to the hyperplane http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Radial Basis Function Networks (RBFN) http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Radial Basis Function Networks (RBFN) http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Optimising the Learning Algorithms o o o A learning algorithm provides best

SIPWORKIII 08/09/06 Optimising the Learning Algorithms o o o A learning algorithm provides best generalisation if it is optimised. A NN is optimised if the optimum topology, learning algorithm and learning times are found. After finding that CCNN provides best performance, we compared 100 different CCNN topologies. We found that a CCNN with 6 hidden nodes in the first layer and 4 hidden nodes in the second layer gives the best results for CFP and CFTP. Similar approaches were followed for SVM and RBNN. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work Solar data (features

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work Solar data (features and activities) Data Association o Machine learning algorithms o Practical results Conclusions and future work o o o http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 o o o Both NGDC catalogues were used and our software has

SIPWORKIII 08/09/06 o o o Both NGDC catalogues were used and our software has analysed the data related to 29343 flares and 110241 sunspots and has managed to associate 1425 M and X flares with their corresponding sunspot groups. The total number of samples used for our training set is 2882, where 1425 samples represent sunspots that produced flares. The remaining samples represent sunspots that existed in non-flaring days and are not related to any sunspot groups within the previous flaring sunspot samples. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 The Training and Testing Sets o The NN training and testing was

SIPWORKIII 08/09/06 The Training and Testing Sets o The NN training and testing was carried out based on the statistical Jack-knife technique (Fukunaga 1990). o For all the experiments, 80% of the samples are randomly selected and used for training while the remaining 20% are used for testing. These experiments are repeated for number of times and the average is taken. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Initial Experiments o. For each sample, the training vector consists of 5

SIPWORKIII 08/09/06 Initial Experiments o. For each sample, the training vector consists of 5 elements ( 3 for inputs; 2 for outputs). http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Initial Experiments o o o Several experiments based on the Jack-knife technique

SIPWORKIII 08/09/06 Initial Experiments o o o Several experiments based on the Jack-knife technique were carried out and we found that the prediction rate for flares in the best case scenario was 72. 9%. This indicated that a correlation existed between the input and output sets. But this value is not high enough to provide reliable prediction of solar activities. To improve the learning performance we tried to associate the classified sunspots with the sunspot cycle. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 o o This seemed logical because the rise and fall of solar

SIPWORKIII 08/09/06 o o This seemed logical because the rise and fall of solar activity coincides with the sunspot cycle (Pap et al. 1990). When the solar cycle is at a maximum, plenty of large active regions exist and many solar flares are detected. These decreases in number as the Sun approaches the minimum part of its cycle (Pap et al. 1990). http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Solar Cycle and Flares Science @ NASA, "Solar Minimum Explodes", 9. 15.

SIPWORKIII 08/09/06 Solar Cycle and Flares Science @ NASA, "Solar Minimum Explodes", 9. 15. 2005 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Solar Cycle Modelling-Hathaway’s Model a represents the amplitude and is related to

SIPWORKIII 08/09/06 Solar Cycle Modelling-Hathaway’s Model a represents the amplitude and is related to the rise of the cycle minimum, b is related to the time in months from minimum to maximum; c gives the asymmetry of the cycle; and to denotes the starting time http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 o For each sample, the training vector consists of 6 elements (

SIPWORKIII 08/09/06 o For each sample, the training vector consists of 6 elements ( 4 for inputs; 2 for outputs). http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 o Hence, for Fkc sunspot at solar maximum that produced an M

SIPWORKIII 08/09/06 o Hence, for Fkc sunspot at solar maximum that produced an M flare, the training vector looks like this: http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work Solar data (features

SIPWORKIII 08/09/06 Organisation of this talk o Objectives & related work Solar data (features and activities) Data Association Machine learning algorithms Practical results o Conclusions and future work o o http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Conclusions o o o A fully automated computer platform that could verify

SIPWORKIII 08/09/06 Conclusions o o o A fully automated computer platform that could verify this correlation between sunspot classes and solar flares relation using machine learning, is designed. The association and learning softwares will become public shortly at http: //spaceweather. inf. bradford. ac. uk/ Our findings show that there is a direct relation between the eruptions of flares and certain Mc. Intosh classes of sunspots such as Ekc, Fki and Fkc. Our findings are in accordance with (Mc. Intosh 1990), (Warwick 1966), and (Sakurai 1970). http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 A hybrid system, which combines both SVM and CCNN, will give better

SIPWORKIII 08/09/06 A hybrid system, which combines both SVM and CCNN, will give better results for flare prediction. http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 Future Work o o o Apply image segmentation and classification algorithms to

SIPWORKIII 08/09/06 Future Work o o o Apply image segmentation and classification algorithms to detect sunspots and classify them automatically, so that the platform is completed. To track the individual sunspot groups over their lifetime. The development of the sunspot group can contribute to the knowledge of the machine learning systems. Will better prediction be achieved if the magnetic configuration of sunspots (Mt. Wilson classification) is combined with the sunspot area to replace the Mc. Intosh classification (Sammis, Tang & Zirin, 2000, Ap. J)? http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 o To compare our findings with other authors who tested the correlations

SIPWORKIII 08/09/06 o To compare our findings with other authors who tested the correlations of the various Mc. Intosh classes on flare rates and the applications to solar flare prediction (e. g. Mc. Intosh 1990; Bornmann & Shaw 1994, Sol. Phys. 150, p. 127; Gallagher et al. 2002, Sol. Phys. 209, p. 171; Wheatland 2004, Ap. J 609, p. 1134). http: //spaceweather. inf. brad. ac. uk/

SIPWORKIII 08/09/06 o Acknowledgment. This work is supported by an EPSRC Grant (GR/T 17588/01),

SIPWORKIII 08/09/06 o Acknowledgment. This work is supported by an EPSRC Grant (GR/T 17588/01), which is entitled “Image Processing and Machine Learning Techniques for Short-Term Prediction of Solar Activity”. http: //spaceweather. inf. brad. ac. uk/