1 INTRODUCTION
Sunspots are the most obvious phenomenon on the solar surface with high magnetic fields. Researches have shown they are evidently related to the solar cycle and other solar activities (flare, filament, coronal mass ejection [CME] etc.) (Wang et al. Reference Wang, Liu, Wang, Deng, Xu, Jing and Cao2013). Sunspot properties (including size, location, number, magnetic classification etc.) are usually applied to predict solar activities and monitor the space environment (Du & Wang Reference Du and Wang2012). Therefore, acquiring their properties is significative. Accurate recognition is essential for extracting sunspots and then able to acquire their properties. In early stage, it was manually recognised (Steinegger et al. Reference Steinegger, Vazquez, Bonet and Brandt1996), which is inefficient and not real-time, could not deal with the statistically requirements for numbers of data and needs to monitor space environment in real-time. Automatic recognition with high precision is therefore expected.
Several methods have been proposed to recognise sunspots. Zharkov et al. (Reference Zharkov, Zharkova, Ipson and Benkhalil2005) used edge-detection method and a local threshold to find the sunspot candidates, and a median filter was employed to remedy initial over segmentation of images. Curto, Blanca, & Martínez (Reference Curto, Blanca and Martínez2008) applied erosion and eroded gradient transformation to the detection of the solar limb, and then top hat operator was used to obtain valley regions on the solar disk. Watson et al. (Reference Watson, Fletcher, Dalla and Marshall2009) employed the morphological operations and then an intensity threshold to obtain candidates for the sunspot regions. Djafer, Irbah, & Meftah (Reference Djafer, Irbah and Meftah2012) adopted wavelet analysis to detect sunspots and the solar limb on Ca ii K1 Meudon images. Goel & Mathew (Reference Goel and Mathew2014) adopted a method called level set to detect and track sunspots.
Huairou Solar Observing Station (HSOS) is one of the key stations of the National Astronomical Observatories, Chinese Academy of Sciences. It has been observing sunspots for many years (Zhang et al. Reference Zhang2007). One of its telescopes can acquire the full-disk photospheric data and vector magnetic fields at the same time. Recognising sunspots based on these data will provide sunspot geometrical and magnetic properties, and is meaningful for sunspot study. Thus, a procedure for automatic recognition of sunspots in HSOS full-disk solar images is expected.
A large number of above methods had been tested to do with that, but did not work well in HSOS image. This is mainly due to the following reasons: (1) affected by ground atmosphere and with the limited diameter of telescope, some sunspots umbra and penumbra in the HSOS images are inseparable; (2) in the year of 2011, as the problem of the telescope, these are instrument noises in HSOS images. Above methods do not consider the interference by them. So in order to extract sunspots accurately from HSOS full-disk images, we propose a new automatic recognition procedure, by which a catalogue of sunspot properties is expected to generate.
This paper is arranged as follows: in Section 2, we briefly introduce the data that are used. Related theories and tools are discussed in Section 3. In Section 4, we give a detailed description of the procedure to identify the sunspots. We verify our procedure and make a discussion in Section 5. The summary is in Section 6.
2 DATA
HSOS has been equipped a 10-cm full-disk vector magnetograph since 2006 that is able to measure full-disk vector magnetic fields at FeI 5324.19 Å and obtain photospheric images simultaneously. The image frame size is 992 × 1004, with the pixel resolution of $2\,\text{arcmin}\times 2\,\text{arcmin}$ and spatial resolution better than 5 arcsec. Sunspots can be observed in the images. Combined with the synchronous vector magnetic field data, automatic identification of sunspots can get their geometrical and magnetic properties.
3 RELATED TOOLS FOR SUNSPOT RECOGNITION
Before introducing sunspots recognition procedure, the related tools will be introduced first.
3.1. Mathematical morphology
Mathematical morphology is a tool for extracting image components that are useful for representation and description (Haralick, Sternberg, & Zhuang Reference Haralick, Sternberg and Zhuang1987). The basic idea is by means of the structure of a certain morphology to measure and extract the corresponding shape of an image. Compared with the differential operator to extract edges, this method has following advantages: it is not so sensitive to noise as differential operator, meanwhile the recognised edges are smooth, continuous, and have less breakpoints.
The two basic morphological set transformations are erosion and dilation. These transformations involve the interaction between an image I (the object of interest) and a structuring set B, called the SE. A lot of other morphological operations are derived from them. In the following, erosion, dilation, closing, opening, and Bot-hat transformation will be introduced.
-
(1) Erosion
To obtain the minimum grey value of the original image I minus the one of the block B, erosion operation is defined as follows:
$$\begin{equation*} I\ominus B\! =\! \min \lbrace I(x+x^{\prime },y+y^{\prime })\, {-}\, B(x^{\prime },y^{\prime })|(x^{\prime },y^{\prime })\in D_b\rbrace . \end{equation*}$$Here, Db is the neighbourhood block.
-
(2) Dilation
Dilation is to obtain the maximum grey value of the original image plus the one of block. It is defined as follows:
$$\begin{equation*} I\oplus B\! =\! \max \lbrace I(x-x^{\prime },y-y^{\prime })\, {+}\, B(x^{\prime },y^{\prime })|(x^{\prime },y^{\prime })\in D_b\rbrace . \end{equation*}$$Here, Db mean the same as in (1).
-
(3) Closing
Closing operation is a process of dilation followed by erosion. It is defined as follows:
$$\begin{equation*} I\bullet B = (I\oplus B)\ominus B. \end{equation*}$$It is employed to fill small holes, with objects connect the adjacent objects and smooth the boundaries of the images.
-
(4) Opening
Opening generally smooths a contour in an image, breaking narrow isthmuses and eliminating thin protrusions. It is defined as follows:
$$\begin{equation*} I\circ B = (I\ominus B)\oplus B. \end{equation*}$$ -
(5) Bot-hat transformation
Bot-hat transformation is a subtraction of the original image and the closing image, it is defined as follows:
$$\begin{equation*} T = I - (I\oplus B)\ominus B. \end{equation*}$$T shows a grey valley in the original image, and highlights the boundaries between connected objects. It is able to extract dark pixels from a bright background.
3.2. Otsu algorithm
Otsu algorithm is a way to search an adaptive threshold, proposed by Otsu (Reference Otsu1979), whose idea is to divide an image into two parts: background and objective with a sharp discrepancy. The more different the two parts are, the sharper the discrepancy is. Part of the objectives wrongly divided into the background will make the discrepancy smaller. Therefore, the sharpest discrepancy means minimum probability of erroneous division. The calculation principle is as follows.
Now suppose that we dichotomise the pixels of the image I(x, y) into two classes (objects and background) by a threshold T, the objects and background pixels numbers are N 1 and N 2, respectively. The image size is indicated by M × N, the percentage of the objects pixel number is ω1, the percentage of the background ones is ω2, then we can easily verify the following relation:
ω1 and ω2 are the objects and background area probabilities, respectively (Sezgin et al. Reference Sezgin2004).
The objects and the background average grey values are shown by μ1 and μ2, respectively. The image average grey value is then by μ
It is the total mean level of the original picture.
g is given by
It represents the variances between objects and background. Substituting Equation (5) into (6) gives
Iterating all possible pixel values at the threshold T, the value which makes g the biggest will be the result.
4 RECOGNITION PROCEDURE
After the introduction to the above tools, sunspot recognition procedure based on them could be described.
In order to recognise sunspots and calculate their parameters (sunspot areas, position and so on), two steps are carried out. In the first step, the solar limb is extracted from the solar disk and the solar centre and radius are fixed. In the second, to recognise sunspots, morphological Bot-hat operation and local threshold are employed. Over segmentation of sunspots is eliminated by limitation in sunspot properties. More details are introduced in the following sections.
4.1. Extraction of solar limb
To calculate sunspot positions and areas on the solar disk, we must determine the solar centre and radius. This is achieved by extracting the solar limb. A procedure based on a morphological method and Otsu algorithm is designed. The steps are as follows (Figure 1):
-
(1) The first step is to remove sunspots and noises on the solar disk to get a clean solar disk. This is achieved by applying a closing operation (first dilation then erosion) with a structuring element (SE) to the original image in Figure 1(a). When this SE is larger than sunspots and noises, the closing operation will be able to remove them. The biggest sunspot in hundreds of images is chosen, and its radius calculated to be 30 pixels, so the SE radius is set to be 30. Then a clean solar disk is gotten in Figure 1(b).
-
(2) The second step is to shrink the solar disk in Figure 1(b). To do this, an erosion operation is employed by using an SE. This SE radius is set to be 1 pixel so as to make the radius of the solar disk 1 pixel smaller, then a shrunk solar disk is shown in Figure 1(c).
-
(3) To get the solar limb, Figure 1(c) is subtracted from Figure 1(b) and 1(d) is gotten.
-
(4) The following step is to get the pixel position of the solar limb. Otsu algorithm is applied to segment Figure 1(d), and a binary image is gotten in Figure 1(e), on which the white pixels are the position of the solar limb.
-
(5) In Figure 1(e), some CCD noises often exist near image border. In order to separate them from the solar limb, a criterion is made, that is as follows:
Figure 1(e) is set to I, and its size is M × N, a pixel in the image is set to I(x, y), if $(\frac{N}{2} - y)^2 + (\frac{M}{2} - x)^2 > M^2$ , then I(x, y) = 0.
Then CCD noises will be removed from Figure 1(e).
-
(6) In the final step, the white pixel positions in Figure 1(e) are extracted and saved for X and Y arrays, then least square fitting method is used to arrays X and Y to create a circle which is considered to be the solar limb. It is labelled with red colour and overlapped on the original image shown in Figure 1(f.)
4.2. Recognition of sunspots
Once the solar limb is extracted, we can recognise the sunspots in its interior. Due to the limited resolution of data, the sunspot umbra and penumbra could not be separated in HSOS images, so they will be treated as a whole in the processing. The recognition of sunspots is achieved by the following steps (Figure 2):
-
(1) Pre-processing: the method in Wang, Su, & Zhang (Reference Wang, Su and Zhang2008) is employed to pre-process the original image and Figure 2(a) is gotten.
-
(2) The second step is to get the gradient of the boundaries of sunspots and noises on Figure 2(a). First, a closing operation with the SE radius of 30 pixels is applied in Figure 2(a), which is larger than the radiuses of sunspots and noises so as to remove them, the definition of this SE is the same with the method in the first step of Section 4.1, and a clean solar image in Figure 2(b) is available. Then Figure 2(a) is subtracted from Figure 2(b), the gradient is shown in the resulting image of Figure 2(c).
-
(3) The third step is to separate the gradient of sunspots from noises in Figure 2(c). Here, the definition of threshold is the key, it depends on the darkness of Figure 2(c). By tests and statistics, the suitable value is 20% of the intensity range of Figure 2(c). But due to solar limb darkening, the sunspots gradient is lower at the solar limb, we make this threshold smaller to be 15% on the region of 0.8R of the solar disk (R represents the solar radius). Then sunspots candidate regions are gotten in Figure 2(d).
-
(4) The final stage is to acquire sunspots from candidates in Figure 2(d). We consider the candidates as verified sunspots in which the difference between the max grey value of a pixel and the min one is bigger than 5, and other regions are discarded.
-
(5) The sunspots are labelled in red and superimposed on the original image. The result is shown in Figure 2(e).
4.3. Sunspots recognition in images with instrument noises
As mentioned in the first paragraph, the instrument noises which appear in HSOS images make the previous methods not work well. So this procedure is used to test the effect of the recognisation of the sunspots. The result is shown in Figure 3, in which we can see the procedure performs well for these images.
5 VERIFICATION AND DISCUSSION
Two methods are adopted to verify the accuracy of the procedure. The first is to detect sunspots by our automatic procedure and manual method separately based on the existed data set, and make a comparison between them. The second is to compare two different data sets taken in the same period, we calculate the correlations of the sunspot areas taken from them by the automatic procedure. More details and results are given below.
5.1. Accuracy of automatic procedure compared with manual
Sunspot number is a basic index to reflect the level of solar activities, and is a suitable vehicle to assess the accuracy of our procedure (Hoyt & Schatten Reference Hoyt and Schatten1998). In our case, the data set is from HSOS photospheric images from November 2011 to December. The automatic and manual methods are adopted separately to calculate the sunspot numbers automatically and count them manually. Two results are listed in Table 1. The first column of the table shows the date of taking the images, the second column the number of sunspots counted by the manual method in an image, the third column the number of sunspots done by our automatic procedure, the fourth column is false rejection rate FRR (the number of sunspots detected by the manual but not by the automatic), and the last column is false acceptance rate FAR (the number of sunspots detected by the automatic but not by the manual).
FRR: the number of sunspots detected by the manual but not by the automatic.
FAR: the number of sunspots detected by the automatic but not by the manual.
In the process of manual recognition, small sunspots are easily missed due to limited seeing condition. To avoid this, the images of the Solar Dynamics Observatory/Helioseismic and Magnetic Imager (SDO/HMI) are used as reference to ensure small sunspots to be detected.
In the last row of the table, we sum the total sunspot numbers recognised by manual and automatic method respectively, and the total FRR and FAR.
Then we define the recognition rate as follows:
and the error rate as follows:
Analysing the sunspots which the automatic method detected but the manual did not, it may be caused by tiny dust or noises, such as instrument noise. The reasons why the sunspots can be detected by the manual but not by the automatic are mainly the following:
(1) For some adhesive sunspots, the manual method will deal with them as separated ones, but the automatic method regards them as one; (2) in our recognition procedure to remove noises, the criterion we set is a region whose difference between maximum grey and minimum one is larger than 5 of being a sunspot. This works well in most cases, except a few cases in which some small sunspots are still missed.
We calculate the diameters of the sunspots which are missed by the automatic method, the diameter calculation formula is
where s is the sunspot area, R is the solar disk radius acquired by the automatic solar limb detection program, and d the sunspot diameter in units of degree.
The result is shown in Table 2. From the table, we can see their diameters are all less than 2°, which means they are weak sunspots on the solar disk which have little impact on the level of solar activities.
5.2. Verification with USAF/NOAA
Sunspot area is an important indicator of the solar activity level (Carbonell & Ballester Reference Carbonell and Ballester1992), which is associated with the solar cycle and significant in space environment monitoring. To further verify the automatic method, we make use of the sunspot area data in this paper.
Figure 4 shows the comparison of the daily sunspot areas extracted by the automatic method from HSOS images during the period of 2006–2012 with those available as TXT files at the US Air Force/NOAA. The horizontal axis represents the date of taking images, the vertical the sunspot area, in units of millionths of a solar hemisphere (μHem). Due to the maintenance of the instrument from August 2009 to November 2010, HSOS data are blank.
From Figure 4, we can see that USAF/NOAA and HSOS have the same tendency, with the correlation coefficient of 95 % shown in Figure 5. This is a high accuracy of recognition which could prove the procedure works well.
To analyse the difference between Figure 4(a) and 4(b), we find the discrepancy is caused by a few factors: first, some instrument noises indistinguishable from smaller sunspots lead to false identification; second, atmospheric interference to HSOS images makes the sunspots border more indistinct; finally, although we choose and compare the images of HSOS and USAF/NOAA in the same days, their collection time may be different that the sunspot morphology may somewhat change.
5.3. Enlarge the recognised sunspots
In order to show the detection result, some sunspots are selected randomly and recognised. The detected areas are zoomed and compared with the original images, they are shown in Figure 6. From the figure, it seems that the sunspots are recognised accurately.
6 CONCLUSION
A procedure for recognition of the sunspots in full-disk photospheric solar images of HSOS is introduced. It adopts Gaussian algorithm to smooth the images at first. Then the sunspots are recognised through the morphological Bot-hat operation with a local threshold. Wrong selection of sunspots is eliminated by a criterion of limiting sunspot properties. Besides, the morphological operations and Otsu algorithm are used to extract the solar limb, which is helpful to calculate sunspot areas. Compared with the manual method, the recognition rate is 95%, the error rate is 1.2%. The correlation between USAF/NOAA and HSOS sunspots areas is in a good agreement (95%). The advantage of this procedure is that it is appropriate to detect sunspots for lower resolution images, particularly the images associated with instrument noises.
In the next step, we will focus on improving the accuracy of recognition. A new sunspot property database will be expected and released on web site, which will be helpful for a study of the solar activity.
ACKNOWLEDGEMENTS
One of the authors C. Zhao is grateful to Prof. X. J. Mao for giving beneficial advices for the manuscript. This work is supported by National Natural Science Foundation of China (11427901, 11403058, 11473039, 11178005, U1331113, U1531247, 11203036, U1331104, 11427803, 11303052, 11373040, 11573037), Ministry of Science and Technology of the People’s Republic of China (2012fy120500, 2014FY120300), and the Young Researcher Grant of National Astronomical Observatories, Chinese Academy of Sciences.