1. Introduction
Radio surveys provide unique views into the Galactic and extragalactic skies. At the frequency of the Rapid ASKAP Continuum Survey (RACS, at 887.5 MHz; McConnell et al. Reference Mohan and Rafferty2020), and more generally below a few GHz, radio emission is dominated by synchrotron radiation; the emission from relativistic electrons spiralling within magnetic fields (Condon Reference Condon1992). This traces two main extragalactic populations: Active Galactic Nuclei (AGN) and Star Forming Galaxies (SFGs). For SFGs, it provides a method of obtaining unbiased star formation rates (SFR; e.g. Bell Reference Bell2003; Garn et al. Reference Garn, Green, Riley and Alexander2009; Davies et al. Reference Davies2017; Gürkan et al. Reference Gürkan2018), as radio emission is un-attenuated by dust. Observing synchrotron emission from AGN is important for understanding galaxy evolution, as their feedback is thought to limit the size to which galaxies can grow (see e.g. Bower et al. Reference Bower, Benson, Malbon, Helly, Frenk, Baugh, Cole and Lacey2006; Fabian Reference Fabian2012; Harrison Reference Harrison2017). Within the Galaxy, radio emission is often observed from supernova remnants (see e.g. Whiteoak & Green Reference Whiteoak and Green1996; Anderson et al. Reference Anderson2017), as Galactic synchrotron emission within the Galactic plane (see e.g. Haslam et al. Reference Haslam, Salter, Stoffel and Wilson1982; Green et al. Reference Green, Cram, Large and Ye1999; Murphy et al. Reference Murphy, Mauch, Green, Hunstead, Piestrzynska, Kels and Sztajer2007; Wang et al. Reference Wang2018) as well as from transient and variable sources (see e.g. Thyagarajan et al. Reference Thyagarajan, Helfand, White and Becker2011; Bhandari et al. Reference Bhandari2018). This variety of objects motivates radio surveys for advancing our understanding of the Universe.
For catalogues of extragalactic radio sources, it is important to have both large area as well as deep observations. Deeper, smaller area surveys provide observations of fainter radio populations (e.g. radio quiet quasars and SFGs, see e.g. Wilman et al. Reference Wilman2008; Padovani et al. Reference Padovani, Bonzini, Kellermann, Miller, Mainieri and Tozzi2015; Smolčić et al. Reference Smolčić2017b) and allow galaxy evolution to be investigated to earlier times in the age of the Universe. Large area surveys, on the other hand, allow extreme and rare AGN to be observed as well as large samples of resolved nearby SFGs. They are also crucial in providing information for radio sky models. Moreover, observations at multiple epochs of large sky areas are useful for detecting transient or variable sources (see e.g. Thyagarajan et al. Reference Thyagarajan, Helfand, White and Becker2011; Mooley et al. Reference Mooley2016; Nyland et al. Reference Nyland2020).
At $\sim$ 1 GHz, radio surveys which have observed large regions of the southern skies ( $\delta < 0^{\circ}$ ) have been dominated by the combination of Sydney University Molonglo Sky Survey (SUMSS; Mauch et al. Reference Mauch, Murphy, Buttery, Curran, Hunstead, Piestrzynski, Robertson and Sadler2003), the Molongolo Galactic Plane Survey (MGPS; Green et al. Reference Green, Cram, Large and Ye1999), and the updated MGPS-2 survey (Murphy et al. Reference Murphy, Mauch, Green, Hunstead, Piestrzynska, Kels and Sztajer2007) as well as the NRAO VLA Sky Survey (NVSS; Condon et al. Reference Condon, Cotton, Greisen, Yin, Perley, Taylor and Broderick1998), complemented in the smaller overlap regions by Faint Images of the Radio Sky at Twenty-Centimeters (FIRST; Becker et al. Reference Becker, White and Helfand1995; Helfand et al. Reference Helfand, White and Becker2015). SUMSS surveyed the southern sky up to a northern-most $\delta =-30^{\circ}$ (excluding the Galactic plane $|b|<10^{\circ}$ ) at 843 MHz with $45^{\prime\prime}$ / $\sin|\delta|$ resolution at a typical sensitivity of ${\sim}1 \mathrm{\,mJy\, beam}^{-1}$ . NVSS, on the other hand, is a northern sky survey which observed to a southern-most $\delta=-40^{\circ}$ at 1.4 GHz, observing with a constant $45^{\prime\prime}$ resolution at a typical sensitivity of ${\sim}0.5 \mathrm{\,mJy\, beam}^{-1}$ . FIRST also observed at 1.4 GHz with the VLA to a deeper sensitivity of ${\sim}0.15 \mathrm{\,mJy\, beam}^{-1}$ , at $5^{\prime\prime}$ resolution around the north and south Galactic caps. However, FIRST does not probe a large fraction of the southern skies and has limited sensitivity to extended emission.
At lower frequencies, the Galactic and Extra-Galactic All-sky MWA Survey (GLEAM; Hurley-Walker et al. Reference Hurley-Walker2017) and TIFR GMRT Sky Survey Alternate Data Release (TGSS-ADR; Intema et al. Reference Intema, Jagannathan, Mooley and Frail2017) provided observations of large regions of the southern sky. GLEAM observed south of $\delta = +30{^\circ}$ at ${\sim}2{^\prime}$ resolution, reaching a root mean square (rms) sensitivity of 6–10 $\mathrm{\,mJy\, beam}^{-1}$ in the frequency range 70–230 MHz. TGSS-ADR observed the entire sky north of $\delta=-53{^\circ}$ at higher resolution, ${\sim}25{^{\prime\prime}}$ , to an rms sensitivity of approximately $3.5 \mathrm{\,mJy\, beam}^{-1}$ at 150 MHz. At higher frequencies, the Australia Telescope Compact Array (ATCA) has conducted surveys of the southern radio sky, including the Australia Telescope 20 GHz Survey (AT20G; Murphy et al. Reference Murphy2010), with approximately $10^{{\prime\prime}}$ resolution, yielding a catalogue to an integrated flux density limit of 40 mJy (for $\delta<0^{\circ}$ ). All these large area surveys are crucial to improving statistics on galaxy numbers, finding rare objects, as well as observing nearby radio sources and resolved star forming galaxies. Moreover, the combination of low, mid and high radio frequency radio surveys are important for spectral modelling of sources (see e.g. Clemens et al. Reference Clemens, Vega, Bressan, Granato, Silva and Panuzzo2008; Callingham et al. Reference Callingham2017; Galvin et al. Reference Galvin2018).
In order to observe such large areas it is advantageous to have an instrument with a large field of view. The Australian SKA Pathfinder (ASKAP; Johnston et al. Reference Johnston2008; Hotan et al. Reference Hotan2014; Hotan et al. Reference Hotan2021) is one such facility able to easily conduct large sky surveys. It uses phased array feeds (PAFs), which provide an instantaneous field of view of ${\sim}31 \,\mathrm{deg}^2$ . The first large sky survey with ASKAP is the Rapid ASKAP Continuum Survey (RACS) and has been described in detail in Paper I (McConnell et al. Reference McConnell2020). RACS used 15 minute observations to image the sky south of $\delta=+41{^\circ}$ using 903 tiles with typical sensitivities of 0.25–0.3 $\mathrm{\,mJy\, beam}^{-1}$ . Each tile is defined as the mosaic of the individual 36 beams which are simultaneously observed using the PAF technology. This survey therefore provides the best knowledge of radio sources at gigahertz frequencies in the southern hemisphere to date. In the future, the Evolutionary Map of the Universe (EMU; Norris et al. Reference Norris2011; Norris et al. Reference Nyland2021) will provide a deeper map of the southern sky to ${\sim}20\, \mu\mathrm{Jy\ beam}^{-1}$ rms, but this will require a significant increase in observation time.
In this paper, we provide the first release of the RACS Stokes I source catalogue. The layout of this paper is as follows. First, we present an overview of RACS in Section 2 and describe the observations used for this first data release. Next, we describe the cataloguing process in Section 3 and the final catalogue in Section 4. We then discuss comparisons to previous surveys in Section 5 before discussing the completeness in Section 6. In Section 7, we present the source counts, both raw and completeness-corrected, for this survey before drawing conclusions in Section 8.
2. Rapid ASKAP Continuum Survey
A detailed description of the RACS tiling and observation strategy can be found in Paper I (McConnell et al. Reference McConnell2020). The majority of RACS observations were initially taken over the course of 12 days during April and May 2019. Subsequently, further re-observations of selected fields were taken in August–November 2019 and between March-June 2020. The latter re-observations were designed to reduce the PSF variation amongst the 36 beams within each individual tile. Each observation lasted 15 minutes in duration and a calibrator observation (of PKS B1934–638) of 200 s in duration was typically observed within a day of each observation. These observations covered a 288 MHz bandwidth in the frequency range 744.5–1032.5 MHz. In total, 903 tiles were observed to ensure complete coverage of the sky for $\delta \leq41^{\circ}$ .
Each observation was processed using ASKAPSoft (see Cornwell et al. Reference Cornwell, Humphreys, Lenc, Voronkov and Whiting2011; Guzman et al. Reference Guzman2019; Whiting et al. Reference Whiting and Ibsen2020, and Whiting et al. in prep). This software was specifically designed to take the raw ASKAP visibilities and produce calibrated images of the field, suitable for scientific use. The pipeline parameters used for the RACS survey are described thoroughly in Paper I. A robustness weight of 0.0 (see Briggs Reference Briggs1995) was used, and short baselines were removed to improve image fidelity for those observations close to the Galactic plane (baselines smaller than 35 m were excluded) or affected by solar interference (baselines smaller than 100 m were excluded)Footnote a. All RACS tile images are on a $2.5{^{\prime\prime}}$ pixel grid and cover ${\sim}31 \,\mathrm{deg}^2$ . As described in Paper I, after the tile images were formed, a flux correction was applied to the tile to account for differences between the primary beam model applied within ASKAPSoft and the response pattern determined from holography measurements.
Images of each of the 903 tiles have been released publicly on the CSIRO ASKAP Science Data Archive (CASDA, Chapman et al. Reference Chapman, Dempsey, Miller, Heywood, Pritchard, Sangster, Whiting and Dart2017; Huynh et al. Reference Huynh, Dempsey, Whiting, Ophel and Ibsen2020)Footnote b. To construct the first Stokes I catalogue, we convolved each tile to a common resolution before mosaicing with neighbouring tiles to reduce sensitivity fluctuations.
2.1. Selected tiles
A combination of factors affect the resolution of beam images within an individual tile, including declination, hour angle coverage, and the flagging of data within the observations. As ASKAP uses a phased array feed system, each field is constructed from 36 individual beams. The resolution can vary from beam to beam within an individual tile as well as between neighbouring tiles. In order to retain accurate flux scale measurements, it is necessary to ensure these beams and those in neighbouring tiles have the same resolution. This is because radio images are in units of $\mathrm{Jy\,beam}^{-1}$ and so varying PSFs in neighbouring images would affect both flux density and shape measurements of sources when mosaiced.
To determine the fields to include in this first data release it was important to consider the aim of the survey. RACS will provide a model of the observable sky for future ASKAP surveys as well as provide an initial epoch as a benchmark for the search for variable or transient sources. For these purposes, it is important to have a resolution as high as possible in order to resolve individual sources and also to observe a large contiguous region of the sky.
However, it is not possible to simply use those individual beams which have angular resolution better than the desired criterion. This is as the holography corrections, as described in Paper I, requires all 36 beams of a tile to be present. This correction accounts for the differences between the beam models assumed in the ASKAPSoft linear mosaicing function, linmos, compared to that determined by holography measurements. As some beams within tiles can have poor angular resolution compared to other beams within the field, not all of the 903 RACS tiles can be convolved to a desired common resolution.
Using these constraints, we decided upon a common resolution of a circular Gaussian beam with a diameter of $25^{{\prime\prime}}$ (Full Width at Half Maximum, FWHM) for the first catalogue data release. This choice of beam improves on the resolution of surveys such as SUMSS and NVSS by approximately a factor of 2, whilst also ensuring that the included observations cover the majority of the southern sky. The distribution of beam major axes (defined by the FWHM) across the entire RACS survey area released in Paper I is shown in Figure 1. All individual PSF major axes that are above $25^{{\prime\prime}}$ are shown in grey. For tiles in which there have been reobservations, the tile denoted with the column ‘SELECT=1’ in the accompanying database to Paper I was generally chosen. However for a few fields, see Table 1, a different tile to the one selected in Paper I was used. This was because the tile selected could not be convolved to $25^{{\prime\prime}}$ , whilst a different observation of the same field could be. This often resulted in these fields having larger rms values, but did result in a continuous observing area. As illustrated in Figure 1, there can be large variations in the PSF major axis across the 903 tiles of the RACS observations.
Figure 2 shows the coverage for the first Stokes I catalogue release area across the sky compared to all of the RACS observations. The region for this first catalogue release (blue in Figure 2) covers the majority of the southern sky with $\delta=-80{^\circ}$ to $+30{^\circ}$ and compromises 799 fields (or tiles).
2.2. Producing common resolution mosaics
In order to produce images at $25{^{\prime\prime}}$ resolution, we made use of scriptsFootnote c to convolve each of the 36 single beam images to the desired $25^{\prime\prime}$ resolution, ensuring retention of the flux scale. This process is discussed further in McConnell et al. (Reference McConnell2020). The convolved beam images were then linearly mosaiced together using the ASKAPSoft linmos function. Each beam image was weighted according to the number of contributing visibilities, and linmos assumed a circular Gaussian beam of FWHM $1.09\,\lambda$ /D for the primary beam model of each of the individual 36 beams. Here $\lambda$ is the central wavelength of the observations ( $=c/\nu = c/887.5$ MHz ${\sim}34\,\textrm{cm}$ ) and D is the diameter of an ASKAP dish (12 m).
Figure 3 presents images before and after convolution to $25^{\prime\prime}$ resolution for three example regions, showing the differences in images for a range of pre-convolution major axis sizes. As can be seen, convolving to a poorer resolution loses some of the fine structure that could have been observed in the sources, such as in the jets of AGN, and has led to an apparent increase in the rms for each image. However, it does provide a consistent resolution across the full region used for this first Stokes I catalogue data release. This prioritises having a reliable flux scale across the image over retaining higher, but variable, angular resolution (which is still available for all tiles in CASDA).
2.3. Tile flux corrections
As discussed in Paper I, the primary beam model assumed by linmos differs from the beam-dependent shapes revealed by holography measurements across the full ASKAP tile. This resulted in systematic and direction dependent errors in the brightness scale. Using a combination of holography and comparisons to SUMSS and NVSS, an empirical flux correction is applied to each tile. We apply this same flux correction to our linearly mosaiced, common resolution tiles. This flux correction varies across the field to a maximum correction of ${\sim}\pm30-40\%$ .
2.4. Full image mosaic
We mosaiced the convolved $25^{\prime\prime}$ , flux corrected tiles to improve the sensitivity at the edge of each tile. For each tile, adjacent tiles with overlapping regions were mosaiced together using SWARP (Bertin et al. Reference Bertin, Mellier, Radovich, Missonnier, Didelon, Morin, Durand and Handley2002) using the weights images produced with linmos. The resultant mosaiced tile image has the same extent as the original tile image but now includes contributions from neighbouring tiles. Since all tiles undergo the same mosaicing process, neighbouring mosaiced tiles will contain overlap regions with identical image data.
These mosaiced tiles allow users to extract images over small specific regions with ease, compared to if a full mosaic of the entire sky existed. These mosaics as well as the source catalogue will be released on CASDA alongside the release of the paperFootnote d. Additionally, as a full image of the sky is important to be able to easily navigate the survey and search for objects, this is included in the form of a HiPS (Fernique et al. Reference Fernique2015) image at https://www.atnf.csiro.au/research/RACS/CRACS_I2/. This HiPS image is an under-sampled version of the mosaiced images that are being released and allows a simple method for users to explore the entire RACS observations.
These mosaiced images of the tiles form the basis of the Stokes I RACS catalogue.
3. Stokes I catalogue: Individual tiles
In order to generate the first full Stokes I catalogue of the area described in Section 2.1, we first produced individual catalogues for each mosaiced tile. Further work was needed to combine these catalogues into a single Stokes I catalogue. In this section, we describe the process of extracting the initial catalogues. The merger of the tile catalogues is then described in Section 4.
We make use of the source extraction software PyBDSF (using version 1.9.1, Mohan & Rafferty Reference Mohan and Rafferty2015). PyBDSF was designed as a radio source finding tool for the LOw Frequency ARray (LOFAR; van Haarlem et al. Reference van Haarlem2013) that identifies areas of radio emission (islands) and fits these regions with 2D elliptical Gaussian components in order to produce both a ‘Gaussian component’ and ‘Source’ catalogue. The ‘Gaussian component’ catalogue (hereafter called component) consists of all the 2D Gaussians that are used to model sources in the field. As radio sources have a diverse range of morphologies, a combination of single and multiple component sources will exist within the catalogue. The source catalogue, in its default mode, joins together Gaussians within an island based on the separation of Gaussians and their flux valuesFootnote e. Because of this, Gaussians within the same island may be considered different sources. However, if necessary, it is also possible to force Gaussians of the same island to be grouped together in a single source. Details of PyBDSF and the parameters which users can specify can be found at https://www.astron.nl/citt/pybdsf/.
When using PyBDSF on individual tiles, we specified several non-default parameters:
- advanced_opts = True
- thresh = ‘hard’
- rms_box = (150,30)
- atrous_do = True
- atrous_jmax = 3
- mean_map = ‘zero’
- frequency = 887.5e6
- group_by_isl = True.
By default, PyBDSF uses a 3 $\sigma$ detection threshold to identify an island boundary (thresh_isl = 3), and a 5 $\sigma$ threshold is used to include islands within a catalogue (thresh_pix = 5). Setting thresh = ‘hard’ enforces this 5 $\sigma$ cut, and does not include a variable threshold based on the false detection rate.
We also specify the box size used by PyBDSF to generate an rms map through specifying rms_box = (150,30). Whilst PyBDSF can internally determine an appropriate size of box in order to produce the rms image, this may need to be changed if there are artefacts within the imageFootnote f. In fact, when we did not specify this parameter, the box size determined by PyBDSF could be as large as approximately 1000 pixels across. This was found to be too large and artefacts around bright sources were being included by PyBDSF in the output catalogue produced. Therefore, we decided to specify a smaller box size to better account for bright sources and remove the likelihood of artefacts being confused for real sources. The 150 in the rms box size represents the box size used to calculate the rms. It was chosen to be 150 pixels as this appeared to reflect the scale over which artefacts influenced the image surrounding a bright source for the areas with artefacts investigated. The 30 in rms_box reflects the step size (in pixels) by which the box is moved to calculate the rms.
Moreover, because of the sensitivity of ASKAP to extended emission, we wanted to ensure that such sources were accurately modelled by PyBDSF. To do this, we followed advice from the PyBDSF pagesFootnote g. We set the mean_map parameter to ‘zero’ and switched on the atrous mode using atrous_do = True. We used fitting up to three wavelet scales in this mode through setting atrous_jmax = 3. This allows extended emission on larger scales to be fit. As the rms appeared to vary across the field especially around bright sources, we left the rms_map as the default parameter in which an rms map is calculated for the field using the rms box size specified.
Finally, due to the source density within these observations, we believe we are not limited by confusion (see Section 4.2 for more details on the source density). By setting group_by_isl = True, we made the assumption that all sources within the same island are likely associated with the same source. From visually investigating a handful of random fields, the models produced by PyBDSF seemed to model source emission of resolved sources well.
After running PyBDSF we recorded three things:
-
• The catalogue of Gaussians identified within the image
-
• The catalogue of grouped sources identified within the image
-
• An rms image of the field
Both the ‘Gaussian’ and ‘Source’ catalogues have scientific value. The Gaussian catalogue is useful as it can be used to de-blend the emission from close neighbouring sources which are not associated with one another. However, the ‘Source’ catalogue is useful for providing information on multi-component sources. Therefore, we construct and release both a ‘Gaussian’ and ‘Source’ catalogue associated with the data.
4. Full sky catalogue
We compose the catalogue from the PyBDSF outputs giving each entry a unique identifier by combining field name and PyDBSF source/component identifiers. For example, source 0 in tile RACS_0000+12A was renamed from a Src_ID of 0 to RACS_0000+12A_0. An extra column that included the Tile_ID associated with a source and its separation from the tile centre was also recorded.
Due to the overlapping tiles, a simple concatenation of all the individual catalogues would result in the duplication of sources. As the images within the overlap regions are identical, only sources detected in a given tile for which that tile centre is the closest to the source are included in the final catalogue. The source position, not the position of individual components, is used for this match. This is due to the possibility of different Gaussians within the same source near a tile boundary having different tile centres as their nearest tile. After concatenation, we ensure that no sources from different tiles were separated by less than 2 pixels (i.e. $5^{\prime\prime}$ ). This only affected a very small number of sources (3 pairs - i.e. 6 sources), and so duplicates of these were removed.
We rounded the data to a given number of decimal places for the column also apply another 5 $\sigma$ thresholding. Whilst PyBDSF uses a 5 $\sigma$ threshold, this will be based on the peak pixel value within the image, not the modelled peak flux. This can therefore be greatly affected by noise fluctuations. To ensure we have high SNR sources, we therefore ensure a 5 $\sigma$ cut using the peak flux recorded in the PyBDSF catalogue and the island rms column.
Combining components and sources in this manner produced an initial source catalogue over the majority of the southern sky ( $\delta=-80{^\circ}$ to $+30{^\circ}$ ) of $\sim$ 2.3 million radio sources and a corresponding component catalogue of $\sim$ 2.7 million components, covering a total sky area of $30\,480 \mathrm{\,deg}^2$ . Figure 4 presents the observed density of sources across the sky using a HEALPix gridFootnote h, with an N $_{\rm side}$ value of 64, corresponding to a rough pixel size of $55^\prime$ . The apparent source density variation is discussed later in this paper.
4.1. Noise distribution
When PyBDSF produces the source and Gaussian catalogue of each tile, a variable rms map of each image is generated. In order to present this rms variation across this first data release, we randomly select 10 million positions across the sky in the range $\delta= -85{^\circ}$ to $+30^\circ$ . At each position, the value of the rms at that location is recorded. We plot this distribution of rms in Figure 5 on the same HEALPix grid as above and plotting the median rms value within each HEALPix cell.
The rms varies across the RACS survey due to a combination of factors. This includes the proximity to bright sources, unmodelled extended emission (which may be a factor close to the Galactic plane), conditions such as the hour angle coverage of the observations and the overlap of tiles across the sky. As shown in Figure 5, there are large rms values along the Galactic plane as well as in other regions for example around $\delta=0{^\circ}$ . These variations across the full sky will arise from a variety of reasons such as from having extended emission in the Galactic plane; having bright sources with large artefacts within a field and, finally, the scheduling of the observations relative to its hour angle coverage. The median rms is typically smaller at more southerly declinations compared to equatorial regions. This may be influenced by the greater overlap between neighbouring tiles or possibly due to the hour angle coverage of these observations (see Paper I).
The distribution of all rms values (from these random positions) across the field of view ( $30\,480 \ \mathrm{deg}^{2}$ ) can be seen in Figure 6 (left), and the variation of the median rms value as a function of declination within different declination bins can be seen in Figure 6 (right). This is shown both inclusive and exclusive of the Galactic plane. Across the full sky, the rms values typically have a median value of approximately $0.3 \mathrm{\,mJy\, beam}^{-1}$ ; however, this is closer to $0.2-0.25 \mathrm{\,mJy\, beam}^{-1}$ at $\delta \lesssim -50{^\circ}$ rising to values closer to $0.35-0.4 \mathrm{\,mJy\, beam}^{-1}$ near $\delta= 0{^\circ}$ .
4.2. The Galactic plane
As can be seen in Figure 5, the rms is elevated around the Galactic plane. Furthermore, as presented in Paper I, the emission around the Galactic plane includes substantial extended emission, such as supernova remnants. As these structures will be insufficiently modelled using Gaussian components, we removed the region around the Galactic plane. Therefore, whilst the images on CASDA will contain these regions, the final catalogues used for the analysis in this paper do not contain any sources where the magnitude of the Galactic latitude, $|b| <5{^\circ}$ . Also, in regions near the Large and Small Magellanic Clouds or supernova remnants sources may be poorly modelled as Gaussian components; however, these regions remain in the catalogue.
After excluding the low Galactic latitudes the final source catalogue contains 2,123,638 sources and 2,462,693 Gaussian components over ${\sim}28\,020 \mathrm{\,deg}^2$ of the sky. This corresponds to an average $\sim$ 90 components or $\sim$ 75 sources per square degree.
We include with the data release the catalogue generated within the galactic plane region defined here. However, we urge caution for any users wanting to use this catalogue for regions with $|b| <5{^\circ}$ . All further quality assessment and comparison of the catalogue to previous surveys refers solely to the catalogue with the galactic latitude cut imposed and we note that, as can be seen in McConnell et al. Reference Mohan and Rafferty(2020), there may be flux density offsets close to the Galactic plane, as well as large RMS values (see Figure 6).
4.3. Catalogue columns
Using the combined PyBDSF catalogues, we use a subset of the column information when generating the final catalogues. These columns provide information on: IDs; astrometry; flux densities; shape information and other important source information. We present an example of the first 10 lines of the source catalogue in Table 2 and Gaussian component catalogue in Table 3, sorted by the Source_ID of the source/Gaussian component. We present a description of the column information for these two RACS Stokes I catalogues belowFootnote i.
4.3.1. Source catalogue
For the Source catalogue, we define the following columns:
-
• Source_Name - The name of the source given in the IAU convention JHHMMSS.S $\pm$ DDMMSS with the prefix RACS-DR1Footnote j
-
• Source_ID - The ID of the source given by the RACS tile ID added to the Src_ID generated by PyBDSF
-
• Tile_ID - The ID of the tile that the source was located in.
-
• SBID - The ID of the scheduling block associated with the observation.
-
• Obs_Start_Time - The time that the pointing observation started as Modified Julian Day (MJD) expressed in days.
-
• N_Gaus - The number of Gaussian components that were used to fit the source
-
• RA and Dec (and errors) - The J2000 position of the source and its associated errors
-
• Total_flux_Source - The total flux density of the entire source (i.e. the sum of the Gaussian components and the Total_Flux column in the PyBDSF source catalogue).
-
• E_Total_flux_Source_PyBDSF—The error on the total flux density from the E_Total_Flux column in PyBDSF.
-
• E_Total_flux_Source—The combined error on the total flux density derived by summing in quadrature the error from PyBDSF with the errors of flux density from Equation 7 of McConnell et al. (Reference Mohan and Rafferty2020).
-
• Peak_flux (and error)—The modelled peak flux density for the source and its associated error from PyBDSF
-
• Maj, Min and PA (and errors)—The major axis, minor axis, and position angle of the source fit by PyBDSF
-
• DC_Maj, DC_Min and DC_PA (and errors)—The deconvolved major axis, minor axis and position angle of the source
-
• S_Code —The code from PyBDSF which defines whether a source is a single (S), multiple (M) or complex (C) source. A single source (S) is a single Gaussian source corresponding to a single island. A multiple (M) is where a single source is composed of multiple Gaussians. A complex source (C) is a source where there are multiple Gaussians which form multiple sources within an island.
-
• Separation_Tile_Centre—The distance between the source and the centre of the tile it is located in.
-
• Noise - The rms noise within the island boundary, quoted from the Isl_rmscolumn in PyBDSF.
-
• Gal_lon and Gal_lat—The Galactic longitude and latitude of the source in degrees
-
• Flag_Close—All sources where there was another source within ${25}^{\prime\prime}$ are flagged with a ‘C’. For 3 pairs of sources, these were so closely located that the Source_Name was identical. This is only 3 Source name’s out of $\sim$ 2 million and so we have flagged these with ‘CD’ in this column. For Sources with no match within ${25}^{\prime\prime}$ have ‘-’ in this column.Footnote k
Unless specified, associated are as described in the PyBDSF documentation.
4.3.2. Gaussian component catalogue
For the Gaussian component catalogue, the associated columns are:
-
• Gaussian_ID —The ID corresponding to the Gaussian component constructed as the RACS tile ID added to a unique Gaussian ID for the Gaussian components in the individual tile
-
• Source_ID, Tile_ID, SBID, Obs_Start_Time and N_Gaus—as above, describing the source associated with this Gaussian component
-
• RA/Dec (and errors)—The J2000 position of the Gaussian component and its associated errors
-
• Total_Flux_Gaussian (and errors)—The modelled total flux density of each individual Gaussian component and the associated errors (similar to as described above for the source but now for the component flux density).
-
• Total_Flux_Source (and errors)—Total flux densities and errors as described for the source catalogue
-
• Peak_Flux (and error)—The modelled peak flux density of the Gaussian component and its associated error.
-
• Maj/Min/PA (and error)—The major and minor axes of the source (FWHM) and the position angle of the Gaussian component used to model the source
-
• Maj_DC/Min_DC/PA_DC (and errors)—The deconvolved source sizes and position angle of the Gaussian component
-
• S_Code—as in source catalogue
-
• Separation_Tile_Centre—The distance between the Gaussian component and the centre of the pointing it is located in
-
• Noise—as in source catalogue
-
• Gal_lon and Gal_lat—The Galactic longitude and latitude of the Gaussian component
More information on how the parameters in the source (*srl.fits) and Gaussian component (*gaul.fits) catalogues are produced by PyBDSF can be found through the PyBDSF documentationFootnote l.
We note here that other work may use differing terminology to the source/Gaussian definitions used in this work. For example, “source” in other work may refer to the final radio object where separated lobes and components that have not been identified by PyBDSF as differing sources but that actually come from the same physical object are combined together. This process of combining “sources” (as defined here) into the same physical object often relies on a combination of visual identification and either machine learning methods or likelihood ratios (see e.g. Banfield et al. Reference Banfield2015; Williams et al. Reference Williams2019; Galvin et al. Reference Galvin2020). The process of combining sources into objects for RACS, however, is beyond the scope of this work.
5. Comparisons with previous radio surveys
Having completed the construction of a final catalogue, we now make comparisons with previous radio surveys at various radio frequencies in order to validate the values determined from RACS.
5.1. Comparison images
We begin with a visual comparison for a handful of RACS sources and their counterpart regions in SUMSS, NVSS, and TGSS-ADR, to indicate the difference in image resolution and baseline sensitivity. We include a comparison image from the IR wavelength AllWISE survey (Cutri et al. Reference Cutri2013), to make comparisons for a nearby resolved galaxy. As all these surveys have different sky coverage, there is only a narrow declination window ( $\delta=-40^{\circ}$ to – $30^{\circ}$ ) where it is possible to make a comparison with all four surveys. To obtain these images, we make use of the cutout servers for each of the respective surveysFootnote m.
Figure 7 demonstrates the higher resolution and increased sensitivity of RACS compared to SUMSS and NVSS. The sensitivity of ASKAP to extended emission is shown to be especially important (see the upper panel) to observe the structure in the spiral arms of the resolved galaxy NGC2997. These four cutouts highlight the improvement of RACS on previous large sky southern radio surveys. These images aim to indicate the quality that can be achieved with RACS. On the other hand, there may be regions, for example, around bright sources, with poorer sensitivity compared to other surveys due to the snapshot nature of these observations and difficulties with the image processing.
Images in the RACS regions will be improved with further observations of RACS as well as with surveys such as the Evolutionary Map of the Universe (EMU; Norris et al. Reference Norris2011, Reference Nyland2021; Pennock et al. Reference Pennock2021).
5.2. Flux offsets, astrometric offsets and spectral indices
It is important to ensure an accurate flux scale and accurate astrometry compared to previous observations as well as to investigate how the measured spectral index compares to our knowledge of the radio source population. We therefore compare our results to five previous large area radio surveys: GLEAM, NVSS, SUMSS and TGSS-ADR. Each of these surveys have different angular resolutions, operate at different frequencies and observe different (although often overlapping) regions of sky. Due to the differences in resolution and sensitivity, we restrict comparison to unresolved, high signal-to-noise, isolated sources. This ensures differences in the angular resolution, noise, and sensitivity to extended emission do not affect our comparisons.
5.2.1. Identifying unresolved sources
To select unresolved sources, we follow a previously-employed method (Bondi et al. Reference Bondi, Ciliegi, Schinnerer, Smolčić, Jahnke, Carilli and Zamorani2008; Smolčić et al. Reference Smolčić2017a; Shimwell et al. Reference Shimwell2019) by defining an envelope to distinguish unresolved sources from those which are resolved. To construct this envelope, we used the Gaussian component catalogue and selected those components that were classified as single sources, and detected at SNR $\,{\geq}\,$ 5, where the SNR was defined as the peak flux of the Gaussian component divided by its island rms noise. We then considered how the ratio of the integrated flux density ( $S_T$ ) to peak ( $S_P$ ) flux density as a function of SNR; see Figure 8.
The total flux-density $S_T$ of an unresolved source with peak brightness $S_P= S\,\mathrm{mJy\,beam}^{-1}$ is $S_T$ = S mJy, by construction. Therefore, if a source is unresolved and the synthesized beam size is a correct representation of the image resolution, the ratio of the integrated to peak flux ( $S_T/S_P$ ) should be identically 1. This ratio, however, often has scatter around 1, especially at low SNR where faint sources are more affected by noise at the source position. For our data we find that as the SNR increases, $S_T/S_P$ tends to a value of 1.025, as illustrated in Figure 8 (right panel). The source of the discrepant value of $S_T/S_P$ is unimportant for our analysis here, but must lie in some unmodelled source smearing due to effects such as uncorrected gain errors or astrometric mismatches between overlapping beams. Following the methods of (Bondi et al. Reference Bondi, Ciliegi, Schinnerer, Smolčić, Jahnke, Carilli and Zamorani2008; Smolčić et al. Reference Smolčić2017a; Shimwell et al. Reference Shimwell2019), we expect values of $S_T/S_P$ , as a function of SNR, to lie predominantly between the envelopes described by
As resolved sources will have elevated values of $S_T/S_P$ , we determine values for A, B from the lower envelope $S_T/{S_P}_{-}$ and declare sources with $S_T/S_P$ > $S_T/{S_P}_{+}$ to be resolved. To generate this fit, we use equally spaced logarithmic bins in SNR. For each bin with 100 sources or more, we find the $S_T/S_P$ ratio that contains 95% of the sources with $S_T/S_P<1.025$ , indicated by the black crosses on Figure 8. These points are fit to Equation 1 using the Scipy function curve_fit. This fit to the lower envelope is determined to be: $1.025 - 0.69 \times \mathrm{SNR}^{-0.62}$ . We reflect this envelope about $S_T/S_P$ = 1.025 and define the upper envelope: $S_T/S_P = 1.025 + 0.69 \times \mathrm{SNR}^{-0.62}$ . Sources below the upper and lower envelopes are determined to be unresolved. Unresolved components are shown as blue points in Figure 8 and resolved sources in grey. From this we estimate approximately 40% of RACS sources are unresolved at ${25}^{\prime\prime}$ resolution, and should therefore also be unresolved in the comparison catalogues.
5.2.2. Matching catalogues
For comparison with other catalogues, RACS sources are selected according to the following criteria:
-
1. Are isolated within an angular separation of $N_{ISO}^{{\prime\prime}}$ . The value of $N_{ISO}$ is given as twice the poorer resolution (using the FWHM) of the two catalogues being compared. We apply the same ‘isolated’ criterion for the comparison radio survey.
-
2. Have a peak SNR in RACS $\geq10$
-
3. Satisfy the unresolved envelope criterion as described above.
-
4. Match the comparison radio catalogue within an angular separation of $N_{match}^{{\prime\prime}}$ . Here $N_{match}$ is taken to be $10^{\prime\prime}$ . This value corresponds to 4 pixels in the RACS images and allows for variation in the positions measured of sources, given NVSS and SUMSS have an angular resolution $\sim$ 2 times poorer than RACS.
The resolution and frequency for each of the surveys we compare to RACS are shown in Table 4. We use sources which satisfy the match criteria to consider the offsets in flux and astrometry, as well as the measured spectral indices. The spectral index ( $\alpha$ ) is used to define the broadband radio emission as a power law of the form $S_{\nu} \propto \nu ^ {\alpha}$ , where $S_{\nu}$ is the flux density at a frequency, $\nu$ . Typically, $\alpha$ is found in catalogues to have an average value of –0.7 to –0.8 in the synchrotron dominated regime (see e.g. Condon Reference Condon1992; Mauch et al. Reference Mauch, Murphy, Buttery, Curran, Hunstead, Piestrzynski, Robertson and Sadler2003; Smolčić et al. Reference Smolčić2017a).
5.2.3. Flux offsets
We make flux density comparisons using SUMSS due to its close proximity in frequency to RACS (843 MHz for SUMSS compared to 887.5 MHz for RACS). This minimizes any effect of spectral index uncertainty on flux density comparisons. For example, assuming a nominal spectral index of $\alpha=-0.8\pm0.1$ we expect the frequency differences between RACS and SUMSS to result in a flux offset of $\pm0.5$ %, increasing to $\pm$ 5% at the frequency of NVSS resulting from the error in spectral index.
Using the matching criteria described above, 53,680 matched sources were identified. The comparison of total flux densities assuming a spectral index of $\alpha=-0.8$ can be seen in Figure 9. From this we find a median flux ratio of $1.00^{+0.16}_{-0.16}$ . The associated errors are quoted from the $16^{\textrm{th}}$ and $84^{\textrm{th}}$ percentiles. We therefore conclude that we have an accurate flux scale for our observations. This flux comparison as a function of position can also be seen in Figure 10. We present this for both comparisons with SUMSS (Figure 10, left) but also show this comparison with NVSS (Figure 10, right). Whilst the difference in frequency compared to NVSS is larger, the two figures in Figure 10 combined show the flux offsets across the majority of the coverage of RACS. Figure 10 does not appear to show significant systematic variation in the ratios of flux density as a function of position.
5.2.4. Astrometric offsets
We assess the astrometry of RACS, using matches that satisfy the selection criteria described in Section 5.2.2 for some of the catalogues described in Table 4. We define the RA offset to be: $\Delta \textrm{RA}$ = RA $_{\textrm{RACS}}$ - RA $_{\textrm{Comp}}$ where “Comp” refers to the comparison survey. The Declination offset is defined in the same way. These astrometric offsets can be seen in Figure 11. We compare to SUMSS and NVSS, but not to GLEAM due to its much larger PSF ( $\sim2{^\prime}$ ), nor to TGSS-ADR as it was tied to the astrometry of NVSS ( $\delta\geq-35{^\circ}$ ) and MGPS or SUMSS ( $\delta\leq-35{^\circ}$ ) to avoid residual astrometric errors from ionospheric interference at low frequencies. We also include a comparison to AT20G which, although it has far fewer comparison sources than NVSS and SUMSS, provides a comparison with surveys at much higher frequencies.
From this we find small median systematic offsets in both RA and Dec, |Offset| $\lesssim 0.8{^{\prime\prime}}$ , where the RA value of RACS is systematically lower than NVSS and AT20G but larger than SUMSS. Here we find RA offsets (in ′′) of: $-0.85^{+1.32}_{-1.22}$ (AT20G), $-0.71^{+2.28}_{-2.22}$ (NVSS) and $+0.46^{+4.05}_{-3.66}$ (SUMSS). The Dec offset is smaller in magnitude than for RA. The measured Dec offsets (in ′′) are: $+0.21^{+0.77}_{-0.86}$ (AT20G), $+0.31^{+2.31}_{-2.38}$ (NVSS) and $+0.12^{+2.51}_{-2.62}$ (SUMSS). However, as the pixel size of the images is $2.5^{\prime\prime}$ , these offsets are typically constrained within a pixel or two. Further discussion of the beam to beam accuracy in astrometry within the individual beam images can be found in Paper I.
The variation of astrometric offset with sky position can be seen in Figures 12 and 13 for Right Ascension and Declination respectively.
5.2.5. Spectral index comparisons
Finally, we compare the spectral index between RACS and radio surveys at other frequencies, assuming a power law spectral energy distribution (SED) as discussed in Section 5.2.2. We define $\alpha$ here as
It is important when measuring the spectral indices between matched catalogues that the sensitivity limits are considered. This will bias spectral indices to either lower or higher values depending on the sensitivity limits and frequencies of the comparison surveys. Therefore we consider the spectral index for our matched sources both with and without a flux density cut applied. To determine the flux density cuts to apply we assume the sensitivity limits of each survey to be the 5 $\sigma$ sensitivity limits in Table 4 or the approximate 10 $\sigma$ sensitivity of RACS (taken here as 3 mJy). Using these sensitivity limits, we determine the flux density cuts that are necessary to ensure there is no bias within the range $\alpha = -0.8\pm 1.2$ , which should encompass the majority of $\alpha$ values observed (see e.g. Smolčić et al. Reference Smolčić2017a; Tiwari Reference Tiwari2019). We then apply any necessary flux cuts to avoid any bias in $\alpha$ . This flux density cut greatly reduces the number of sources available for comparisons. The histogram distribution of these spectral indices can be seen in Figure 14 (left) as well as the comparison of spectral index with flux density (right). This latter plot indicates the necessity of applying a flux limit cut when investigating the spectral index.
For spectral index comparisons, we do not consider $\alpha^{\rm RACS}_{\rm SUMSS}$ due to the small frequency offset. However, we add in a comparison to the rescaled TGSS-ADR catalogue from Hurley-Walker (Reference Hurley-Walker2017). This adjusted the flux scale of TGSS-ADR based on measurements from the GLEAM survey. For comparisons to this survey, we will use the label ‘TGSS-ADR-R’.
From Figure 14, we find a typical median $\alpha$ in the range $\sim-$ 0.6 to $-$ 0.9, encompassing the typical values expected of $\sim-$ 0.7 to $-$ 0.8. Without a flux cut applied, the median $\alpha$ and errors from the $16^{\rm th}$ and $84^{\rm th}$ percentiles are measured as: $-0.69^{+0.25}_{-0.21}$ (GLEAM), $-0.87^{+0.52}_{-0.42}$ (NVSS), $-0.64^{+0.26}_{-0.23}$ (TGSS-ADR), and $-0.62^{+0.25}_{-0.22}$ (TGSS-ADR-R). When a flux cut is applied, these are now measured as: $-0.61^{+0.32}_{-0.19}$ (GLEAM), $-0.90^{+0.43}_{-0.39}$ (NVSS), $-0.59^{+0.36}_{-0.22}$ (TGSS-ADR), and $-0.58^{+0.36}_{-0.21}$ (TGSS-ADR-R). The comparisons with the low frequency surveys of TGSS-ADR and GLEAM are closer to –0.6 to –0.7, whilst the higher frequency comparison with NVSS is more similar to –0.9. This may suggest that the RACS fluxes are slightly larger than expected from previous surveys; however, as shown in Section 5.2.3, we have a good flux comparison with SUMSS.
In general, these comparisons have shown that we have good systematic astrometric and flux characteristics compared to other surveys. Measurements of the spectral indices of RACS sources will also be improved with future RACS observations, which are planned for different frequency bands (see Paper I).
6. Completeness
We consider the completeness of our catalogue as a function of flux density. It should be close to unity at high flux densities and will decline towards zero close to the detection threshold of the survey. Completeness is affected by both the variation of rms across the survey area, which affects the detection threshold, as well as the source finder itself. We therefore need to consider the completeness of this catalogue as a function of flux density.
We consider the survey completeness in two forms, for unresolved sources and for a combination of both unresolved (point) and resolved sources. To investigate both of these, we use simulations in which we inject sources into the residual images (Image - Model from PyBDSF) and investigate the recovery of the injected sources with PyBDSF. These simulations are described below.
6.1. Point source detection
First, to investigate the point source completeness, we injected Gaussians with the resolution of the images i.e. a circular PSF of $25^{\prime\prime}$ FWHM into our residual images. To consider the detection of sources at a variety of realistic radio flux densities, we use the simulated catalogues from SKADS (Wilman et al. Reference Wilman2008, Reference Wilman, Jarvis, Mauch, Rawlings and Hickey2010). These simulations were created in preparation for the Square Kilometre Array (SKA) to provide realistic mock catalogues that reflect both observations from existing radio surveys as well as expectations of radio sources below current sensitivity limits.
For 5 million random positions across the range $\delta = -85{^\circ}$ to $+30^\circ$ we find the closest tile for each random source. For each tile we consider the random sources which are closest to that tile and for each source we randomly choose a flux density from SKADS and scale this from 1400 MHz to 887.5 MHz, assuming a spectral index of $\alpha=-0.8$ . We inject a Gaussian component with the simulated total flux densityFootnote n into the residual image at the random position generatedFootnote o. PyBDSF is then used to investigate the detection of simulated sources within the image, using the same parameters as in Section 3. From this the comparison of detected sources across the image can be calculated. We repeat this for each tile within the observation. We repeat this method 10 times to make multiple realisations of the simulated distribution of sources. We estimate the average completeness from the mean completeness in each flux bin considered and the error from the standard deviation across the 10 realisations.
Once all the output PyBDSF catalogues have been calculated for each field and for each simulation realisation, we compare the input sources to those measured. For each field, we match the input catalogue to the recovered catalogue and class those sources as “recovered” as those output source which match to an input source within half the FWHM resolution of our images (i.e. $12.5^{\prime\prime}$ ). We then calculate the detection fraction in two methods.
First within each flux density bin, we investigate the fraction of sources that have a “recovered” counterpart. The result of this can be seen in the left panel of Figure 15. This shows approximately 50% detection at $\sim$ 1.7 mJy and 95% detection at ${\sim}5.0$ mJy. From this, we also consider the overall completeness of the sources detected in the survey. To do this, we combine knowledge of the underlying flux distribution of sources from the SKADS simulations, with the fraction that are detected. For each flux density bin (logarithmically sampled), we sum the product of the detection fraction with the input source count distribution from the random sources at flux densities greater than or equal to the flux density bin being considered. This is normalised to the sum of the full input source count distribution. This overall completeness can be seen in the left hand of Figure 15. This suggests an overall 50% completeness at $\sim$ 0.8 mJy and 95% completeness at ${\sim}2.9$ mJy.
Second, we consider the effect of flux measurement by the source finder and how this may affect the apparent distribution of fluxes. This comparison of input to measured flux distribution can be seen in Figure 16 (left panel for point sources). As can be seen, these measured fluxes are scattered around the 1-to-1 measurement line (black line), but will have a positive bias, especially at fainter flux densities. This positive bias is a combination of the effect of the measurement of source fluxes being affected by noise peaks/trough a source is on and, as brighter sources are more likely to be detected, sources which lie on a positive noise spike are more likely to be detected. Moreover, as there are more faint sources within the simulations, these are more likely to be affected by this positive bias.
To determine the point source detection fraction, with this second method, we compare the binned distribution of flux densities recovered by PyBDSF compared to the input flux density distribution of the simulated sources injected into the image. The ratio of the output flux density distribution compared to the input flux density distribution is therefore a measurement of the detection fraction of point sources across the image. This can be seen as the black line in the right panel of Figure 15. As the change in flux density can be seen through this measurement, it is possible to have detection fractions larger than one. This will reflect that, due to differences between input and output flux densities, there are more sources observed in a flux density bin than were input into the simulation. This method suggests a completeness of 50% for point sources with a flux density of $\sim$ 1.8 mJy and 95% at a flux density ${\sim} 2.7$ mJy. This method will be especially important in the discussion of source counts in Section 7.
For both methods though, we determine the average detection fraction and completeness in each flux density bin by the mean value from the simulations. The associated error is then taken as the standard deviation from the simulation realisations.
6.2. Resolved source completeness
Next we investigate the effect of source size on completeness, as the previous section neglects the effect of resolution bias. Resolution bias accounts for the relative difficulty in detecting extended sources compared to point sources. An extended source with the same integrated flux density as a point source will have a lower peak flux density, an effect that becomes more important at low SNR. This will be important to consider as the majority of sources in this catalogue are believed to be resolved (see Figure 8).
To investigate this, we again use the simulated sources from Wilman et al. (Reference Wilman2008; Reference Wilman, Jarvis, Mauch, Rawlings and Hickey2010), using the source size models associated with each source. SKADS sources are described by a single or a combination of components which are described by ellipses. This will therefore contain a combination of single-component sources for objects such as SFGs as well as multi-component lobed FRI and FRII sources. These simulations should therefore give a more realistic distribution of the diverse ranges of sources expected within radio surveys and will contain a combination of resolved and unresolved sources.
To consider the completeness of a realistic distribution of resolved and unresolved sources, we follow the same method as in Section 6.1; however, first convolving the ellipse source model with the Gaussian PSF, ensuring the flux scale is retainedFootnote p. After running PyBDSF on each imageFootnote q, we use the same process as in Section 6.1 to compare the input and output catalogues and determine the detection fraction. This detection fraction is shown in Figure 17, and the comparison of input to measured flux densities can be seen in the right hand panel of Figure 16. This shows that 50% of sources at $\sim$ 1.8 (1.9) mJy will be detected, increasing to a 95% detection fraction at approximately $\sim$ 8.6 (3.3) mJy using method 1 (2) described above. This indicates a poorer detection fraction than for point sources, reflecting the effect of resolution bias on the detection of sources. However, it could also relate to the fact that the simulated sources are extended and made of multiple components and so matching using a positional radius may lead to errors in matching components to a single source and so larger errors between the positional location of the input to measured source. Issues due to this would also be seen in flux density comparisons in Figure 16 where the input and measured flux densities appear offset. Computing the overall completeness of our catalogue from method 1 suggests 50% completeness at $\sim$ 0.9 mJy and 95% completeness at $\sim$ 4.7 mJy.
6.3. Limitations of the simulations
We identify three separate limitations to the simulations we have used to analyse the survey completeness. First, these simulations inject sources into the residual image. Therefore, these will not account for any issues that are introduced through the calibration pipeline or any effects of CLEAN bias (see e.g. Section 7.2 of Becker et al. Reference Becker, White and Helfand1995) or time and bandwidth smearing (see e.g. Bridle & Schwab Reference Bridle and Schwab1989). Furthermore, some smearing may occur where images are mosaiced together which could affect source detectability, this includes both when beams are mosaiced together to form a tile and where tiles are mosaiced with other neighbouring tiles. To improve this, sources could be injected into the visibilities and processed through the pipeline; however, for 799 tiles, this is an arduous process.
Second, these simulations will be affected if the source morphologies assumed in SKADS and flux density distribution of input sources are not as accurate a representation of the underlying source distribution as expected. This may also be the case if the morphologies observed with ASKAP are more susceptible to extended emission and complex morphologies which may not be well modelled in SKADS using elliptical components. In term of flux density distribution, though, the source counts from Wilman et al. (Reference Wilman2008) seem to well recreate observations at the flux densities probed by RACS. This may, however, not be the case at fainter fluxes (see e.g. Smolčić et al. Reference Smolčić2017a; Mauch et al. Reference Mauch2020; Matthews et al. Reference Matthews, Condon, Cotton and Mauch2021).
Finally, these simulations may not properly account for the effect of having multiple sources located in close proximity to each other. This is as the sources are injected at random positions and so will have a uniform source density. This will not account for the clustering of real sources due to the large scale structure of the Universe. Moreover, in the matching process, there may be issues arising from simulated sources merging into a single source if they are located in close proximity. This could affect whether input to output simulated sources are matched together as well as any input to output flux ratios.
Despite these limitations, these simulations should give a good understanding of how well we are detecting realistic radio sources within our images. We shall discuss further how successful these simulations appear to be, given their effect on the measured source counts, in Section 7.
6.4. Reliability
Next we assess the reliability of these observations following the approach of Intema et al. (Reference Intema, Jagannathan, Mooley and Frail2017). This considers the contamination of noise within the catalogue by considering the source detection over the negative image (i.e. –1 $\times$ image). The technique relies on the premise that the noise in the image is symmetric. Therefore, running PyBDSF over the negative images using the same parameters as in Section 3 can indicate the distribution of positive noise which may have contaminated the final source catalogue.
We concatenate the PyBDSF catalogues from the negative images in the same way as described in Section 3, to avoid source duplication. The distribution of source flux densities from the negative image compared to the final catalogues is shown in Figure 18. The number of negative sources is small compared to real sources within the catalogue ( ${\sim} 0.3$ %), suggesting that the number of false detections within the final source catalogue is negligible compared to real sources.
6.5. Density as function of declination
Finally, in order to consider the completeness as a function of sky position, we present the variation of source density of the catalogue with declination. This is shown in Figure 19 for all sources in the source catalogue, where we have excluded the Galactic plane. The density of sources with a total flux density above or equal to the six different flux density limits quoted (1, 2, 3, 4, 5 and 10 mJy) is shown. In Figure 19, the left hand panel shows this source density as a function of declination, whilst the right hand panel indicates the area which is being considered whilst constructing the source density. The integrated total number of density of sources above a given flux density is presented in Table 5.
As can be seen in Figure 19, the source density within our catalogue is approximately flat across the entire declination range observed for flux density limits $\geq 4$ mJy. In the 1 and 2 mJy flux density limit bins, on the other hand, the incompleteness limits within the data means that the source density is more variable over the declination range. These are still relatively small variations at 3 mJy; however, for the 1 mJy limit, the source density of sources around $\delta=-60{^\circ}$ is much larger compared to at other declination ranges. Moreover, due to the higher rms that can be seen in Figure 5, there is an under-density in sources at declinations of ${\sim}0{^\circ}$ to $+20^\circ$ in both the 1 and 2 mJy bins.
7. Source counts
Finally, using our finished catalogue and, having quantified the completeness within our sample, we compare the source count distribution of the radio sources identified in our catalogue to previous surveys. Whilst narrow, deep surveys help to fill in the source count distribution of faint sources, it is only with large area surveys that the source count distribution of the brightest sources can be understood. This is because these bright objects are rare. Source counts describe the number of sources within a given flux density bin per steradian on the sky. These are typically normalised by multiplying by $S^{2.5}$ (where S is the total flux density) to define the Euclidean normalised source counts (see e.g. Heywood et al. Reference Heywood, Jarvis and Condon2013, for an explanation).
Radio source counts are constructed for most radio survey catalogues (see e.g. Bondi et al. Reference Bondi, Ciliegi, Schinnerer, Smolčić, Jahnke, Carilli and Zamorani2008; Smolčić et al. Reference Smolčić2017a; Shimwell et al. Reference Shimwell2019) and so compilations of the source counts from multiple surveys exist (e.g. de Zotti et al. Reference de Zotti, Massardi, Negrello and Wall2010). We therefore determine the source counts for our catalogue and make comparisons to the past survey source counts compiled by de Zotti et al. (Reference de Zotti, Massardi, Negrello and Wall2010) and the low frequency source counts from GLEAM from Franzen et al. (Reference Franzen, Vernstrom, Jackson, Hurley-Walker, Ekers, Heald, Seymour and White2019), which cover large sky areas, in Figure 20. Here we illustrate the source count distribution for the source catalogue discussed in Section 3. The raw count from this catalogue can be seen as the light blue circles in Figure 20.
However, as discussed in Section 6, our observations are not complete to the faintest flux density limits to which we observe. This is due to the fact that there are noise variations across the full survey area of RACS, meaning that the faintest sources are unable to be detected uniformly. To make a correction for this and to correct the measured source counts to what would be observed if the rms was uniform, we make use of the detection fraction curves described in Section 6. Specifically, we use the detection fraction where we account for variations in flux density (see Figures 15 and 17 - right). We use this as the measurements of the flux densities of sources will also be affected by any differences between the true and measured source densities.
Using the detection fraction as a function of flux density, we corrected the raw counts using the detection fractions from Sections 6.1 and 6.2 to understand the intrinsic source count distribution. We calculated the associated errors by adding in quadrature the errors from both the poisson statistics of the data itself, with the errors from the completeness simulations (see Section 6.1). In Figure 20, we plot the source counts only for those sources that have a flux density (at 887.5 MHz) of greater than 1 mJy. These are compared to the compilation of measured source counts from de Zotti et al. (Reference de Zotti, Massardi, Negrello and Wall2010) as well as the source counts from the extra-galactic simulated catalogues of SKADS (both converted to 887.5 MHz assuming $\alpha=-$ 0.8). We apply corrections based on both the point source only corrected simulations (Section 6.1; blue) as well as the simulations which have both point and extended simulated sources (Section 6.2; dark blue). These are both plotted so that the effect of source size can be investigated.
As can be seen from Figure 20, these corrections only affect the lowest flux density bins, below approximately 5 mJy. Using the corrections from Section 6, we find that if we only include point sources in our investigations, then the source counts appear too small at faint flux densities in comparison to previous observations. When the effect of extended sources is included, these source counts are further corrected and are now in much better agreement with the source counts from de Zotti et al. (Reference de Zotti, Massardi, Negrello and Wall2010) and Wilman et al. (Reference Wilman2008). However, these source counts may still possibly appear too low in the faintest flux density bins and possible explanations for this can be found in Section 6.3. A table of the resulting RACS source counts can be found in Table 6.
At high flux densities, we find that RACS is able to provide tight constraints on the source counts, due to the large area coverage of the survey. Importantly, this allows the source counts at flux densities at values $\gtrsim 10^4$ mJy that are not well investigated in the de Zotti et al. (Reference de Zotti, Massardi, Negrello and Wall2010) compilation catalogue to be seen. The source counts presented at these high flux densities do appear higher than the counts from Franzen et al. (Reference Franzen, Vernstrom, Jackson, Hurley-Walker, Ekers, Heald, Seymour and White2019). This may reflect differences in the source populations observed at lower frequencies (200 MHz).
8. Conclusions
In this paper, we have presented the first Stokes I catalogue release for the Rapid ASKAP Continuum Survey (RACS, McConnell et al. Reference McConnell2020). This first catalogue contains the majority of the Southern sky in the Declination region $-80^\circ$ to $+30^\circ$ using 799 tiles across the sky. These observations were reduced as described in McConnell et al. (Reference McConnell2020) and in this paper we describe the process of mosaicing the observations together and producing a single source catalogue. We present the source catalogue of 2,123,638 islands and 2,462,693 components which were derived from PyBDSF catalogues over each of the 799 tiles. These have been combined together to form a full catalogue which removes duplicate sources. This catalogue will be released for download from CASDA (Chapman et al. Reference Chapman, Dempsey, Miller, Heywood, Pritchard, Sangster, Whiting and Dart2017; Huynh et al. Reference Huynh, Dempsey, Whiting, Ophel and Ibsen2020) with the publication of the paper.
For quality assessment, we have compared the results of this work to previous large sky radio surveys from GLEAM, NVSS, SUMSS, and TGSS-ADR. This has allowed quantification of the accuracy in the flux density scale and astrometry, along with the spectral indices implied from our data. We find good flux density agreement with SUMSS, finding a RACS-to-SUMSS flux density ratio of $1.00^{+0.16}_{-0.16}$ . Our median astrometric offsets from comparisons with SUMSS and NVSS appear to be limited to a pixel ( $2.5{^{\prime\prime}}$ ) with most offsets constrained to less than two pixels. Finally, we find typical $\alpha$ measurements of $\sim -$ 0.6 compared to radio observations at lower frequency to RACS and $\sim -$ 0.9 for surveys at higher frequencies.
We have further analysed the data using simulations to investigate the detection fraction of both point and resolved sources within our images as a function of flux density. Using these measurements, we determined that this catalogue detects 95% of point sources at ${\sim}5$ mJy, leading to a 95% total point source completeness at $\sim$ 3 mJy. We have shown using the detection fraction of sources (of varying size) that we can recover source count distributions similar to previous work over the range $1-10^4$ mJy, and we include knowledge of the source count distribution at the highest flux densities ( $\gtrsim10^4$ mJy) compared to previous work.
In summary, this work has described the first large sky RACS catalogue, which provides the deepest radio observations of the southern sky to date. This is especially impressive given the brief duration of each observation and the short overall survey time. We have constructed a deep catalogue of the radio sky at 887.5 MHz, which is important for radio sky models and science. In the future, we will improve upon this first catalogue, providing further catalogues to fill in gaps over the southern sky as well as catalogues at other frequencies. This will provide more information about the spectral distribution of sources within the southern sky. Moreover, the Stokes Q and U polarisation products from RACS will be used to produce a corresponding linear polarization catalogue, known as SPICE-RACS (Spectra and Polarization In Cutouts of Extragalactic Sources from RACS, Thomson et al. in prep).
Acknowledgements
We thank the referee for their helpful comments to improve this manuscript, and we also thank Minh Huynh for help in uploading the data to the CSIRO ASKAP Science Data Archive, CASDA. We thank H. Andernach for identifying some issues with the source naming convention during the proofs stages. The Australian SKA Pathfinder is part of the Australia Telescope National Facility which is managed by CSIRO. Operation of ASKAP is funded by the Australian Government with support from the National Collaborative Research Infrastructure Strategy. ASKAP uses the resources of the Pawsey Supercomputing Centre. Establishment of ASKAP, the Murchison Radio-astronomy Observatory, and the Pawsey Supercomputing Centre are initiatives of the Australian Government, with support from the Government of Western Australia and the Science and Industry Endowment Fund. We acknowledge the Wajarri Yamatji people as the traditional owners of the Observatory site. This work was supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia. CLH acknowledges support from the Leverhulme Trust through an early career research fellowship. TM acknowledges the support of the Australian Research Council through grants FT150100099 and DP190100561. JL and JP are supported by Australian Government Research Training Program Scholarships.
The results in this paper have been derived using several packages: healpy and HEALPix package (Górski et al. Reference Górski, Hivon, Banday, Wandelt, Hansen, Reinecke and Bartelmann2005; Zonca et al. Reference Zonca, Singer, Lenz, Reinecke, Rosset, Hivon and Górski2019), Astropy (Astropy Collaboration et al. 2013), Scipy (Virtanen et al. Reference Virtanen2020), Numpy (van der Walt et al. Reference van der Walt, Colbert and Varoquaux2011), matplotlib (Hunter Reference Hunter2007), tqdm (https://doi.org/10.5281/zenodo.4586769). This research made use of APLpy, an open-source plotting package for Python (Robitaille & Bressert Reference Robitaille and Bressert2012; Robitaille Reference Robitaille2019). We have also made use of programs such as ds9 (Joye & Mandel 2003), TOPCAT (Taylor Reference Taylor2011) and Aladin (Bonnarel et al. Reference Bonnarel2000) in order to help with this research. This publication makes use of data products from the Wide-field Infrared Survey Explorer, which is a joint project of the University of California, Los Angeles, and the Jet Propulsion Laboratory/California Institute of Technology, funded by the National Aeronautics and Space Administration.