1. Introduction
Identifying galaxy morphological features remains something where the human eye is unsurpassed. Despite great improvements in machine learning techniques, the subtleties of various galaxy morphological properties remain best identified by experts or groups of people. Arguably, this makes these classifications prone to human error and biases. To remedy biases and scale up to the kind of imaging surveys now practical, the Galaxy Zoo project was conceived (Lintott et al. Reference Lintott2008; Willett et al. Reference Willett2013). By involving the wider public in classifying galaxies with a series of specific questions, one can both gain morphological classifications and estimate their biases and uncertainties.
Galaxy Zoo has had phenomenal success based on the work of thousands of volunteers working in tandem with professional astronomers. The results range from those that challenge our understanding of galaxy formation (e.g. Masters et al. Reference Masters2011; Smethurst et al. Reference Smethurst2015; Smethurst et al. Reference Smethurst2022; Walmsley et al. Reference Walmsley, Ferguson, Mann and Lintott2019; Walmsley et al. Reference Walmsley2021a; Walmsley et al. Reference Walmsley2021b) to a wealth of rare objects (e.g. Keel et al. Reference Keel2013; Keel et al. Reference Keel2022; Keel et al. Reference Keel2023).
Morphological classifications are useful on their own but increase dramatically in utility when combined with other information, specifically Spectral Energy Distribution (SED) fits to multi-wavelength data and spectroscopic redshifts. The Galaxy and Mass Assembly (GAMA, Driver et al. Reference Driver2009; Driver Reference Driver2021) survey is an excellent example of such a possible use. By combining MAGPHYS SED information from GAMA with specific Galaxy Zoo classifications for the survey area, subtle differences in specific star formation rates with a morphological feature can be identified (e.g. Porter-Temple et al. Reference Porter-Temple2022; Porter et al. Reference Porter2023; Smith et al. Reference Smith, Arora, Stone, Courteau and Geach2021). The improvements in accuracy, in both morphology from voting and inferred properties from the SED fit, for a statistical and complete samples allow for such more subtle relations to be revealed.
Our motivation to construct a GAMA Galaxy Zoo catalogue is therefore threefold: (a) to further explore morphology relations with other inferred properties for this highly complete sample, (b) to cross-examine morphology measures in two iterations of the Galaxy Zoo based on two different imaging surveys of the same galaxies (GAMA+KiDS and DESI Legacy Survey, Fig. 1) but completely different approaches (direct votes vs voting+zoobot machine learning predictions), and c) provide a training catalogue for machine learning experiments for both researchers and students.
The GAMA-KiDS Galaxy Zoo (KiDS-GZ) and the DESI-LS Galaxy Zoo (DESI-GZ) classifications are fortuitously based on mostly the same classification flow diagram for Galaxy Zoo Dr4 (Fig. 2), with the number of options for three questions changing slightly. This means the same questions were answered by the volunteers but based on different images. An additional difference between the two Galaxy Zoo catalogues is that the DESI-LS classifications catalogue is that it is based on zoobot machine learning classifications trained on the Galayx Zoo classifications of all of DESI LS imaging. This study is meant to be a useful cross check between raw voting and a machine-learning extrapolated one.
This paper is organised as follows: in Section 2, we discuss the origin of the KiDS and DESI surveys Galaxy Zoo databases, as well as the parent GAMA survey catalogue. Section 3 briefly describes the differences in input imaging data quality from these surveys. In Section 4 we directly compare the reported voting fractions for each of the questions in Fig. 2, discussing patterns and their possible explanations. Section 6 uses the DESI Galaxy Zoo voting matched with the GAMA MAGPHYS catalogue to reproduce the results from Porter-Temple et al. (Reference Porter-Temple2022) as an test of how well DESI-GZ can be used in GAMA. In Section 7, we discuss the benefits and drawbacks of using either Galaxy Zoo catalogue for future morphological studies, and Section 8 lists our conclusions from this comparison.
2. Data
We use two iterations of the Galaxy Zoo catalogues: one based on KiDS images, made for the GAMA collaboration, and one made based on the DESI imaging survey as a larger Galaxy Zoo effort and then extrapolated using zoobot (Walmsley et al. Reference Walmsley2023a). The relative coverage of these surveys, DESI and GAMA, are shown in Fig. 1. Ancillary data such as stellar mass estimates and redshifts are from the public GAMA-IV catalogues (DR4 Driver et al. Reference Driver2022). The other catalogues (target and MAGPHYS) are part of GAMA DR2 release and available through the GAMA website.
2.1 GAMA-KiDS Galaxy Zoo
For GAMA, we use three sources: the GAMA survey itself, the KiDS survey for imaging, and finally the catalogue with voting from the Galaxy Zoo.
2.1.1 GAMA
GAMA is a combined spectroscopic and multi-wavelength imaging survey designed to study spatial structure in the nearby ( $z \lt 0.25$ ) Universe on kpc to Mpc scales (see Driver et al. Reference Driver2009, Reference Driver2022, for an overview). The survey, after completion of phase 2 (Liske et al. Reference Liske2015), consists of three equatorial regions each spanning 5 deg in Dec and 12 deg in RA, centred in RA at approximately 9h (G09), 12h (G12) and 14.5h (G15) and two Southern fields, at 02h (G02) and 23h (G23). The three equatorial regions, amounting to a total sky area of 180 deg $^2$ , were selected for this study. For the purpose of visual classification, 49 851 galaxies were selected from the equatorial fields with redshifts $z \lt 0.15$ (see below). The GAMA survey is $ \gt $ 95% redshift complete to $r \lt 19.8$ mag in all three equatorial regions (Driver et al. Reference Driver2022). We use the magphys SED fits data-products (Driver et al. Reference Driver2018) from the third GAMA data-release (DR3, Baldry et al. Reference Baldry2018). The GAMA Galaxy Zoo voting catalogue is slated to be part of the rolling DR4 (Driver et al. Reference Driver2022).
2.1.2 KiDS
The Kilo Degree Survey (KiDS, de Jong et al. Reference de Jong, Verdoes Kleijn, Kuijken and Valentijn2013; de Jong et al. Reference de Jong2015; de Jong et al. Reference de Jong2017; Kuijken et al. Reference Kuijken2019) is an optical wide-field imaging survey with the OmegaCAM camera at the VLT Survey Telescope. It has imaged 1 350 deg $^2$ in four filters (u g r i). The core science driver is mapping the large-scale matter distribution in the Universe, using weak lensing shear and photometric redshift measurements. Further science cases include galaxy evolution, Milky Way structure, detection of high-redshift clusters, and finding rare sources such as strong lenses and quasars. KiDS image quality is typically 0. $^{\prime\prime}$ 7 resolution (for sdss-r) and depths of 23.5, 25, 25.2, and 24.2 magnitude for i, r, g, and u, respectively. This imaging was the input for the Galaxy Zoo citizen science classifications (see also Kelvin et al. Reference Kelvin2018).
2.1.3 KiDS Galaxy Zoo
Information on galaxy morphology is based on the GAMA-KiDS Galaxy Zoo classification (Lintott et al. Reference Lintott2008; Kelvin et al. Reference Kelvin2018). GAMA-KiDS Galaxy Zoo data was initially provided by Lee Kelvin and Steven Bamford (private communication, 2019). Further details of this project may be found in Holwerda et al. (Reference Holwerda2019), Porter-Temple et al. (Reference Porter-Temple2022), and Porter et al. (Reference Porter2023). Briefly, Red-Green-Blue (RGB) cutouts were constructed from KiDS g-band and r-band imaging with the green channel as the mean of these. KiDS cutouts were introduced to the classification pool and mixed in with the ongoing classification efforts. For the Galaxy Zoo classification, 49,851 galaxies were selected from the equatorial fields with redshifts $z \lt 0.15$ . Galaxy Zoo provided a monumental effort with almost 2 million classifications received from over 20 000 unique users over the course of 12 months. This classification has been used by the GAMA team to identify dust lanes in edge-on galaxies (Holwerda et al. Reference Holwerda2019), searches for strong lensing galaxy pairs (Knabel et al. Reference Knabel2020), the morphology of green valley galaxies (Smith, Giroux, & Struck Reference Smith, Giroux and Struck2022a) and void galaxies (Porter et al. Reference Porter2023) and to link star formation properties to the number of spiral arms (Porter-Temple et al. Reference Porter-Temple2022). In this paper we use the visual classifications of disc galaxies from the Galaxy Zoo project; the full decision tree for the GAMA-KiDS Galaxy Zoo project is shown in Fig. 2. When talking about this catalogue, we use KiDS-GZ.
2.2 DESI Galaxy Zoo
Walmsley et al. (Reference Walmsley2023a) presents the galaxy zoo classifications based on three colour imaging of the Dark Energy Spectroscopic Instrument (DESI) survey (Dey et al. Reference Dey2019). DESI requires images to target its spectroscopic fibres; these are primarily provided by the DESI Legacy Surveys (DESI-LS). These images are shallower than KiDS and often lower resolution (poorer seeing) which varied from field to field (Dey et al. Reference Dey2019). The images from DESI-LS were converted into Red-Green-Blue (RGB) images for processing in the Galaxy Zoo classification pipeline.8
Walmsley et al. (Reference Walmsley2023a) provide user-friendly catalogues for use in future endeavours such as these. There are two catalogues: one with the Galaxy Zoo classifications over all of DESI-LS. These were used to train zoobot (Walmsley et al. Reference Walmsley2023b), a deep-learning tool, to predict voting fractions for all of DESI-LS sources. This is the catalogue shown in Fig. 1. Because DESI-LS covers a wide area, the number of galaxies with actual voting in the GAMA footprint is small. We use the zoobot predictions catalogue for this reason to directly compare to the KiDS-GZ voting.
The zoobot (Walmsley et al. Reference Walmsley2023b) is a deep-learning tool that can be trained (and retrained) to predict Galaxy Zoo volunteer voting fractions based on RGB and greyscale images. The code is publicly available here: https://github.com/mwalmsley/zoobot. zoobot is based on the Keras Python function EfficientNetB0, an image classifier using an optimised Convolutional Neural Network structure (Tan & Le Reference Tan and Le2019). The catalogues are publicly available here: https://zenodo.org/records/8360385. Because this is a prediction, not actual voting records for these galaxies, the voting fraction for example never reaches 100% in the DESI-GZ catalogue we are using (see section 4.1 on question T00).
2.3 Catalogue matching
We match the catalogues from KiDS-GZ and the DESI-GZ using their positions (the match_coords_sky algorithm in astropy) within a match radius of two arcseconds (the width of the GAMA spectroscopic fibre). We opt to match only the GAMA equatorial fields (G09, G12, G15) and not G02 which also has Galaxy Zoo voting information because the equatorial fields have the most ancillary catalogues in the GAMA Data-releases (e.g. MAGPHYS SED fits). Previous work using the KiDS-GZ (Holwerda et al. Reference Holwerda2019; Knabel et al. Reference Knabel2020; Knabel et al. Reference Knabel2023; Porter-Temple et al. Reference Porter-Temple2022; Porter et al. Reference Porter2023) all use only the equatorial fields.
Matching the GAMA targeting catalogue and DESI-GZ results in some 35k sources in common. The G02 field has 10k sources in common but these are not used in the rest of the paper. The DESI-GZ catalogue matches to GAMA are released as part of this paper with GAMA CATAID added for future GAMA archival use.
3. Data quality
3.1 Image quality
The KiDS-GZ classifications are based on RGB images with the green filter constructed from red and blue, as the green filter (sdss-r) had not been observed for all targets yet at the beginning of the Galaxy Zoo iteration. The DESI-LS survey was obtained on a variety of telescopes but with the same filter set as SDSS and KiDS. Survey depths are summarised in Table 1. Spatial resolution varies similarly to depth (0. $^{\prime\prime}$ 6–1. $^{\prime\prime}$ 2) for DESI-LS while KiDS is more consistent with an average PSF of 0. $^{\prime\prime}$ 7. On the whole, the DESI-GZ classifications are based on more complete information, for example, including a green filter, and more area, but these are extrapolated using zoobot. The GAMA-KiDS imaging is higher resolution and deeper in the two filters used.
The physical resolution in the respective surveys depends on the seeing and distance. In previous work (Smith et al. Reference Smith2022b; Porter-Temple et al. Reference Porter-Temple2022; Porter et al. Reference Porter2023; Holwerda et al. Reference Holwerda2022), samples were limited in redshift to keep sensitivity to smaller morphological features the same across the studied sample. In the case of the KiDS-GZ for example, a redshift limit of $z=0.08$ corresponds to a physical resolution of $\sim$ 1 kpc, a key resolution for morphology (Lotz, Primack, & Madau Reference Lotz, Primack and Madau2004).
3.2 Number of classifiers
The histogram of the number of classifiers for each KiDS-GZ is shown in Fig. 3. The mean number is 36 classifiers for the DESI-GZ training sample. The number of classifiers is lower for KiDS-GZ, leading to possibly larger scatter in the voting fractions. We note that the DESI-GZ catalogue, the KiDS-GZ is compared against is the extrapolated one by zoobot.
The trade off between the two Galaxy Zoo iterations is that the KiDS-GZ imaging may be deeper and likely higher resolution, but the DESI-GZ has a higher number of votes going into individual sources of the large training sample which is then generalised to all the DESI-GZ catalogue (Walmsley et al. Reference Walmsley2023a). If the zoobot classifier is trained well on the DESI-GZ, the accuracy should be equal to or surpassing the lower number of classifiers on the deeper KiDS data.
4. Voting comparison
Fig. 2 shows the flowchart of questions for the GAMA-KiDS classifications. We will compare the fraction of votes for each question.
Fig. 4 shows two KiDS images of ETGs examined by KiDS-GZ classifiers, and histograms of probabilities (multiples of vote fractions within relevant questions) for five endpoints within the flowchart in Fig. 2 (Glass Reference Glass2024). The chosen endpoints, Smooth (T00:A0), Edge-On (T00:A1 $ \times $ T01:A0), Spiral (T00:A1 $ \times $ T01:A1 $ \times $ T03:A0), No_Spiral (T00:A1 $ \times $ T01:A1 $ \times $ T03:A1) and Star/Artifact (T00:A2), form a complete set horizontally across Question T00 and probabilities therefore sum to 1. The threshold for morphology selection was the dominant probability above 0.4, to ensure a leading selection with only one other choice close behind, as is the case for GAMA64646.
T01-T03 are binary choices, allowing for easier comparisons using either fraction. In the case of multiple options, the voting fraction for each needs to be compared. We note that DESI-GZ and KiDS-GZ had a different number of options for T02 and T04, where the DESI-GZ questionnaire had more options. In these cases, we compare the same worded answer, with much of the difference due to a difference in choice. We compare voting fractions since the number of votes is different for KiDS-GZ and DESI-GZ as the retirement criterion – the point in voting where the image was not shown to new classifiers – was set differently between these. Most usage focuses on voting fractions to identify features, not absolute or calibrated numbers of votes.
4.1 T00: Is the galaxy in the centre of the image simply smooth and rounded, or does it have features?
This is the first question encountered to separate those galaxies that are mostly featureless (elliptical/spheroidal) and artefacts from those with a lot of substructures (Fig. 2). This is a key question as the voter is not shown the remaining detailed questions if they do not mark the objects as ‘having features’.
The voting in DESI-GZ does not reach 100% as does KiDS-GZ. This is an artefact of the zoobot predictions. In the Galaxy Zoo voting in the DESI-LS, 100% fractions do happen.
Fig. 5 shows the comparison between the KiDS-GZ and the DESI-GZ zoobot predicted voting fractions for the galaxies in common. This is one of the questions where we have answers for all the galaxies. T00 is the first question to be answered and thus always presented to all classifiers. A notable behaviour is that the fraction of positive votes for smooth in the DESI-GZ voting is typically higher than the KiDS-GZ one. This means Galaxy Zoo volunteers voted for galaxies to be smooth more in DESI-GZ than in GAMA-KiDS. This DEIS-GZ voting is then reflected in the zoobot training sample. This is likely due to the deeper and/or higher resolution in the GAMA-KiDS images as volunteers are able to identify more galaxies with structure of some kind.
It also highlights a difference the zoobot prediction makes on a voting fraction: there are no 100% voting fractions in the DESI-GZ. There are some in the KiDS-GZ but experience has shown these often to be the result of a single vote, especially for questions that depend on a previous choice (T01–T08). We leave the 100% fractions in our KiDS-GZ catalogue for now but often rejected these in analyses for this reason. An alternate approach is to renormalise these question with all the volunteers considering this object, not the number that answered the question.
4.2 T01: Could this be a disc viewed edge-on?
Rather than an axis ratio, this is a direct question if this is possibly an edge-on disc galaxy. As a binary question (only two answers possible; yes or no), we only need to examine one voting fraction because the other answer’s listed fraction is the inverse of the first answer. This question immediately follows voting in favour of this galaxy having ‘features’. The comparison sample is therefore smaller because a voting fraction is only recorded if someone in both groups of volunteers has voted on this galaxy to have features. There are two populations visible in Fig. 6, galaxies with a high fraction in both DESI-GZ zoobot predictions and KiDS-GZ in favour of the edge-on perspective and a larger group with a small fraction ( $f \lt 0.2$ ) in favour of the edge-on perspective.
The voter fractions in Fig. 6 agree quite well in case of the ‘edge-on’ question. This might make a sample of edge-on galaxies selected by this plot quite robust.
4.3 T02: Is there any sign of a bar feature through the centre of the galaxy?
This is a y/n question in the KiDS-GZ iteration, but one that was expanded to three options in DESI-GZ; none/weak/strong options for the bar. The fractions for ‘no-bar’ are compared in Fig. 7. This is the only option common to both surveys. There is reasonable agreement between the two GZ iterations but the change in options could mean that the fraction for no-bar in DESI is lower as the option for ‘weak bar’ is now available.
4.4 T03: Is there any sign of a spiral arm pattern?
The third and last binary (y/n) question. The fractions of voting are shown in Fig. 8. This is a question only answered after T00 and T01, and therefore the comparison sample is once again smaller. There is a good agreement for high fractions of voting in both DESI-GZ zoobot predictions and KiDS-GZ.
Glass et al. (2024, in preparation. Glass Reference Glass2024) showed that KiDS-GZ vote fractions can be used successfully to identify and remove weak spiral galaxies in samples of smooth early-type galaxies derived from GAMA classifications (Kelvin et al. Reference Kelvin2014; Moffett et al. Reference Moffett2016). The GAMA classifications are based on SDSS imaging (catalogue VisualMorphologyv03), at lower resolution and depth than KiDS. For disc-like early-type galaxies (ellipticals) in their sample, 19% were found to have weak spiral features using KiDS-GZ.
4.5 T04: How prominent is the central bulge, compared with the rest of the galaxy?
This is a question with three options in the KiDS-GZ but five options in the DESI-GZ and the subsequent zoobot predictions. In principle, it could be reduced to a binary one (evidence of a bulge y/n?) similar to T07.
Fig. 9 shows the ‘no bulge’ voting fractions for KiDS-GZ and DESI-GZ. This is the answer both Galaxy Zoo questions have in common. The agreement with other answers for this question is quite poor. The most likely explanation is that the DESI-GZ offered more options to classify. Many of the prominent bulges would have gotten votes for ‘large’ bulge instead. Other possible explanations include the difference in depth between DESI-GZ and KiDS-GZ and that the RGB images were constructed differently in each survey. The fact that ‘no bulge’ is consistent is encouraging.
4.6 T05: How tightly wound do the spiral arms appear?
This question is only presented if the Galaxy Zoo volunteer answers affirmative to T03. The three answers are not easily reduced to a binary question. Fig. 10 shows the voting fractions for the ‘tightly wound’ answer compared between KiDS-GZ and DESI-GZ zoobot predictions.
The reasonably good agreement between the answer fractions for KiDS-GZ and DESI-GZ zoobot predictions is reassuring that once spiral arm features are identified in either survey, there is good agreement on their appearance.
4.7 T06: How many spiral arms are there?
This is the question on which Porter-Temple et al. (Reference Porter-Temple2022) focused for their study of star formation and stellar mass properties. The option ‘more than 4’ for the number of spiral arms is functionally a vote for a flocculent spiral. In principle, this question could be reduced to a binary one of ‘grand design spiral’ or ‘flocculent’ but this has not been used. Porter-Temple et al. (Reference Porter-Temple2022) found the vast majority of objects in their sample ( $\rm log(M_*/M_\odot \gt 9$ , $z \lt 0.08$ ) for completeness and resolution reasons) to mostly consist of 2-armed spirals with much smaller but statistically significant numbers of the other categories.
Fig. 11 shows the voting fractions for KiDS-GZ and DESI-GZ zoobot predictions for all four distinct answers in question T06. The best agreement is for 2-armed spirals (the most common kind). One-armed spirals are agreed upon with a low fraction, and a similar but more diffuse version of that pattern repeats for 3-armed spirals. There is a noticeable trend with both 3- and 4-armed spirals where there is a higher voting fraction in favour of these in the KiDS-GZ compared to the DESI-GZ zoobot predictions. There is an additional answer possible in DESI-GZ (‘cannot tell’) which was not an option in KiDS-GZ. This all suggests the threshold for identifying a larger number of arms may have to be set to a lower fraction in the DESI-GZ zoobot predictions. There are different ways to ensure a galaxy has a certain number of spiral arms. One could require a simple majority ( $ \gt $ 20% for a given option within the five options in T06) or an overwhelming consensus ( $ \gt $ 50%), depending on how certain one wants to be of the selected sample. The former can still be a close call (all options have almost 1/5th of the vote), while the latter is unambiguous but lower statistics. This question is relevant for any comparison to Porter-Temple et al. (Reference Porter-Temple2022) or Hart et al. (Reference Hart2017): ‘how well do volunteers agree on the number of spiral arms in a galaxy’ (see 6).
4.8 T07: Does the galaxy have a bulge at its centre?
This question is only answered if T01 is positive (view is edge-on). Three answers are possible (Fig. 2). In principle, this question can be reduced to a binary one (is there a bulge y/n?) by combining the voting of the first two options (‘boxy’ and ‘round’). Fig. 12 shows the fraction of votes in favour of ‘no bulge’. Generally speaking, the two Galaxy Zoo iterations, KiDS-GZ and DESI-GZ zoobot predictions are in broad agreement but with large scatter.
4.9 T08: How rounded is it?
This is the only dedicated question if the volunteer answers ‘Smooth’ in T00. This is exclusively for elliptical/spheroidal galaxies. Fig. 13 shows the voting fractions for all three options (round, in-between, and cigar) compared between the KiDS-GZ and DESI-GZ zoobot predictions. There is general agreement but with substantial scatter, to be expected for a slightly relative or subjective question.
4.10 T09: Is the galaxy currently merging or is there any sign of tidal debris?
This is a question with four possible answers but could be reduced to a single ‘signs of interaction y/n?’ by combining the first three answers to compare against the ‘none’ voting fraction. This was the usage in Porter et al. (Reference Porter2023) for void galaxies. The middle options between ‘merging’ and ‘none’ in this question are the only change between KiDS-GZ and DESI-GZ zoobot predictions. The middle options are ‘tidal debris’ and ‘both’ (meaning the galaxies show both signs of merging and tidal debris) for the KiDS-GZ and the middle options in DESI-GZ zoobot predictions are ‘minor’ and ‘major’ indicating the relative ratio of the galaxies in minor/major interaction.
Fig. 14 shows the comparison between the two answers the GZ iterations have in completely common (i.e. the answer is phrased the same). This is a question that is asked for all galaxies, regardless of T01 so the comparison has more statistics. Generally, there is a reasonable agreement (high fraction of no-merger in both iterations) but especially at lower fractions (more ambivalence), the scatter is higher. There is a better agreement on no-merger than on merging since the other options could draw votes away depending on the image depth (i.e. a tidal feature is visible in KiDS but not in DESI, either in the volunteer voting or the zoobot predictions).
4.11 T10: Do you see any of these odd features in the image?
The final question is unique in that the answers are not mutually exclusive and one could vote for more than one of these features. For example, one could see an overlapping pair of galaxies and a prominent dust lane visible. Whether or not it is clear to each volunteer that multiple answers are allowed is not clear.
This question was not included in the data-release by Walmsley et al. (Reference Walmsley2023a), and we do not include the comparison here. The question was undoubtedly asked but it would be difficult to inter-compare with likely low statistics as these are relatively infrequently occurring phenomena. zoobot predictions for these are difficult for the same reasons.
4.12 Correlation metrics
Figs. 5 through 14 include the Pearson ( $R_p$ ) and Spearman ( $R_s$ ) rankings. These are summarised in Table 3 including the p-values returned with each test. The Pearson ranking is an indication of how linear the relation between the two voting fractions is. The Spearman one is a ranking for a monotonous, but importantly not necessarily linear, relation between the two voting fractions.
Most of the voting between KiDS-GZ and DESI-GZ zoobot predictions is highly correlated with rankings well above 0.8. The highest agreement is on whether this disc can be viewed edge-on (T01). This is reflected in Fig. 6 with clusters at 0 and 1.
The lower correlations are often for questions where either there were more options in one of the Galaxy Zoo iterations (e.g. T04) or a suspected dependence on surface brightness (e.g. T03), or both.
Of the number of spiral arms, a subtle difference in surface brightness may make a difference. The agreement is the strongest for two arms, where the statistics are the highest. The agreement on mergers (T09) are surprisingly good because these are dominated by the answer that there is no evidence for an ongoing or past merger (Fig. 14).
5. Other GAMA visual classifications
Previous visual classifications of the GAMA galaxies include those by the GAMA team (Driver et al. Reference Driver2022) and low-redshift quasar hosts (Stone et al. Reference Stone2023). These are visual classifications of the galaxy as a whole (Table 4). We compare the KiDS-GZ voting against these expert visual classifications.
Fig. 15 shows the overlap (58 galaxies out of 205) with the DESI-GZ sample and the one from Stone et al. (Reference Stone2023) for quasar host galaxies and question T00, the most numerous and relevant one for the morphology of the galaxy as a whole. Early type galaxies (E or S0) show a low fraction of ‘features’ votes, later types, that is, disc-dominated classes have higher fractions of votes in favour of ‘features’. This makes the Galaxy Zoo classifications consistent with the expert visual assessment in Stone et al. (Reference Stone2023) of quasar host galaxies.
There is a larger sample of overlap between the visual classifications by Driver et al. (Reference Driver2022) and the KiDS-GZ catalogue. The classifications by Driver et al. (Reference Driver2022) focus on the prominence of the bulge with respect to the galaxy as a whole. Both T00 and T04 are therefore good comparison questions (Table 2, Fig. 2).
Fig. 16 shows the distribution of voting for the first question ‘smooth or featured?’. Ellipticals have the lowest voting fractions, followed by disc-dominated classes (cBD, dBD and D). Both cBD and dBD have a high voting fraction for ‘features’, higher than disc alone (D).
The voting for T04 ‘bulge prominence?’ in Fig. 17 for the answer ‘no bulge’. The highest fraction is the ‘pure disc’ (D), which is consistent with the KiDS-GZ vote. The diffuse bulge (dBD) is next, followed by the concentrated bulge (cBD). Ellipticals are a small fraction of the galaxies in this question as most have been filtered out by T00.
Overall the expert classifications and the KiDS-GZ classifications agree well. Figs. 16 and 17 can serve as a possible translation between Galaxy Zoo voting and expert classes (e.g. $0.2 \lt f_{T00} \lt 0.7$ and $f_{T04} \gt 0.5$ would select a fairly clean disc-only sample from the Galaxy Zoo voting.
6. Comparison to Porter-Temple+ (2022)
Using the KiDS-GZ, Porter-Temple et al. (Reference Porter-Temple2022) examined the dependence of stellar mass, star formation rate, and specific star formation rate with the number of spiral arms. They adopted a conservative approach in the identification of spiral arms by limiting the redshift to $z \lt 0.08$ , adopting a lower limit of $log(M_*/M_\odot) = 9$ , and setting a relatively high threshold for a galaxy to be classified with one, two, three, four, or five and more spiral arms ( $f \gt 0.5$ ).
Their selection criterion is shown as the red line box in Fig. 18. The KiDS-GZ classifications were limited intentionally to $z=0.15$ but we can see from the GAMA galaxies with DESI-GZ zoobot predictions classifications in Fig. 18 that this limit is not enforced for DESI-GZ zoobot predictions. This is not the reason there is a much higher fraction in voting for smooth galaxies in DESI-GZ zoobot predictions (Fig. 5) because that sample is limited to $z=0.15$ by the crossmatch with KiDS-GZ. The images on which DESI-GZ zoobot predictions voting are based are shallower (Table 1) and thus more prone to miss lower surface brightness features (spiral arms, tidal arms etc).
Voting in favour of T06 options other than 2-arms show slightly lower fractions for the same galaxies compared to the KiDS-GZ (Fig. 11, reflected in lower rankings as well). Therefore, we adopt slightly less stringent criteria to classify a galaxy with a certain number of spiral arms: we require that the $f_{disc}(T00) \gt 0.3$ and $f_{n-arm}(T06) \gt 0.2$ with n the number of spiral arms. We also require the redshift to be $z \lt 0.08$ as the DESI imaging is not higher resolution than KiDS and a minimum stellar mass of $log_{10}(M_*/M_\odot) \gt 9$ . The lower voting fraction than Porter-Temple et al. (Reference Porter-Temple2022) for a choice of n arms is needed because otherwise the statistics for any number other than $n=2$ would be too low for a comparison. We note that 5+ still suffered from too low numbers to be included in the plot.
Fig. 19 shows the distribution of stellar masses for $n=$ 1, 2, 3, or 4 spiral arms. The n=5+ category did not get enough votes for a statistically significant result. We see a similar rise in stellar mass with the number of spiral arms as Porter-Temple et al. (Reference Porter-Temple2022), compare to their Fig. 4.
Fig. 20 shows the star formation rate of galaxies with $n=$ 1, 2, 3, or 4 arms. Similar to Porter-Temple et al. (Reference Porter-Temple2022), their Figure 6, we see a rise in the star formation with number of spiral arms, similar to the increase with mass.
Fig. 21 shows the specific star formation rate ( $SFR/M_*$ ) of galaxies with $n=$ 1, 2, 3, or 4 arms. Similar to Porter-Temple et al. (Reference Porter-Temple2022), we see a flat or slight decline in the specific star formation with number of spiral arms. A very similar, subtle decline in sSFR with the number of arms was observed by Porter-Temple et al. (Reference Porter-Temple2022) in their Figure 8.
The comparison in stellar mass, star formation, and specific star formation can be done by comparing the distribution of values of galaxies with a certain number of arms (m) to the population as a whole. The similarity can be tested with the Kolmogorov-Smirnov test (K-S), which measures the greatest fractional difference in the cumulative distribution; 0 means no difference, and 1 means completely different distributions. The K-S test values are listed in Table 5 with the p-value in brackets. The differences in distributions are not very large, similar to what Porter-Temple et al. (Reference Porter-Temple2022) found and show the same trends. The two-armed spiral, being the most numerous, will resemble the population at large the most, with the lowest K-S value. The trend is higher K-S values away from 2-arms. These are all the same trends observed by Porter-Temple et al. (Reference Porter-Temple2022). We conclude that with accurate inferred parameters – stellar mass, star formation, and specific star formation rates – one can reproduce the experiment from Porter-Temple et al. (Reference Porter-Temple2022) accurately.
7. Discussion
The two Galaxy Zoo iterations agree reasonably well with each other despite different approaches to the imaging data that went into them and only slight differences in the classification questions. These were the same galaxies and observed in the same filters (gri), but on different telescopes, under different seeing conditions, to different depths, with a different approach to the generation of an RGB colour image, based on different numbers of classifiers, and in the case of DESI, extrapolated by zoobot.
The correlation between answers (Table 3) show very good (linear) agreement between the voting fractions for most of the features. This adds to the confidence that these features are present in these galaxies, separately from the origin of the voting. There is excellent agreement between KiDS-GZ and DES-GZ zoobot predictions.
The question whether a volunteer sees features (T00) and whether they see spiral structure (T03) are somewhat overlapping; one would need to see features or disc structure to even see spiral arms. It is therefore perhaps illustrative to compare the difference in voting fractions in DESI/zoobot and KiDS-based Galaxy Zoo. We do so in Fig. 22. There is a clear shift of voting in the KiDS-based Galaxy Zoo towards galaxies with features in T00 but then, once features have been found, the voting is mostly balanced around a difference of 0 for the T03. In the DESI zoobot predictions, more galaxies are identified as ‘smooth’ but once features are identified, the result is similar to the KiDS one. This is not distance dependent and it is most likely the result of the depth of DESI compared to KiDS, influencing the final zoobot predictions.
There is a lot of scatter in the fractions of votes. For individual galaxies, there may be room for interpretation, a well-known effect even among expert classifiers (cf discussion in Nair & Abraham Reference Nair and Abraham2010a; Nair & Abraham Reference Nair and Abraham2010b). But for statistical uses, either catalogue looks to agree well with one another. Apart from perhaps removing unity values in the voting fraction, not much more correction is needed in KiDS-GZ.
We check the KiDS-GZ voting with expert visual classifications by Stone et al. (Reference Stone2023) and Driver et al. (Reference Driver2022). Both agree in broad terms with the voting in the Galaxy Zoo catalogue. The distributions of voting fractions for T00 and T04 agree with the categories assigned by experts, for example, pure disc galaxies have a high voting fraction for no bulge and early types have a low voting fraction for no features. The broad agreement is another validation of the utility of this Galaxy Zoo catalogue for future uses.
It is heartening to see that previous results by Porter-Temple et al. (Reference Porter-Temple2022) are recovered here. The DESI-GZ zoobot predictions has voting for higher redshift galaxies, and it is based on shallower imaging data, thus consistency with previous results strengthens its use case. Similarly, the KiDS-GZ voting fractions were not calibrated or de-biased but the higher thresholds compensate for that. Results like those in Figs. 19–21 are only possible when the accuracy on both the x-axis,that is, the certainty in morphological classification and the accuracy on the y-axis, that is, the inferred galaxy property is of equally good quality, thanks to the multiwavelength photometry (Wright et al. Reference Wright2016). The voting in Hart et al. (Reference Hart2017) was high accuracy with an earlier iteration of Galaxy Zoo but the accuracy in their star formation measure did not quite match that of their Galaxy Zoo classifications, smoothing out the relation between arms and star formation rate. The combination of voting and SED accuracy allowed Porter-Temple et al. (Reference Porter-Temple2022) to improve on the Hart et al. (Reference Hart2017) result. For similar reasons, we caution against the use of Galaxy Zoo questions on morphological details ( $ \lt $ kpc in size) for redshifts over $z=0.1$ for either KiDS or DESI, as these correspond to more than the 0. $^{\prime\prime}$ 7 spatial resolution.
8. Conclusions
In this paper, we directly compared two different iterations of the Galaxy Zoo morphology classification based on two different imaging surveys in the three GAMA equatorial fields. The images the classifications are based on, differ in depth, construction of RGB image, resolution, and target redshift range. The DESI-GZ catalogue is the result of zoobot predictions based on all of DESI-LS trained classifications.
We found that for individual galaxies, the voting fractions can often be quite different (several tens of percent; see Figs. 5–14). However, by and large the voting between both iterations agrees, especially for the populations at large.
Reproducing the results from (Porter-Temple et al. Reference Porter-Temple2022), we find the same trends as they did using the DESI-GZ catalogue. With similar constraints on redshift, the DESI-GZ catalogue is suitable for similar work on morphological details.
We note that the DESI-GZ zoobot predictions has a higher fraction of ‘smooth’ classifications for galaxies that have more ‘disc or features’ in T00. This is likely a combination of distance and depth of DESI imaging, hiding lower surface brightness features such as spiral arms and the discs of galaxies and an effect of weighting in the zoobot classifications.
Acknowledgement
The data in this paper are the result of the efforts of the Galaxy Zoo volunteers, without whom none of this work would be possible. Their efforts are individually acknowledged at http://authors.galaxyzoo.org.
This research made use of Astropy, a community-developed core Python package for Astronomy (Astropy Collaboration et al. 2013, 2018).
Data Availability
There are two catalogs that will be made public on the GAMA website.
Appendix A. Catalogue Descriptions
Here we describe the GAMA KiDS and DESI based catalogues to accompany this paper.
Tables A1 and A2 list the entries in the Galaxy Zoo KiDS classification catalog. These include a CATAID to identify the GAMA source and totals and fractions of voting on these objects.
Table A3 is the full listing of entries in the DESI Galaxy Zoo catalogue as described in Walmsley et al. (Reference Walmsley2023a). These the the GAMA CATAID and right ascention and declination used to match with the entries in the DESI catalogue. This catalogue contains only fractions of votes as these are predicted by zoobot.