Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-01-14T00:51:28.284Z Has data issue: false hasContentIssue false

Nuclear SSR-based genetic diversity and STRUCTURE analysis of Greek tomato landraces and the Greek Tomato Database (GTD)

Published online by Cambridge University Press:  28 February 2024

Androniki C. Bibi*
Affiliation:
Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, Greece Department of Biology, University of Crete, Heraklion, Greece
John Marountas
Affiliation:
Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, Greece
Konstantina Katsarou
Affiliation:
Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, Greece Department of Biology, University of Crete, Heraklion, Greece
Anastasios Kollias
Affiliation:
Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, Greece
Pavlos Pavlidis
Affiliation:
Department of Biology, University of Crete, Heraklion, Greece Institute of Computer Science (ICS), Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece
Eleni Goumenaki
Affiliation:
Department of Agriculture, Hellenic Mediterranean University, P.O. Box 1939, GR 71004, Heraklion, Crete, Greece
Dimitris Kafetzopoulos
Affiliation:
Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, Greece
*
Corresponding author: Androniki C. Bibi; Email: androniki_bibi@imbb.forth.gr
Rights & Permissions [Opens in a new window]

Abstract

Tomato has been cultivated in Greece for more than 200 years, even though is not native to the country. Greece with a favourable environment all-round the year, has become a major competitor in tomato production around Europe. However, there is an increasing demand to improve tomato crop, to withstand harsh environmental conditions (extreme temperatures, salinity, etc.), and to develop high-quality final products. People have devoted a significant effort to crop improvement through phenotypic screening resulting in a large number of tomato landraces. An increasing demand to clarify the relationships among local tomato landraces and hybrids utilizing the most preferred molecular markers the simple sequence repeats (SSR-markers) is the main objective of this study. Twenty-seven tomato landraces and two tomato hybrids cultivated in Crete, Greece, were genotyped utilizing eleven simple sequence repeats (SSR) along with the Structure analysis of the germplasm. A neighbour-joining dendrogram of the 27 landraces and the two hybrids was produced. The Structure analysis indicated that nine ancestral populations are hidden inside the group of all the genotypes tested, using Evanno's method. The final objective was to make these data publicly available through the first Greek relational database (Greek Tomato Database-GTD). GTD was developed allowing the users to update and enrich the database, with new and supplemental information. This work is the first molecular fingerprint of the 27 landraces of Greece which is documented along with the phenotypic information in the GTD.

Type
Research Article
Copyright
Copyright © The Author(s), 2024. Published by Cambridge University Press on behalf of National Institute of Agricultural Botany

Introduction

In the twentieth century tomato (Solanum lycopersicum L.) has become the second most produced vegetable crop after potato (FAOSTAT 2023). Tomato belongs to the Solanaceae family and is generally considered a vegetable, although sometimes is referred to as a fruit (Foolad, Reference Foolad2007). Tomato originated from South America and was probably domesticated in Mexico. Tomato was brought to Europe by the Spanish in the 1540s and it grew particularly well in Mediterranean climates. Tomato was consumed from the early 17th century in Spain (López-Terrada, Reference López-Terrada2014). At first, tomato was treated in Europe as an ‘ornamental’ and was known as ‘golden apple,’ ‘love apple,’ or “Peruvian apple’ (Jones, Reference Jones2007). Tomato was imported in Greece in the eighteenth century. Currently, Greece is ranked sixth in total tomato production in the European Union, after Spain, Italy, Portugal, Poland and Netherlands, with 853.2 thousand tonnes (European Commission - DG Agri G2, 2021). In addition to that, Crete is the leading region of Greece for greenhouse production, followed by Peloponnese, Macedonia, Thessaly, Central Greece, Epirus and Aegean Islands (Savvas et al., Reference Savvas, Ropokis, Ntatsi and Kittas2016). To meet the increasing demand for tomatoes there is a need for breeding to enhance yield and improve tolerance to various stresses.

Breeding efficiency in tomato has been improved by using molecular markers to identify and transfer important alleles from germplasm to elite cultivars (Foolad, Reference Foolad2007). However, there is a lack of sufficient polymorphic markers between closely related tomato species and within cultivars because the majority of molecular markers were developed based on polymorphisms between domesticated tomato and its wild relatives (Tanksley et al., Reference Tanksley, Ganal, Prince, Carmen de-Vicente, Bonierbale, Broun, Fulton, Giovannoni, Grandillo, Martin, Messeguer, Miller, Miller, Paterson, Pineda, Roder, Wing, Wu and Young1992; Fulton et al., Reference Fulton, Nelson and Tanksley1997; Frary et al., Reference Frary, Xu, Liu, Mitchell, Tedeschi and Tanksley2005; Sadiyah et al., Reference Sadiyah, Ashari, Waluyo and Soegianto2020). Simple sequence repeat (SSR) markers are often the preferred molecular markers for marker-assisted plant breeding when they are available because the SSR markers possess properties suitable for high-throughput genotyping, such as high reproducibility, co-dominance nature, multi-allelic variation, simplistic assay, low distributing cost and easy automation (Edwards and McCouch, Reference Edwards and McCouch2007; Vieira et al., Reference Vieira, Santini, Diniz and de Freitas Munhoz2016; Bhattarai et al., Reference Bhattarai, Shi, Kandel, Solís-Gracia, da Silva and Avila2021).

In recent years the majority of available tomato seeds are hybrids which are costly and sometimes lack qualities, such us flavour and aroma. Hybrid seed also requires that growers buy their seeds every year to achieve consistent production. (Mavromatis et al., Reference Mavromatis, Athanasouli, Vellios, Khah, Georgiadou, Pavli and Arvanitoyannis2013). In contrast to modern-day hybrids, tomato landraces are very heterogeneous because they have been selected for their performance in adverse and low-input agricultural environments, as well as qualitative criteria e.g. aroma and flavour. Tomato is mainly propagated with seeds which enable the genetic material to be saved through the years. Many efforts have been made individually or organized to collect seeds from different vegetable crops including tomato. These seeds are either inferred varieties, landraces or populations of vegetables, and they are collected, preserved and cultivated organically by farmers.

Databases are a valuable tool in the cataloguing and analysis of biological specimens. Biological databases provide a deep knowledge store for scientists to preserve genomic and phenotypic data (Whitehornand and Marklyn, Reference Whitehornand and Marklyn2001). Databases also enable improved understanding of the relationships between available data and gain new insights into the relationships between traits to enable improved decision-making or identifying new opportunities (Codd, Reference Codd and Rustin1971). These type of databases for other plants exists. For instance, Vitis vinefera L. a plant that has been widely investigated has multiple available databases: a European Vitis Database ((http://www.eu-vitis.de/index.php), an Italian ((https://vitisdb.it), a Swiss (http://www1.unine.ch/svmd), a French Vitis Database (http://plantgrape.plantnet-project.org/it/cepages) and a Greek Vitis Database (http://www.biology.uoc.gr/gvd). Databases also exist regarding spinach (http://spinachbase.org), olive http://www.bioinfo-cbs.org/ogdd/) and rice (http://server.malab.cn/Ricyer/index.html). Databases for Solanum spp. include the Tomato Functional Genomics Database (TFGD) (http://ted.bti.cornell.edu/), and the Kazusa Tomato Genomics Database (https://www.kazusa.or.jp/tomato/) are some of the tomato database that exist.

Even though tomato is not native to Greece, a significant effort has been devoted to documenting and understanding the wide range of phenotypic and genetic variability in Greek tomato cultivars using molecular markers (Terzopoulos and Bebeli, Reference Terzopoulos and Bebeli2008; Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019; Athinodorou et al., Reference Athinodorou, Foukas, Tsaniklidis, Kotsiras, Antonios, Costas, Kyratzis, Tzortzakis and Nikoloudakis2021). However, even though Greece is considered an important producer, the data from previous studies is not widely available. The objective of this study was to perform molecular fingerprinting with SSR markers of 27 landraces and the two hybrids cultivated in Crete, Greece. A population structuring analysis was performed to delve deeper into the ancestry of these ecotypes, hypothesizing that there is going to be a diverse and unknown genetic material A publicly available relational database was created, containing all possible information about tomato genotypes cultivated in Greece, including Greek and ISO transliterated name, place of origin/location, genetic/molecular characteristics.

Materials and methods

Plant material

In 2019 and 2020, in collaboration with a nonprofit cooperative enterprise named ‘Melitakes’, seeds from 27 landraces of tomato (S. lycopersicum L.) were collected for this study (Table 1.), a total of 83 individuals. ‘Melitakes’, is a Social Cooperative Enterprise located in Pyrgos, Heraklion, Crete, Greece, which deals with the collection of seeds from different vegetable crops including tomato seeds. In addition, in collaboration with the Department of Agriculture of the Mediterranean University of Crete, seeds from six tomato landraces and two hybrids cultivated in Crete were collected. The landrace name in Greek along with the two hybrid names, the code name used in the analysis, and the origin of the seeds are shown in Table 1.

Table 1. Τhe code name used in the analysis for 83 individuals of the 27 tomato landraces and the two tomato hybrids

In italics is the origin of the seeds for 83 individuals of the 27 tomato landraces and the two tomato hybrids.

Fourteen landraces arrived in the greenhouse of the Department of Biology, University of Crete, and the remaining thirteen landraces and two hybrids were planted in a greenhouse of the farm of the Department of Agriculture of the Hellenic Mediterranean University of Crete. In both cases, seeds were germinated in a controlled environment nursery, in trays filled with peat moss and perlite. When the plants reached the 4th main stem leaf were transplanted in 2 lt pots filled with peat moss with perlite in a ratio of 3:1. The plants were watered every three days, and a well-balanced fertilizer was used (Nitrophoska® 15-5-20 (+ 2MgO + 8S + TE), EuroChem, Greece). Young leaves were collected from all genotypes and were either stored at −80 °C or used directly for DNA extraction (Table 1).

DNA extraction

DNA was extracted using a CTAB-based protocol according to (Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019) and (Bibi et al., Reference Bibi, Gonias and Doulis2020). The leaf tissue (0.05 g) was ground with liquid nitrogen with a mortar and pestle. A CTAB buffer was prepared with 20.5 g NaCl (Sigma Aldrich by Merck KGaA, Darmstadt, Germany), 5 g CTAB (Sigma Aldrich by Merck KGaA, Darmstadt, Germany) in 215 mL ddH20, 25 mL 1 M TRIS-HCL pΗ:8 (Sigma Aldrich by Merck KGaA, Darmstadt, Germany), 10 mL 0.5 M EDTA pΗ:8, 1 μl b- mercaptoethanol (Sigma Aldrich by Merck KGaA, Darmstadt, Germany) and 1% (w/v) PVP 360 (Sigma Aldrich by Merck KGaA, Darmstadt, Germany of CTAB Buffer (0.5 ml) was added to the 50 mg (0.05 g) of ground tissue and incubated at 65 °C for 30 min with occasional vigorous shaking. Chloroform: Ιsoamyl alcohol (24:1) (Sigma Aldrich by Merck KGaA, Darmstadt, Germany) 0.5 mL was added to each sample and shaken using a vortex. The samples were centrifuged at 13,000 rpm for 15 min to resolve phases. The aqueous phase was pipetted out carefully to a fresh tube (1.5 mL), added 0.5 mL cold isopropanol, mixed and incubated at −20 °C or at 4 °C or on ice for 1 h or overnight. The samples were centrifuged at 9000 rpm for 5 min. The supernatant was discarded and the pellet was washed in 70% ethanol. 70% ethanol (1 ml) was added and the samples were centrifuged at 13,000 rpm for 5 min the ethanol was carefully removed and the pellet dried for at least an hour. The precipitate was dissolved in 100 μl TE buffer by gentle inversion for at least an hour. RNase A (Qiagen, by Safe Blood Bio Analytical, Greece) (10 mg/mL) 1 μl was added and the sample was incubated at 37 °C for 30 min (wait for at least 15 min). Proteinase K (F. Hoffmann-La Roche Ltd) (1 mg/mL) 1μl was added and again the samples were incubated at 37 °C for 15–30 min. 3 M Sodium acetate (Sigma Aldrich by Merck KGaA, Darmstadt, Germany) (10 μl)and 250 μl absolute ethanol was added. The samples were incubated at −20 °C or at 4 °C or on ice for 1 h or overnight. The samples were centrifuged at 14.000 rpm for 10 min and the supernatant was discarded. Following that 1000 μl of 70% ethanol was added and then centrifuged at 13,000 rpm for 5 min, empty the ethanol carefully and drain dry (let it stay for at least an hour). In the final step, the DNA pellets were suspended in 100 μl 1X Tris-EDTA (TE) buffer (Serva by TechnoBioChem Ltd, Greece) and stored at −20 °C. The DNA quality was visualized on 0.8% agarose (Invitrogen™ by ThermoFischer Scientific) gels stained with ethidium bromide (Sigma Aldrich) (10 mg/mL), and the samples were then diluted in TE to a final concentration of 10 ng/μl.

SSR analysis

A set of 11 SSR markers scattered throughout the genome of S. lycopersicum L, was selected, providing a high number of alleles from He et al. (Reference He, Poysa and Yu2003) and Korir et al. (Reference Korir, Diao, Tao, Li, Kayesh, Li, Zhen and Wang2014) (Table 2). Samples were genotyped with eleven selected microsatellite molecular markers. PCR was conducted in a final volume of 10 μl and the reaction mixture contained 1 ng/μl genomic DNA, 0.2 μM of the forward primer (labelled) and 0.2 mM of the reverse primer (unlabelled), 0.2 mM dNTP, 1U Taq DNA Polymerase (New England Biolabs, Ipswich, MA, USA), 1X Taq polymerase Mg-free buffer, and 2 mM MgCl2. The forward primers were labelled with ABI fluorescent dyes HEX (green), ROX (red), and FAM (blue) (Eurofins Genomics). Amplifications were performed using a T100 thermal cycler (Bio-Rad Laboratories Inc., United Kingdom). The amplification conditions consisted of an initial denaturation step of 5 min at 95 °C, followed by 30 cycles of 30 s at 94 °C, 30 s at 50–62 °C and 30 s at 72 °C, with a final extension at 72 °C for 10 min. The resulting PCR products were first visualized by 0.8% agarose gel electrophoresis. Up to three different primer pairs were mixed in the same well (multiplex), taking into account the size of the amplified fragments and/or the labelling of the primers prior to the SSR fractionation. The products were loaded into the SEQ-Studio genetic analyser (Applied Biosystems, Foster City, California, USA) for SSR fractionation. During the fragment analysis, size standards LIZ600 of Applied Biosystems were employed. Allele binning and data matrix production was done within STR and, version 2.4.108 (Veterinary Genetics Lab, University of California).

Table 2. Loci names, primer sequences (forward and reverse), annealing temperature product range of the 11 SSR loci used to fingerprint 84 genotypes of tomato (He et al., Reference He, Poysa and Yu2003; Korir et al., Reference Korir, Diao, Tao, Li, Kayesh, Li, Zhen and Wang2014)

Genetic analysis and neighbour-joining tree construction

For each locus, allele sizing was based on published repeat patterns (Carvalho et al., Reference Carvalho, Yadav, Garrido-Maestu, Azinheiro, Trujillo, Barros-Velázquez and Prado2021). The data matrixes were produced and genetic diversity measures were determined for each employed locus across all fingerprinted genotypes. These measures included: (i) individual locus polymorphic information content (PIC) (Botstein et al., Reference Botstein, White, Skolnick and Davis1980), (ii) observed heterozygosity (HO) and (iii) expected heterozygosity (HE) to determine the genetic variation. PIC, HO, HE, estimated frequency of null alleles, and probability of identity (PI) were calculated with the software CERVUS ver. 3.0.3 software package (Kalinowski et al., Reference Kalinowski, Taper and Marshall2007). A similarity matrix was produced employing Nei's distance matrix within GenAlEx version 6; (Peakall and Smouse, Reference Peakall and Smouse2012). Subsequently, a neighbor-joining tree was produced using the function about of the popp package of R (v. 4.1.3) to estimate the dendrogram based on Nei's genetic distance together with the bootstrap values on the branches of the tree. From the 83 tomato genotypes, 27 tomato landraces and two hybrids were used for dendrogram construction. To estimate the divergence between the different populations, pairwise Fst measurements were calculated according to (Weir and Cockerham, Reference Weir and Cockerham1984) using GenAlEx 6 (Peakall and Smouse, Reference Peakall and Smouse2006). Analysis of molecular variance (AMOVA) was also performed to assess the genetic structure of the 27 tomato landraces and two tomato hybrids, using GenAlEx 6.

Population structure

The genetic structures of these individuals were analysed using STRUCTURE 2.3.4 software (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000). This software applies a Bayesian clustering algorithm to identify subpopulations, assign individuals to them, and estimate population allele frequencies (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000). This analysis was carried out using a burning period of 200,000 iterations and a run length of 800,000 MCMC replications. We tested a continuous series of K, from 1 to 10, in 10 independent runs. We did not introduce any prior knowledge about the origin of the population and assumed correlated allele frequencies and admixture (Falush et al., Reference Falush, Stephens and Pritchard2003). For selecting the optimal value of K, ΔK values (Evanno et al., Reference Evanno, Regnaut and Goudet2005) were calculated using STRUCTURE harvester (Earl and vonHoldt, Reference Earl and vonHoldt2012). POPHELPER, proposed by (Francis, Reference Francis2017) was used to analyse and visualize population structure.

Multidimensional scaling analysis

Multidimensional Scaling (MDS) is a computational approach used to visualize the level of similarities (or dissimilarities) between high-dimensional individuals as a configuration of points mapped into a Cartesian space (Mead, Reference Mead1992). MDS is a distance-based method. Here, we applied the Rs distance (Reynolds, et al., Reference Reynolds, Weir and Cockerham1983) between the populations of the sample. Reynolds distance (or coancestry distance) provides an estimate of the genetic drift between the populations. MDS and the Reynolds distance were calculated using the R programming language (v. 4.1.3) and the packages poppr and adegenet. Since distances were calculated between populations (and not between individuals), we used the function genind2genpop from the adegenet R package to convert individual genotype data into alleles counts per population.

The Greek Tomato Database

The data were converted to CSV files and imported into the database via the utility phpMyAdmin (www.phpmyadmin.net, version 5.1.0) which has been configured and used as the main tool for the data management. The platform was deployed to a new server and the content is served via an Apache web server (httpd.apache.org, version 2.4.41) using a Linux distribution of Ubuntu version 20.04 LTS as its operating system. The web hosting server also has a Linux Server distribution of Ubuntu Server (ubuntu.com, version 20.04.01 LTS). For the development of the website, the Laravel PHP Framework version 7.30.4 was used. The content of the website is served with the web scripting language PHP version 7.4.3. The database is stored under the MySQL server version 8.0.28.

This database is hosted on an Apache operating system (http://d.apache.org, version 2.4.41). All work was done in a PC with Linux distribution of Ubuntu as its operating system (ubuntu.com, version 20.04 LTS), and is supported by the open-source MySQL relational database. The server is located in the facilities of FORTH-IMBB in a specially configured Data Centre. The Database can be accessed at http://139.91.75.96/tomatodb using a simple web browser. The user has access to information about samples such as sample collection sites. This database provides the users the ability to store SSR results and information about SSR primers. In addition, there are photos for each sample of the same plant, leaf, foliage, and fruit as well as a cross-section of the fruit.

Our two main tasks were firstly to create the SQL structure of each database table based on the fields found at the header of the flat text file and secondly to import the data from the old format to the new one, therefore each flat file was converted as a separate database table within the same database. The conversion of the data from the flat file format to the new database was done in the next steps. Based on these fields and the type of data it holds we decided which type of data to use for each field. Then with SQL, we created an SQL query for each table. This SQL query was imported into the phpMyAdmin system and created the tables.

Results

Genetic diversity analysis of indigenous landraces of tomato

A total of 83 individuals were genotyped, using 11 SSR loci, to amplify polymorphic fragments from the 27 tomato landraces and two hybrids. All 11 SSR loci were used efficiently and reproducibly. The genetic diversity measures determined for each employed locus across, all fingerprinted genotypes are shown in Table 3. The number of amplified alleles (k) by each SSR primer pair varied from eight for SLR5 to fourteen for SLR10 with an average number of alleles per locus of 10.727. The expected heterozygosity (He) ranged from 0.440 in SLR3 to 0.813 in SLR26 with an average value of 0.6172. High heterozygosity means lots of genetic variability, on the contrary, low heterozygosity means little genetic variability (Mc Donald, Reference Mc Donald2018). The genotype level of polymorphism was assessed by calculating PIC values for each of the 11 SSR loci. In our study, the average PIC (polymorphic information content) was 0.5687. Using the 11 SSR markers in combination, the cumulative probability of identity, a measure of the probability of obtaining an identical genotype was calculated, with a value of 8.41 × 10−9. Similarly, the use of SSR markers in tomato genotypes has given data with a very low cumulative probability, which has been shown before in other studies (Laucou et al., Reference Laucou, Lacombe, Dechesne, Siret, Bruno, Dessup, Dessup, Ortigosa, Parra, Roux, Santoni, Varès, Péros, Boursiquot and This2011; Emanuelli et al., Reference Emanuelli, Lorenzi, Grzeskowiak, Catalano, Stefanini, Troggio, Myles, Martinez-Zapater, Zyprian, Moreira and Stella Grando2013; Doulati-Baneh et al., Reference Doulati-Baneh, Mohammadi, Labra, De Mattia, Bruni, Mezzasalma and Abdollahi2015). This number corresponds to a statistical potential of distinguishing a large number of unrelated tomato genotypes. AMOVA was conducted to determine the variation explained by populations. The results indicated that 58% of the genetic variation (P < 0.0001) resided among populations and 42% (P < 0.0001) resided within populations.

Table 3. Total genetic diversity of the 27 tomato landraces and the two tomato hybrids

Number of observed alleles (k), Expected (He) and observed (Ho) heterozygosity polymorphic information content (PIC), Probability of null alleles (F(Null)).

STRUCTURE analysis

The genetic structure of the whole population was evaluated using STRUCTURE software. The analysis provided evidence for a significant population structure in this set of cultivars. A maximum value of the rate of change in the log probability of the data was revealed at K = 9, using Evanno's method (Fig. 1a). The highest Delta K value was observed at K = 9 (Fig. 1d). The estimated logarithm of the probability of the data [L(K)] increased linearly from K = 3 up to K 7 showing a clear point of inflection (Fig. 1c).

Figure 1. Genetic STRUCTURE of 27 tomato-landraces and two tomato-hybrids from Greece with 11 SSR considering K = 9. A) Colours of the letters represent the nine groups, defined by the K value. The vertical axis indicates the obtained ancestry groups A, B, C, to I, for each genotype. Ancestor population (a.p) A is coloured black, a.p. B is coloured blue, a.p. C is coloured light blue, a.p. D is coloured raff, a.p. E is coloured green, a.p. F is coloured yellow, a.p. G is coloured dark yellow, a.p. H is coloured dark yellow, and finally the a.p. I, is coloured red. B) The distribution of each landrace in the ancestor population that they belong to according to the analysis. C) Mean (± standard deviation) log-likelihood value of the data [L(K)] as a function of the value of K, the number of clusters, D) second-order rate of change of the log-likelihood of the data (ΔK) as a function of K, the number of clusters.

The estimated population structure inferred from the analysis identifies nine genetic groups (ancestor populations -a.p A, B, C, D, F, G, H and I) and is graphically presented in Fig. 1a, supporting the hypothesis that there is a diverse genetic material. All individuals were assigned to nine ancestor populations revealing interesting pairing that is analysed below along with the dendrogram produced (Fig. 1b).

Genetic distance analysis

A neighbour-joining tree was built based on Nei's distance matrix (Fig. 2), where shorter branches between two landraces/hybrids, indicate higher genetic similarity between them. Therefore based on the phylogenetic tree, three main clusters were revealed, that include individuals from a variety of ancestor populations (Fig. 2a), no previous records have been reported with this group of Greek landraces. The first one (Fig. 2a, cluster A, branch colour is black) includes the landraces VEG11HMU600, Veg10K10C in a smaller cluster, Santorini K40-19, PrunenoirK38A and Bournalati K36, KerasMavI19 K37E and KerasMavFORTH K37E clustered together, Veg15 Belladona F1, Veg16 Bobcat F1 and Veg3 HMU3 clustered together, Veg1 HMU220, Bournelati K10 and Bournalati K34, finally Veg 7 K21, Veg 12 HMU1120, Veg6 K2A, Veg2 HMU2040 clustered together. According to the dendrogram Veg10 K10C and VEG11HMU600 are mingled forming a cluster with a bootstrap value 57 (Fig. 2a). The Structure analysis revealed that both of them have individuals that belong to ancestor populations A and F (yellow and dark blue) (Fig. 1b). The similarity of the genotypes is also depicted in the Multidimensional scaling (MDS) diagram (Fig. 2b). Santorini K40-19 has a very distinct position on the tree and the Structure analysis revealed that this tomato landrace created an ancestor population of its own (ancestor population D: raff-blue). In the MDS plot Santorini K40-19 also showed its distinct position (Fig. 2b). Among these landraces in the main cluster, PrunenoirK38A and Bournelati K36 are mingled with a low bootstrap value of 30 (Fig. 2a). The Structure analysis revealed that Bournelati K36 belongs to the ancestor population A (dark blue) and PrunenoirK38A has individuals that belong in ancestor populations A and F (yellow and dark blue) (Fig. 1b). The MDS diagram places all landraces that belong in the ancestor populations A and F (yellow and dark blue) close by (Fig. 2b). Furthermore, KerasMavI19 K37E and KerasMavFORTH K37E are mingled together with a high bootstrap value of 96 increasing the certainty for this cluster formation. The Structure analysis also shows that these landraces belong to the same ancestor population I (red, Fig. 1b), and in the MDS diagram they are very close to each other (Fig. 2b). Interestingly, the hybrid Veg16 Bobcat F1 and Veg3 HMY3 are mingled together (bootstrap value 62), and are close enough with Veg15 Belladona F1 (bootstrap value 55). The Structure analysis also shows that Veg16BobcatF1 and Veg3HMY3 belong to the same ancestor population C (light blue, Fig. 1b) and in the MDS diagram they are very close to each other (Fig. 2b). Veg15 Belladona F1 has individuals that belong in ancestor populations C, E, and F (light blue, yellow and green) (Fig. 1b). Bournalati K10 and Bournelati K34 are clustered together and close by is Veg1 HMU220. Bournalati K10 and Veg1 HMU220 belong to the same ancestor population G (dark yellow) while Bournelati K34 belongs to ancestor population B (blue|) (Fig. 1b). All of them appear very close in the MDS diagram (Fig. 2b). Finally, a cluster is formed from Veg 7 K21, Veg 12 HMU1120, Veg6 K2A and Veg2 HMU2040. Veg6 K2A and Veg2 HMU2040 are closer to each other (bootstrap value 46) in the MDS diagram, while the rest are further apart (Fig. 2b). The Structure analysis showed that among the Veg 7 K21, Veg6 K2A and Veg2 HMU2040 belong to the same ancestor population B (blue) and Veg 12 HMU1120 has individuals that belong in ancestor populations B and G (blue and dark yellow) (Fig. 1b).

Figure 2. (A) Neighbour-joining unweighted phylogenetic similarity tree based on Nei's distance matrix. Different colours in letters represent the ancestor populations for each genotype. Cluster A (black branch), Cluster B (red branch) and Cluster (green branch). Bootstrap values were calculated using the about function of the poppr package of R (v. 4.1.3). (B) MDS plot of the various individual populations. The Reynolds metric was used to estimate the distance between each population pair. Different colours represent the ancestor populations for each genotype.

In the second main cluster of the dendrogram (Fig. 2, cluster B, branch colour is red), Agrokipiou K301 and Agrokipiou K302 are mingled together and appear identical (bootstrap value 94). According to the Structure analysis, these landraces belong to the same ancestor population E (green, Fig. 1b, 2a). Close enough is Kipourou K10B (bootstrap value 49, ancestor population E). Also, another cluster is formed between Agrokipio K30 and VoidokardiaB k37A but they are not genetically linked. The Structure analysis shows that Agrokipio K30 has individuals that belong to the ancestor populations E and F (green and yellow) (Fig. 1b, 2a) and VoidokardiaB k37A has individuals that belong to the ancestor populations E and G (green and dark yellow) (Fig. 1b, 2a). In the MDS diagram, these landraces are not very close therefore for these landraces, we cannot conclude. (Fig. 2b).

Finally, in the third main cluster (Fig. 2, cluster C, branch colour is green) of the diagram five landraces are found. These are Bournelati K35, KerasKokFORTH K37D, Veg8 K37Z, Veg9 HMU9, and Veg13 HMU13. Bournelati K35 and KerasKokFORTH K37D are mingled together (bootstrap value 46) and according to the Structure analysis, Bournelati K35 belongs to the ancestor population F (yellow) and KerasKokFORTH K37D has individuals that belong to the ancestor populations F and A (yellow-dark blue) (Fig. 1b, 2a). In addition to that, Veg9 HMU9, Veg8 K37Z and Veg13 HMU13 are mingled together and according to the Structure analysis, these landraces belong to the same ancestor population H (orange, Fig. 1b, 2a). Also, these genotypes are forming a cluster with Veg13 HMU13. The Structure analysis revealed that Veg13 HMU13 belongs to the ancestor population H (orange, Fig. 1b, 2a). In the MDS diagram these landraces are very close to each other (Fig. 2b).

The database schema

A MySQL server available at the Institute of Molecular Biology and Biotechnology was used to create the structure of the tables needed for each type of data and the relations among them (Fig. 3a, b). An open-source web application in PHP and HTML5 was also developed for data presentation. For data management, a MySQL data management platform has been installed and configured as open-source PHPMyAdmin (www.phpmyadmin.net, version 5.1.0). The schemas of the new database are shown in Fig 3a. The GTD database enables readers to retrieve these data in one or more tables with a single query. GTD also provides an insight into the protocols used for this research and some basic phenotypic information including the place of origin for its landrace. GTD can be updated and enriched by users with new and supplemental information to existing datasets. An overview of the GTD indicating primary functions is presented in Supplementary material.

Figure 3. (A) The database schema presents the data tables that have been created to store the content of the database. (B) The front page of the Greek Tomato Database (GTD).

Discussion

It was hypothesized that in Greece and especially in Crete, the leading region for greenhouse tomato production, there is going to be a very diverse genetic pool, that even though has been narrowed with natural selection through the years, the landraces still remain diverse. Eleven SSR loci were used to amplify polymorphic fragments from these genotypes and the probability of obtaining an identical genotype was low enough 8.41 × 10−9, corresponding to a statistical potential of distinguishing a large number of unrelated tomato genotypes. According to the literature the use of SSR markers in tomato genotypes has given data with a very low cumulative probability of identity in many studies (Laucou et al., Reference Laucou, Lacombe, Dechesne, Siret, Bruno, Dessup, Dessup, Ortigosa, Parra, Roux, Santoni, Varès, Péros, Boursiquot and This2011; Emanuelli et al., Reference Emanuelli, Lorenzi, Grzeskowiak, Catalano, Stefanini, Troggio, Myles, Martinez-Zapater, Zyprian, Moreira and Stella Grando2013; Doulati-Baneh et al., Reference Doulati-Baneh, Mohammadi, Labra, De Mattia, Bruni, Mezzasalma and Abdollahi2015; Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019). The results supported the hypothesis revealing that the 27 landraces and the two hybrids belong to nine ancestral populations (ancestry groups A, B, C, D, F, G, H and I).

There is an increasing demand to clarify the relationships among local tomato landraces. The phylogenetic tree and the Multidimensional scaling (MDS) diagram, supported the great genetic diversity within our set of genotypes. Reynolds approach is considered appropriate for data with small mutation rates adopting the infinite alleles model. Even though the method was developed for allozymes, which are characterized by a small mutation rate compared to the microsatellite data, the method is considered appropriate for small populations, in which genetic drift considerably affects the evolution of the populations. In such scenarios (appropriate for our data as well), the SSR mutations do not show the bell-like distribution expected by the stepwise mutation model. In contrast, allelic distribution is similar to the infinite alleles model (Reynolds et al., Reference Reynolds, Weir and Cockerham1983).

The Structure analysis indicated that nine ancestral populations are hidden inside the group of all the genotypes tested, using Evanno's method. Notably, the majority of the landraces are genetically apart compared to the two hybrids except for two landraces (i.e. Veg3 is identical to Veg16Bobcat and Kipourou K10B is similar to Veg15 Belladona F1). These two hybrids (Veg15Belladona F1 and Veg16Bobcat F1) were also included in a genetic analysis by (Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019) and, similarly to our findings, the two hybrids had been found close by to the dendrogram and accessed to the same ancestor population.

These data give an insight into the relationships among the 27 tomato landraces cultivated in Crete, Greece, and provide a speculation of their origin. Structure analysis and the MDS diagram access the landraces on different ancestor populations, allowing us to delve deeper into their origin. There is no previous research regarding the ancestry of these 27 tomato landraces, however, the two hybrids seem to support previous findings regarding their origin (Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019).

It is a necessity to have a publicly available tomato database in Greece. The database developed is the first Greek Tomato Database that is fully functional and publicly available. It is well known that is a valuable tool in cataloguing and analysing of biological specimens and provides a deep knowledge store for scientists to preserve genomic and phenotypic data (Whitehornand and Marklyn, Reference Whitehornand and Marklyn2001). This database will serve as a depository for the molecular fingerprint of 27 tomato ecotypes allowing the characterization, identification, and preservation of these tomato ecotypes in Greece. Also, it is a starting point to incorporate new genetic information for answers on many issues such as identification, sanitation, adaptation etc.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S147926212300103X.

Acknowledgements

All authors would like to acknowledge Stella Hatzigeorgiou from the Hellenic Mediterranean Organization ELGO –DIMITRA Institute of Olive, Subtropical crops and Vitis, and Manoli Vardaki, from ‘Melitakes’, a Social Cooperative Enterprise located in Pyrgos, Heraklion, Crete, Greece. Their significant contribution was to provide us with the seeds from most of the landraces analysed.

Authors' contributions

Study conception and design: ACB, DK. Acquisition of data: ACB, AK, JM, KK. Analysis and interpretation of data: ACB, JM, PP. Drafting of manuscript: ACB, KK, PP. Critical revision: ACB, KK, EG. All authors read and approved the final manuscript.

Funding statement

This work was supported by the Emblematic Action of the Greek General Secretariat for Research and Technology, Agro4Crete (Protocol Number: SAE 013), Emblematic research action for the agri-food sector of Crete: four institutes, four reference points (Agro4Crete). This action is incorporated in Subproject 2, Intervention B” Pilotic application of new innovations in agriculture production” for the project ‘National Emblematic research action for the utilization of new technologies in the agri-food’ and it will be accomplished via a collaboration of four Institutes: Hellenic Mediterranean University, University of Crete, Foundation of Research and Technology, Institute of Molecular Biology and Biotechnology, and Hellenic Mediterranean Organization ELGO-DIMITRA.

Competing interest

None.

Declarations

Ethics approval and consent to participate (kindly mention the name of the Ethics Committee and the Ethical Approval Number)

Not applicable.

Consent for publication

All authors have approved the submission.

Availability of data and materials

The datasets generated during and/or analysed during the current study are available in the GreekTomato Database http://139.91.75.96/tomatodb.

Footnotes

Diseased.

References

Athinodorou, F, Foukas, P, Tsaniklidis, G, Kotsiras, A, Antonios, C, Costas, D, Kyratzis, AC, Tzortzakis, N and Nikoloudakis, N (2021) Morphological diversity, genetic characterization, and phytochemical assessment of the Cypriot tomato germplasm. Plants 10, 124.CrossRefGoogle ScholarPubMed
Bhattarai, G, Shi, A, Kandel, DR, Solís-Gracia, N, da Silva, JA and Avila, CA (2021) Genome-wide simple sequence repeats (SSR) markers discovered from whole-genome sequence comparisons of multiple spinach accessions. Scientific Reports 11, 9999.Google Scholar
Bibi, AC, Gonias, ED and Doulis, AG (2020) Genetic diversity and structure analysis assessed by SSR markers in a large collection of vitis cultivars from the Island of Crete, Greece. Biochemical Genetics 58, 294321.Google Scholar
Botstein, D, White, RL, Skolnick, M and Davis, RW (1980) Construction of genetic linkage map in man using restriction fragment length polymorphisms. American Journal of Human Genetics 32, 314331.Google Scholar
Carvalho, J, Yadav, S, Garrido-Maestu, A, Azinheiro, S, Trujillo, I, Barros-Velázquez, J and Prado, M (2021) Evaluation of simple sequence repeats (SSR) and single nucleotide polymorphism (SNP)-based methods in olive varieties from the northwest of Spain and potential for miniaturization. Food Chemistry: Molecular Sciences 3, 100038.Google Scholar
Codd, E (1971) “Relational completeness of data base sublanguages. In Rustin, R (ed.), Data Base Systems. pp. 6598 in Proceedings of 6th Courant Computer Science Symposium New York, N.Y.Google Scholar
Doulati-Baneh, H, Mohammadi, SA, Labra, M, De Mattia, F, Bruni, I, Mezzasalma, V and Abdollahi, R (2015) Genetic characterization of some wild grape populations (Vitis Vinifera Subsp. Sylvestris) of Zagros mountains (Iran) to indentify a conservation strategy. Plant Genetic Resources: Characterization and Utilization 13, 2735.CrossRefGoogle Scholar
Earl, DA and vonHoldt, BM (2012) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the evanno method. Conservation Genetics Resources 4, 359361.Google Scholar
Edwards, JD and McCouch, SR (2007) Molecular Markers for Use in Plant Molecular Breeding and Germplasm Evaluation, In Guimarães, EP, Ruane J, Scherf BD, Sonnino A, Dargie JD (eds.). Marker – Assisted selection. Current status and future perspectives in crops, livestock, forestry and fish. Food and Agriculture Organization of the United Nations Rome, pp. 2950.Google Scholar
Emanuelli, F, Lorenzi, S, Grzeskowiak, L, Catalano, V, Stefanini, M, Troggio, M, Myles, S, Martinez-Zapater, J, Zyprian, E, Moreira, FM and Stella Grando, M (2013) Genetic diversity and population structure assessed by SSR and SNP markers in a large germplasm collection of grape. BMC Plant Biology 13, 1339. doi: 10.1186/1471-2229-13-39CrossRefGoogle Scholar
European Commission - DG Agri G2 (2021), The tomato market in the EU: Vol. 1: Production and area statistics, Working document DG Agri G2.Google Scholar
Evanno, G, Regnaut, S and Goudet, J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology 14, 26112620.Google Scholar
FAOSTAT (2023) Food and Agricultural Organization. Available at http://faostat.fao.org (accessed 22 January 2004).Google Scholar
Falush, D, Stephens, M and Pritchard, JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 15671587.CrossRefGoogle ScholarPubMed
Foolad, MR (2007) Genome mapping and molecular breeding of tomato. International Journal of Plant Genomics 2007, 152.Google Scholar
Francis, RM (2017) Pophelper: an R package and web app to analyse and visualize population structure. Molecular Ecology Resources 17, 2732.Google Scholar
Frary, A, Xu, Y, Liu, J, Mitchell, S, Tedeschi, E and Tanksley, S (2005) Development of a set of PCR-based anchor markers encompassing the tomato genome and evaluation of their usefulness for genetics and breeding experiments. Theoretical and Applied Genetics 111, 291312.CrossRefGoogle ScholarPubMed
Fulton, TM, Nelson, JC and Tanksley, SD (1997) Introgression and DNA marker analysis of lycopersiconperuvianum, a wild relative of the cultivated tomato, into lycopersicon esculentum, followed through three successive backcross generations. Theoretical and Applied Genetics 95, 895902.Google Scholar
Gonias, ED, Ganopoulos, I, Mellidou, I, Bibi, AC, Kalivas, A, Mylona, PV, Osanthanunkul, M, Tsaftaris, A, Madesis, P and Doulis, AG (2019) Exploring genetic diversity of tomato (Solanum lycopersicum L.) germplasm of genebank collection employing SSR and SCAR markers. Genetic Resources and Crop Evolution 66, 12951309.CrossRefGoogle Scholar
He, C, Poysa, V and Yu, K (2003) Development and characterization of simple sequence repeat (SSR) markers and their use in determining relationships among lycopersicon esculentum cultivars. Theoretical and Applied Genetics 106, 363373.Google Scholar
Jones, JB (2007) Tomato Plant Culture: In the Field, Greenhouse, and Home Garden, 2nd Edn, Florida, USA: CRC Press.Google Scholar
Kalinowski, ST, Taper, Ml and Marshall, TC (2007) Revising how the computer program CERVUS accommodates genotyping error increases success in paternity assignment STEVEN. Molecular Ecology 16, 10991106.Google Scholar
Korir, NK, Diao, W, Tao, R, Li, X, Kayesh, E, Li, A, Zhen, W and Wang, S (2014) Genetic diversity and relationships among different tomato varieties revealed by EST-SSR markers. Genetics and Molecular Research 13, 4553.Google Scholar
Laucou, V, Lacombe, T, Dechesne, F, Siret, R, Bruno, JP, Dessup, M, Dessup, T, Ortigosa, P, Parra, P, Roux, C, Santoni, S, Varès, D, Péros, JP, Boursiquot, JM and This, P (2011) High throughput analysis of grape genetic diversity as a tool for germplasm collection management. Theoretical and Applied Genetics 122, 12331245.Google Scholar
López-Terrada, M (2014) The history of the arrival of the tomato in Europe: an initial overview. Traditom, 117. Available at http://traditom.eu/fileadmin/traditom/downloads/TRADITOM_History_of_the_arrival_of_the_tomato_in_Europe.pdfGoogle Scholar
Mavromatis, AG, Athanasouli, V, Vellios, E, Khah, EM, Georgiadou, EC, Pavli, OI and Arvanitoyannis, IS (2013) Characterization of tomato landraces grown under organic conditions based on molecular marker analysis and determination of fruit quality parameters. Journal of Agricultural Science 5, 239. doi: 10.5539/jas.v5n2p239Google Scholar
Mc Donald, D (2018) Genetic Markers, ZOO 4425/5425. Retrieved July 11, 2023 Available at https://www.uwyo.edu/dbmcd/molmark/index.html.Google Scholar
Mead, A (1992) Review of the development of multidimensional scaling methods. The Statistician 41, 27.CrossRefGoogle Scholar
Peakall, R and Smouse, PE (2006) GenAlEx 6: genetic analysis in excel. Population genetic software for teaching and research. Molecular Ecology Notes 6, 288295.CrossRefGoogle Scholar
Peakall, R and Smouse, PE (2012) GenAlEx 6.5: genetic analysis in excel. Population genetic software for teaching and research--an update. Bioinformatics (Oxford, England) 28, 25372539.Google Scholar
Pritchard, JK, Stephens, M and Donnelly, P (2000) Inference of population structure using multilocus genotype data. Genetics 155, 945959.Google Scholar
Reynolds, J, Weir, BS and Cockerham, CC (1983) Estimation of the coancestry coefficient: basis for a short-term genetic distance. Genetics 105, 767779.CrossRefGoogle ScholarPubMed
Sadiyah, H, Ashari, S, Waluyo, B and Soegianto, A (2020) Genetic diversity and relationship of husk tomato (physalis spp.) from east java province revealed by SSR markers. Biodiversitas Journal of Biological Diversity 22, 184192. doi: 10.13057/biodiv/d220124CrossRefGoogle Scholar
Savvas, D, Ropokis, A, Ntatsi, G and Kittas, C (2016) Current situation of greenhouse vegetable production in Greece. Acta Horticulturae , 443448.Google Scholar
Tanksley, SD, Ganal, MW, Prince, JP, Carmen de-Vicente, M, Bonierbale, MW, Broun, P, Fulton, TM, Giovannoni, JJ, Grandillo, S, Martin, GB, Messeguer, R, Miller, JC, Miller, L, Paterson, AH, Pineda, O, Roder, MS, Wing, RA, Wu, W and Young, ND (1992) High density molecular linkage maps of the tomato and potato genomes. Genetics 132, 11411160.Google Scholar
Terzopoulos, PJ and Bebeli, PJ (2008) DNA And morphological diversity of selected Greek tomato (Solanum lycopersicum L.) landraces. Scientia Horticulturae 116, 354361.Google Scholar
Vieira, MLC, Santini, L, Diniz, LA and de Freitas Munhoz, C (2016) Microsatellite markers: what they mean and why they are so useful. Genetics and Molecular Biology 39, 312328.Google Scholar
Weir, BS and Cockerham, CC (1984) Estimating f -statistics for the analysis of population structure. Evolution 38, 13581370.Google Scholar
Whitehornand, M and Marklyn, B (2001) Inside Relational Databases. Springer. Printed in Verlag London Limited 2001 in Great Britain.Google Scholar
Figure 0

Table 1. Τhe code name used in the analysis for 83 individuals of the 27 tomato landraces and the two tomato hybrids

Figure 1

Table 2. Loci names, primer sequences (forward and reverse), annealing temperature product range of the 11 SSR loci used to fingerprint 84 genotypes of tomato (He et al., 2003; Korir et al., 2014)

Figure 2

Table 3. Total genetic diversity of the 27 tomato landraces and the two tomato hybrids

Figure 3

Figure 1. Genetic STRUCTURE of 27 tomato-landraces and two tomato-hybrids from Greece with 11 SSR considering K = 9. A) Colours of the letters represent the nine groups, defined by the K value. The vertical axis indicates the obtained ancestry groups A, B, C, to I, for each genotype. Ancestor population (a.p) A is coloured black, a.p. B is coloured blue, a.p. C is coloured light blue, a.p. D is coloured raff, a.p. E is coloured green, a.p. F is coloured yellow, a.p. G is coloured dark yellow, a.p. H is coloured dark yellow, and finally the a.p. I, is coloured red. B) The distribution of each landrace in the ancestor population that they belong to according to the analysis. C) Mean (± standard deviation) log-likelihood value of the data [L(K)] as a function of the value of K, the number of clusters, D) second-order rate of change of the log-likelihood of the data (ΔK) as a function of K, the number of clusters.

Figure 4

Figure 2. (A) Neighbour-joining unweighted phylogenetic similarity tree based on Nei's distance matrix. Different colours in letters represent the ancestor populations for each genotype. Cluster A (black branch), Cluster B (red branch) and Cluster (green branch). Bootstrap values were calculated using the about function of the poppr package of R (v. 4.1.3). (B) MDS plot of the various individual populations. The Reynolds metric was used to estimate the distance between each population pair. Different colours represent the ancestor populations for each genotype.

Figure 5

Figure 3. (A) The database schema presents the data tables that have been created to store the content of the database. (B) The front page of the Greek Tomato Database (GTD).

Supplementary material: File

Bibi et al. supplementary material

Bibi et al. supplementary material
Download Bibi et al. supplementary material(File)
File 14.9 KB