Development of a robust deep learning model for weed classification across diverse bermudagrass turfgrass regimes in China and the United States

Xiaotong Kong; Kang Han; Teng Liu; Aniruddha Maity; Aimin Li; Xiaojun Jin; Jialin Yu

doi:10.1017/wsc.2025.10021

Development of a robust deep learning model for weed classification across diverse bermudagrass turfgrass regimes in China and the United States

Published online by Cambridge University Press: 25 June 2025

Xiaotong Kong

Kang Han ,

Teng Liu ,

Aimin Li ,

and

Xiaotong Kong: Affiliation:
Intern, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Weifang, Shandong, China; and Graduate Student, Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
Kang Han: Affiliation:
Research Assistant, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Shandong, China
Teng Liu: Affiliation:
Research Assistant, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Shandong, China
Aniruddha Maity: Affiliation:
Assistant Professor, Department of Crop, Soil and Environmental Sciences, Auburn University, Alabama, USA
Aimin Li*: Affiliation:
Associate Professor, Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
Xiaojun Jin*: Affiliation:
Associate Professor, National Engineering Research Center of Biomaterials, Nanjing Forestry University, Nanjing, China
Jialin Yu*: Affiliation:
Professor and Principal Investigator, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Shandong, China
*: Corresponding authors: Aimin Li; Email: lam@qlu.edu.cn; Xiaojun Jin; Email: xiaojunjin@njfu.edu.cn; Jialin Yu; Email: jialin.yu@pku-iaas.edu.cn
Corresponding authors: Aimin Li; Email: lam@qlu.edu.cn; Xiaojun Jin; Email: xiaojunjin@njfu.edu.cn; Jialin Yu; Email: jialin.yu@pku-iaas.edu.cn
Corresponding authors: Aimin Li; Email: lam@qlu.edu.cn; Xiaojun Jin; Email: xiaojunjin@njfu.edu.cn; Jialin Yu; Email: jialin.yu@pku-iaas.edu.cn

Article contents

Abstract
Introduction
Materials and Methods
Result and Discussion
Funding statement
Competing interests
Footnotes
References

Rights & Permissions

Abstract

Automatic precision herbicide application offers significant potential for reducing herbicide use in turfgrass weed management. However, developing accurate and reliable neural network models is crucial for achieving optimal precision weed control. The reported neural network models in previous research have been limited by specific geographic regions, weed species, and turfgrass management practices, restricting their broader applicability. The objective of this research was to evaluate the feasibility of deploying a single, robust model for weed classification across a diverse range of weed species, considering variations in species, ecotypes, densities, and growth stages in bermudagrass turfgrass systems across different regions in both China and the United States. Among the models tested, ResNeXt152 emerged as the top performer, demonstrating strong weed detection capabilities across 24 geographic locations and effectively identifying 14 weed species under varied conditions. Notably, the ResNeXt152 model achieved an F1 score and recall exceeding 0.99 across multiple testing scenarios, with a Matthews correlation coefficient (MCC) value surpassing 0.98, indicating its high effectiveness and reliability. These findings suggest that a single neural network model can reliably detect a wide range of weed species in diverse turf regimes, significantly reducing the costs associated with model training and confirming the feasibility of using one model for precision weed control across different turf settings and broad geographic regions.

Keywords

Computer vision precision herbicide application turfgrass weed management neural networks

Information

Type: Research Article
Information: Weed Science , Volume 73 , Issue 1 , 2025 , e50

DOI: https://doi.org/10.1017/wsc.2025.10021 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press on behalf of Weed Science Society of America

Introduction

Turfgrass is ubiquitously grown in various landscapes, including home lawns, golf courses, parks, school playgrounds, and sports fields (Stier et al. Reference Stier, Steinke, Ervin, Higginson and McMaugh2013). Weed control is a constant task for turf management, as weeds compete with turfgrasses for essential nutrients, sunlight, and water, potentially compromising both aesthetics and functionality of the turf. Implementing cultural practices, such as mowing and irrigation, can reduce weed infestation but rarely achieves complete weed control (Busey Reference Busey2003; Neal Reference Neal2020). Currently, the most effective way to control weeds is the application of various preemergence and postemergence herbicides (Kraehmer et al. Reference Kraehmer, Laber, Rosinger and Schulz2014). However, weeds in natural environments are often unevenly distributed, leading to the broadcast application of herbicides across entire areas, including those without weed presence. Moreover, many commonly used herbicides, such as atrazine (photosystem II inhibitor) and monosodium methanearsonate (MSMA, arsenical herbicide), are classified as restricted-use pesticides due to their potential environmental impact (Kudsk and Streibig Reference Kudsk and Streibig2003; McElroy and Martins Reference McElroy and Martins2013; USEPA 2023a, 2023b; WSSA 2023). While manual spot spraying can significantly reduce herbicide inputs compared with broadcast applications by targeting only weed-infested areas, it is labor-intensive, time-consuming, and impractical for large-scale applications. Broadcast applications, on the other hand, are more efficient for large areas but often result in excessive herbicide use and environmental risks (Kudsk and Streibig Reference Kudsk and Streibig2003; Yu and McCullough Reference Yu and McCullough2016).

Computer vision–based automated weed detection and precision spraying technology is a promising solution for significantly reducing herbicide input and weed control costs (Bhakta et al. Reference Bhakta, Phadikar and Majumder2019; Gerhards et al. Reference Gerhards, Andujar Sanchez, Hamouz, Peteinatos, Christensen and Fernandez-Quintanilla2022; Jin et al. Reference Jin, Liu, Yang, Xie, Bagavathiannan, Hong and Chen2023b). Deep learning, a subfield of machine learning, employs multi-layered neural networks to simulate the way the human brain connects and transmits information between neurons (LeCun et al. Reference LeCun, Bengio and Hinton2015). Neural networks are a core component of deep learning and consist of a mathematical model composed of many artificial neurons, also known as nodes or units. Each neuron receives inputs from other neurons, weights and processes these inputs through an activation function, and then passes the result to the neurons in the next layer (Yang and Wang Reference Yang and Wang2020). Deep convolutional neural networks (DCNNs) have achieved remarkable success in many applications, such as facial recognition (Singh et al. Reference Singh, Hariharan and Gupta2020), natural language processing (Chowdhary Reference Chowdhary2020), self-driving cars (Maqueda et al. Reference Maqueda, Loquercio, Gallego, García and Scaramuzza2018), and automated detection of structural flaws (Luo et al. Reference Luo, Gao, Woo and Yang2019).

In recent years, the application of DCNNs in agriculture has grown exponentially. For example, Zhao et al. (Reference Zhao, Ma, Yong, Zhu, Wang, Luo and Huang2023) developed a neural network capable of detecting the germination status and estimating the total number of germinated rice (Oryza sativa L.) seeds. Ahmad Loti et al. (Reference Ahmad Loti, Mohd Noor and Chang2021) documented a neural network that effectively identifies and differentiates various diseases in pepper (Piper aduncum L.). Additionally, previous studies have demonstrated the efficacy of DCNNs for detecting weeds in a variety of cropping systems (Dang et al. Reference Dang, Chen, Lu and Li2023; Sharpe et al. Reference Sharpe, Schumann, Yu and Boyd2020; Yu et al. Reference Yu, Schumann, Sharpe, Li and Boyd2020). For example, researchers developed a neural network to detect weeds growing in soybean [Glycine max (L.) Merr.] (dos Santos Ferreira et al. Reference dos Santos Ferreira, Freitas, da Silva, Pistori and Folhes2017). Andrea et al. (Reference Andrea, Daniel and Misael2017) developed a neural network that accurately and reliably classifies weeds in corn (Zea mays L.) stands. Osorio et al. (Reference Osorio, Puerto, Pedraza, Jamaica and Rodríguez2020) successfully employed a neural network to detect weeds in lettuce (Lactuca sativa L.). Moreover, You et al. (Reference You, Liu and Lee2020) proposed a neural network–based semantic segmentation method to distinguish weeds from crops in complex agricultural settings.

DCNNs have also demonstrated great performance in detecting weeds growing in bermudagrass [Cynodon dactylon (L.) Pers.] turf (Jin et al. Reference Jin, Bagavathiannan, Maity, Chen and Yu2022a; Xie et al. Reference Xie, Hu, Bagavathiannan and Song2021; Yu et al. Reference Yu, Schumann, Cao, Sharpe and Boyd2019a, Reference Yu, Sharpe, Schumann and Boyd2019b). The use of DCNNs to detect weeds in both dormant and actively growing bermudagrass was first reported by Yu et al. (Reference Yu, Schumann, Cao, Sharpe and Boyd2019a), who compared three image classification neural networks, including DetectNet (NVIDIA, 2016), VGGNet (Simonyan and Zisserman Reference Simonyan and Zisserman2014), and GoogLeNet (Szegedy et al. Reference Szegedy, Liu, Jia, Sermanet, Reed, Anguelov and Rabinovich2015). These neutral networks were evaluated for their ability to detect and classify several broadleaf weed species, including dollar weed (Hydrocotyle spp.), Florida pusley (Richardia scabra L.), and old world diamond-flower (Oldenlandia corymbosa L.), in actively growing bermudagrass turf. It was found that DetectNet achieved an excellent F₁ score of 0.99, outperforming the other two neural networks. In a subsequent study, Yu et al. (Reference Yu, Schumann, Sharpe, Li and Boyd2020) documented that VGGNet achieved excellent performance, surpassing AlexNet and GoogLeNet in detecting dallisgrass (Paspalum dilatatum Poir.), doveweed [Murdannia nudiflora (L.) Brenan], smooth crabgrass [Digitaria ischaemum (Schreb.) Schreb. ex Muhl.], and tropical signalgrass [Urochloa adspersa (Trin.) R. Webster] in actively growing bermudagrass.

To develop a commercially viable smart sprayer employing a neural network model, the spray system must detect various weed species across different turfgrass regimes, irrespective of species, ecotypes, densities, growth stages, and geographic locations. However, detecting and differentiating weeds from turfgrass can be challenging, particularly when dealing with grass weeds that share similar morphological characteristics with turfgrass species. In contrast, broadleaf weeds often exhibit distinct features that facilitate their identification. Previous neural networks were typically designed to identify a single or a limited number of weed species in specific geographic areas (Jin et al. Reference Jin, Bagavathiannan, Maity, Chen and Yu2022a, Reference Jin, Bagavathiannan, McCullough, Chen and Yu2022b, Reference Jin, Liu, McCullough, Chen and Yu2023a). In addition, due to phenotypic plasticity, significant morphological variations may occur among the weed ecotypes from different turfgrass management regimes or geographic areas (Kerr et al. Reference Kerr, Zhebentyayeva, Saski and McCarty2019). For example, a dwarf ecotype of goosegrass [Eleusine indica (L.) Gaertn.], with an average internode length of only 0.2 cm, has been found in a golf course in Florida (Kerr et al. Reference Kerr, Zhebentyayeva, Saski and McCarty2019), in contrast to wild ecotypes with an average internode length of 7 cm (Saidi et al. Reference Saidi, Kadir and Hong2016). Consequently, these networks may struggle to detect different weed ecotypes, species, or those growing in mixed stands at varying growth stages and densities across diverse turf regimes and geographic regions. Moreover, bermudagrass is widely utilized in various turf sites, including home lawns, golf courses, school playgrounds, and sport fields.

Research has shown that training image size and quantity significantly influence the performance of neural networks in weed detection. Zhuang et al. (Reference Zhuang, Li, Bagavathiannan, Jin, Yang, Meng and Chen2022) evaluated multiple neural networks, including AlexNet and VGGNet, for the detection and classification of broadleaf weed seedlings in wheat (Triticum aestivum L.) and found that, for a small training dataset (5,500 negative and 5,500 positive images), increasing the size of the training images from 200 × 200 pixels to 300 × 300 or 400 × 400 pixels resulted in a decrease in the F₁ scores of both networks. However, for larger training datasets (11,000 negative and 11,000 positive images), increasing the image size improved the performance of all studied networks, regardless of the image sizes. Therefore, the objectives of this research were (1) to assess the ability of image classification neural networks trained on datasets from limited geographic regions to generalize weed detection performance across diverse bermudagrass turf regimes and locations and (2) to examine how varying the size of training datasets impacts the performance of eight different neural network models.

Materials and Methods

Neural Network Models

This study evaluated eight neural network models for weed detection in turfgrass systems. These included AlexNet (Krizhevsky et al. Reference Krizhevsky, Sutskever and Hinton2012), a pioneering DCNN with five convolutional layers; GoogleNet (Szegedy et al. Reference Szegedy, Liu, Jia, Sermanet, Reed, Anguelov and Rabinovich2015), which employs an Inception architecture for efficient feature extraction; and VGGNet (Simonyan and Zisserman Reference Simonyan and Zisserman2014), known for its deeper structure with multiple 3 × 3 convolution kernels. Additionally, ResNet101 and ResNet152 (He et al. Reference He, Zhang, Ren and Sun2016), which utilize residual learning, were assessed to enhance deep network training. To improve multi-scale feature extraction, Res2Net and ResNeXt (Gao et al. Reference Gao, Cheng, Zhao, Zhang, Yang and Torr2019) were also included. Finally, PoolFormer, a transformer-based model with a simplified MetaFormer architecture (Yu et al. Reference Yu, Luo, Zhou, Si, Zhou, Wang and Yan2022), was tested for its ability to generalize weed detection across diverse conditions. A summary of each model’s architecture and advantages is provided in Table 1.

Table 1. Eight neural networks evaluated in the study.

^a Abbreviations: CNN, convolutional neural network; GPU, graphics processing unit; ILSVRC, ImageNet Large Scale Visual Recognition Challenge.

Image Acquisition

The training, validation, and testing dataset images were mainly captured in four cities in Florida and Georgia in the United States, covering various turf regimes infested with a variety of broadleaf, grass, and sedge weeds, as detailed in Table 2 and Figure 1.

Table 2. Details of training, validation, and testing dataset images.^a

^a Training, validation, and testing dataset images collected from four locations in Florida and Georgia, USA.

Figure 1. Image examples from the training dataset. The dataset includes Digitaria ischaemum and Paspalum dilatatum at the 3- to 5-tiller stage and Murdannia nudiflora and Oldenlandia corymbosa at full maturity before flowering.

In Bradenton, FL, USA (27.4963, -82.5745), images were captured multiple times at a golf course between October and November 2018, predominantly featuring annual grass weeds such as E. indica, D. ischaemum, and U. adspersa, with visually estimated turf cover exceeding 90% and bare soil cover below 10%.

In Tampa, FL, USA (27.9473, -82.4584), images were captured multiple times between July and December 2018 from golf courses, roadsides, and sports fields, primarily featuring M. nudiflora, R. scabra, E. indica, and D. ischaemum.

In Riverview, FL, USA (27.8139, −82.4167), images were collected from the rough of a golf course between July 2018 and February 2019. These images included broadleaf weeds such as Hydrocotyle spp. and O. corymbosa, along with grasses like E. indica, D. ischaemum, and U. adspersa. Additionally, low-density pre-flowering purple nutsedge (Cyperus rotundus L.) was present. Turfgrass cover ranged from 70% to 80%, while bare ground covered 20% to 30%.

In Georgia, USA, images were captured in July and October 2018 at the turfgrass research facility at the University of Georgia Griffin Campus (33.2608, -84.2521). These images featured low-density, pre-flowering annual sedges (Cyperus compressus L.) exhibiting a clump growth habit, as well as broadleaf weeds, grasses, and fragrant kyllinga (Kyllinga odorata Vahl). The broadleaf weeds included spotted spurge [Chamaesyce maculata (L.) Small; syn.: Chamaesyce maculata L.], Virginia buttonweed (Diodia virginiana L.), and white clover (Trifolium repens L.), while the grasses included P. dilatatum, E. indica, and D. ischaemum.

Regarding model robustness testing, this study constructed an additional robustness testing dataset. These data were collected from 24 different locations across the United States and China, as shown in Figures 2 and 3. Each test scenario in the dataset consists of 100 turfgrass images without weeds and 100 images containing weeds.

Figure 2. Images of weed species and their respective turf sites in the United States used for neural network robustness testing.

Figure 3. Images of weed species and their respective turf sites in China used for neural network robustness testing.

All images used in this study were captured using a Sony^® Cyber-Shot camera (Sony, Minato, Tokyo, Japan) with a resolution of 1,920 × 1,080 pixels. They were captured between 0900 and 1700 hours under various weather and outdoor lighting conditions, including clear, cloudy, and partly cloudy conditions.

Training and Testing

To align with the input requirements and maintain compatibility across the deep learning model architectures evaluated in this study, the original images were resized to 480 × 480 pixels. These resized images were then divided into two datasets. The small training dataset contained 10,000 positive images (with weeds) and 10,000 negative images (without weeds). To construct the large training dataset, an additional 40,000 images were added—20,000 positive and 20,000 negative—resulting in 30,000 positive and 30,000 negative images. Both the validation and the testing datasets contained 1,000 positive and 1,000 negative images each.

In this study, eight DCNNs were trained on the small and large training datasets for 100 epochs each. After the optimal weights for each model were obtained, model performance was evaluated on the testing dataset. The best-performing model was then selected to undergo final robustness testing on the testing dataset. Jin et al. (Reference Jin, Bagavathiannan, McCullough, Chen and Yu2022b) employed a classification-based approach for weed detection, wherein images were segmented into grid cells and classified based on the presence of weeds. This method enables both detection and localization, with classification evaluation metrics effectively reflecting the model’s performance in weed detection.

A confusion matrix was employed to evaluate the performance of each model. This matrix compares the predicted results of a classifier with the true labels, categorizing them into four categories: true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).

In the confusion matrix:

TP represents the number of instances correctly predicted as positive.
FN represents the number of instances incorrectly predicted as negative when they are actually positive.
FP represents the number of instances incorrectly predicted as positive.
TN represents the number of instances correctly predicted as negative.

In the present research, TP represents the model correctly identified the target weed; TN represents the model correctly identified images without the target weed; FP represents the model incorrectly predicted the target weed; and FN refers to instances where the model failed to predict the presence of the actual target weed.

The confusion matrix was used to calculate various performance metrics of a classification algorithm. In this study, precision (Equation 1), recall (Equation 2), F₁ score (Equation 3), and Matthews correlation coefficient (MCC) (Equation 4) were calculated using the results from the confusion matrix.

Precision measures the ability of the developed neural network to correctly identify the targets and was calculated using the following formula:

(1)

$${\rm{Precision}} = {{{{\rm{TP}}}} \over{{{\rm{TP\;}} + {\rm{\;FP}}}}}$$

Recall provides an estimation of the developed neural network’s ability to identify its targets and was calculated using the following formula:

(2)

$${\rm{Recall}} = {{{{\rm{TP}}}} \over {{{\rm{TP\;}} + {\rm{\;FN}}}}}$$

The F₁ score, defined as the harmonic mean of precision and recall, was calculated using the following formula:

(3)

$${{\rm{F}}_1} = {{{2{\rm{\;}} \times {\rm{\;Precision\;}} \times {\rm{\;Recall}}}} \over {{{\rm{Precision\;}} + {\rm{\;Recall}}}}}$$

MCC is a universal evaluation metric used to assess the performance of classification models, measuring the correlation between predicted and actual labels. It applies to both binary and multiclass classification tasks. In this study, we applied MCC to evaluate the performance of binary classification models, specifically distinguishing between weed (presence) and non-weed (absence) in turfgrass images (Sokolova and Lapalme Reference Sokolova and Lapalme2009). It was calculated using the following equation:

(4)

$${{\rm MCC} = {{{{\rm{TP}} \times {\rm{TN}} - {\rm{FP}} \times {\rm{FN}}}} \over {{\sqrt {\left( {{\rm{TP}} + {\rm{FP}}} \right) \times \left( {{\rm{TP}} + {\rm{FN}}} \right) \times \left( {{\rm{TN}} + {\rm{FP}}} \right) \times \left( {{\rm{TN}} + {\rm{FN}}} \right)} }}}}$$

Experimental Configuration

The experiments were conducted using the PyTorch (Meta) deep learning framework (v. 1.13.0) with CUDA 11.6 (NVIDIA). To ensure fairness of the experiments, none of the models were initialized with pretrained weights. All training and testing procedures were executed on a workstation equipped with an Intel^® Core™ i9-10920X CPU @ 3.50 GHz, an NVIDIA RTX 3080 Ti GPU, and 128 GB of memory. The operating system used was Ubuntu 20.04.1.

The hyperparameter settings were as follows: the image size for the training process was set to 480 × 480 pixels, with stochastic gradient descent as the optimizer. The base learning rate was set to 0.1, and weight decay was applied with a value of 0.0001. The batch size was 16, and the learning rate policy was set to “step.” Momentum was set to 0.9, and the model was trained for 100 epochs. The output layer was configured with two nodes, corresponding to the binary classes (weed vs. non-weed), using a softmax activation function. This configuration ensured a controlled and consistent setup for evaluating the model’s performance.

Result and Discussion

Model Evaluation

Among the evaluated neural networks, VGGNet16, trained with 20,000 images, achieved the highest precision in the testing dataset, reaching 0.9990 (Table 3). It also exhibited the highest recall and F₁ score and an MCC value exceeding 0.9960. Res2Net and ResNeXt152 also performed well, with accuracies of 0.9950 and 0.9970, respectively. Notably, ResNeXt152 had a recall of 0.9990, slightly surpassing VGGNet16, with similar F₁ and MCC scores.

Table 3. Testing results of neural networks for classification of weeds while growing in turfgrasses.

^a The small training dataset contained 10,000 positive images (with weeds) and 10,000 negative images (without weeds), while the large training dataset contained 30,000 positive and 30,000 negative images.

^b MCC, Matthews correlation coefficient.

Initially, AlexNet and GoogleNet showed lower performance, with precision of 0.8801 and 0.7897 and MCC values below 0.9. However, their performance improved significantly when trained with 60,000 images. AlexNet’s precision increased to 0.9980, surpassing VGGNet16, and its MCC reached 0.9940. GoogleNet achieved a perfect recall (1.0), a precision of 0.9950, and the highest MCC among all models.

ResNet152 and Poolformer achieved a recall of 1.0 with a precision of 0.9921 and 0.9940, respectively. While ResNet101 and ResNet152 showed further improvements, VGGNet16, Res2Net, and ResNeXt152 experienced overall performance declines, possibly due to their simpler architectures and limited learning capacities (Chollet Reference Chollet2021; Hastie et al. Reference Hastie, Tibshirani and Friedman2009). Models with complex structures excelled on smaller datasets but faced overfitting risks on larger datasets (Szegedy et al. Reference Szegedy, Liu, Jia, Sermanet, Reed, Anguelov and Rabinovich2015).

VGGNet16 and ResNeXt152 consistently demonstrated excellent performance across all metrics, with MCC values >0.98 and F₁ scores >0.99. ResNeXt152 outperformed all models in recall and showed the best performance in confusion matrices (Figures 4 and 5). Therefore, this study selected the top two weights of ResNeXt152 for weed classification in various turfgrass regimes.

Figure 4. Confusion matrices of models trained on the small dataset and tested on the robustness testing dataset.

Figure 5. Confusion matrices of models trained on the large dataset and tested on the robustness testing dataset.

Single Weed Species Classification

Weed classification in turfgrass systems has been identified as a critical component of precision agriculture, with recent advancements in deep learning offering promising solutions for reducing herbicide use and improving management efficiency (Beckie et al. Reference Beckie, Ashworth and Flower2019; Bhakta et al. Reference Bhakta, Phadikar and Majumder2019). Previous studies have often been confined to collecting data from specific regions for training neural networks and subsequently evaluating them within the same geographic locations (Jin et al. Reference Jin, Liu, Chen and Yu2022c; Xie et al. Reference Xie, Girshick, Dollár, Tu and He2017; Yu et al. Reference Yu, Schumann, Cao, Sharpe and Boyd2019a, Reference Yu, Sharpe, Schumann and Boyd2019b). Although these studies have demonstrated effective weed detection, the testing datasets in these works were limited to specific turfgrass management regimes and geographic locations. In this study, we tested model performance across 24 diverse scenarios in China and the United States, encompassing various weed species, ecotypes, densities, and growth stages, as detailed in Table 4.

Table 4. Robustness testing results of ResNeXt152 for classification of weeds while growing in turfgrasses.^a

^a Neural network was trained using small and large dataset.

^b Low density indicates weed coverage in the images is visually less than 10% of the total image pixels; high density indicates the weed coverage in the images is visually more than 80% of the total image pixels.

^c MCC, Matthews correlation coefficient.

The models trained with large datasets demonstrated consistent precision exceeding 0.97 across all classifications for D. ischaemum. They achieved F₁ scores ranging from 0.988 to 0.998, with perfect recall of 1.0 for the first five scenarios. Even small datasets yielded strong performance, with MCC values above 0.81 across the same five scenarios. However, the classification of D. ischaemum in commercial landscapes was less effective, likely due to lower recall (below 0.58). These results confirm that ResNeXt152, when trained on large datasets, can accurately classify D. ischaemum at different growth stages in various turfgrass settings.

For P. dilatatum classification in Auburn, AL, USA, the model trained on a small dataset showed acceptable recall but lower precision (0.7333). In contrast, the model trained on the large dataset achieved a precision of 0.9686, emphasizing the importance of dataset size in minimizing false detections. Both models trained with large datasets performed well in other scenarios, with MCC exceeding 0.95, and the neural network maintained excellent performance even with low-density P. dilatatum, indicating that ResNeXt152 can accurately classify P. dilatatum in various turf conditions.

For M. nudiflora classification, tests were conducted at locations in Miami, FL, and Tifton, GA, USA. Both small and large datasets yielded robust results, with precision consistently above 0.97 and F₁ scores not falling below 0.96. Recall rates were optimal in three out of four scenarios, further affirming the efficacy of ResNeXt152 in M. nudiflora classification.

Testing on single weed species at additional locations revealed that the model trained with a large dataset achieved near-perfect classification (precision and recall exceeding 0.99, MCC > 0.98) for species including Hydrocotyle spp., R. scabra, green kyllinga [Kyllinga brevifolia Rottb.], O. corymbosa, D. virginiana, and T. repens.

Finally, experiments with U. adspersa at two Florida golf courses demonstrated that the model trained on a small dataset performed well in low-density scenarios (precision: 0.9745, MCC: 0.9696), while the model trained on a large dataset achieved consistent excellence (all metrics > 0.99). These results highlight ResNeXt152’s ability to classify weeds across varying densities in bermudagrass regimes.

Mixed Weed Species Classification

After confirming the accuracy of ResNeXt152 in single weed species classification, the study extended its evaluation to complex scenarios involving the coexistence of multiple weed species. As previously reported, multiple classifier neural network models, including DenseNet, EfficientNetV2, ResNet, RegNet, and VGGNet, are capable of effective weed detection and classification (Jin et al. Reference Jin, Bagavathiannan, McCullough, Chen and Yu2022b). These models achieved high accuracy, with F₁ scores of at least 0.946 on the test dataset. However, the training and test images used in this study were collected from geographically proximate areas, which may limit generalizability.

In the present study, the test locations for evaluating the model’s performance in classifying different weed species growing in turfgrass across multiple states in the United States and provinces in China were selected. The model performed exceptionally well in scenarios featuring multiple weed species. It achieved a precision and recall of 1.0000 and 0.9732 for D. ischaemum and O. corymbosa on a golf course and roadside when trained on the small dataset, and 0.9946 and 1.0000 when trained on the large dataset. In a sod farm with K. brevifolia, C. rotundus, and D. ischaemum, the large dataset model outperformed the smaller dataset model. In a city park with C. maculata and D. ischaemum, both models demonstrated outstanding performance, with evaluation metrics exceeding 0.99. In conclusion, ResNeXt152 proves effective in accurately classifying multiple weed species across turf management regimes. The experimental results demonstrate that data collected from a limited geographic area can be used to train a neural network capable of effectively classifying a wide variety of weed species in bermudagrass across different locations. While this study focused on bermudagrass, the methodology is potentially applicable to other cool- and warm-season turfgrass species. Future work will investigate its effectiveness for weed classification in these turfgrass types.

In summary, this study assessed the performance of eight classification neural networks trained with varying numbers of images. Through a comparative analysis, ResNeXt152 emerged as the most effective model among those evaluated. Additionally, this research highlights the practicality of utilizing a single neural network model for weed classification in turfgrass regimes with diverse uses across diverse geographic regions in both China and the United States. The study demonstrated that ResNeXt152 achieved robust classification performance across 24 different locations, covering 6 turf sites with distinct uses and 14 weed species with varying densities and growth stages—all using a single training session. Future research will focus on expanding testing and validation to additional global locations and integrating the developed neural models into the machine vision subsystem of a smart sprayer prototype.

Funding statement

This work was supported by the Key R&D Program of Shandong Province, China (grant no. 202211070163), the National Natural Science Foundation of China (grant no. 32072498), the Taishan Scholars Program, China, and the Weifang Science and Technology Development Plan Project (grant no. 2024ZJ1097).

Competing interests

The authors declare no conflicts of interest.

Footnotes

Associate Editor: Prashant Jha, Louisiana State University

References

[USEPA] U.S. Environmental Protection Agency (2023a) Ingredients Used in Pesticide Products—Atrazine. http://www.epa.gov/ingredients-used-pesticide-products/atrazine. Accessed: February 27, 2025Google Scholar

[USEPA] U.S. Environmental Protection Agency (2023b) Monosodium Methanearsonate (MSMA), an Organic Arsenical. https://www.epa.gov/ingredients-used-pesticide-products/monosodium-methanearsonate-msma-organic-arsenical. Accessed: February 27, 2025Google Scholar

[WSSA] Weed Science Society of America (2023) Herbicide Mechanism of Action Classification. https://wssa.net/herbicides/herbicide-mechanism-of-action. Accessed: February 27, 2025Google Scholar

Ahmad Loti, NN, Mohd Noor, MR, Chang, SW (2021) Integrated analysis of machine learning and deep learning in chili pest and disease identification. J Sci Food Agric 101:3582–3594 10.1002/jsfa.10987CrossRef Google Scholar PubMed

Andrea, CC, Daniel, BBM, Misael, JBJ (2017) Precise weed and maize classification through convolutional neuronal networks. Pages 1–6 in Proceedings of the 2017 IEEE Second Ecuador Technical Chapters Meeting. Salinas, Ecuador: Institute of Electrical and Electronics Engineers10.1109/ETCM.2017.8247469CrossRef Google Scholar

Beckie, HJ, Ashworth, MB, Flower, KC (2019) Herbicide resistance management: recent developments and trends. Plants 8:161 10.3390/plants8060161CrossRef Google Scholar PubMed

Bhakta, I, Phadikar, S, Majumder, K (2019) State-of-the-art technologies in precision agriculture: a systematic review. J Sci Food Agric 99:4878–4888 10.1002/jsfa.9693CrossRef Google Scholar PubMed

Busey, P (2003) Cultural management of weeds in turfgrass: a review. Crop Sci 43:1899–1911 10.2135/cropsci2003.1899CrossRef Google Scholar

Chollet, F (2021) Deep Learning with Python. 2nd ed. New York: Simon & Schuster. Pp 121–138 Google Scholar

Chowdhary, KR (2020) Natural language processing. Pages 603–649 in Chowdhary KR, ed. Fundamentals of Artificial Intelligence. New Delhi: Springer.10.1007/978-81-322-3972-7_19CrossRef Google Scholar

Dang, F, Chen, D, Lu, Y, Li, Z (2023) YOLOWeeds: a novel benchmark of YOLO object detectors for multi-class weed detection in cotton production systems. Comput Electron Agric 205:107655 10.1016/j.compag.2023.107655CrossRef Google Scholar

dos Santos Ferreira, A, Freitas, DM, da Silva, GG, Pistori, H, Folhes, MT (2017) Weed detection in soybean crops using ConvNets. Comput Electron Agric 143:314–324 10.1016/j.compag.2017.10.027CrossRef Google Scholar

Gao, SH, Cheng, MM, Zhao, K, Zhang, XY, Yang, MH, Torr, P (2019) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43:652–662 10.1109/TPAMI.2019.2938758CrossRef Google Scholar

Gerhards, R, Andujar Sanchez, D, Hamouz, P, Peteinatos, GG, Christensen, S, Fernandez-Quintanilla, C (2022) Advances in site-specific weed management in agriculture—a review. Weed Res 62:123–133 10.1111/wre.12526CrossRef Google Scholar

Hastie, T, Tibshirani, R, Friedman, J (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer. Pp 9–41 10.1007/978-0-387-84858-7_2CrossRef Google Scholar

He, K, Zhang, X, Ren, S, Sun, J (2016) Deep residual learning for image recognition. Pages 770–778 in Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV: Institute of Electrical and Electronics Engineers10.1109/CVPR.2016.90CrossRef Google Scholar

Jin, X, Bagavathiannan, M, Maity, A, Chen, Y, Yu, J (2022a) Deep learning for detecting herbicide weed control spectrum in turfgrass. Plant Methods 18:94 10.1186/s13007-022-00929-4CrossRef Google Scholar PubMed

Jin, X, Bagavathiannan, M, McCullough, PE, Chen, Y, Yu, J (2022b) A deep learning-based method for classification, detection, and localization of weeds in turfgrass. Pest Manag Sci 78:4809–4821 10.1002/ps.7102CrossRef Google Scholar PubMed

Jin, X, Liu, T, Chen, Y, and Yu, J (2022c) Deep learning-based weed detection in turf: a review. Agronomy 12:3051 10.3390/agronomy12123051CrossRef Google Scholar

Jin, X, Liu, T, McCullough, PE, Chen, Y, Yu, J (2023a) Evaluation of convolutional neural networks for herbicide susceptibility-based weed detection in turf. Front Plant Sci 14:1096802 10.3389/fpls.2023.1096802CrossRef Google Scholar

Jin, X, Liu, T, Yang, Z, Xie, J, Bagavathiannan, M, Hong, X, Chen, Y (2023b) Precision weed control using a smart sprayer in dormant bermudagrass turf. Crop Prot 172:10630210.1016/j.cropro.2023.106302CrossRef Google Scholar

Kerr, RA, Zhebentyayeva, T, Saski, C, McCarty, LB (2019) Comprehensive phenotypic characterization and genetic distinction of distinct goosegrass (Eleusine indica L. Gaertn.) ecotypes. J Plant Sci Phytopathol 3:95–10010.29328/journal.jpsp.1001038CrossRef Google Scholar

Kraehmer, H, Laber, B, Rosinger, C, Schulz, A (2014) Herbicides as weed control agents: state of the art: I. Weed control research and safener technology: the path to modern agriculture. Plant Physiol 166:1119–1131 10.1104/pp.114.241901CrossRef Google Scholar

Krizhevsky, A, Sutskever, I, Hinton, GE (2012) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84–90 10.1145/3065386CrossRef Google Scholar

Kudsk, P, Streibig, J (2003) Herbicides—a two-edged sword. Weed Res 43:90–102 10.1046/j.1365-3180.2003.00328.xCrossRef Google Scholar

LeCun, Y, Bengio, Y, Hinton, G (2015) Deep learning. Nature 521:436–444 10.1038/nature14539CrossRef Google Scholar PubMed

Luo, Q, Gao, B, Woo, W L, Yang, Y (2019) Temporal and spatial deep learning network for infrared thermal defect detection. NDT E Int 108:102164 10.1016/j.ndteint.2019.102164CrossRef Google Scholar

Maqueda, AI, Loquercio, A, Gallego, G, García, N, Scaramuzza, D (2018) Event-based vision meets deep learning on steering prediction for self-driving cars. Pages 5419–5427 in Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT: Institute of Electrical and Electronics Engineers10.1109/CVPR.2018.00568CrossRef Google Scholar

McElroy, J, Martins, D (2013) Use of herbicides on turfgrass. Planta Daninha 31:455–467 10.1590/S0100-83582013000200024CrossRef Google Scholar

Neal, JC (2020) Turfgrass Weed Management—An IPM Approach Handbook of Integrated Pest Management for Turf and Ornamentals. Boca Raton, FL: CRC Press. Pp 275–292 Google Scholar

NVIDIA (2016) DetectNet: Deep Neural Network for Object Detection in DIGITS. NVIDIA Developer Blog. https://developer.nvidia.com/blog/detectnet-deep-neural-network-object-detection-digits/. Accessed: June 6, 2025Google Scholar

Osorio, K, Puerto, A, Pedraza, C, Jamaica, D, Rodríguez, L (2020) A deep learning approach for weed detection in lettuce crops using multispectral images. Agric Eng 2:471–488 Google Scholar

Saidi, N, Kadir, J, Hong, LW (2016) Genetic diversity and morphological variations of goosegrass [Eleusine indica (L.) Gaertn] ecotypes in Malaysia. Weed Turfgrass Sci 5:144–154 10.5660/WTS.2016.5.3.144CrossRef Google Scholar

Sharpe, SM, Schumann, AW, Yu, J, Boyd, NS (2020) Vegetation detection and discrimination within vegetable plasticulture row-middles using a convolutional neural network. Precis Agric 21:264–277 10.1007/s11119-019-09666-6CrossRef Google Scholar

Simonyan, K, Zisserman, A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556Google Scholar

Singh, NS, Hariharan, S, Gupta, M (2020) Facial recognition using deep learning. Pages 375–382 in Proceedings of Advances in Data Sciences, Security and Applications: Proceedings of ICDSSA 2019. New Delhi, India: Bharati Vidyapeeth’s College of Engineering10.1007/978-981-15-0372-6_30CrossRef Google Scholar

Sokolova, M, Lapalme, G (2009) A systematic analysis of performance measures for classification tasks. Inform Process Manag 45:427–437 10.1016/j.ipm.2009.03.002CrossRef Google Scholar

Stier, JC, Steinke, K, Ervin, EH, Higginson, FR, McMaugh, PE (2013) Turfgrass benefits and issues. Pages 105–145 in Stier JC, Horgan BP, Bonos SA, eds. Turfgrass: Biology, Use, and Management. Madison, WI: American Society of Agronomy, Soil Science Society of America, Crop Science Society of America10.2134/agronmonogr56.c3CrossRef Google Scholar

Szegedy, C, Liu, W, Jia, Y, Sermanet, P, Reed, S, Anguelov, D, Rabinovich, A (2015) Going deeper with convolutions. Pages 1–9 in Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA: Institute of Electrical and Electronics Engineers10.1109/CVPR.2015.7298594CrossRef Google Scholar

Xie, S, Girshick, R, Dollár, P, Tu, Z, He, K (2017) Aggregated residual transformations for deep neural networks. Pages 5987–5995 in Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognitio, Honolulu, HI: Institute of Electrical and Electronics Engineers10.1109/CVPR.2017.634CrossRef Google Scholar

Xie, S, Hu, C, Bagavathiannan, M, Song, D (2021) Toward robotic weed control: detection of nutsedge weed in bermudagrass turf using inaccurate and insufficient training data. IEEE Robot Autom Lett 6:7365–7372 10.1109/LRA.2021.3098012CrossRef Google Scholar

Yang, GR, Wang, XJ (2020) Artificial neural networks for neuroscientists: a primer. Neuron 107:1048–1070 10.1016/j.neuron.2020.09.005CrossRef Google Scholar PubMed

You, J, Liu, W, Lee, J (2020) A DNN-based semantic segmentation for detecting weed and crop. Comput Electron Agric 178:105750 10.1016/j.compag.2020.105750CrossRef Google Scholar

Yu, J, McCullough, PE (2016) Efficacy and fate of atrazine and simazine in doveweed (Murdannia nudiflora L.). Weed Sci 64:379–388 10.1614/WS-D-15-00180.1CrossRef Google Scholar

Yu, J, Schumann, AW, Cao, Z, Sharpe, SM, Boyd, NS (2019a) Weed detection in perennial ryegrass with deep learning convolutional neural network. Front Plant Sci 10:1422 10.3389/fpls.2019.01422CrossRef Google Scholar PubMed

Yu, J, Schumann, AW, Sharpe, SM, Li, X, Boyd, NS (2020) Detection of grassy weeds in bermudagrass with deep convolutional neural networks. Weed Sci 68:545–552 10.1017/wsc.2020.46CrossRef Google Scholar

Yu, J, Sharpe, SM, Schumann, AW, Boyd, NS (2019b) Deep learning for image-based weed detection in turfgrass. Eur J Agron 104:78–84 10.1016/j.eja.2019.01.004CrossRef Google Scholar

Yu, W, Luo, M, Zhou, P, Si, C, Zhou, Y, Wang, X, Yan, S (2022) Metaformer is actually what you need for vision. Pages 10809–10819 in Proceedings of the 2022 IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, LA: Institute of Electrical and Electronics Engineers10.1109/CVPR52688.2022.01055CrossRef Google Scholar

Zhao, J, Ma, Y, Yong, K, Zhu, M, Wang, Y, Luo, Z, Huang, X (2023) Deep-learning-based automatic evaluation of rice seed germination rate. J Sci Food Agric 103:1912–1924 10.1002/jsfa.12318CrossRef Google Scholar PubMed

Zhuang, J, Li, X, Bagavathiannan, M, Jin, X, Yang, J, Meng, W, Chen, Y (2022) Evaluation of different deep convolutional neural networks for detection of broadleaf weed seedlings in wheat. Pest Manag Sci 78:521–529 10.1002/ps.6656CrossRef Google Scholar PubMed

Table 1. Eight neural networks evaluated in the study.

Table 2. Details of training, validation, and testing dataset images.a

Figure 1. Image examples from the training dataset. The dataset includes Digitaria ischaemum and Paspalum dilatatum at the 3- to 5-tiller stage and Murdannia nudiflora and Oldenlandia corymbosa at full maturity before flowering.

Figure 2. Images of weed species and their respective turf sites in the United States used for neural network robustness testing.

Figure 3. Images of weed species and their respective turf sites in China used for neural network robustness testing.

Table 3. Testing results of neural networks for classification of weeds while growing in turfgrasses.

Figure 4. Confusion matrices of models trained on the small dataset and tested on the robustness testing dataset.

Figure 5. Confusion matrices of models trained on the large dataset and tested on the robustness testing dataset.

Table 4. Robustness testing results of ResNeXt152 for classification of weeds while growing in turfgrasses.a

Article contents

Development of a robust deep learning model for weed classification across diverse bermudagrass turfgrass regimes in China and the United States

Abstract

Keywords

Information

Introduction

Materials and Methods

Neural Network Models

Image Acquisition

Training and Testing

Experimental Configuration

Result and Discussion

Model Evaluation

Single Weed Species Classification

Mixed Weed Species Classification

Funding statement

Competing interests

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests