Introduction
Neotoma, commonly referred to as woodrat or packrat, is a North American cricetid rodent genus distributed from southern Mexico to northern Canada (Cordero and Epps, Reference Cordero and Epps2012; Soberón and Martínez-Gordillo, Reference Soberón and Martínez-Gordillo2012; Coyner et al., Reference Coyner, Murphy and Matocq2015). Members of this genus are considered keystone species because their multichambered dens (middens) create nutrient-rich microhabitats that maintain local biodiversity (Matocq, Reference Matocq2009; Whitford and Steinberger, Reference Whitford and Steinberger2010). These middens are built with local plant and animal remains, and are consolidated with crystalized urine (Smith et al., Reference Smith, Betancourt and Brown1995, Reference Smith, Crawford, Harding, Lease, Murray, Raniszewski and Youberg2009). Species of Neotoma are important paleoecological indicators because their middens, and the organic materials within, often fossilize in arid regions, thus preserving aspects of biotic communities over millennia (Cordero and Epps, Reference Cordero and Epps2012; Smith et al., Reference Smith, Betancourt and Brown1995). In addition, because the intergenerational body size of some Neotoma species is highly responsive to long-term changes in ambient temperature, woodrats have been used as a ‘paleothermometer’ (Smith and Betancourt, Reference Smith and Betancourt1998, Reference Smith and Betancourt2003). Neotoma remains are therefore coveted for (paleo)ecological research and for studies of adaptive responses to environmental change across space and time (Smith et al., Reference Smith, Betancourt and Brown1995, Reference Smith, Crawford, Harding, Lease, Murray, Raniszewski and Youberg2009; Lyman and O’Brien, Reference Lyman and O’Brien2005; Brooke McEachern et al., Reference Brooke McEachern, Eagles-Smith, Efferson and Van Vuren2006; Smith and Betancourt, Reference Smith and Betancourt2006; Cordero and Epps, Reference Cordero and Epps2012; Becklin et al., Reference Becklin, Betancourt, Braasch, Dézerald, Díaz, González and Harbert2024).
Paleoecological studies of Neotoma have largely focused on fossilized middens and fecal pellets (e.g., Smith et al., Reference Smith, Betancourt and Brown1995, Reference Smith, Crawford, Harding, Lease, Murray, Raniszewski and Youberg2009; Smith and Betancourt, Reference Smith and Betancourt2003; Mychajliw et al., Reference Mychajliw, Rice, Tewksbury, Southon and Lindsey2020), in part due to the difficulty of identifying other remains such as teeth and bones to species using qualitative character observations and linear measurements (Harris, Reference Harris1984; Betancourt et al., Reference Betancourt, Devender and Martin1990; Zakrzewski, Reference Zakrzewski, Martin and Barnosky1993; Repenning, Reference Repenning and Barnosky2004). However, midden preservation is skewed geologically towards younger localities and environmentally towards arid localities (Smith and Betancourt, Reference Smith and Betancourt2006; Smith et al., Reference Smith, Crawford, Harding, Lease, Murray, Raniszewski and Youberg2009), which limits the temporal and geographic scope of research compared to skeletal elements (Betancourt et al., Reference Betancourt, Devender and Martin1990; Zakrzewski, Reference Zakrzewski, Martin and Barnosky1993). Species-level identification of Neotoma remains is important because different species exhibit different ecological niches, denning behaviors, habitat preferences, and climatic tolerances (Finley Jr., Reference Finley, Betancourt, Van Devender and Martin1990; Lyman and O’Brien, Reference Lyman and O’Brien2005; Smith et al., Reference Smith, Crawford, Harding, Lease, Murray, Raniszewski and Youberg2009; Soberón and Martínez-Gordillo, Reference Soberón and Martínez-Gordillo2012). Erroneous paleoecological interpretations can therefore occur if remains are misidentified.
Modern quantitative methods may help overcome the challenges of identifying Neotoma skeletal remains to species. Indeed, advancements in geometric morphometrics and complementary multivariate analyses over the last few decades have produced powerful toolkits for quantifying shape and shape change among biological subjects in a variety of systems (e.g., Zelditch et al., Reference Zelditch, Swiderski, Sheets and Fink2004; Webster and Sheets, Reference Webster, Sheets, Alroy and Hunt2010; Adams et al., Reference Adams, Rohlf and Slice2013; Baken et al., Reference Baken, Collyer, Kaliontzopoulou and Adams2021; Mitteroecker and Schaefer, Reference Mitteroecker and Schaefer2022). These techniques have made differentiating closely related, speciose, and morphologically similar species more feasible (e.g., Martínez and Di Cola, Reference Martínez and Di Cola2011; Jansky et al., Reference Jansky, Schubert and Wallace2016; Changbunjong et al., Reference Changbunjong, Prakaikowit, Maneephan, Kaewwiset, Weluwanarak, Chaiphongpachara and Dujardin2021; Wyatt et al., Reference Wyatt, Hopkins and Davis2021; Alhajeri, Reference Alhajeri2025). Geometric morphometric analyses are less subjective than qualitative character observations and offer several advantages over traditional linear morphometrics, including the ability to partition object shape from size and capture spatial distributions of shape change (Zelditch et al., Reference Zelditch, Swiderski, Sheets and Fink2004). This method allows entire shape configurations to be visualized and analyzed collectively and is therefore appropriate for quantifying complex morphological conditions such as the spatial relationships between the lophs and folds of Neotoma teeth (Zakrzewski, Reference Zakrzewski, Martin and Barnosky1993).
Here we apply a relatively simple landmark configuration to individual teeth that are more frequently preserved in fossil assemblages than complete elements (Harris, Reference Harris1984; Repenning, Reference Repenning and Barnosky2004), thus broadening its utility across paleontological study systems. Specifically, we quantify shape variation in the lower first molar (m1) of five extant western North American species and use these data to identify fossil Neotoma m1s from Rancho La Brea (RLB), a near-coastal late Quaternary Lagerstätte in Hancock Park, Los Angeles, California (Fig. 1). Over millennia, oil-infiltrated sediments at RLB produced surficial asphalt seeps, which led to the entrapment and preservation of millions of fossil organisms representing over 600 taxa, including Neotoma (Dice, Reference Dice1925; Stock and Harris, Reference Stock and Harris1992; Holden et al., Reference Holden, Southon, Will, Kirby, Aalbu and Markey2017; Mychajliw et al., Reference Mychajliw, Rice, Tewksbury, Southon and Lindsey2020).

Figure 1. Distribution maps of the five extant Neotoma species examined here. The overview map shows the full extent of—and overlap among—each species range. Individual species maps indicate the points of sampling locations of individual specimens included in this study. Rancho La Brea, the locality where fossil Neotoma specimens were collected, is marked by a gray diamond on each map. Species range data obtained from (Patterson et al., Reference Patterson, Ceballos, Sechrest, Tognelli, Brooks, Luna, Ortega, Salazar and Young2007).
A recent study of Neotoma tooth morphology found that 2D shape outlines of upper first molars were largely unreliable for classifying species (Tomé et al., Reference Tomé, Whiteman-Jennings and Smith2020). Our analysis of discreet m1 landmarks could be complementary, however, because lower first cheek teeth often vary among closely related small mammal species and tend to be the most useful teeth for species-level identification (e.g., Thaeler, Reference Thaeler1980; Dalquest and Stangl Jr., Reference Dalquest and Stangl1983; Harris, Reference Harris1984; Dalquest et al., Reference Dalquest, Stangl and Grimes1989; Wallace, Reference Wallace2006). In addition, analysis of discreet landmarks may be advantageous over shape outlines because landmarks can capture shape disparity in homologous structures of the tooth while 2D outlines capture shape of the occlusal periphery, which can be substantially affected by ontogeny and wear (Repenning, Reference Repenning and Barnosky2004; Tomé et al., Reference Tomé, Whiteman-Jennings and Smith2020). This study addresses the question “Can 2D landmark analysis of lower first molars differentiate extant Neotoma species and identify unknown fossils?”
Methods
Study system
We examined five western North American Neotoma species currently present in California: N. albigula (white-throated woodrat), N. cinerea (bushy-tailed woodrat), N. fuscipes (dusky-footed woodrat), N. lepida (desert woodrat), and N. macrotis (big-eared woodrat). The N. lepida species complex has undergone relatively recent taxonomic revisions (Patton et al., Reference Patton, Huckaby and Álvarez-Castañeda2007; Bradley et al., Reference Bradley, Edwards, Lindsey, Bateman, Cajimat, Milazzo, Fulhorst, Matocq and Mauldin2022), and we did not attempt to morphologically differentiate among members of that complex in this study, instead we used N. lepida as a representative species. Collectively, these five representative species occupy broad geographic distributions and habitats ranging from boreal forest to desert (Smith et al., Reference Smith, Crawford, Harding, Lease, Murray, Raniszewski and Youberg2009). We sampled right m1s from 39 individuals of N. cinerea and 40 individuals of all other species (n = 199) collected in California between 1904 and 2015 (Fig. 1, Supplementary Table 1) from the University of California Museum of Vertebrate Zoology (MVZ). To examine morphological variation by geographic distance within species, an additional 13 N. cinerea were sampled from eastern Idaho and western South Dakota at MVZ and the South Dakota School of Mines and Technology Museum of Geology (SDSM). Populations from western South Dakota occur near the easternmost extent of the species’ range, >1,300 miles northeast of California N. cinerea (Fig. 1).
We also sampled 16 fossilized left and right m1s from four Project 23 (P23) Deposits (1, 7B, 13, and 14) at RLB (Fig. 1, Supplementary Table 1). Project 23 is the collective name for sixteen fossiliferous asphalt deposits discovered in 2006 during construction of a parking structure adjacent to RLB (Fuller et al., Reference Fuller, Fahrni, Harris, Farrell, Coltrain, Gerhart, Ward, Taylor and Southon2014; Holden et al., Reference Holden, Southon, Will, Kirby, Aalbu and Markey2017). These deposits were salvaged in 23 large wooden crates that are now housed and excavated in Hancock Park. P23 deposits provide new opportunities to acquire and study fossils of organisms from lower trophic levels such as small mammals, invertebrates, and plants that were not prioritized when RLB’s historic deposits were excavated in the early twentieth century (Stock and Harris, Reference Stock and Harris1992; Holden et al., Reference Holden, Southon, Will, Kirby, Aalbu and Markey2017; Fox et al., Reference Fox, Takeuchi, Farrell and Blois2019; Mychajliw et al., Reference Mychajliw, Rice, Tewksbury, Southon and Lindsey2020). P23 fossils are Late Pleistocene and radiocarbon dates obtained thus far range from >50,000 to approximately 26,000 14C yr BP in age (Fuller et al., Reference Fuller, Fahrni, Harris, Farrell, Coltrain, Gerhart, Ward, Taylor and Southon2014, Reference Fuller, Southon, Fahrni, Farrell, Takeuchi, Nehlich, Guiry, Richards, Lindsey and Harris2020; Holden et al., Reference Holden, Southon, Will, Kirby, Aalbu and Markey2017; Fox et al., Reference Fox, Southon, Howard, Takeuchi, Potze, Farrell, Lindsey and Blois2023). Neotoma fossils are uncommon at RLB (Dice, Reference Dice1925; Fox, Reference Fox2020) and this m1 sample exhausts what was available from those deposits at the time of sampling.
Due to the geographic and temporal positions of the RLB faunas, it is unlikely that P23 Neotoma fossils are from a different species than the five extant species/species complexes examined here. Therefore, comparison of fossil molars with these extant California species is appropriate. By doing this, however, we assume that (1) geographic ranges of Neotoma have not shifted over the last 50,000 years to the extent that species currently residing hundreds of miles from Los Angeles (e.g., N. mexicana) were present there in the past and (2) that intraspecific tooth morphology has not changed more than interspecific tooth morphology in that time. The former assumption is made under the rationale that animals with the ability to adapt in place and undergo body size or other changes in response to climate change, including Neotoma (Smith et al., Reference Smith, Betancourt and Brown1995), will likely do so rather than undergo drastic, long-distance range shifts (Lyons, Reference Lyons2003). However, if that assumption is incorrect and the RLB fossils represent a species not included in our reference dataset, morphological mismatches should be detectable between fossils and all extant species sampled.
Data acquisition and preparation
We photographed recent and fossil Neotoma teeth and dentaries using Dino-Lite Edge AM4815ZTL and Dino-Lite Edge Plus AM4917MZT digital microscopes. The orientation of the occlusal surface of each specimen and its distance from the camera lens was standardized to the extent possible to mitigate specimen presentation and imaging error (Arnqvist and Mårtensson, Reference Arnqvist and Mårtensson1998; Fruciano, Reference Fruciano2016; Fox et al., Reference Fox, Veneracion and Blois2020). No sampling filters (e.g., by age, sex, or tooth wear stage) were implemented aside from omitting very young specimens with unerupted or partially erupted dentition. However, filtered data subsets were later created to evaluate wear stage effects on m1 morphology and species classification. Image sets of recent and fossil specimens were assembled in TpsUtil 32 (Rohlf, Reference Rohlf2018a) and digitized in TpsDig 2.32 (Rohlf, Reference Rohlf2018b) by author NSF. Right m1s were landmarked when present. If right m1s were not available, images of left m1s were flipped to imitate right dentition prior to landmark digitization.
Fourteen landmarks were digitized along the m1 occlusal surface of each specimen. There is no unanimously accepted dental terminology for Neotoma (see Zakrzewski, Reference Zakrzewski, Martin and Barnosky1993; Repenning, Reference Repenning and Barnosky2004; Martin and Zakrzewski, Reference Martin and Zakrzewski2019). Therefore, our landmark definitions follow the most recent terminology of Martin and Zakrzewski (Reference Martin and Zakrzewski2019) for m1 conids, lophids, and flexids/reentrant folds (see Table 1 and Fig. 2 for landmark definitions and placement, respectively). We mainly used type 1 and type 2 landmarks (Bookstein, Reference Bookstein1997); however, some type 3 landmarks were included to capture the morphology of the woodrat tooth periphery (Fig. 2). One landmark (landmark 14) also varied in type among specimens depending on wear stage and/or species (Fig. 2). For this landmark, different definitions are indicated depending on which condition occurs (Table 1).

Figure 2. Landmark configuration employed on the right m1 of MVZ 3925 Neotoma macrotis (top) with a well-defined metaflexid reentrant (notch between landmarks 13 and 14) and right m1 of MVZ 199799 N. lepida (bottom) that lacks a discernible metaflexid in occlusal view. See Table 1 for landmark definitions.
Table 1. Definitions of the 14-landmark configuration employed for specific differentiation of western North American Neotoma right m1s. Landmark types (i.e., 1, 2, 3; Bookstein, Reference Bookstein1997) are listed as well. Tooth terminology is adapted from Martin and Zakrzewski (Reference Martin and Zakrzewski2019) and landmark placement is shown in Figure 2.

Each recent specimen from California was digitized twice by author NSF with several weeks between sessions to quantify measurement error (ME), which can affect biological signals within geometric morphometric datasets (Arnqvist and Mårtensson, Reference Arnqvist and Mårtensson1998; Fruciano, Reference Fruciano2016; Fox et al., Reference Fox, Veneracion and Blois2020; Collyer and Adams, Reference Collyer and Adams2024a). All landmark datasets were superimposed via Generalized Procrustes analysis (GPA) using the ‘gpagen’ function in the R package geomorph v.4.0.7 (Adams et al., Reference Adams, Collyer, Kaliontzopoulou and Baken2024) before performing other analyses. Generalized Procrustes analysis translates all specimens to unit centroid size and optimally rotates them along a common coordinate system using a generalized least-squares algorithm to standardize specimen rotation, orientation, and scale (Rohlf and Slice, Reference Rohlf and Slice1990; Adams et al., Reference Adams, Collyer, Kaliontzopoulou and Baken2024).
Measurement error
Previous research has shown that artificial variation introduced from measurement error (ME) during the landmark digitization process can have a profound effect on statistical outcomes and biological interpretations (Fruciano, Reference Fruciano2016; Robinson and Terhune, Reference Robinson and Terhune2017; Fox et al., Reference Fox, Veneracion and Blois2020). Therefore, measurement error was quantified among the two landmark digitization repetitions (hereafter referred to as data repetitions one and two) using the ‘measurement.error’ function in the R package RRPP v. 2.0 (Collyer and Adams, Reference Collyer and Adams2024b). This function evaluates repeatability of multivariate data measurements obtained from the same subjects using analysis of variance (ANOVA) with a modified residual randomization in a permutation procedure (RRPP) for restricted randomization (Collyer and Adams, Reference Collyer and Adams2024a, Reference Collyer and Adamsb). Measurement error ANOVA quantifies both random and systematic (non-randomly distributed) error in repeated measurements. Random error can artificially increase data variance and obscure biological signals while systematic error can skew biological signals and subsequently alter biological inferences (Arnqvist and Mårtensson, Reference Arnqvist and Mårtensson1998; Fruciano, Reference Fruciano2016; Collyer and Adams, Reference Collyer and Adams2024a).
Intraclass correlation coefficients (ICC) were also calculated to determine the ratio of among-subject to within-subject variance based on dispersion using the ‘ICCstats’ function in RRPP (Collyer and Adams, Reference Collyer and Adams2024b). ‘ICCstats’ generates three statistics: one for the population (ICC), one for measurement agreement among subjects (ICCA), and one for consistency among repeated measurements (ICCC) (Liljequist et al., Reference Liljequist, Elfving and Roaldsen2019; Collyer and Adams, Reference Collyer and Adams2024a, Reference Collyer and Adamsb). Values closer to 1 indicate higher repeatability while values closer to 0 indicate lower repeatability among measurements. Models were run with and without a ‘groups’ parameter; the groups parameter was included to account for morphological variation among the five extant Neotoma species examined (Collyer and Adams, Reference Collyer and Adams2024a, Reference Collyer and Adamsb). Fossil unknowns and N. cinerea collected outside California were not included in analyses of measurement error.
Species classification
Measurement error is ubiquitous and impossible to eliminate completely from geometric morphometric datasets (Fruciano, Reference Fruciano2016). Yet previous research has shown that even small amounts of ME can have a substantial effect on biological signals and statistical outcomes, especially when morphological differences between target groups are subtle (Robinson and Terhune, Reference Robinson and Terhune2017; Fox et al., Reference Fox, Veneracion and Blois2020). We therefore examined discriminant analysis of principal components (DAPC) group classification statistics for both landmark data repetitions and used predicted group membership agreement among each as proxy for biological signal fidelity. DAPC first performs a principal components analysis (PCA) transformation to ensure predictor variables (i.e., landmark data) are uncorrelated (Jombart et al., Reference Jombart, Devillard and Balloux2010). Discriminant function analysis is then performed on the retained principal components to maximize variation among groups while minimizing variation within groups (Jombart et al., Reference Jombart, Devillard and Balloux2010; Jombart and Collins, Reference Jombart and Collins2022). With DAPC, the optimal number of principal components included in the discriminant analysis can be calculated to balance the tradeoff between information capture and discrimination power (more principal components retained), and model overfitting (fewer principal components retained) (Jombart et al., Reference Jombart, Devillard and Balloux2010; Jombart and Collins, Reference Jombart and Collins2022).
DAPC was conducted on both GPA-transformed landmark data repetitions of recent m1s from California using the ‘dapc’ function in the R package adegenet v. 2.1.10 (Jombart et al., Reference Jombart, Kamvar, Collins, Lustrik, Beugin, Knaus and Solymos2023). Twenty-eight x- and y-coordinate variables from the 14 landmarks (Table 1, Fig. 2) were entered as predictor variables to sort recent specimens into their known species groups. Cross-validation was implemented using the ‘xvalDapc’ function in adegenet to determine the optimal number of retained principal components based on the number of principal components that achieve the lowest root-mean-square error (RMSE) (Jombart and Collins, Reference Jombart and Collins2022). xvalDapc models were run dozens of times and the range of output values (numbers of principal components achieving lowest MSE) were recorded for each iteration. We then examined DAPC classification statistics (i.e., the percentage of recent specimens correctly assigned to their respective species groups) to determine which principal component number achieved the best compromise between discrimination power and overfitting. Hereafter, model one refers to the DAPC model constructed from landmark data repetition one and model two refers to the model constructed from data repetition two.
Landmark data from RLB fossil m1s and Neotoma cinerea from Idaho and South Dakota were entered in DAPC models as supplementary individuals. In other words, they were entered as observations to be sorted by models fitted with recent California specimen training data but were not included in model construction (Jombart and Collins, Reference Jombart and Collins2022). These “unknowns” were sorted into extant species groups using the ‘predict.dapc’ function in adegenet. Although species identities of N. cinerea outside California were known a priori, they were treated as unknowns to test the models’ ability to classify members of the same species from different geographic locations and environments. Predicted species assignments and posterior probabilities of group membership were recorded for unknowns in both data-repetition-training models. Only fossil unknowns assigned to the same species with posterior probabilities of group membership ≥ 0.80 in both models were considered to have high predictive fidelity.
Shape visualization and comparison
To visualize shape differences between extant Neotoma species from California and RLB fossils, wireframe outlines were created for the mean landmark configuration of each species using the ‘define.links’ function in geomorph (Adams et al., Reference Adams, Collyer, Kaliontzopoulou and Baken2024). Wireframes of each extant species were then overlain with the mean landmark configuration of the RLB fossil population. Shape differences between extant species and fossils were quantified via vector displacement plots using the ‘plotRefToTarget’ function in geomorph. Mean fossil shape data were entered as the ‘reference specimen’ and mean shape data of each extant species were entered as the ‘target specimen’ comparison. Thus, plotted vectors signify the direction and magnitude of landmark displacement in each extant species relative to the corresponding landmarks of the fossil population. Vector plots were also used to compare the mean landmark configuration of extant N. cinerea from California with specimens east of California to visualize m1 shape differences across space and environments. Magnifications of shape differences among all groups were set to the default value of 1.
Tooth wear
The occlusal shape of Neotoma molars can vary substantially due to wear and ontogeny among conspecific individuals (Harris, Reference Harris1984; Repenning, Reference Repenning and Barnosky2004; Tomé et al., Reference Tomé, Whiteman-Jennings and Smith2020). To evaluate the sensitivity of our species classification results to wear, we vetted our original datasets of extant Neotoma species from California (data repetitions one and two) by removing specimens near both extremes of the wear spectrum. That is, we removed specimens with nearly unworn molars (wear stage 1, early stage 2) and very heavily worn molars (stage 5) based on the wear categories of Harris (Reference Harris1984) and Tomé et al. (Reference Tomé, Whiteman-Jennings and Smith2020). Twenty-eight of the 199 recent specimens initially sampled from California (∼14%) were removed in this treatment (Supplementary Table 1). We performed DAPC on these data subsets and compared wear-vetted group classification results with classification results from the original dataset repetitions. We then visualized mean shape differences (across all extant California species) between our wear-vetted and unvetted datasets using vector displacement plots. Vector plots and DAPC were conducted using the methods described in the ‘Shape visualization and comparison’ and ‘Species classification’ sections above, respectively. All analyses were done in R v 4.4; (R Core Team, 2024).
Results
Measurement error ANOVA of the two extant Neotoma species data repetitions yield small random and systematic ME values relative to specimen variance in models with and without group parameters (Table 2). Systematic ME is also small relative to random ME, but statistically significant in all cases (Table 2). Repeatability values are high across the three ICC statistics in both models, but more so in the group-excluded model (∼0.94 and ∼0.92 for group-excluded and -included models, respectively, Table 2). In both models, ICCC values are discernibly greater than ICC and ICCA values (Table 2).
Table 2. ANOVA and intraclass correlation coefficient (ICC) summary statistics for measurement error (ME) analysis of both extant Neotoma landmark dataset repetitions. Systematic ME, random ME, and ICC statistics were quantified using the ‘measurement.error’ and ‘ICCstats’ functions, respectively, in the R package ‘RRPP’ (Collyer and Adams, Reference Collyer and Adams2024b). “Without groups” = no group parameter for the five extant Neotoma species. “With groups” = group parameter for the five extant Neotoma species included.

Of the 24 principal components generated via DAPC, the optimal number of retained principal components suggested by cross-validation is 6–22. Correct classification of extant species improves with each additional principal component retained up to 16 and plateaus thereafter. Therefore, 16 of the 24 principal components were retained in the final training models. Those models correctly classify 90.5% and 84.9% of specimens across all five extant species groups in models one and two, respectively (Table 3). Correct classification rates were 75–97.5% within species groups; Neotoma cinerea achieves the highest correct classification rate overall, and N. macrotis achieves the lowest correct classification rate in both models (Table 3). DAPC scatterplots show good morphospace separation of N. cinerea from all other extant species groups (Fig. 3). Substantial group overlap occurs between N. fuscipes and N. macrotis and, to a lesser extent, between N. albigula and N. lepida (Fig. 3).

Figure 3. Discriminant analysis of principal components (DAPC) scatterplot using the first 16 of 24 principal components generated from (A) model one (the first landmarking repetition of Neotoma m1s) and (B) model two (the second repetition). Data from extant N. albigula (Na), N. cinerea (Nc), N. fuscipes (Nf), N. lepida (Nl), and N. macrotis (Nm) from California were used to train the models; the 16 Project 23 fossil unknowns (dark gray diamonds) and 13 extant N. cinerea east of California (purple squares) were entered as supplementary individuals. Inset shows the cumulative variance explained by each principal component. Black bars and gray bars represent principal components included and excluded, respectively, based on DAPC cross-validation analysis in the R package adegent (Jombart et al., Reference Jombart, Kamvar, Collins, Lustrik, Beugin, Knaus and Solymos2023).
Table 3. Discriminant analysis of principle components classification statistics of models built with extant Neotoma species data from the first (top) and second (bottom) landmarking repetition. All 28 landmark variables are included in the analysis. The first 16 of 24 principal components were retained for each repetition, as suggested via the ‘xvalDapc’ cross-validation function in the R package ‘adegent’ (Jombart et al., Reference Jombart, Kamvar, Collins, Lustrik, Beugin, Knaus and Solymos2023). Columns indicate predicted species affinities; rows indicate actual species affinities; % Correct = the percentage of specimens correctly assigned to their species group; Na = Neotoma albigula; Nc = Neotoma cinerea; Nf = Neotoma fuscipes; Nl = Neotoma lepida; Nm = Neotoma macrotis.

Most fossil unknowns from RLB are identified as N. macrotis (12 of 16 specimens in both DAPC models, Table 4). All other specimens are identified as N. fuscipes. Both models agree on the species affinity of 14 of the 16 unknown specimens, and four of the 16 specimens meet our criteria for high predictive fidelity (all identified as N. macrotis, Table 4). All fossil unknowns plot within or near the observed morphospace of N. fuscipes and N. macrotis (Fig. 3). Shape differences in vector-displacement plots are obvious between the fossil population mean and extant Neotoma albigula, N. cinerea, and N. lepida means, particularly at landmarks 5, 10, and 11 (Figs. 2 and 4). Shape differences between the fossil population and extant N. fuscipes and N. macrotis are subtle and no strong vector displacements were observed (Fig. 4). Most recent specimens of N. cinerea from Idaho and South Dakota are correctly identified as N. cinerea (11 of 13 in both DAPC models), and both models agree on the species affinity of all specimens (Table 4). Most of these eastern specimens cluster within or near the morphospace of California N. cinerea (Fig. 3). However, discernible differences in mean m1 shape configurations occur between California N. cinerea and eastern specimens at landmarks 2, 3, 6, and 12 (Fig. 2, Supplementary Fig. 1).

Figure 4. Vector displacement plots of mean m1 landmark shape configurations for the five extant Neotoma species (n = 39–40 each) examined overlain with the mean landmark shape configuration of fossil Neotoma (n = 16) from Project 23 deposits at Rancho La Brea (P23). Configurations were generated using the first landmarking repetition. Also shown is the overlay of extant N. macrotis and N. fuscipes shape configurations (lower right). Arrows indicate the magnitude and direction of vector displacement between the reference and target groups. Plots were created using the ‘define.links’ and ‘plotRefToTarget’ functions in the R package geomorph (Adams et al., Reference Adams, Collyer, Kaliontzopoulou and Baken2024).
Table 4. Classification results of 16 fossil Neotoma m1s from Project 23 Deposits 1, 7B, 13, and 14 at Rancho La Brea and extant N. cinerea east of California. Predicted species-group membership (PGM) of each specimen is listed for each DAPC training model (1 and 2) followed by their 0–1 posterior probabilities (PP) of group membership. Values closer to one indicate higher probabilities of belonging to a particular species than values closer to 0. Underlined text indicates fossil specimens assigned to the same extant species in both models with posterior probabilities ≥ 0.80 in each. Taxa in bold indicate eastern specimens of N. cinerea misclassified as other species. ID = Idaho; SD = South Dakota; LACMP = Los Angeles County Museum of Paleontology; MVZ = University of California Museum of Vertebrate Zoology; SDSM = South Dakota School of Mines and Technology Museum of Geology.

DAPC species classification models improve marginally for wear-vetted datasets versus unvetted datasets overall (91.2% from 90.5% and 87.1% from 84.9%) across all extant species for models one and two, respectively (Table 3, Supplementary Table 2). Species-specific classification results improve slightly for all species except Neotoma lepida in both models. Vector displacement plots of mean m1 shape show no discernible difference between wear-vetted and unvetted datasets for either model (Supplementary Fig. 2).
Discussion
We examined whether we could differentiate extant Neotoma species and identify unknown fossils from Late Pleistocene Rancho La Brea localities in Los Angeles, California. Overall, we found that 2D landmark analysis of lower first molars is an appropriate approach for these goals, and we were able to identify a subset of fossils at RLB as N. macrotis with reasonable certainty.
Classification results
DAPC of 28 m1 landmark-coordinate variables was conducted to determine if (1) the five extant Neotoma species can be differentiated by tooth morphology and (2) these data can be used to identify fossils of unknown species affinity. DAPC was performed on both landmark repetitions to evaluate data replicability and biological signal variability due to ME. Our DAPC training models are relatively accurate overall, correctly predicting species affinities of 84.9–90.5% of the 199 recent California specimens examined (Table 3). Thus, in general, m1s of extant Neotoma species can be differentiated using this geometric morphometric approach. Our results also are robust to tooth wear variability given the marginal improvement in DAPC classification statistics (Table 3, Supplementary Table 2) and indiscernible mean shape differences (Supplementary Fig. 2) between wear-vetted and unvetted datasets.
Analyses of ME among the two dataset repetitions indicate that landmark ME is relatively low, explaining only ∼3% of the total data variance (R 2 values, Table 2). Systematic measurement error, which is generally of greater concern for altering biological signals and interpretations than random ME (Arnqvist and Mårtensson, Reference Arnqvist and Mårtensson1998; Collyer and Adams, Reference Collyer and Adams2024a), is also low relative to random ME (η 2 values, Table 2) and specimen variation (signal-to-noise ratio (SNR) values, Table 2). However, systematic ME contributions are statistically significant in both the group-included and group-excluded models (Table 2). High ICC values (> 0.9, Table 2) indicate good repeatability and thus relatively low ME among datasets (Liljequist et al., Reference Liljequist, Elfving and Roaldsen2019; Collyer and Adams, Reference Collyer and Adams2024b). Yet discrepancies between ICCC and ICC/ICCA statistics (Table 2) further imply these datasets are affected by systematic ME (Collyer and Adams, Reference Collyer and Adams2024a).
The extent to which ME affects biological signals varies among specimens, taxa, and studies. Species classification patterns are very similar among the two DAPC models in our study. For example, predicted group memberships are most accurate for N. cinerea and N. lepida, and least accurate for N. macrotis in both models (Table 3). However, DAPC-predicted group memberships are not identical for individual specimens in both models, and neither are DAPC scatterplot patterns (Fig. 3). This suggests that ME does affect biological signal to some extent. Of greater concern is how ME affects predicted group memberships of fossil unknowns because unlike our recent specimens, it is impossible to independently verify fossil species identity (ancient DNA retrieval has thus far been unsuccessful at RLB; Gold et al., Reference Gold, Robinson, Farrell, Harris, Thalmann and Jacobs2014). Predicted group memberships were consistent among training models in 14 of the 16 fossil individuals (∼88%) and the models were able to correctly identify most individuals of extant N. cinerea (∼85%) from distant geographic locations (Table 4). Therefore, while it is impossible to determine the accuracy of our models for identifying fossil unknowns, this approach appears robust due to the relatively high accuracy of DAPC training models in sorting known extant California species, the overall consistency in their predictions of fossil species identities among landmarking repetitions, and the models’ ability to correctly identity most specimens across space, which is often used as an ecological substitute for time (Pickett, Reference Pickett and Likens1989; Blois et al., Reference Blois, Williams, Fitzpatrick, Jackson and Ferrier2013).
Most errors and inconsistencies in recent and fossil-group membership predictions occur between Neotoma fuscipes and N. macrotis (Tables 3 and 4). This is not surprising as N. macrotis was considered a subspecies of N. fuscipes until phenotypic and molecular analyses found species-level differences among them (Matocq, Reference Matocq2002a, Reference Matocqb). Indeed, DAPC and vector displacement plots show more morphological similarity between N. fuscipes and N. macrotis than any other two species examined here (Figs. 3 and 4). DAPC classifies most (12 of 16) fossil unknowns from RLB as N. macrotis and the remaining specimens are classified as N. fuscipes (Table 4, Fig. 3). Marked shape differences occur between RLB m1s and the m1s of extant N. albigula, N. cinerea, and N. lepida, particularly among landmarks 5, 10, and 11, which capture the shape of the hypoflexid and entoflexid reentrant folds (Table 1, Figs. 2 and 4). However, RLB fossils are virtually indistinguishable from extant N. fuscipes and N. macrotis (Fig. 4).
Because most fossils are classified as N. macrotis in both landmark data repetitions, and several have high predictive fidelity (Table 4, Fig. 3), the presence of N. macrotis among the P23 faunas of RLB is strongly supported. It is more difficult to discern whether N. fuscipes is also present among these faunas, or if it is an erroneous model prediction given the comparatively low identification rates of this species and elevated misclassification rates among extant N. fuscipes and N. macrotis in both models (Tables 3 and 4). Previous work has shown that unknowns assigned to groups in low frequency should be interpreted cautiously, especially when models are not completely accurate at sorting training subjects of known group affinity (Fox et al., Reference Fox, Veneracion and Blois2020). Until stronger support for N. fuscipes is provided (e.g., via larger fossil sample sizes), we attribute only one woodrat species, N. macrotis, to the P23 faunas of RLB.
Systematic paleontology
Rodentia (Bowdich, Reference Bowdich1821)
Cricetidae (Fischer, Reference Fischer1817)
Neotoma (Say and Ord, Reference Say and Ord1825)
Neotoma macrotis (Thomas, Reference Thomas1893)
Referred specimens. m1s: Los Angeles County Museum of Paleontology (LACMP) LACMP23-34168, LACMP23-35665, LACMP23-40158, LACMP23-40402.
Remarks. Four Neotoma m1s from P23 Deposit 7B (LACMP23-35665), P23 Deposit 13 (LACMP23-40158, LACMP23-40402), and P23 Deposit 14 (LACMP23-34168) are assigned to N. macrotis based on predicted group membership assignment and posterior probabilities of group membership ≥ 0.80 in both DAPC models (Table 4).
Neotoma cf. N. macrotis (Thomas, Reference Thomas1893)
Referred specimens. m1s: LACMP23-28509, LACMP23-31175, LACMP23-31976, LACMP23-33922, LACMP23-35668, LACMP23-35780, LACMP23-35782, LACMP23-35806, LACMP23-35934, LACMP23-36312, LACMP23-40662. Left dentary with m1-m2: LACMP23-40663.
Remarks. Twelve Neotoma specimens from P23 Deposit 1 (LACMP23-28509), P23 Deposit 7B (LACMP23-35668, LACMP23-35780, LACMP23-35782, LACMP23-35806, LACMP23-35934, LACMP23-36312), and P23 Deposit 14 (LACMP23-31175, LACMP23-31976, LACMP23-33922, LACMP23-40662, LACMP23-40663) are assigned to Neotoma cf. N. macrotis. This includes specimens with predicted group membership of N. fuscipes in one or both DAPC models and/or posterior probabilities of Neotoma macrotis group membership ≤ 0.80 in one or both models.
Biogeographic implications
To our knowledge, this is the first report of N. macrotis from a prehistoric locality, likely due to the taxon’s recent distinction as a unique species and due to the challenges of identifying isolated elements of Neotoma to species in general. An early study of RLB small mammals was unable to identify fossil Neotoma dentaries from Localities 2050 and 2051 to species (Dice, Reference Dice1925). It was noted, however, that N. lepida and N. macrotis (then Neotoma fuscipes macrotis) were likely candidates based on size. Similarly, a recent study of small-mammal coprolites likely belonging to Neotoma from P23 Deposit 1 noted the fecal pellets were within the size range of extant N. lepida and N. macrotis (Mychajliw et al., Reference Mychajliw, Rice, Tewksbury, Southon and Lindsey2020). This is now the third study to independently support the occurrence of N. macrotis among RLB paleofaunas and the first to provide size-independent morphological evidence for it. Conversely, occurrence of N. lepida is not supported based on the small fossil m1 sample size acquired here. While it is possible that N. lepida and other unexamined members of the N. lepida species complex (e.g., N. bryanti) are present in the RLB paleofaunas, it is unlikely that any members are represented in this sample given the morphological similarity among the N. lepida complex (Patton et al., Reference Patton, Huckaby and Álvarez-Castañeda2007; Bradley et al., Reference Bradley, Edwards, Lindsey, Bateman, Cajimat, Milazzo, Fulhorst, Matocq and Mauldin2022) and the strong differentiation between N. lepida and N. fuscipes/N. macrotis specimens (Table 3, Figs. 3 and 4). Additional m1 samples from other deposits spanning different geologic ages are needed to discern whether other Neotoma species were present in the Los Angeles region during the Late Pleistocene and Holocene.
Rancho La Brea is located within the contemporary range of N. macrotis (Fig. 1), so its presence among these paleofaunas is not surprising. Rancho La Brea is also less than 100 miles from the southernmost N. fuscipes–N. macrotis contact zone (Fig. 1) (Matocq, Reference Matocq2002a), and the range of N. fuscipes may have shifted slightly southward towards RLB as past climate change shifted regions of habitat suitability for the species during Pleistocene glacial periods. Recent projections of the geographic distribution of N. fuscipes place it in southern California at the last glacial maximum (LGM), 21,000 yr BP (Boria et al., Reference Boria, Brown, Matocq and Blois2021), presenting another line of evidence that it was potentially in the Los Angeles region in earlier glacial times captured by P23. However, morphological differentiation among extant N. fuscipes and N. macrotis is the result of complex, ongoing ecological and evolutionary processes. Clear divergence in craniodental traits at species contact zones is consistent with competition-driven character displacement, but an asymmetric pattern of morphological change between the two species occurs as distance from the contact zone increases (Matocq and Murphy, Reference Matocq and Murphy2007). Even within N. fuscipes, there is marked genotypic (Matocq, Reference Matocq2002a, b; Boria et al., Reference Boria, Brown, Matocq and Blois2021) and phenotypic (Hooper, Reference Hooper1938; Matocq, Reference Matocq2002a) variation among geographic populations in California. Those observations are evident in this study as more variation is expressed in the m1 morphological ellipse of N. fuscipes than in any other California Neotoma species examined, some of which overlap with N. macrotis (Fig. 3). Interpreting whether N. fuscipes and N. macrotis, or N. macrotis alone, is present at RLB is further complicated because these two species today have multiple contact zones, including at least one area of sympatry where they hybridize (Matocq, Reference Matocq2002a; Coyner et al., Reference Coyner, Murphy and Matocq2015). It may therefore be impossible to identify some specimens within the N. fuscipes–N. macrotis complex to species based on tooth shape alone, especially if they occur near the contemporary range intersections of both species where hybridization can occur. In such cases, genomic analyses may be required for specific identification.
Methodological developments
We have shown that landmark-based shape configurations can help overcome the challenges of identifying isolated Neotoma molars in both modern and prehistoric systems. Previously, most species-level identifications of Neotoma teeth, bones, and fecal pellets were inferred based on contemporary species distributions and size. Such methods are only appropriate if one can assume no range shifts occurred in the past and if focal species are near the size minima and maxima of the genus where little interspecific overlap occurs (e.g., N. albigula and N. cinerea; Lyman and O’Brien, Reference Lyman and O’Brien2005; N. lepida and N. cinerea; Smith et al., Reference Smith, Crawford, Harding, Lease, Murray, Raniszewski and Youberg2009).
Shape-based identification of Neotoma remains have been attempted with mixed outcomes. For example, Zakrzewski (Reference Zakrzewski, Martin and Barnosky1993) and Repenning (Reference Repenning and Barnosky2004) described morphological characters of extant and fossil Neotoma dentition and inferred biogeographic (Repenning, Reference Repenning and Barnosky2004), biostratigraphic (Zakrzewski, Reference Zakrzewski, Martin and Barnosky1993; Repenning, Reference Repenning and Barnosky2004), phylogenetic, and evolutionary (Zakrzewski, Reference Zakrzewski, Martin and Barnosky1993) patterns based on those characters. It was acknowledged, however, that specific relationships among fossil taxa could not be inferred, in part, because complex features such as spatial relationships between different tooth reentrant folds and lophs are difficult to quantify via direct observation (Zakrzewski, Reference Zakrzewski, Martin and Barnosky1993). Harris (Reference Harris1984) attempted to statistically differentiate m1s of eight extant Neotoma species to facilitate fossil identification using a combination of linear measurements and qualitative characters. It was shown that most specimens could be correctly sorted into their respective species groups, but only when split into species pairs prior to discriminant analysis (Harris, Reference Harris1984). Additional data corrections were implemented to account for tooth-wear differences (Harris, Reference Harris1984) and, while somewhat successful, those treatments could be difficult to replicate for other studies.
More recently, Tomé et al. (Reference Tomé, Whiteman-Jennings and Smith2020) used elliptical Fourier analysis of 2D shape outlines to classify upper first molars (M1s) of six Neotoma species. Only 57% of specimens were correctly classified in that study, and four of the species groups were sorted at less than 20% accuracy (Tomé et al., Reference Tomé, Whiteman-Jennings and Smith2020). Discrepancies between our classification results and those of Tomé et al. (Reference Tomé, Whiteman-Jennings and Smith2020) are likely due to study design differences. For example, while both studies used a geometric morphometric technique, our study examined 14 discreet landmarks placed on anatomically homologous loci of the m1, while Tomé et al. (Reference Tomé, Whiteman-Jennings and Smith2020) examined M1 shape using a semiautomated ‘high-density’ method generating more than 100 points per outline. It could be that m1s exhibit more interspecific variation than M1s, or ‘traditional’ landmarks could be more effective at capturing interspecific shape variation than shape outlines in this system.
The degree to which mathematically applied, high-density, geometric morphometric configurations can obscure biological signals remains unclear, but recent work suggests these methods should be interpreted cautiously (Cardini, Reference Cardini2020; Shui et al., Reference Shui, Profico and O’Higgins2023). Occlusal outlines of Neotoma molars could also be more sensitive to wear-based shape variation than our discrete landmark configuration. Tomé et al. (Reference Tomé, Whiteman-Jennings and Smith2020) found that removing specimens with more wear (stage 4) and less wear (stage 2) than average (stage 3) provided little improvement to species classification results. However, the most extreme wear stages (1 and 5) were excluded from that study (Tomé et al., Reference Tomé, Whiteman-Jennings and Smith2020). Our statistical and visual comparisons of wear-vetted and unvetted datasets showed little change when specimens near both wear extremes (stages 1, early 2, and 5) were removed (Table 3 versus Supplementary Table 2, Supplementary Fig. 2). Specimens exhibiting those wear stages encompassed a relatively small proportion of our original dataset (∼14%, Supplementary Table 1), and it would be beneficial to evaluate datasets with higher proportions of specimens near both extremes of the wear spectrum in the future. Nevertheless, our results suggest that data noise generated by m1 wear did not substantially alter interspecific morphological signals captured by our landmark configuration.
Other sources of variation in classification results could include differences in sampling approaches among studies. Tomé et al. (Reference Tomé, Whiteman-Jennings and Smith2020) examined six Neotoma species with species-group sizes between 20 and 90 while our study examined five species with group sizes between 39 and 40. Both the number of groups and group sample sizes entered in discriminant function and canonical variate analyses can have a large effect on model outcomes (Kovarovic et al., Reference Kovarovic, Aiello, Cardini and Lockwood2011). Tomé et al. (Reference Tomé, Whiteman-Jennings and Smith2020) also sampled specimens across a larger geographic range than we did and noted that M1 shape of some Neotoma species could be more affected by local diet and environment than phylogeny. For extant species with broad distributions such as N. cinerea (Fig. 1), we would thus expect more intraspecific variation to be captured if we sampled across more of their range due to local environmental adaptations. Indeed, our limited sample of N. cinerea from Idaho and South Dakota showed morphological differences from our California sample (Supplementary Fig. 1). Shape configuration differences were most prominent at landmarks 2, 3, 6, and 12, which correspond to features of the protoflexid, hypoconid, and metaconid (Table 1, Fig. 2, Supplementary Fig. 1). Despite those differences, our DAPC models were able to correctly identify most individuals to species regardless of their geographic origin (Tables 3 and 4). It is potentially significant that the two individuals misclassified outside California (SDSM 171655 and SDSM 171656) are from the eastern edge of the species’ range and exhibit wear stages 5 and 1, respectively. This suggests that, while our landmark protocol bodes well for extant and fossil Neotoma identification, interactions between phylogeny, wear/ontogeny, diet, geography, and environment may affect tooth morphology in complex ways. Therefore, additional applications of this method across a broader range of taxa, individuals, and locations are needed to better contextualize its advantages and limitations for species classification.
Summary and conclusions
We found that 2D geometric morphometric analysis of Neotoma m1s is generally effective at differentiating five extant western North American species. Most individuals were correctly classified within their respective species groups via DAPC; although higher rates of morphological overlap and misclassification occur between the two most recently diverged species, N. fuscipes and N. macrotis (Table 3, Figs. 3 and 4). By applying this method to classify fossil Neotoma m1s, we identified N. macrotis in the P23 deposits of RLB during the Late Pleistocene. Our study shows that landmark-based shape configurations can provide a size and contemporary range-independent option for identifying Neotoma remains to species. Because fossil teeth tend to preserve over a wider range of environments and time periods than fossil middens, this method could extend the spatiotemporal range of Neotoma species occurrences and further our understanding of their biogeography and paleoecology. Our relatively simple identification procedure should be applicable to other study systems and locations in North America.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/qua.2025.9.
Acknowledgments
This project was supported by National Science Foundation EAR-1623852 to JLB and American Society of Mammalogists Grant-In-Aid of Research to NSF. We thank C.J. Conroy at the UC Museum of Vertebrate Zoology and G.T. Takeuchi and A.B. Farrell at the La Brea Tar Pits and Museum for research access and support, collaborators J.L. Gill, J.D. Yeakel, and E.L. Lindsey for fruitful discussions of Rancho La Brea small mammals, and Quaternary Research editors, Melissa Pardi, and an anonymous reviewer for their constructive comments and suggestions. Fossil specimens were made available through the collective excavation, matrix sorting, and preparation efforts of many La Brea Tar Pits and Museum staff and volunteers — we thank you all.
Data availability
Specimen images and landmark coordinate data are available at Zenodo: https://doi.org/10.5281/zenodo.15127876. Analytical input data, code, and output data/figures are available on GitHub: https://github.com/bloispaleolab/Neotoma.m1.GM.
Competing interests
The authors declare no competing interests.