Macroscopic maturity staging data are widely used to distinguish between reproductive and non-reproductive individuals. The implicit assumption is that these data are accurate. The accuracy of macroscopic maturity staging of North Sea herring (Clupea harengus) has not been checked since the macroscopic scale was produced in 1961. The aim of this study was to assess the accuracy of macroscopic maturity staging of female North Sea herring by comparison to histological staging and the gonadosomatic index (GSI). Ovary samples were collected during the North Sea Herring Acoustic Survey in 2006 on-board FRV ‘Scotia’ (Scotland) and in 2007 on-board FRV ‘Scotia’ and RV ‘Johan Hjort’ (Norway). Commercial samples were also collected by Marine Scotland, Aberdeen in both years. The maturity staging error was relatively low in 2006 (21% error) but was much higher on-board FRV ‘Scotia’ (57%) and RV ‘Johan Hjort’ (47%) in 2007. There was estimated to be a 27% under-estimation of the spawning stock biomass (SSB) in 2007 due to the differences in the proportion mature but no change in SSB estimates in 2006. GSI cut-off scores, estimated by means of multinomial regression models were successfully able to separate immature females from both mature-active and recovering females; however, there was some overlap between the mature-active and recovering individuals. We conclude that an effective and low-cost means of reducing error in herring maturity studies is the combined use of a four-point macroscopic maturity scale with routinely collected GSI data, the latter acting to validate and fine tune macroscopic staging.