Visual inspection remains the most frequently applied method for detecting treatment effects in single-case designs. The advantages and limitations of visual inference are here discussed in relation to other procedures for assessing intervention effectiveness. The first part of the paper reviews previous research on visual analysis, paying special attention to the validation of visual analysts' decisions, inter-judge agreement, and false alarm and omission rates. The most relevant factors affecting visual inspection (i.e., effect size, autocorrelation, data variability, and analysts' expertise) are highlighted and incorporated into an empirical simulation study with the aim of providing further evidence about the reliability of visual analysis. Our results concur with previous studies that have reported the relationship between serial dependence and increased Type I rates. Participants with greater experience appeared to be more conservative and used more consistent criteria when assessing graphed data. Nonetheless, the decisions made by both professionals and students did not match sufficiently the simulated data features, and we also found low intra-judge agreement, thus suggesting that visual inspection should be complemented by other methods when assessing treatment effectiveness.