Publication bias is ‘the tendency on the parts of investigators, reviewers and editors to submit or accept manuscripts for publication based on the direction or strength of the study findings’ (Higgins & Green, Reference Higgins and Green2009). In other words, publication bias means that studies with statistically significant results have a higher probability to be published. Therefore, studies with no statistically significant differences (or not favoring the investigational drug) may not be found in commonly accessed databases, like PubMed or MEDLINE (Cipriani, Girlanda & Barbui, Reference Cipriani, Girlanda and Barbui2009). Turner and colleagues, who recently analyzed 74 randomized placebo-controlled antidepressant trials submitted to the Food and Drug Administration (FDA) for marketing authorization, found that approximately one-third of all studies went unpublished and that publication status was directly associated with study outcome (Turner et al. Reference Turner, Matthews, Linardatos, Tell and Rosenthal2008). Thirty-seven of 38 studies with positive results were published, whereas only a minority of studies considered by the FDA as having negative results were published (or were published in a way that conveyed a positive outcome). An inflation of the true estimate of efficacy for antidepressants was shown, with a 32% increase in the overall effect size of antidepressants in the published literature, as compared with the effect size estimated including unpublished data. Even though in the past some authors have raised some concerns about the reliability of unpublished data as these do not undergo any peer review process (Chalmers et al. Reference Chalmers, Berrier, Sacks, Levin, Reitman and Nagalingam1987), Turner and colleagues clearly highlighted how it is important to have access to unpublished literature when carrying out a systematic review (SR), considering that the main aim of SRs is to provide unbiased estimates of treatment effects (Purgato et al., Reference Purgato, Barbui and Cipriani2010).
Unpublished data refer to studies that are not published at all but it may also refer to information that are not included in study reports published in scientific journals. As a consequence, retrieving unpublished evidence represents a compelling challenge, and there is no standardized way of collecting it. The first thing to do is to check if there is unpublished material. Nowadays, there are randomized controlled trials' registries where studies have to be registered before starting recruitment. ClinicalTrials.gov was started in 2000 and now it is the largest single registry of clinical trials (Zarin & Tse, Reference Zarin and Tse2008). This database is freely accessible and in the website some useful information are reported (http://www.clinicaltrial.gov): a summary of the study protocol with participant demographics and baseline characteristics, primary and secondary outcomes and disclosure of agreements between sponsors and researchers; or, for example, it is made clear if the recruitment phase is still ongoing or the study has been completed. A summary of study results is additionally expected to be posted (these data should be available to the public within 12 months of trial completion or within 30 days of FDA approval or clearance of a new drug). Each study has a unique ID number. Trial registration is requested for study approval by the local ethics committees and a policy by the International Committee of Medical Journal Editors requires prospective trial registration as a pre-condition for publication.
When researchers collect the number of available studies and the corresponding IDs, it is possible to look whether the study has been published or not. If a study has been carried out but not published, full report of study results can be found using internet.
Websites of regulatory agencies are one option, but the most informative sources of unpublished data are pharmaceutical industries' websites. Not all companies have study registries and not all the companies that have study registries report results in an easy-to-use and comprehensive format. In the field of antidepressants, one of the best examples of transparency is the GSK website. As a consequence of a legal settlement between GSK and the New York state following the concerns about the lack of transparency of paroxetine clinical trials in children and adolescents, in 2004 Attorney General's office required GSK to develop a publicly accessible online results database for the timely, comprehensive, posting of results of company-marketed drugs (http://www.gsk-clinicalstudyregister.com/). Fig. 1 shows the format in which study results are presented. Having access to this information may be useful also when the published paper is available because it can help retrieve some missing information (many times standard deviations are not reported in the published tables) or clarify the true number of randomized patients or the exact figures for mean change (it is common that changes in rating scales are reported only on graphs).
In SRs, the inclusion of unpublished data is of utmost relevance, most of all in fields where many studies are available. A pragmatic approach that readers may employ in order to assess if an SR is comprehensive and systematic is to check whether data from unpublished trials are included in the analysis.