Introduction
Science and medicine research initiatives are increasingly interdisciplinary, and the advances in technology demand the ability to acquire, store, analyze, and interpret data. This has led to an increasing need to integrate collaborative biostatisticians within scientific research teams to meet the demand. Many Academic Medical Centers (AMCs) facilitate efficient and reproducible research with centralized collaborative resources that provide access to quantitative experts. There are numerous models and considerations for building and maintaining these groups, including training, evaluating, and establishing realistic expectations for biostatisticians [Reference Rubio, del Junco and Bhore1–Reference Perkins5]. The training and professional development of collaborative biostatisticians (staff and faculty) are necessary to maintain successful and effective biostatistics collaborative programs in AMCs, as well as in institutions focused on clinical or translational science. This article provides a roadmap for the biostatistics collaboration process and proposes specific training strategies to support professional development of clinically integrated collaborative biostatisticians (CBs).
In scientific research, including clinical and translational research, the CB is responsible for ensuring that the design approaches and analytic methods are sound and sufficiently rigorous, so that reasonable inferences and/or accurate predictions can be made. It is imperative that CBs work closely with their clinical and/or translational collaborators as well as with research stakeholders (institution leadership, funding entity or sponsor, etc.) to ensure that results are interpreted appropriately to inform clinical practice and decision-making that will ultimately affect patient care or health outcomes. This requires CBs to provide high-quality analyses, facilitate reproducible research workflows, and communicate statistical results effectively in interdisciplinary collaborative environments. It is important to provide CBs with a clear set of principles from which they can build a solid foundation for implementing sound scientific, business, and regulatory practices. CBs must be able to articulate the purpose and importance of their work to their collaborators and stakeholders. Additionally, CBs must have broad knowledge and understanding of statistical approaches that can be applied in developing, implementing, and revising analytic methods that will enhance the quality of the research. This knowledge will also help facilitate the application of standard and sometimes simpler methods that are pragmatic, thereby increasing both efficiency and interpretability. Equally important are the skills needed to develop robust workflows to ensure that the research goal is achieved in a reproducible manner that upholds the scientific integrity of the project, and that all ethical guidelines for statistical practice are followed [6]. This requires CBs to demonstrate good judgment, effective communication, and methods that will efficiently strengthen scientific rigor. Fig. 1 displays the essential skills needed for success as a CB and provides guidance that should be incorporated in hiring and training practices. The topics and skills presented in Fig. 1 were collected from the current literature [Reference Perkins5,Reference Tobi7–Reference Begg and Vaughan10] and through anecdotal experience of the leadership of 13 biostatistics units across academia and industry who have vast experience training CBs. The goal of this article is to present and describe comprehensive training methods that can be adapted to any quantitative collaboration group when training staff and faculty CBs, including new or existing units as well as biostatisticians who are embedded within clinical units.
The Joint Task Force for Clinical Trial Competency, a diverse working group tasked with creating a core competency framework for clinical research, has developed a framework that consists of 47 competency statements within 8 competency domains, each described at the “Basic”, “Skilled”, and “Advanced” skill levels [Reference Sonstein11–Reference Hornung14]. This is a comprehensive list for clinical team members; however, it does not reference specific CB competencies or collaboration with CBs. Several articles have previously discussed the need for effective interdisciplinary collaboration and have collectively outlined core competencies needed for CBs to be successful in AMCs [Reference Tobi7–Reference Begg and Vaughan10,Reference Samsa15]. Begg and Vaughan (2011) reviewed program requirements used to teach interdisciplinary skills at ten top-rated schools and provided a review of related literature. They report that to improve the necessary skills training required for effective interdisciplinary research collaborations, some analytically focused degree programs (biostatistics, epidemiology, etc.) train students to become skilled team scientists. Such programs provide foundational education for CBs. Several articles discuss how these competencies might be integrated into educational curricula and summarize key competencies as: broad and high-level expertise in statistical theory and methods, broad understanding of specific and relevant biomedical areas, and communication and leadership [Reference Tobi7,Reference Samsa9,Reference Begg and Vaughan10,Reference Samsa15]. Previous literature focuses on developing general frameworks, defining competencies, and integrating training into curriculum design. This article specifically focuses on providing additional material concentrated on training CBs to fill gaps in the existing literature. The competencies outlined in the previous literature, along with experience by the authors, were used to develop the proposed collaborative roadmap as well as targeted training strategies for CBs in practice.
Biostatistics and other quantitatively focused degree programs sometimes include training to develop consultation or collaboration skills. However, such training often offers limited exposure to the full collaborative process. Additional training, such as experience in real collaborations and/or working with actual patient data, can be obtained through graduate research assistantships or intern experiences, particularly when in the setting of a high-volume collaborative AMC. Students who complete their degrees in graduate programs that include extensive collaborative training and are then hired into a full-time CB position will often still require additional job-based training and mentoring to equip them with the skills needed to succeed as an effective interdisciplinary collaborator. Additionally, those transitioning into a CB role from a variety of quantitative backgrounds (e.g., theoretical statistics, data science, or epidemiology) or from graduate programs other than biostatistics may need additional training related to specific methods or soft skills to excel as part of an interdisciplinary team.
In some academic institutions, biostatistics faculty take on a collaborative role, and in others, staff are also expected to serve as interdisciplinary collaborators. Spratt et al. (2017) describe the challenges of building career ladders and providing clear expectations for collaborative biostatistics faculty in an AMC [Reference Spratt16]. To the best of our knowledge, there is a paucity of literature discussing these topics for research staff (Master’s or PhD level). There are also differences in the structural placement of biostatistics faculty and staff within institutions, with some institutions hiring directly into clinical departments, while others build a centralized core group of biostatisticians who collaborate across multiple clinical disciplines. Biotechnology and pharmaceutical companies also vary widely in the structure of biostatistics units. The proposed training program is sufficiently comprehensive, such that it can be easily modified and incorporated into any of these settings.
The training methods presented in this article were initially developed by the Duke Biostatistics, Epidemiology, and Research Design Core (BERD Core) and have been in use for the last four years to train CBs who have collaborated on more than 700 projects resulting in more than 400 publications and presentations. The training addresses skills that are necessary to navigate through every phase of the research process, from research question development and study design specification through analysis, publication, and/or implementation. To outline the collaborative process, Fig. 2 provides a roadmap that describes the key phases of the biostatistics collaborative process that CBs are trained to navigate. Subsequently, the training methods and collaboration roadmap were agreed upon by leadership of biostatistics units at 13 different institutions and companies. While the material presented here is specifically proposed for biostatisticians, it could be adapted for use with other interdisciplinary research-oriented groups including but not limited to clinical or translational data scientists, biomedical informaticists, and imaging scientists. In this article, first, the collaboration roadmap for biostatistics is proposed; then, a novel training strategy for CBs to navigate this roadmap with tangible examples is presented, and finally, we discuss previous cases of the training methods.
The Biostatistics Collaboration Roadmap and Training Methods
In each of the 12 phases of the biostatistics collaboration process (Fig. 2), the CB has a key role that requires specific training to support. It is expected that the CB will ensure appropriate completion of most of the phases of the collaborative process, especially when a project manager is not involved. As such, these individuals must be able to take the initiative to move projects forward and meet appropriate deadlines. As a biostatistician gains experience with a specific clinical or scientific research area, they develop expertise regarding common outcomes, data structures, and analytic methods. As this expertise advances, collaborations in this research area become increasingly effective and efficient. Long-term collaborations are strongly encouraged, as they foster the development of strong partnerships and allow important statistical concepts and approaches to become fully integrated into the team’s research endeavors. The roadmap presented in Fig. 2 represents a generalized collaborative process. Research collaborations typically are not linear and may iterate through these steps depending on the project details, funding models, and overall project goals. CBs may join a collaboration after study design has occurred, data collection is complete, or even after a first failed try at publication. While the role of the CB throughout the process will vary by project, it is important for all CBs to understand what their contributions may involve during each phase of the collaboration.
Initial Collaboration Phases (Phases 1–4)
The first phase of the collaborative process is the initial discussion between the biostatistician and scientific collaborator(s), which optimally should occur while investigators are developing their research questions. Including the CB in this initial step allows them to fully participate in the brainstorming and development of primary and secondary research questions and associated statistical hypotheses. The CB may offer insight into the most statistically appropriate way to frame the question, as well as providing input on methods for data collection or identification of existing datasets that will be the most appropriate. The CB then needs to understand additional clinical/scientific background information, literature, data sources, related outcomes, and limitations. This process should involve ongoing conversations that ensure ample opportunity for the CB to seek clarity when not familiar with a concept. Expectations of the collaborative roles and authorship should be discussed early on within the collaborative process [17].
Statistical Analysis and Dissemination Phases (Phases 5–12)
The next phase involves outlining appropriate statistical analysis strategies that will eventually evolve into a detailed statistical analysis plan (SAP). The SAP serves to outline the study background, goals (often research hypotheses), data specifications, response and explanatory measures, and all proposed statistical methodologies that will be used to address the research questions. An example template SAP can be found on the Duke BERD Core website [18]. Once the SAP is completed and agreed upon by collaborators, data collection or acquisition of an already existing dataset can commence. The CB may or may not be involved in the day-to-day activities of data collection, but should be in regular communication with those performing this task. Prior to initiating final statistical analyses, the data should be thoroughly cleaned and reviewed, and a “locked” dataset should be created either by or under the direction of the CB. Analysis per the SAP should then be completed using this dataset, results should be summarized by the CB in the form of an analysis report, and statistical inferences from the results should be explained by the CB. In the case of a continuous improvement project, one locked dataset may not be feasible, but datasets used to make decisions at each stage should be stored and documented as appropriate to ensure reproducibility. A formal report should be created that includes all relevant information, and the CB should collaborate on the dissemination of the information (e.g., co-authoring manuscripts). After final dissemination and/or implementation is complete, the CB should package all the work in a way that allows for full reproducibility of their contributions [Reference Ioannidis19–Reference Begley and Ioannidis21]. As the CB gains expertise in the specific scientific area through long-term collaborative engagement, this can facilitate efficient navigation through all phases.
Training for the Biostatistics Collaboration Process
The training strategy, initially developed in the Duke University Biostatistics, Epidemiology, and Research Design (BERD) Core, outlines clear expectations for supervisors who are charged with training junior biostatisticians, and is sufficiently flexible to be adapted for a variety of settings. To ensure biostatisticians are able to effectively navigate the phases of the collaborative process, we propose training that focuses on 16 topic areas. These areas relate to three main core competency areas, as suggested by the current literature [Reference Tobi7]: (1) Clinical and Domain Knowledge (broad understanding of specific and relevant biomedical areas, Table 1A); (2) Statistical Expertise (broad and high-level expertise in statistical theory and methods, Table 1B); and (3) Communication and Leadership (including organizational behavior, Table 1C).
Abbreviations: CB = Collaborative Biostatistician; CITI = Collaborative Institutional Training Program; CRF = Case Report Form; DCF = Data Collection Form; EHR = Electronic Health Record; FDA = Food and Drug Administration; FOA = Funding Opportunity Announcement; GCP = Good Clinical Practice; GLP = Good Laboratory Practice; ICH = International Council for Harmonisation; IRB = Institutional Review Board; MOO = Manual of Operations; NIH = National Institutes of Health; ODS = Output Delivery System; PCORI = Patient-Centered Outcomes Research Institute; RFA = Request for Application; SAP = Statistical Analysis Plan; SOP = Standard Operating Procedure.
Clinical and domain knowledge includes topics that relate to the responsible conduct of research, knowledge, and understanding of institutional structure, as well as understanding of the clinical/scientific domain.
Statistical expertise refers to the skills needed to implement statistical approaches, develop statistical analysis plans, implement reproducible coding practices, learn new statistical methods as needed, and conduct relevant literature reviews.
Communication and Leadership refers to effective written and verbal communication skills, leading effective collaborative meetings, and time/project management. The importance of developing soft skills (e.g., communication, interpersonal skills, etc.) is also addressed in this section [Reference Zapf22,Reference Chuang-Stein23].
Table 1A–C is organized using these three broad areas, with sub-topics, to allow for training of specific skills. These tables describe the approaches that have been developed to Teach the necessary skills, monitor the Implementation of these skills, and Evaluate the proficiency of these skills (TIE). During the “Teach” stage of training, key learning objectives are formally outlined, and learning strategies are provided by the CB’s supervisor. The “Implementation” stage lists specific tasks that the CB is expected to complete in order to reinforce learning of the topics outlined in the “Teach” stage. In the essential “Evaluation” stage, the CB receives critical feedback and opportunities to identify knowledge gaps that can be addressed through iterative exploration of the TIE process to improve skills and enhance knowledge. Table 1A–C presents the TIE process for each of the three main competencies that a CB must master (see Fig. 1).
At Duke, we have implemented TIE using “train the trainer” and “near-peer” models. The supervisor trains one person and mentors the trainee until they are ready to move into a supervisor role. Their ability to supervise is evaluated using the TIE methodology by leadership in the unit. This model repeats as junior CBs mature, and additional research staff and faculty are hired. The goal is that all trained CBs, within any role and in any collaborative clinical or translational science setting, could train other CBs who are junior to them. Evaluations can be conducted informally during regular management and/or as part of a formal annual evaluation process. At Duke, evaluations are conducted via annual online surveys in which biostatistics supervisors and collaborators rate the staff member’s ability in each competency category (on a Likert scale). Regular evaluations of each skill in Table 1 are conducted by the staff member’s manager. We present examples in the next section to provide specific methods that can be used for teaching, implementing, and evaluating specific skills.
Example of TIE in Clinical and Research Domain – Fulfilling Regulatory Requirements
All CBs must be sufficiently trained in regulatory requirements for research. Failure to follow regulations can result in termination of IRB approval, withdrawal/restriction of research funding, and fines. CBs may need to be knowledgeable about NIH and FDA guidelines and regulations for clinical research, institution-specific guidelines and regulations for scientific research, and standard/good scientific research practice.
Teach
The “Teach” stage may involve the CB viewing online learning modules, attending in-person trainings, and formally reviewing NIH, FDA, or other relevant guidelines with structured discussion led by senior/supervising biostatisticians.
Implement
The CB may be required to implement institutional processes and complete all in-person and online institutional and commercial training modules (e.g., Collaborative Institutional Training Initiative Program [24]). Additionally, the CB could be tasked with explaining processes and specifics to other CBs and the study team to demonstrate mastery of the topics. In this stage, CBs would work to implement all requirements as appropriate within their active research projects and discuss questions with their supervisors.
Evaluate
Supervising biostatisticians will conduct a formal review to confirm that the CB has acquired the skills necessary to follow required regulatory guidelines and perform research and analyses in accordance with guidelines. These skills will be evaluated through detailed review of their work and verification that all required regulatory elements are in place (e.g., listing as key personnel on protocols). Supervisors will also evaluate the CB by critically reviewing all regulatory materials created by the CB to be submitted internally, or to the NIH or FDA. The final step in the Evaluation will be the CB’s successful completion of online and in-person training (indicated by receiving confirmation or certification of completion of post-training tests).
Example of TIE in Statistical Expertise – Statistical Analysis Plans (SAPs)
Creation of the SAP requires understanding of the background, research questions, and hypotheses, as well as the data structure and statistical methodology that will be used to answer the research question. CBs must effectively communicate with the study team and review relevant materials to fully understand the scope and goals of a project in order to draft the SAP. Drafting of the SAP and acquiring final approval from the scientific collaborator ensures that both parties agree on the project specifics (i.e., analytic approaches, outcomes, data, etc.) and expectations of each role within the collaborative setting [18].
Teach
The CB will work to understand why SAPs are necessary and the specific elements that should be included. They will also learn the process for creating SAPs for a variety of study types. To teach this skill, the CB can review SAPs from the supervisor’s previous projects and discuss potential issues that could have occurred if all key components were not included. A template SAP can be provided, so the CB can understand all key components. The supervisor can also task the CB with writing the first draft of an SAP, which can then be compared to the draft written by the supervisor, followed by discussion of differences. This technique will help the CB think about why they missed certain key points and help them in drafting future SAPs.
Implement
The CB critically reviews sample and example SAPs, participates in online training, and writes draft SAPs using templates provided by the supervisor. The CB may also be asked to practice explaining what SAPs are and why they are important to peers, the supervisor, or scientific collaborators (with guidance and oversight from supervisor). They will also be asked to draft SAPs for projects they are collaborating on, with supervision.
Evaluate
The supervisor will review the draft SAPs created by the CB, identify gaps and strengths, and provide constructive criticism verbally and in writing. The supervisor will provide the CB with a list of additional questions that should be clarified by the scientific collaborator. Over time, the supervisor can track how many questions the CB missed in order to evaluate growth. Evaluation of competency with this skill could also entail a reduction in the number of edits needed by the supervisor.
Example of TIE in Communication and Leadership – Using Strong Statistical Voice
A strong statistical voice in this context is considered to be the ability to confidently and accurately advocate for appropriate quantitative methodology and resulting interpretations. CBs must be able to actively listen, absorb, and convey their opinions during all aspects of collaborative research, from study design and hypothesis generation to results interpretation and dissemination. CBs must also serve as advocates for the use of appropriate methodology and interpretation, something that can be challenging when scientific collaborators are unfamiliar with statistical methods or do not obtain the results they were hoping to see [Reference Zapf22]. Using a strong statistical voice is an important skill that can affect which methods are used and how results are interpreted and disseminated. Therefore, a strong statistical voice is vital in study planning, especially when selecting outcomes of interest, designing the data collection approach, and conducting power and sample size calculations. If the CB does not effectively communicate which analytic approaches are most appropriate, studies may suffer from poor design and statistical errors or fail to achieve desired power to detect results. Similarly, if a CB does not ask the right questions or communicate effectively with a scientific collaborator to identify critical variables that should be collected and analyzed, the results may not accurately address the research questions of interest.
Teach
Skills to “Teach” include the importance of gaining an in-depth understanding of the research questions and critical evaluation of introduction and background material to ensure sufficient understanding of the motivation for the research project. Additionally, when appropriate, assertive advocacy of one’s position regarding appropriate methodology or lack of appropriate data to properly address a research question is important. This can include the ability to balance realistic modifications to statistical methodology proposed by collaborators while ensuring rigor and reproducibility of results. A clear and detailed understanding about leadership structure and how to identify senior-level biostatisticians and/or faculty to serve as support is needed.
Implement
These skills can be “Implemented” by providing examples of research statements to CBs and requiring them to translate these statements to research questions and testable hypotheses. Additionally, CBs can shadow collaborative meetings and be required to document how senior biostatisticians ask targeted questions and advise on appropriate analytical approaches.
Another approach to implement this skill involves having CBs role-play with their supervisor, so they can practice advocating for appropriate methodologies and research considerations in different types of challenging research project settings.
Evaluate
Supervising biostatisticians can then “Evaluate” these skills by observing how the CB communicates thoughts and concerns in a team meeting and through email communications. The supervisor can then provide feedback and discuss additional questions or methods that the CB should have considered. Peer-to-peer role-playing may also be an effective tool to incorporate, as it would allow the CB to focus on growth and practice without the added pressure of evaluation. Evaluation for the role-playing scenario can be targeted to evaluate specific communication skills. For example, the supervisor can play the role of the scientific collaborator who proposes the use of inappropriate methodology to answer the research question, stemming from a lack of understanding of the assumptions of the methodology. The supervisor can then evaluate how well the CB is able to engage in a conversation about why the assumptions would be violated and how effectively the CB can communicate with the “collaborator” and successfully suggest more appropriate methodology. Evaluation should also include providing critical feedback aimed at helping the CB develop reasoned explanations based on statistical validity and alternative approaches as needed.
Discussion
Given the tremendous impact research has in affecting health care decision making and health outcomes, the development of a highly skilled collaborative workforce of biostatisticians is a critical priority. CBs are becoming increasingly essential to clinical translational science, and it is imperative that CBs are trained to be mindful of their important roles, as they collaboratively design studies, analyze data, and interpret results to help collaborators disseminate study findings. CBs must help assure that data are generated, recorded, and transferred using processes and systems that are secure, efficient, and reproducible. CBs must also work as integral team members to help others (colleagues, regulators, providers/investigators, etc.) draw appropriate statistical inference. To do this, CBs must have broad methodological knowledge and the ability to consider different methods for the design and analysis of studies. If there is any conflict regarding the appropriate statistical methods to be used, the CB should be able to effectively communicate the issues to their collaborators, colleagues, and supervisors to advocate for the correct methods. If errors occur or counterintuitive results are found, the CB must have the skillset to investigate the errors in a timely fashion. Errors should be corrected, noted (as appropriate), and shared to create an iterative learning cycle of unbiased and clear interpretation. CBs should also know when to request independent review to further validate findings. It is not enough that only intent is unbiased, but the CB must work with the scientific collaborator to ensure that objective interpretations are disseminated. Adequate training of CBs to do this has the potential to directly impact the integrity of the research by facilitating more effective collaborations. The methods we present aim to address each of these important challenges in training strong CBs.
The collaborative process and training methods presented here are broad and inclusive, so that they can be adapted to a variety of collaborative situations. In long-term collaborations, all methods may apply, while in short-term or one-off collaborations, selective and relevant parts of the proposed material can be utilized. A CB who is trained in the entire process should be able to identify and apply appropriate steps of the collaborative process.
It is imperative that CBs understand the steps required for project completion and all expectations of their role. For institutions to retain CBs who have developed strong competencies, there must be sufficient time dedicated to implementing effective training methods and providing career development opportunities. If training and opportunity for advancement are not available, the institution risks losing strong researchers. The TIE strategies presented in this article have been used to train more than 22 BERD Core staff and have been extended to train 54 Biostatistics Core Training and Internship Program student interns who have collectively worked on more than 40 collaborative teams across the School of Medicine at Duke University. Initial efforts to expand this approach appear promising and could be adopted widely by the clinical and translational science community. Future work will focus on the implementation and evaluation of these training strategies in order to work toward the development of new strategies.
Acknowledgements
This publication was supported by Grant Numbers UL1 TR002377, UL1 TR002366, UL1 TR002243, UL1 TR003015, and UL1 TR003096 from the National Center for Advancing Translational Sciences (NCATS). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health. The authors would also like to thank Matthew Ray, Duke Clinical and Translational Science Institute, Duke University, for figure and table review and development.
Disclosures
SCG reports receiving consulting fees from Gilead Sciences for serving on multiple Data Monitoring Committees. SMT reports receiving consulting fees from Abbvie, Inc. for work on biosimilars and bioequivalence. Although these relationships are not perceived to represent a conflict with the present work, they have been included in the spirit of full disclosure. There are no other conflicts of interest to report.