28.1 Introduction
The regulatory governance of Artificial Intelligence and Machine Learning (AI/ML) technologies as medical devices in healthcare challenges the regulatory divide between research and clinical care, which is typically of pharmaceutical products. This chapter considers the regulatory governance of an AI/ML clinical decision support (CDS) software for the diagnosis of diabetic retinopathy as a ‘risk object’ by the Food and Drug Administration (FDA) in the United States (US). The FDA’s regulatory principles and approach may play an influential role in how other countries govern this and other software as a medical device (SaMD). The disruptions that AI/ML technologies can cause are well publicised in the lay and academic media alike, although the more serious ‘risks’ of harm are still essentially anticipatory. In some quarters, there is a prevailing sense that a ‘light-touch’ approach to regulatory governance should be adopted to ensure that the advancement of AI – particularly in ways that are expected to generate economic gain – should not be unduly burdened. Hence, in response to the question of whether regulation of AI is needed now, scholars like Chris Reed have responded with a qualified ‘No’. As Reed explains, the use of the technology in medicine is already regulated by the profession, and regulation will be adapted piecemeal as new AI technologies come into use anyway.Footnote 1 A ‘wait and see’ approach is likely to produce better long-term results than hurried regulation based on a very partial understanding of what needs to be regulated. It is also perhaps consistent with this mind-set that the commercial development and application of AI and AI-based technologies remain largely unregulated.
This chapter takes a different view on the issue, and argues that the response should be a qualified ‘Yes’ instead, partly because there is already an existing regulatory framework in place that may be adapted to meet anticipated challenges. As a ‘risk object’, the regulation of AI/ML medical devices cannot be understood and managed separately from a broader ‘risk culture’ within which it is embedded. Contrary to what an approach in ‘command-and-control’ suggests, regulatory governance of AI/ML medical devices should not be understood merely as the application of external forces to contain ills that must somehow be managed in order to derive the desired effects. Arguably, it is this limited conception of ‘risks’ and its relationship with regulation that give rise to liminality. As Laurie and others clearly explains,Footnote 2 a liminal space is created contemporaneously with the uncertainties generated by new and emerging technologies. Drawing on the works of Arnold van Gennep and Victor Turner, ‘liminality’ is presented as an analytic to engage with the processual and experiential dynamics of transitional and transformational inter-structural boundary or marginal spaces. It is itself an intermediary process in a three-part pattern of experience, that begins with separation from an existing order, and concludes with re-integration into a new world.Footnote 3 Mapping liminal spaces and the changing boundaries entailed can help to highlight gaps in regulatory regimes.Footnote 4
Risk-based evaluation is often a feature of such liminal spaces, and when they become sites for battles of power and values, ethical issues arise. Whereas liminality has been applied to account for human experiences within regulated spaces, this chapter considers the epistemic quality of ‘risks’ and its situatedness within regulatory governance as a discursive practice and as a matter of social reality. In this respect, regulation is not necessarily extrinsic to its regulatory object, but constitutive of it. Concerns about ‘risks’ from technological innovations and the need to tame them have been central to regulatory governance.Footnote 5 Whereas governance has been a longstanding cultural phenomenon that relates to ‘the system of shared beliefs, values, customs, behaviours and artifacts that members of society use to cope with their world and with one another, and that are transmitted from generation to generation through learning’,Footnote 6 it is the regulatory turn that is especially instructive. Here, regulatory response is taken to reduce the uncertainty and instability of mitigating potential risks and harms and by directing or influencing actors’ behaviour to accord with socially accepted norms and/or to promote desirable social outcomes, and regulation encompasses any instrument (legal or non-legal in character) that is designed to channel group behaviour.Footnote 7 The high connectivity of AL/ML SaMDs that are capable of adapting to their digital environment in order to optimise performance suggests that the research agenda persists beyond what may be currently limited to the pilot or feasibility stages of medical device trials. If continuous risk-monitoring is required to support the use of SaMDs in a learning healthcare system, more robust and responsive regulatory mechanisms are needed, not less.Footnote 8
28.2 AI/ML Software as Clinical Decision Support
In April 2018, the FDA granted approval for IDx-DR (DEN180001) to be marketed as the first AI diagnostic system that does not require clinician interpretation to detect greater than a mild level of diabetic retinopathy in adults diagnosed with diabetes.Footnote 9 In essence, this SaMD applies an AI algorithm to analyse images of the eye taken with a retinal camera that are uploaded to a cloud server. A screening decision is made by the device as to whether the individual concerned is detected with ‘more than mild diabetic retinopathy’ and, if so, is referred to an eye care professional for medical attention. Where the screening result is negative, the individual will be rescreened in twelve months. IDx-DR was reviewed under the FDA’s De Novo premarket review pathway and was granted Breakthrough Device designation,Footnote 10 as the SaMD is novel and of low to moderate risk. On the whole, the regulatory process did not detract substantially from the existing regulatory framework for medical devices in the USA. A medical device is defined broadly to include low-risk adhesive bandages to sophisticated implanted devices. In the USA, a similar approach is adopted in the definition of the term ‘device’ in Section 201(h) of the Federal Food, Drug and Cosmetic Act.Footnote 11
For regulatory purposes, medical devices are classified based on their intended use and indications for use, degree of invasiveness, duration of use, and the risks and potential harms associated with their use. At the classification stage, a manufacturer is not expected to have gathered sufficient data to demonstrate that its proposed product meets the applicable marketing authorisation standard (e.g. data demonstrating effectiveness). Therefore, the focus of the FDA’s classification analysis is on how the product is expected to achieve its primary intended purposes.Footnote 12 The FDA has established classifications for approximately 1700 different generic types of devices and grouped them into sixteen medical specialties referred to as ‘panels’. Each of these generic types of devices is assigned to one of three regulatory classes based on the level of control necessary to assure the safety and effectiveness of the device. The class to which the device is assigned determines, among other things, the type of premarketing submission/application required for FDA clearance to market. All classes of devices are subject to General Controls,Footnote 13 which are the baseline requirements of the FD&C Act that apply to all medical devices. Special Controls are regulatory requirements for Class II devices, and are usually device-specific and include performance standards, postmarket surveillance, patient registries, special labelling requirements, premarket data requirements and operational guidelines. For Class III devices, active regulatory review in the form of premarket approval is required (see Table 28.1).
Class | Risk | Level of regulatory controls | Whether clinical trials required | Examples |
---|---|---|---|---|
I | Low | General | No | Gauze, adhesive bandages, toothbrush |
II | Moderate | General and special | Maybe | Suture, diagnostic X-rays |
III | High | General and premarket approval | Yes | Pacemakers, implantable defibrillators, spinal cord stimulators |
Clinical trials of medical devices, where required, are often non-randomised, non-blinded, do not have active control groups, and lack hard endpoints, since randomisation and blinding of patients or physicians for implantable devices will in many instances be technically challenging and ethically unacceptable.Footnote 14 Table 28.2 shows key differences between clinical trials of pharmaceuticals in contrast to medical devices.Footnote 15 Class I and some Class II devices may be introduced into the US market without having been tested in humans through an approval process that is based on predicates. Through what is known as the 510(k) pathway, a manufacturer needs to show that its ‘new’ device is at least as safe and effective as (or substantially equivalent to) a legally marketed predicate device (as was the case for IDx-DR).Footnote 16
Pharmaceuticals | Medical devices | ||||
---|---|---|---|---|---|
Phase | Participants | Purpose | Stage | Participants | Purpose |
0 (Pilot/exploratory ; not all drugs undergo this phase) | 10–15 participants with disease or condition | Test very small (subtherapeutic) dosage to study effects and mechanisms | Pilot/early feasibility/ first-in-human | 10–15 participants with disease or condition | Collect preliminary safety and performance data to guide development |
I (Safety and toxicity) | 10–100 healthy participants | Test safety and tolerance Determine dosing and major adverse effects | Feasibility | 20–30 participants with disease or condition | Assess safety and efficacy of near-final or final device design Guides design of pivotal study |
II (Safety and effectiveness) | 50–200 participants with disease or condition | Test safety and effectiveness Confirm dosing and major adverse effects | |||
III (Clinical effectiveness) | >100–1000 participants with disease or condition | Test safety and effectiveness Determine drug–drug interaction and minor adverse effects | Pivotal | >100–300 participants with disease or condition | Establish clinical efficacy, safety and risks |
IV (Post-approval study) | >1000 | Collect long-term data and adverse effects | Post-approval study | >1000 | Collect long-term data and adverse effects |
The nature of regulatory control is changing; regulatory control does not arise solely through the exertion of regulatory power over a regulated entity but also acts intrinsically from within the entity itself. It is argued that risk-based regulation draws on different knowledge domains to constitute the AI/ML algorithm as a ‘risk object’, and not merely to subjugate it. Risk objectification renders the regulated entity calculable. Control does not thereby arise because the regulated entity behaves strictly in adherence to specific commands but rather because of the predictability of its actions. Where risk cannot be precisely calculated however, liminal spaces may help to articulate various ‘scenarios’ with different degrees of plausibility. These liminal spaces are thereby themselves a means by which uncertainty is managed. Typically, owing to conditions that operate outside of direct regulatory control, liminal spaces can either help to maintain a broader regulatory space to which they are peripheral, or contribute to its re-configuration through a ‘domaining effect’. This aspect will be considered in the penultimate section of this chapter.
28.3 Re-embedding Risk and a Return to Sociality
The regulatory construction of IDx-DR as a ‘risk object’ is accomplished by linking the causal attributes of economic and social risks, and risks to human safety and agency, to its constitutive algorithms reified as a medical device.Footnote 17 This ‘risk object’ is made epistemically ‘real’ when integrated through a risk discourse, by which risk attributions and relations have come to define identities, responsibilities, and socialities. While risk objectification has been effective in paving a way forward to market approval for IDx-DR, this technological capability is pushed further into liminality. The study that supported the FDA’s approval was conducted under highly controlled conditions where a relatively small group of carefully selected patients had been recruited to test a diagnostic system that had a narrow usage criteria.Footnote 18 It is questionable whether the AI/ML feature was itself tested, since the auto-didactic aspect of the algorithm was locked prior to the clinical trial, which greatly constrained the variability of the range of outputs.Footnote 19 At this stage, IDx-DR is not capable of evaluating the most severe forms of diabetic retinopathy that requires urgent ophthalmic intervention. However, IDx-DR is capable of ML, which is a subset of AI and refers to a set of methods that have the ability to automatically detect patterns in data in order to predict future data trends or for decision-making under uncertain conditions.Footnote 20 Deep learning (DL) is in turn a subtype of ML (and a subfield of representation learning) that is capable of delivering a higher level of performance, and does not require a human to identify and compute the discriminatory features for it. From the 1980s onwards, DL software has been applied in computer-aided detection systems, and the field of radiomics (a process that extracts large number of quantitative features from medical images) is broadly concerned with computer-aided diagnosis systems, where DL has enabled the use of computer-learned tumour signatures.Footnote 21 It has the potential to detect abnormalities, make differential diagnoses and generate preliminary radiology reports in the future, but only a few methods are able to manage the wide range of radiological presentations of subtle disease states. In the foreseeable future, unsupervised AI/ML will test the limits of conventional means of regulation of medical devices.Footnote 22 The challenges to risk assessment, management and mitigation will be amplified as AI/ML medical devices change rapidly and become less predictable.Footnote 23
Regulatory conservatism reflects a particular positionality and related interests that are at stake. For many high-level policy documents on AI, competitive advantage for economic gain is a key interest.Footnote 24 This position appears to support a ‘light touch’ approach to regulatory governance of AI in order to sustain technological development and advance national economic interests. If policymakers, as a matter of socio-political construction, consider regulation as impeding technological development, then regulatory governance is unlikely to see meaningful progression. Not surprisingly, the private sector has had a dominant presence in defining the agenda and shape of AI and related technologies. While this is not in and of itself problematic, the narrow regulatory focus and absence of broader participation could be. For instance, it is not entirely clear to what extent the development of AI/ML algorithms is determined primarily by sectorial interests.Footnote 25
Initial risk assessment is essentially consequentialist in its focus on intended use of the SaMD to achieve particular clinical outcomes. Risk characterisation is abstracted to two factors:Footnote 26 (1) significance of the information provided by the SaMD to the healthcare decision; and (2) state of the healthcare situation or condition. Risk is thereby derived from ‘objective’ information that is provided by the manufacturer on intended use of the information provided by the SaMD in clinical management. Such use may be significant in one of three ways: (1) to treat or to diagnose, (2) to drive clinical management or (3) to inform clinical management. The significance of an intended use is then associated with a healthcare situation or condition (i.e. critical, serious or non-serious). Schematically, Table 28.3 presents the risk characterisation framework based on four different levels of impact on the health of patients or target populations. Level IV of the framework (e.g. SaMD that performs diagnostic image analysis for making treatment decisions in patients with acute stroke, or screens for mutable pandemic outbreak that can be highly communicable through direct contact or other means) relates to the highest impact while Level I (e.g. SaMD that analyses optical images to guide next diagnostic action of astigmatism) relates to the lowest.Footnote 27
State of healthcare situation or condition | Significance of information provided by SaMD to healthcare decision | ||
---|---|---|---|
Treat or diagnose | Drive clinical management | Inform clinical management | |
Critical | IV | III | II |
Serious | III | II | I |
Non-serious | II | I | I |
To counter the possible deepening of regulatory impoverishment, regulatory governance as concept and process will need to re-characterise risk management as a form of learning and experimentation rather than rule-based processes, thus placing stronger reliance on human capabilities to imagine alternative futures instead of quantitative ambitions to predict the future. Additionally, a regulatory approach that is based on total project lifecycle needs to be taken up. This better accounts for modifications that will be made to the device through real-world learning and adaptation. Such adaptation enables a device to change its behaviour over time based on new data and optimise its performance in real time with the goal of improving health outcomes. As the FDA’s conventional review procedures for medical devices discussed above are not adequately responsive to assess adaptive AI/ML technologies, the FDA has proposed for a premarket review mechanism to be developed.Footnote 28 This mechanism seeks to introduce a predetermined change control plan in the premarket submission, in order to give effect to the risk categorisation and risk management principles, as well as the total product lifecycle approach, of the IMDRF. The plan will include the types of anticipated modifications (or pre-specifications) and associated methodology that is used to implement the changes in a controlled manner while allowing risks to patients to be managed (referred to as Algorithm Change Protocol). In essence, the proposed changes will place on manufacturers a greater responsibility of monitoring the real-world performance of their medical devices and to make available the performance data through periodic updates on what changes were made as part of the approved pre-specifications and the Algorithm Change Protocol. In totality, these proposed changes will enable the FDA to evaluate and monitor, collaboratively with manufacturers, an AI/ML software as a medical device from its premarket development to postmarket performance. The nature of the FDA’s regulatory oversight will also become more iterative and responsive in assessing the impact of device optimisation on patient safety.
As the IMDRF also explains, every SaMD will have its own risk category according to its definition statement even when it is interfaced with other SaMD, other hardware medical devices or used as a module in a larger system. Importantly, manufacturers are expected to have an appropriate level of control to manage changes during the lifecycle of the SaMD. The IMDRF labels any modifications made throughout the lifecycle of the SaMD, including its maintenance phase, as ‘SaMD Changes’.Footnote 29 Software maintenance is in turn defined in terms of post-marketing modifications that could occur in the software lifecycle processes identified by the International Organization for Standardization.Footnote 30 It is generally recognised that testing of software is not sufficient to ensure safety in its operation. Safety features need to be built into the software at the design and development stages, and supported by quality management and post marketing surveillance after the SaMD has been installed. Post market surveillance includes monitoring, measurement and analysis of quality data through logging and tracking of complaints, clearing technical issues, determining problem causes and actions to address, identify, collect, analyse and report on critical quality characteristics of products developed. However, monitoring software quality alone does not guarantee that the objectives for a process are being achieved.Footnote 31
As a concern of Quality Management System (QMS), the IMDRF requires that maintenance activities preserve the integrity of the SaMD without introducing new safety, effectiveness, performance and security hazards. It recommends that a risk assessment, including considerations in relation to patient safety and clinical environment and technology and systems environment, should be performed to determine if the changes affect the SaMD categorisation and the core functionality of SaMD as set out in its definition statement. The proposed QMS complements the risk categorisation framework through its goal of incorporating good software quality and engineering practices into the device. Principles underscoring QMS are set out in terms of organisational support structure, lifecycle support processes, and a set of realisation and use processes for assuring safety, effectiveness and performance. These principles have been endorsed by the FDA in its final guidance to describe an internally agreed upon understanding (among regulators) of clinical evaluation and principles for demonstrating the safety, effectiveness and performance of the device, and activities that manufacturers can take to clinically evaluate their device.Footnote 32
28.4 Regulatory Governance as Participatory Learning System
In this penultimate section of this chapter, it is argued that the regulatory approach considered in the preceding sections is intended to support a participatory learning system comprising at least two key features: (1) a platform and/or mechanisms that enable constructive engagement with, and participation of, members of society; and (2) the means by which a common fund of knowledges (to be explained below) may be pooled to generate an anticipatory knowledge that could guide collective action. In some instances, institutionalisation could advance this agenda, but it is beyond the scope of this manuscript to examine this possibility to a satisfactory degree.
There is a diverse range of modalities through which constituents of a society engage in collaborative learning. As Annelise Riles’s PAWORNET illustrates, each modality has its own goals, character, strengths and limitations. In her study, Riles observes that networkers did not understand themselves to share a set of values, interests or culture.Footnote 33 Instead, they understood themselves to be sharing their involvement in a certain network that was a form of institutionalised association devoted to information sharing. What defined networkers most of all was the fact that they were personally and institutionally connected or knowledgeable about the world of specific institutions and networks. In particular, it was the work of creating documents, organising conferences or producing funding proposals that generated a set of personal relations that drew people together and also created divisions of its own. In the author’s own study,Footnote 34 ethnographic findings illustrate how the ‘publics’ of human stem cell research and oocyte donation were co-produced with an institutionalised ‘bioethics-as-public-policy’ entity known as the Bioethics Advisory Body. In that context, the ‘publics’ comprised institutions and a number of individuals – often institutionally connected – that represented a diverse set of values, interests and perhaps cultures (construed in terms of their day-to-day practices in the least). These ‘publics’ resemble a network in a number of ways. They were brought into a particular set of relationship within a deliberative space created mainly by the consultation papers and reinforced through a variety of means that included public meetings, conferences, and feedback sessions. Arguably, even individual feedback from a public outreach platform known as ‘REACH’ encompassed a certain kind of pre-existing (sub-) network that has been formed with a view to soliciting relatively more spontaneous and independent, uninvited forms of civil participatory action. But this ‘network’ is not a static one. It varied with, but was also shaped by, the broader phenomenon of science and expectations as to how science ought to be engaged. In this connection, Riles’s observation is instructive: ‘It is not that networks “reflect” a form of society, therefore, nor that society creates its artifacts … Rather, it is all within the recursivity of a form that literally speaks about itself’.Footnote 35
A ‘risk culture’ that supports learning and experimentation rather than rule-based processes must embed the operation of AI and related technologies as ‘risk objects’ within a common fund of knowledges. Legal processes are inherent to understanding the risk, such as that of a repeat sexual offence under ‘Megan’s Law’, which encompasses the US community notification statutes relating to sexual offenders.Footnote 36 Comprising three tiers, this risk assessment process determines the scope of community notification. In examining the constitutional basis of Megan’s Law, Mariana Valverde et al. observe that ‘the courts have emphasised the scientific expertise that is said to be behind the registrant risk assessment scale (RRAS) in order to argue that Megan’s Law is not a tool of punishment but rather an objective measure to regulate a social problem’.Footnote 37 However, reliance on Megan’s Law as grounded in objective scientific knowledge has given rise to an ‘intermediary knowledge in which legal actors – prosecutors and judges – are said not only to be more fair but even more reliable and accurate in determining a registrant’s risk of re-offence’.Footnote 38 In this, the study also illustrates a translation from scientific knowledge and processes to legal ones, and how the ‘law’ may be cognitively and normatively open.
Finally, the articulation of possible harms and dangers as ‘risks’ involves the generation of ‘anticipatory knowledge’, which is defined as ‘social mechanisms and institutional capacities involved in producing, disseminating, and using such forms [as] … forecasts, models, scenarios, foresight exercises, threat assessments, and narratives about possible technological and societal futures’.Footnote 39 Like Ian Hacking’s ‘looping effect’, anticipatory knowledge is about knowledge-making about the future, and could operate as a means to gap-filling. The study by Hugh Gusterson of the Reliable Replacement Warhead (RRW) program is illustrative of this point, where US weapons laboratories could design new and highly reliable nuclear weapons that are safe to manufacture and maintain.Footnote 40 Gusterson shows that struggle over the RRW Program, initiated by the US Congress in 2004, occurred across four intersecting ‘plateaus of nuclear calculations’ – geopolitical, strategic, enviropolitical, and technoscientific – each with its own contending narratives of the future. He indicates that ‘advocates must stabilise and align anticipatory knowledge from each plateau of calculation into a coherent-enough narrative of the future in the face of opponents seeking to generate and secure alternative anticipatory knowledges’.Footnote 41 Hence the interconnectedness of the four plateaus of calculation, including the trade-offs entailed, was evident in the production of anticipatory knowledge vis-à-vis the RRW program. In addition, the issues of performativity and ‘social construction of ambiguity’ were also evident. Gusterson observes that being craft items, no two nuclear weapons are exactly alike. However, the proscription of testing through detonation meant that both performativity and ambiguity over reliability became matters of speculation, determined through extrapolation from the past to fill knowledge ‘gaps’ in the present and future. This attempt at anticipatory knowledge creation also prescribed a form that the future was to take. Applying a similar analysis from a legal standpoint, Graeme Laurie and others explain that foresighting as a means of devising anticipatory knowledge is neither simple opinion surveying nor mere public participation.Footnote 42 It must instead be directed at the discovery of shared values, the development of shared lexicons, the forging of a common vision of the future and the taking of steps to realise the vision with the understanding that this is being done from a position of partial knowledge about the future. As we have considered earlier on in this chapter, this visionary account captures the approach that has been adopted by the IMDRF impressively well.
28.5 Conclusion
Liminality highlights the need for a processual-oriented mode of regulation in order to recognise the flexibility and fluidity of the regulatory context (inclusive of its objects and subjects) and the need for iterative interactions, as well as to possess the capacity to provide non-directive guidance.Footnote 43 If one considers law as representing nothing more than certainty, structure and directed agency, then we should rightly be concerned as to whether the law can envision and support the creation of genuinely liminal regulatory spaces, which is typified by uncertainty, anti-structure and an absence of agency.Footnote 44 The crucial contribution of regulatory governance however, is its conceptualisation of law as an epistemically open enterprise, and in respect of which learning and experimentation are possible.