Given our reliance on evidence-based decision-making, we need confidence in scientific publishing. There are increasing concerns about difficulties publishing negative results, and non-publication of ‘inconvenient’ data. This publishing behaviour is important because it distorts the available guidelines. In response to this, the 2007 USA Food and Drug Administration Amendments Act mandated sponsors of relevant trials to report their findings on the clinicaltrials.gov website within a year of the study end. DeVito et al downloaded data from all applicable studies between March 2018 and September 2019.Reference DeVito, Bacon and Goldacre1 They found only 63% of the 4022 relevant trials had reported on the website, with only 40% making their 1-year deadline. Sponsors who ran multiple trials did much better than those smaller ones that ran fewer trials. Interestingly, industry-led research did better than non-industry; it might be that they have more to lose through non-compliance, or have more professional systems in place. There has been no improvement with time as the system embeds, and crucially, the authors note the lack of enforcement. They calculated that if the rules had been applied properly, the Food and Drug Administration could have collected almost $4 billion in fines, but have not yet issued a single one. There are some prominent clinical and academic sites ‘leading’ the chart of poor performers: you might be surprised at the names if you turn to the primary paper. Pleasingly, and impressively, the authors openly provide their data and software so that others can test and re-use it, and are maintaining an updated website so that organisations can demonstrate improvements (http://fdaaa.trialstracker.net).
Risk assessment is one key area always benefitting from improved data; Amir Sariaslan et al describe a nationwide cohort study on the incidence of being subject to, or committing, violence in people with psychiatric disorders.Reference Sariaslan, Arseneault, Larsson, Lichtenstein and Fazel2 This is a comprehensive study, encompassing over a quarter of a million Swedish nationals identified as having a psychiatric disorder (55% women), who were compared with over two and a half million age- and gender-matched individuals from the general population, and almost 200 000 full biological siblings without such conditions. Those with a mental health diagnosis were between three to four times more likely both to be the victim and perpetrator of violence. This equates to just under 7% of those with a psychiatric diagnosis suffering or committing violence to a degree requiring medical attention across a 10-year period. With regards to being a victim of violence, the figures are clearly considerably lower than the ‘classically quoted’ statistic of a tenfold increased risk. In part, this is as a result of the inclusion of siblings as comparators, cleverly allowed controlling for shared genetic and environmental confounders. Furthermore, being subject to violence was measured through identified healthcare visits or death from violence, and perpetration was measured by criminal convictions. Inevitably, this is skewed towards relatively serious levels of violence, while milder forms will be more common and thus harder to capture. Interestingly, schizophrenia was the only diagnosis where individuals were not found to have higher rates of being victims of violence than the general public (once comorbid substance use was accounted for). This is surprising given the vulnerabilities of such individuals, although the authors speculate it might be because of greater rates of social isolation.
We know the role of the neurotransmitter dopamine in primates; particularly its role in signalling rewarding stimuli, allied to signalling surprise, motivation, and a role in movement and addiction. The central idea is predicated on temporal difference learning models updated in response to reward prediction error signalling: that sounds complicated – what is it? In brief, we can learn by recognising the current state of the world, selecting from a repertoire of relevant actions and then recording the outcome of the action as a positive or negative value. An example might be actions consequent on traffic signals. When you learn to drive, if you got into a car with no knowledge or instruction on the world or traffic laws, you might have to undergo reinforcement (or trial-and-error) learning to acquire – the hard way – that ‘red’ means stop, not accelerate. In this unfortunate example, you would use previous rounds of trials (for example a sequence of car accidents) to eventually learn this and that green means ‘go’. Having an instructor to supervise your learning and advise you in advance of these instructions would clearly facilitate.
Computationally, this idea can be naively captured by tabulating all possible combinations of state–action pairs and then trying them out to acquire the associated value (i.e. a crash, or safe passage). But of course, not every time you accelerate through a red light will you get a negative reward outcome (i.e. you might get lucky) so typically, we average the rewards obtained over a sequence of trials of each state–action pair. Future decisions on which action to select in a given state are then weighted by this average to inform (hopefully) smart decisions that result in fewer accidents and more hassle-free motoring. A feckless driver might have learned a positive value for the state–action ‘red light, choose accelerate’ because 90% of the time, they managed to run a red light and get lucky; so one hurried morning, they hit the accelerator at a red light. Their reward-dependent action-selection mechanism tells them to expect ‘you'll probably be alright’ – but this time, they get T-boned by another vehicle. The difference between what they expected and what happened is the reward prediction error and the algorithm that updates the value attached to state–action pairs is called temporal difference learning. It has since been shown that the temporal difference algorithm and reward prediction errors marry with the phasic firing of dopamine neurons in the ventral-tegmental area.
However, over time, the temporal difference model of learning a single positive/negative value for each state–action has been inconsistent with some observed phenomena. So-called distributional reinforcement learning represents the reward prediction error as separate ‘channels’ (i.e. distinct positive and negative channels) with an associated degree of ‘optimism’ for each channel. The consequence is that for a single reward outcome, the distributional reinforcement learning system could represent a positive reward prediction error (in a pessimistic channel) and a negative reward prediction error (in an optimistic channel). In contrast, in a classical temporal difference model there is a point at which the averaged singular value associated with a state–action switches from positive to negative (and vice versa) – this reversal point determined whether future reward prediction errors will be positive or negative for a given reward outcome.
Dabney et al predicted that ventral-tegmental dopamine neurons might display behaviours consistent with this multiple-channel distributional reinforcement learning, where the different neurons have individual optimisms associated with reward outcome.Reference Dabney, Kurth-Nelson, Uchida, Starkweather, Hassabis and Munos3 In essence, this translates to different dopamine neurons having different reversal points. In mice ventral-tegmental dopamine neurons, they found that on delivery of variable-magnitude rewards, different neurons displayed a range of reversal points that were not explained by noise in recordings. Further, they managed to display a similar differential firing rate in upstream GABAergic neurons in the ventral tegmentum. The standard temporal difference model of reinforcement learning suggests that all dopamine neurons communicate the same magnitude and direction of reward prediction error. This new distributional reinforcement learning model allows dopamine neurons to have different reversal points such that they can signal diverse reward prediction errors for the same reward outcome and consequently, can capture a full representation for the distribution of value for a given task domain. In other words, as the authors note, ‘the brain represents possible future rewards not as a single mean, but instead as a probability distribution, effectively representing multiple future outcomes simultaneously and in parallel’.Reference Dabney, Kurth-Nelson, Uchida, Starkweather, Hassabis and Munos3
The relationship between cardiometabolic side-effects of antipsychotics and the well-established decrease in lifespan observed with psychoses has remained unclear. Most studies have short durations not reflecting the lifetime nature of the illness. To clarify the cost–benefit relationship of long-term antipsychotic use, Taipale and colleagues conducted the largest and longest-term study of physical mortality and morbidity during antipsychotic use.Reference Taipale, Tanskanen, Mehtälä, Vattulainen, Correll and Tiihonen4 Using a Finnish national register, the cohort included 62 250 people, 8719 with a first-time hospital admission for schizophrenia and no antipsychotic use in the previous year. Each patient was followed for an average of 14 years. Extrapolating antipsychotic exposure via the national prescription register, they used each person as their own control, comparing periods of admission to hospital during antipsychotic use and times with no treatment. All somatic (non-psychiatric) and cardiovascular hospital admission were charted, while mortality was tracked as either all-cause, cardiovascular, or suicide.
During follow-up, just under 70% experienced admissions to hospital, with no difference in incidence for either somatic or cardiovascular admissions across antipsychotic-use and medication-free times. While these findings are at odds with the known short-term adverse effects of antipsychotic use, the authors surmise the remission of schizophrenia seen with effective treatment allows for significant lifestyle gains in terms of healthy behaviour and better utilisation of healthcare. In addition, there was a significant positive impact on mortality (22.3%) with antipsychotic use decreasing all-cause, cardiovascular mortality and suicide. The best across-the-board mortality results were seen for clozapine, and the weakest for levomepromazine, with no differences between long-acting injectable and oral administrations for any drugs observed. Given the size and scope of the study, it can be reasonably concluded that the roughly 15-year decrease in lifespan consistently observed in those with schizophrenia cannot be ascribed, as some had feared, to antipsychotic use but, instead, to their non-use.
Finally, the UK's Mental Health Act has been reviewed and there are wider international discussions that existing systems for involuntary detention are not fit for purpose. One strong argument is that current laws differentiate treatment of those with a mental disorder, and are thus by definition discriminatory and a breach of individuals’ human rights. Indeed, it has been proposed that they are in violation of the United Nations Convention on the Rights of Persons with Disabilities (CRPD). In a stimulating editorial George Szmukler suggests we might be edging towards a global paradigm shift.Reference Szmukler5 He proposes that conventional mental health laws contain two ‘deeply rooted negative stereotypes’ of those with mental illnesses that lack underpinning evidence: first that they are incompetent to make reasoned decisions, and second that they are intrinsically dangerous. He contrasts how just by having a mental illness one can be involuntarily detained based on assessment of risk that has not (and might never) occur, whereas those without a mental illness but known violent tendencies could only ever be detained after committing an offence. There are some radically different models currently being proposed around the globe, and two major potential approaches are described.
The first is labelled a ‘fusion law’ that treats mental and physical health equally, and is based upon one's capacity to make decisions. Under such a model no one with decision-making capacity could ever be treated against their wishes, regardless of their risks to themselves or others. The second model, perhaps contentiously, takes mental illness as a ‘disability’ under the CRPD. With this, ‘substitute decision making’ – which we might best interpret as ‘acting in best interests’ for another – is in breach of the Convention, and individuals rights, their will and preferences for care must be respected at all times, as they are for all others without a disability (/mental illness) even if that means that they are put at risk. The CRPD encourages ‘supported decision making’ to assist those with disabilities: in physical health this is well-established, for example installing a ramp for those using wheelchairs, but it remains largely unexplored in the realm of mental illness. Szmukler identifies challenges in interpreting an individual's decisions, noting how these might change, for example, because of a psychotic episode. Delineating longer-term ‘will’ in advance directives from shorter-term ‘preference’ might help here, but much remains untested. No doubt clinicians reading this will feel uncertain; in one sense that is the point – we have long ‘held control’ and that is being challenged. Change in how we manage involuntary care does feel inevitable.
eLetters
No eLetters have been published for this article.