I. Introduction
It is now a cliché to highlight that whilst artificial intelligence (AI) provides many opportunities, it also presents myriad risks.Footnote 1 Legal articles tend to focus on the risk that AI will undermine important normsFootnote 2 and the importance of incorporating mechanisms into the operation, or oversight, of AI as a means of offsetting said risk.Footnote 3 Amongst the norms considered, the Rule of LawFootnote 4 unsurprisingly features.Footnote 5 But the analyses of the Rule of Law are narrow. Some use the Rule of Law simply as a means of linking to their real concern, which is for the protection of fundamental rights.Footnote 6 Others speak only in negative terms about AI’s deleterious impact on the Rule of Law.Footnote 7 These accounts are incomplete. AI has the capacity to augment as well as to erode fidelity to the ideal of the Rule of Law, which may or may not require protection of fundamental rights. Rather than viewing AI only as a threat to important norms, this article’s core argument is that AI should also be presented as an opportunity to meet their demands. The article’s particular focus is on the Rule of Law in tax administration. It departs from much of the literature on the topic of AI that studies AI in order to reaffirm an essential truism – that technological progress jeopardises norms we consider important. To be clear, there is much value in these contributions as they explain precisely how the threats arise and where the focus of protections should be. What the present study attempts to do is to use the investigation of AI’s capabilities to demonstrate the ways in which jurisdictions could better attain the values that they apparently, might or should, consider important.
The first aim of this article is accordingly to rebalance the discourse on the interaction between AI and the Rule of Law. In doing so, it complements a broader argument in the literature that the power of the State can be positively harnessed to achieve important goals.Footnote 8 The second aim is to demonstrate that a jurisdiction’s adoption of either a thinner or thicker form of the Rule of Law can be deduced from analysing how that jurisdiction regulates important legal relationships. For the present article, the tax specific administrative law on the use of AI is used as a case study, though it is argued that the analysis could also be extended to other important norms and contexts. The investigation can also act as a mirror to jurisdictions, forcing them to reflect upon whether changes are needed to the regulation of AI so that their Rule of Law ethoses are respected.
The article is structured as follows. Section II presents some thoughts on the Rule of Law which serve to produce an analytical framework. Section III applies that analytical framework in order to demonstrate how AI could be harnessed to better meet the Rule of Law’s demands than presently occurs. Section IV looks at how jurisdictions, namely Germany, the Netherlands, Slovakia and the UK, have responded to the regulation of AI in tax administration and what lessons can be drawn about the Rule of Law from them.
II. Some Thoughts on the Rule of Law
Before it is possible to consider the relationship between the Rule of Law and AI, it is necessary to discuss the Rule of Law. The first part introduces the broad tenets of the concept, whilst the second part elaborates on some material distinctions between different accounts of the Rule of Law. Those distinctions have some relevance for Section III of this article but are particularly important for Section IV.Footnote 9
A. The Broad Tenets of the Rule of Law
The Rule of Law is an ideal of political morality which has the effect of elevating the importance of law and legal infrastructure in any system of governance.Footnote 10 It has two essential components – that official action is authorised by law, hereinafter referred to as the principle of legality,Footnote 11 and that laws should have certain qualities in order to perform their functions. Although writers agree as to its importance and on the need for these two basic components, the Rule of Law remains an elusive concept, the full contents of which are devoid of universal agreement. The best that we can say is that people tend to adopt either thinner or thicker versions of it.
The core function of a “thin” account of the Rule of Law is that laws should act as guidanceFootnote 12 – the “guidance function”. A person should, before committing themselves to any course of action, be able to know the legal consequences that will follow.Footnote 13 Raz, for instance, describes this as the “basic idea” of the Rule of Law.Footnote 14 As a result, legal rules should adhere to standards of (reasonable)Footnote 15 clarity and accessibility.Footnote 16 To assist with the resulting epistemicFootnote 17 demand of the Rule of Law, it is desirable for governments to inform people how the rules work.Footnote 18 Lon Fuller’s “laundry list”Footnote 19 of requirements of law – generality, publicity, prospectivity, intelligibility, consistency, practicability, stability and congruence – is conventionally highlighted at this point as a canonical thin Rule of Law framework.Footnote 20 With respect to “congruence”, an idea which will be relevant for our later discussion, there should be similarity between the law as stated and the law as applied. The guidance function of law is frustrated where a person factors legal consequences into decisions that they make only to find that officials subsequently fail to apply the law as it had been laid down.Footnote 21
What advocates of thicker accounts argue is that the Rule of Law makes further, more substantive dictates, such as requiring respect for fundamental human rightsFootnote 22 and private property.Footnote 23 Whilst thinner accounts also insist that certain rights be protected (due process rights, for instance),Footnote 24 in thicker accounts the demands for the protection of rights are more exacting. Thicker versions thus consider not just the inner (procedural and formal) morality of laws, but also what laws should regulate. Laws to that end should constrain the arbitrary exercise of power.Footnote 25 In the context of relationships between the Government and its subjects, this means constraining the exercise of government power.
B. Material Distinctions in Their Demands
Thinner and thicker accounts will agree on the importance of the guidance function and thus that individuals should be able to understand the rules, which are applied as they are written. They will also agree that there should be “accountab[ility] through law” where there are potential infractions by officials.Footnote 26 For this to be realised, there must be an infrastructure in place. There must be a court system, for instance, which is competent to issue remedies for infractionsFootnote 27 and an ability for individuals to access this court system.Footnote 28 A corresponding duty to give reasons for official decisions will arise as a result, without which individuals would be practically unable to challenge administrative decisions.Footnote 29
But they will disagree on the extent of other demands. Whereas in a thin account, laws which conform to the procedural and formal standards espoused by Fuller and others will provide the necessary authority for government action (and thus satisfy the “principle of legality”),Footnote 30 in thicker accounts the focus will be on ensuring that officials are provided only with the authority which is necessary to perform a particular function. Thicker accounts as such will insist that administrative powers are provided only by way of tightly defined rules, with safeguards built into the exercise of those powers. Whilst the thin account would only impose a narrow “principle of finality” in relation to judicial decisions (res judicata Footnote 31 – “the most important test of an independent judiciary in modern society”),Footnote 32 a thicker account would broaden the scope of the principle of finality to administrative decisions (res decisa).Footnote 33 In a criminal law context, for instance, once a decision not to prosecute has been taken by a public official, it would not be permissible to go back on that decision.Footnote 34 The principle of finality would also extend to passive inactions (in contrast to decisions not to act). It would be incumbent on public officials to take decisions in a timely manner and, to that end, the limitation period within which individuals could be pursued for alleged past offences will be relatively short.
III. Meeting the Rule of Law’s Demands
Concerns about AI, adopting here the OECD definition of AI being “a machine-based system that can, for a given-set of human defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments”,Footnote 35 saturate the literature. Jobs are in jeopardy.Footnote 36 Democracy is on the line.Footnote 37 The use of AI may result in the concentration of economic power,Footnote 38 whilst also exacerbating existing social and economic inequalities.Footnote 39 Biases in AI can result in decisions which worsen the plight of groups which have historically suffered discrimination.Footnote 40 Legal scholars write about how important fundamental rights, such as the right to privacy and the right to due process,Footnote 41 or data rights, such as those enshrined in the General Data Protection Regulation (GDPR),Footnote 42 or important legal norms, such as the duty to give reasons and to take into account only relevant considerations when making decisions,Footnote 43 are endangered by AI. Many in turn propose mechanisms for using law to manage these threats and provide accountability,Footnote 44 or mechanisms for designing AI to ensure that they do not materialise.Footnote 45
The Rule of Law too is said to be under threat. Proponents of thicker accounts have argued that AI technologies are problematic because they could lead to the “domination” of humans.Footnote 46 AI power, if unchecked, could diminish respect for the Rule of Law.Footnote 47 As AI imperils fundamental rights, some authors argue that to the extent that this materialises, the Rule of Law too is undermined.Footnote 48 With respect to thinner accounts, important Rule of Law values, such as transparency and accountability, predictability and consistency, and equality before the law could be attenuated.Footnote 49 Where AI technology is opaque, humans subjected to decisions negatively affecting their interests are denuded of the possibility of challenging those decisions to ensure that they had been arrived at correctly.Footnote 50 The focus overwhelmingly is on how the Rule of Law provides a good reason to limit technology’s capabilities.
These claims are not disputed here. But it needs to be appreciated that these technologies can also be used to assist governments in meeting various Rule of Law demands. For instance, the Rule of Law encourages public authorities to make swift and consistent decisions and to achieve high levels of compliance with the underlying rules, whether that is through assisting citizens in complying with their legal obligations or enjoying their legal rights, or through being more effective in the execution of public functions. AI can be harnessed to satisfy these Rule of Law demands. However, very few scholars have alighted on this latter fact. There are a few notable exceptions. John Tasioulas recognises the potential of AI to assist with delivering congruence, though he ultimately rejects this in the narrow context of the judicial adjudication of legal disputes.Footnote 51 Aziz Huq has a slightly broader focus and highlights that AI could be better (or indeed could be worse) at satisfying Rule of Law demands, such as the duty to give reasons, the engendering of stability and certainty, or ensuring accountability through law.Footnote 52 More generally, there are scholars who have written about how law drives, shapes and enables technological development.Footnote 53 Karen Yeung, for instance, argues that there are good reasons to suggest that law will shape the use of blockchain rather than the other way round.Footnote 54 Meanwhile, the “informational economy” (an economy where the goal is to produce, accumulate and process information),Footnote 55 which underpins AI, has been facilitated by law as Julie Cohen has argued.Footnote 56
Both Tasioulas and Huq write from the perspective of the procedural claims of “thinner” accounts of the Rule of Law; Yeung discusses the Rule of Law in terms of the State’s interest in its continuing to prevail over coding, whilst the Rule of Law is present but not at the core of Cohen’s analysis. This article builds upon these contributions by specifying, in this section, how AI can facilitate greater satisfaction of the Rule of Law’s demands than presently occurs. The use of AI by tax authorities in order to assist those authorities in carrying out their legal functions can be used as a case study for teasing out this argument.
The OECD predicts that the next era of tax administration – Tax Administration 3.0 – will see a “paradigm shift” away from the present approach which relies on active, burdensome and voluntary compliance by taxpayers towards seamless and frictionless automation.Footnote 57 The OECD suggests that AI will play a crucial role in Tax Administration 3.0, increasing compliance and reducing tax collection costs.Footnote 58 Three AI technologies in particular are being used by tax authorities: machine-learning (ML), natural language processing (NLP) and expert systems.Footnote 59 ML systems analyse and draw inferences from data. They are used in predictive models often using neural networks, which act like neurons in the brain and are referred to as “black box” systems because it is not clear how the information provided (“inputs”) led to the arrival at particular decisions (“outputs”).Footnote 60 France has used a black box ML algorithm to determine local property taxation of buildings based on outlines of building size from aerial photos which are then compared with the data held by the tax authorityFootnote 61 (known as data matching). In China, data matching and black box ML have been used to review simple tax returns (cutting 90 per cent of workload for officials).Footnote 62 NLP systems understand, interpret and generate human language. The Australian Tax Office used NLP software to understand large amounts of data supplied from the Paradise Papers, a leak from Appleby of over 13 million documents relating to offshore investments.Footnote 63 Expert systems, finally, are developed by experts. Knowledge is coded into “rules” which operate with “if X, then Y” logic. As the reasoning of the system is transparent, they are a form of “white box” AI. The Canada Revenue Agency uses a white box questionnaire-based expert system to determine which AI projects to focus on in light of the tax administration’s priorities.Footnote 64
Despite the apparently strong claims made about Tax Administration 3.0, the OECD has at the same time been careful in its publications to qualify when it expects its lofty hopes to be met.Footnote 65 Indeed, there is a need to be realistic about the capabilities of new technologies and to avoid the trap of embracing “technological solutionism”, as Evgeny Morozov calls it – the utopian idea that technology holds the solution to our social and personal problems (particularly if we were to prioritise the values that technology is good at advancing).Footnote 66 It is now well established in the social science literature that it is folly to adopt such an uncritical, unnuanced approach to problems ranging, for instance, from pandemicsFootnote 67 to healthcare,Footnote 68 governance,Footnote 69 criminal justiceFootnote 70 and so on.Footnote 71 To that end, there are various limitations to the capabilities of ML, NLP and expert systems (as this article will showcase) which serve to paint a nuanced portrait of how they could assist in advancing the Rule of Law’s demands for accountability through law, the guidance function of law and the principle of finality, to which we shall now turn.
A. Accountability Through Law
The Rule of Law imposes a responsibility on tax authorities to provide reasons to taxpayers for decisions taken pursuant to these powers. Whilst the problems in respect of opaque AI systems are well rehearsed in the literature,Footnote 72 and indeed will be criticised below also,Footnote 73 in the context of tax administration there is reason to think that AI could be used to generate more transparent decision-making than currently occurs.
Whereas in common law administrative law scholarship, there is a longstanding debate about whether there is or should be a general duty to give reasons,Footnote 74 in the tax administration context there are clear instances in which tax authorities are statutorily obliged to give reasons for their actions.Footnote 75 This is unsurprising given that tax is a creature of statute (in virtually all jurisdictions today)Footnote 76 and that all powers that tax authorities have which impinge on the interests of taxpayers must have a statutory basis.Footnote 77 There are broadly four categories of coercive tax authority powers: assessment powers; information powers; collection powers; and sanctioning powers. In many instances where these are exercised, the taxpayer must be given “notice”Footnote 78 (as it is termed in the UK)Footnote 79 and in many of those instances, in turn, the notice must state reasons for the exercise of the power.Footnote 80 For instance, where a tax authority suspects fraud, an auditFootnote 81 can be opened in respect of a taxpayer’s affairs in many jurisdictions up to ten years prior.Footnote 82 The taxpayer must be made aware that it is the suspicion of fraud which triggered the audit.
Algorithmic decision-making should, theoretically, make it easier for reasoned decision letters to be sent to taxpayers where tax authorities have a duty to give reasons, or to tax officials in order to explain how AI-assisted decisions have been arrived at. Brazil, for instance, is already using a system for customs inspections which explains its reasons behind identifying a particular risk.Footnote 83 Indeed as Huq says, “[t]here is no reason why [an ML] adjudicative tool cannot, for example, offer an ordinary language account of its reward function, the most significant parameters in the determination of an outcome, and an account of what behaviors or factors might be changed to receive a different outcome”.Footnote 84 Authors to that end are increasingly working on the concept of explainable AI (XAI),Footnote 85 to such an extent that “explainability has evolved into its own field of research”.Footnote 86 XAI methods are broken down into “local” and “global” explanations and attempt to counteract the problem of opacity inherent in black box AI systems. Global explanations consider broadly how black box ML systems operate, for instance how “input” factors, in general, affect outputs.Footnote 87 The former try to establish how a system will have arrived at a particular decision, for instance by testing the system against counterfactuals in order to narrow down how the system arrived at a particular outcome.Footnote 88 Of course, issues need to be teased out about what explainability means and, indeed, the necessary requirements of explainability will differ from context to context.Footnote 89 There is, however, a fundamental limitation to XAI, which is that we can never know the “true” reason for a decision arrived at by a black box ML systemFootnote 90 – all XAI systems are added on to ML systems (they are not an intrinsic component of the same model) and provide post-hoc explanations. They essentially try to reverse engineer the decision-making process.
Determining the “real motivation” behind a decision, however, is a perennial problem in legal systems. An employer may, whilst concealing their true discriminatory motivation, dismiss an employee ostensibly on the basis of justifiable reasons. Unconscious bias may influence a judge’s decision, even if s/he communicates rational reasons to support the decision. The legal systems’ solution to these conscious or unconscious biases is to use post-hoc evidence-gathering techniques to understand best how a decision was arrived at: cross-examining a witness, for instance, or analysing a simultaneous recording, inferring motivation from tone and language. These are an imperfect solution where no perfect solution exists. Given that same conundrum in an XAI context, post-hoc evidence-gathering provides precedent for (imperfectly) resolving the issue in an AI context. Local explanation and global explanation methods operate in an equivalent way to post-hoc evidence-gathering techniques in law.
With this in mind, we could envision, say in the context of a decision to open a fraud audit, an XAI decision letter. The letter would set out what amounts to fraud in law, broadly what factors the tax authority took into account in making its determination and why greater weight was placed on particular factors. Indeed, the XAI could be used in combination with tax authorities’ risk management systems (RMSs), which these days tend to be how fraud and other forms of underpayment are flagged. Based on several factors (like the type of trade, the complexity of a business, earnings level and size), taxpayers will be deemed to be at “risk” of not complying with the tax code. The more likely it is that a taxpayer will fail to fulfil their tax obligations, the “riskier” they are. The “riskier” they are, the more resources will be allocated towards auditing and monitoring the taxpayer. ML is now used to assist in determining risk levels, by recognising “correlations, structures, and anomalies in data”.Footnote 91
An XAI letter would be a significant improvement on the way things currently work. Not only are tax authorities reluctant to explain in any detail how decisions are arrived at by their RMSs,Footnote 92 there is also very little reassurance that the reasons offered for a decision are the “true” reasons. The “true” reasons could have been a function of deliberate, let alone unintentional, bias on the part of the tax official. But a critical difference is that while XAI can operate alongside an RMS, the “real motivation” of a human tax inspector’s decision can only be imperfectly “determined” long after a decision (where it is challenged by a taxpayer). XAI decision letters accordingly would put taxpayers in a better position than they currently are to challenge the relevant decisions and to persuade an independent adjudicator that the reasons which purport to justify the decision are insufficient.Footnote 93
B. The Guidance Function of Law
AI may in many ways enable tax authorities better to comply with the Rule of Law’s demand that law act as guidance both in terms of equipping people with a more complete understanding of their rights and obligations and in respect of ensuring congruence between the law as stated and applied.
Taxpayers owe a continuous positive obligation to ensure that they are complying with the law. As a result, it is unsurprising that taxpayers will need assistance in understanding their rights and obligations, and indeed such assistance promotes the Rule of Law in the sense that it renders taxpayers better equipped to understand the legal consequences of their actions.Footnote 94 Tax advisers offer that service for a fee, whilst tax authorities generallyFootnote 95 offer that service for free through offering assistance through helplines, producing guidance, hosting webinars and providing bespoke rulings. In recent years, tax authorities have invested in chatbots to provide this assistance also. Around 29 administrations studied by the OECD in 2021 used chatbotsFootnote 96 and, as of 2022, 13 had used AI to improve the service.Footnote 97 The basic idea is that taxpayers can come to the bot with a query, and that bot provides an answer to the taxpayer, whether this is about the substantive law or some procedural issue (such as which form to fill in).Footnote 98 Chatbots are usually based on NLP. These may be intent-based systems, for instance, where the chatbot tries to determine the intent behind the user’s question and then provides a pre-determined answer, or more sophisticated ML NLP systems, such as generative AI models like GPT which can formulate new content.Footnote 99 Whilst chatbots will struggle to deal with more complex legal issues and factual matrices,Footnote 100 they should, in time, be able to do a better job than human beings for the great majority of queries that tax authorities face, which involve discrete questions and relatively simple tax affairs.Footnote 101 AVIVA, a chatbot used in Spain to deal with VAT queries, for instance, reduced email traffic by 90 per cent.Footnote 102
The introduction of chatbots is a positive overall development for the Rule of Law (even if the true motivation behind their introduction may be cost cuttingFootnote 103 ). This is so even though chatbots will make mistakes, though less so in the case of intent-based as opposed to ML NLP systems. Whereas the former provide pre-determined answers on the basis of the bot’s estimation of the intent of the user’s question (and so misinterpretation can arise), the latter generate original output, which is why these chatbots are liable to hallucinate and, essentially, attempt to pass off demonstrably false information as true.Footnote 104 These issues do not undermine the claim that chatbots are positive for the Rule of Law overall. From a taxpayer’s perspective, legal certainty is desiredFootnote 105 and therefore that there can be reliance on the information provided by the chatbot, which could be ensured either by legal doctrines, such as legitimate expectations or a statutory rule binding the tax authority to statements made by the chatbot.
AI is also already, and will continue to be, used even more effectively to ensure greater levels of compliance, something which is seen clearly through the use of ML RMSs in combination with data scraping tools. Tax authorities now have access to vast quantities of data from a wide range of sources – information from taxpayers themselves, employers and financial institutions, other tax authorities, other government agencies (such as those responsible for registering companies and hosting company accounts), public information (like the electoral roll), flight sales and passenger information and social media sites.Footnote 106 Tax authorities can get quite a clear picture of a taxpayer’s affairs both from looking at their profile as built from these data sources, comparing across taxpayers with common characteristics, such as age, trade and geographical location and “network analysis” – looking at connections between taxpayer transactions.Footnote 107 That all of this information is digitalised means that tax authorities can use ML to assist in determining risk levels, by recognising “correlations, structures, and anomalies in data”.Footnote 108 And indeed, many tax authorities have adopted AI-assisted RMSs – around 52 of the 58 jurisdictions studied by the OECD in 2022 use AI-assisted RMSs to uncover previously hidden assets and identify new risks.Footnote 109 Since 2010,Footnote 110 HMRC has used a system known as CONNECT, which is a data matching and black box ML RMS tool.Footnote 111 As of 2023, it holds approximately 55 billion data itemsFootnote 112 and apparently prompts 90 per cent of HMRC investigations.Footnote 113 Though HMRC has refrained from commenting in detail on CONNECT, others have speculated about its positive utility in reducing the gap between the tax that ought to be paid and the tax that is paid, known as the tax gap.Footnote 114 In the 2005–06 tax year it was 7.5 per cent, whilst in 2018–19 it was 4.7 per cent.Footnote 115 In short, RMSs help tax authorities to narrow the tax gap and thereby ensure greater congruence between the law as stated and applied.
There are two important limitations to ML-based RMSs which mean that they are liable to err. The first is the bias risk – that errors can arise by virtue of human errors or unintentional biases of the developers and those using the systems, or because of biases in the sample on which the system trained.Footnote 116 For instance, in the infamous Toeslagenaffaire (Dutch Childcare Benefits Scandal; where at least 11,000 non-Dutch parents whose claims for childcare allowance, administered by the tax authority, were deemed fraudulent by a black box ML RMS),Footnote 117 nationality was used as one of the input factors which trained the AI system to detect the likelihood of fraud, along with other more benign factors. But it has been speculated that the system in turn learnt from the actions taken by tax officials (who used the data from the system to inform their interventions, in the form of audits and queries, which were targeted towards specific nationalities).Footnote 118 This created a feedback loop and induced the algorithm to place greater weight on nationality when determining the likelihood of fraud.Footnote 119 The second is the inherent probability risk. As prediction models are “inherently probabilistic”, “prediction errors” are unavoidable.Footnote 120 The training of any prediction model involves choosing an error threshold which appropriately balances the risk of error with the consequences that it would be desirable to avoid.Footnote 121 In other words, it must be decided whether it is better to have false positive or false negative errors. For example, false positives in determining recidivism risk might result in people being jailed unnecessarily. False negatives for cancer screening would result in cancer going undiagnosed. In a tax context, the error threshold will necessarily be calibrated to produce more false positives than false negatives – it is better to audit taxpayers that are not liable than to fail to audit taxpayers that are. Any false positives, however, will at the very least bring about distress on the part of the audited taxpayers – what Adam Smith famously referred to as “unnecessary trouble, vexation, and oppression … equivalent to the expense at which every man would be willing to redeem himself”.Footnote 122 Distress may be unfortunate, and to the greatest extent possible tax authorities should seek to minimise the number of false positives as a result, but distress is inevitable whether AI is used or not. More troubling are instances where severe consequences follow from false positives. In the Toeslagenaffaire, some parents had their child benefit claims disallowed because they used a childminding agency which was perceived to be fraudulent, despite the fact that perhaps up to 20 per cent of such parents were entirely innocent.Footnote 123 Under System Teleinformatyczny Izby Rozliczeniowej (The Clearing House Teleinformatic System) in Poland, a black box ML RMS used to detect VAT fraud, false positives could generate fatal consequences for business. Where the system detected a high risk of fraud, a business would have its bank account frozen for 72 hours while an investigation was carried out (extendable up to three months).Footnote 124
These two limitations are also not fatal to the positive role that AI can play in facilitating congruence. Methods, such as oversight and testing, are being developed to manage bias risk.Footnote 125 The inherent probability risk prompts considerations as to the safeguards which limit a state’s response to a particular social or economic problem, safeguards being relevant for thicker accounts of the Rule of Law which focus on constraining governmental power. In a sense, the solution lies in not pre-emptively attaching consequences and instead insisting upon having a human check that measures should be taken against the accused person.
C. Principle of Finality
AI can be harnessed to commence and conclude tax investigations quickly, hence satisfying res decisa, a principle which is important in the context of thicker but not thinner Rule of Law accounts. Decisions on whether to commence formally an audit could be taken by an AI system – using the risk-based algorithm – and thereafter closed more efficiently than is the case with human decision-making today. Where an audit has been opened, expert systems could be developed using “knowledge from human experts and encode that knowledge into rules which will be applied based on the factual information” obtained from taxpayers.Footnote 126 These systems would “collect facts from users through interview-style questions and produce answers based on a decision-tree analysis”.Footnote 127
It might not be feasible to develop an expert system like this to deal with audits generally, as expert systems can become overly complex “to the point that even experts struggle to make sense of them”Footnote 128 when operating at scale. This is because one of chief limitations of expert systems is that they require extensive input from subject experts who would need to design the questions and add the rules that should determine what different answers would mean. But this is not that dissimilar to what already occurs in certain audits. In the UK, for instance, Heather Self and I have reported that HMRC is already using pro forma questionnaires with large businesses during audits.Footnote 129 Presumably HMRC’s analysis will pivot depending on the answers supplied. It is not much of a stretch to imagine these questionnaires and reasons (turned into rules) being used in the development of an expert system. So, whilst largescale use of expert systems for auditing generally might not be realistic, it would be feasible to free up tax authority resources by using expert systems for particular types of dispute (for instance those involving residency tests) or for particular types of taxpayer (such as large financial institutions). If successfully implemented, a reduction in limitation periods could be countenanced whereby tax authorities would be time-barred from investigating a taxpayer’s past affairs more quickly than at present.
IV. AI, Administrative Tax Law and Jurisdiction-Specific Insights
There is still more to be said about AI and the Rule of Law as viewed through the lens of tax administration. By looking more granularly at how the use of AI by tax authorities is legally regulated in specific jurisdictions, we can explore how different legal systems reflect their own particularised version of the Rule of Law. In other words, we can explore whether and how the laws of different jurisdictions are calibrated in response to the Rule of Law’s demands. That states might not countenance adapting their legal regimes in response to the capabilities and limitations of AI, as explored in Section III, reflects their approach to the Rule of Law. Whether thicker or thinner accounts are adopted, and the precise shape of those accounts, will also be revealed through such an examination. This methodology differs from that adopted in the literature whereby the European Convention on Human Rights (ECHR) or EU law are used to analyse tax AI,Footnote 130 instead recognising the value in looking at jurisdiction-specific approaches. This section focuses principally on Germany, the Netherlands, Slovakia and, the UK, as these jurisdictions offer contrasting civil and common law approaches and also provide real world examples of cases or legislation involving the use of AI by tax authorities. It looks at three particular Rule of Law demands – the principle of legality, accountability through law and the principle of finality.
A jurisdiction-specific examination also pursues a secondary goal in that such an examination helps to spotlight instances where there is an asymmetry between a jurisdiction’s beliefs about the Rule of Law and how it operates with respect to the legal regulation of the use of AI by tax authorities, allowing corrective action to be taken. In other words, it can prompt action from policymakers who are unhappy with how the rules governing the use of AI by tax authorities do not reflect a jurisdiction’s ideal version of the Rule of Law. This point is more fully developed in the Conclusion.
Can so much be learnt about a particular jurisdiction’s approach to the Rule of Law by looking at such a narrow field? The author would contend that the answer is “yes”. An examination of domain-specific administrative lawFootnote 131 provides insights into the shape and content of the Rule of Law in a jurisdiction, given that “[a]dministrative law is closely bound up with national institutions and traditions, as well as national constitutional values and ways of operating”.Footnote 132 The legal regulation of AI by tax authorities is itself a subfield of administrative tax law (those rules peculiar to the exercise of tax functions and legal principles peculiar to the exercise of public functions). Though administrative tax law is just one of the many administrative law regimes which bears upon the relationship between the individual and the State, it is a particularly important one. Without tax, there is no modern State (subject to a few, very notable exceptions in the case of monarchical oil rich countries).Footnote 133 Tax constitutes the StateFootnote 134 and is the one area of public administration which is essentially unavoidable. In Ajay Mehrotra’s words, “taxes are among the most pervasive and persistent ways that citizens interact with their government”.Footnote 135 All of us pay taxes – maybe not taxes on income, but at least taxes on consumption, whilst registration into the tax system may also be a condition for obtaining paid employment and benefits.Footnote 136
A. The Principle of Legality
Błażej Kuźniacki, Marco Almada, Kamil Tyliński, Łukasz Górski, Beata Winogradska and Reza Zeldenrust argue that, for AI systems in tax administration to be deemed constitutional, they should only be implemented by statutory law in a sufficiently precise and predictable manner.Footnote 137 What this section will tease out is that, whilst this might be an accurate, broad summation of the principle of legality in some jurisdictions, it does not reflect the manifestation of the principle in others.
A civil law jurisdiction with a strong tradition of codifying laws, it should be unsurprising that in Germany the use of AI-assisted RMS and the automation of processes in a tax administration context are regulated by discrete and specific legislative provisions.Footnote 138 With regard to the former, section 88(5) of Abgabenordnung (the German Fiscal Code) provides that tax authorities may use automated RMSs in determining when auditing is necessary. But safeguards must be built into the models, such as ensuring that:
-
a sufficient number of cases is selected, on the basis of random selection, for comprehensive review by officials;
-
officials review those cases sorted out as requiring review;
-
officials are able to select cases for comprehensive review;
-
regular reviews are conducted to determine whether RMSs are fulfilling their objectives.
With regard to the latter, section 155(4) allows tax authorities “to use fully automated processes to conduct, correct, withdraw, revoke, cancel or amend” tax assessments, credits and prepayments and any associated administrative acts.
These provisions of German law, which are similar in content and specificity to those that can be found in the tax laws of Belgium, France, the Netherlands and Poland,Footnote 139 reflect a particular approach to the Rule of Law which regards as important not merely that public power be authorised by law but that the power be authorised by explicit and tightly defined laws. It also regards as important a further constraining function of thicker conceptions of the Rule of Law – namely, the inclusion of safeguards which make specific dictates.
In Slovakia, we see a similar approach. eKasa is a system which transmits information from electronic cash registers to the tax authority. That information is also processed by a risk-scoring algorithm, in turn used to determine which taxpayers to audit.Footnote 140 In essence, it is an RMS whereby one of the inputs is real-time transactional information. eKasa was the subject of litigation before the Supreme Court of Slovakia because the system potentially breached rights enshrined in the Slovakian Constitution, the Charter of Fundamental Rights, the ECHR and the GDPR. The Supreme Court found the use of eKasa to be unlawful. The Court accepted that there was a basis in law for the provision of information to the tax authority. But the Court found that this law did not permit the processing of that data through algorithmic means.
The Court’s reasoning can be understood in Rule of Law terms. The processing interfered with the constitutional right to informational self-determination found in Articles 19(3) and 22(1) of the Slovakian Constitution.Footnote 141 As a result, legislation permitting this interference must be established by law both in a formal sense and in a substantive sense – the legislation must “clearly provide the citizen with an appropriate indication of the circumstances and conditions under which the public authority is authorized to encroach on his right”.Footnote 142 But the manner in which the system operated was not known to taxpayers as the legislation left open how taxpayers’ data would be processed.Footnote 143 What the Slovakian Constitutional Court invoked here is the principle of legality.Footnote 144 Whilst the Court regarded EU law and the ECHR as supporting this assessment, the reasoning was grounded principally in the requirements of Slovakian law.Footnote 145
The Court went on to find that the legislation also breached the Constitution because it lacked safeguardsFootnote 146 which the Constitution mandated because of the risk of abuse of power or errors.Footnote 147 To be constitutional, these safeguards must not only exist on paper, but must be “effective”,Footnote 148 which may require “sufficient authorizations, financial resources and control tools”.Footnote 149 Slovakian law, the Court reasoned, also went further than the protections provided by EU law and the ECHR in this respect because it does not just apply to personal dataFootnote 150 and also imposes as obligations that which the GDPR otherwise reserves as choices for Member States.Footnote 151 As such, Slovakian law constrains the actions of public authorities, delimiting their powers to use AI-assisted RMSs in a manner which goes no further than necessary to achieve a clear purpose and incorporates strong safeguards. As with the German Fiscal Code, this invokes a version of the principle of legality which is common to thicker Rule of Law accounts.
In the Netherlands, we can also find evidence that a thicker principle of legality is followed. SyRI was a black box ML-based decision support tool used by several Dutch public authorities to help detect tax and social security fraud.Footnote 152 SyRI used data from a wide range of government agencies and looked to identify inconsistencies, anomalies and asymmetries in the data in order to build a risk profile for individuals.Footnote 153 It was passed into law in 2014Footnote 154 and was in use until it was declared unlawful by the Hague District Court (which in the Dutch judicial system is two tiers below the Supreme Court)Footnote 155 in 2020.Footnote 156 Though the projects for which SyRI was deployed were not run by the tax administration,Footnote 157 it was a system which could have been used by the tax administration. Further, it bore critical similarities to the system used by the tax authorities in the Toeslagenaffaire (thus, the Court’s findings are relevant also to understanding the legality of the system used for child benefit claimants). The Court found that the legislation infringed the right to private and family life as enshrined in Article 8 of the ECHR and that such infringement was not “necessary in a democratic society” because it was disproportionate to the (legitimate) aims pursued.Footnote 158 This was due to lack of specificity and safeguards. As to the former, the SyRI legislation did not make clear the causal link between the data used by SyRI and the level of risk which was then attributed to the data subject. In other words, how the risk level was either increased or decreased depending on the presence of different factors.Footnote 159 As for the latter, the SyRI legislation was silent on the risk model used (such as the type of algorithms used in the model) and on the risk analysis method used.Footnote 160 In consequence, data subjects were unable to defend themselves “against the fact that a risk report ha[d] been submitted” about them.Footnote 161 Further, individuals not subjected to a risk report would have no way of knowing if their data had been “processed on correct grounds”.Footnote 162 The silence also meant that it was impossible to verify if the system produced discriminatory effects.Footnote 163
Whilst the judgment represents the Dutch approach to the interpretation of Article 8 of the ECHR, different Council of Europe Member States can arrive at different but legitimate interpretations as to the leeway which should be allowed to public authorities when seeking to manage pressing social issues. As the Hague Court noted, the determination of what is “necessary in a democratic society” is subject to a margin of appreciation.Footnote 164
This leads nicely into a discussion of the principle of legality in the UK where, by contrast, specificity of legal authorisation and safeguards are regarded as less important as evidenced by section 103 of the Finance Act 2020, section 103’s background and the use of CONNECT. Section 103 provides that any responsibility imposed on HMRC can be executed by a computer, including the commencement of an audit.Footnote 165 This updates provisions which previously referred to tasks assigned to an “officer” of the tax authority.Footnote 166 It is unlikely that the old British provisions would pass muster in Germany, the Netherlands or Slovakia as they did not actually specify that the decisions would be made by a non-human. The Constitution in Slovakia, for instance, required infringements with informational self-determination, as such processing would be, to be the “subject of a public debate” as “reflected in the legislative process” and “explicit approval by the legislator”.Footnote 167 Whilst the new provision would deal with some of the concerns of the Court in the eKasa case (in that a computer is actually mentioned), critically it does not specify that ML RMS can be used, even though it is known that, since 2010, HMRC has been using the CONNECT system for that purpose.
Further, it is worth noting that the UK legislation was not introduced because of a need to ensure that there had been public debate, but simply because there had been conflicting authority on the matter.Footnote 168 By the time the legislation had actually been introduced, the Upper Tribunal in Revenue and Customs Commissioners v Rogers and another Footnote 169 had determined that the old legislation did indeed permit the taking of decisions by computers despite the legislation not specifically providing for this.Footnote 170 In short, what the Upper Tribunal had accepted in Rogers would not have been acceptable in Slovakia or presumably in Germany or the Netherlands.
Section 103 meanwhile would struggle to satisfy the norms of those other jurisdictions as it envisages the taking of highly consequential decisions without requiring explicit safeguards. Whereas in Slovakia the need for safeguards arose by virtue of the processing of data alone, whether it resulted in a decision or inaction,Footnote 171 section 103 allows for the commencement of audits by a computer into a taxpayer’s affairs up to 20 years in the past.Footnote 172
In short, by way of contrast to the thicker Rule of Law conception of the principle of legality reflected in the German Fiscal Code, the eKasa judgment and the SyRI judgment, the UK appears to adopt a significantly thinner version.
B. Accountability Through Law
Opacity of decision-making is a Rule of Law problem per se, whether involving ML AI or not, as there is no way of knowing whether the underlying decision-making was flawed (intentionally or not). The “Horizon” post office scandal is proof of this point, if it were ever needed. There, faulty IT software (Horizon) was used as the basis for pursuing and prosecuting 700+ post office workers, when the evidence was in fact tainted by myriad technical problems.Footnote 173 “Blind faith” was placed in the, resultingly, unchallenged evidence produced by the Horizon system.Footnote 174 It is hardly surprising then that black box AI models are routinely criticised in the literature. The basic idea is that the model will arrive at a conclusion, but the process by which it so arrives will be opaque. Though the article has discussed black box ML so far in terms of ML systems that are inherently complex, Jenna Burrell suggests that there are in fact three types of opacity: intentional – where for corporate or State secrecy reasons, the decision-making process is not disclosed; technical illiteracy – where the human decision-makers are unable to understand and relay to an affected person how the decision was arrived at; and inherent complexity – where even an expert cannot practicably determine how the decision came about.Footnote 175 Whatever type of opacity is in play, the result is the same for the Rule of Law: people are denied the ability to understand fully a decision that has been taken against them and as a result to defend themselves properly. Their ability to hold public authorities accountable through law is affected.
XAI can assist with inherent complexity particularly in the RMS context, though for decisions which directly determine individuals’ rights, the consensus in the literature appears to be that opaque ML models should not be used.Footnote 176 Technical illiteracy was a problem in the Toeslagenaffaire in respect of some welfare recipients.Footnote 177 The relevant tax officials concerned were “unable to explain how the AI assisted system worked” and instead told “welfare recipients they had to repay ‘because the algorithm said so’ [though the point was not put in those explicit terms]”.Footnote 178 But there is no reason to think that this will be a perennial problem – the solution lies in the upskilling and education of public officials.
Intentional opacity meanwhile is common to ML RMSs and is a policy choice which will likely continue to be adopted by states unless challenged. The exact balance between transparency, which is demanded by even a thin Rule of Law account as it enables taxpayers to defend themselves against decisions affecting them, and intentional opacity, which tax authorities regard as necessary to prevent gaming or evasion, will depend on a jurisdiction’s willingness to accept deviations from the ideal of the Rule of Law. To that end, when authors such as David Hadwick and Shimeng Lan argue that tax authorities should be transparent about the factors used in ML RMSs and their decision-making processes,Footnote 179 they are implicitly invoking the Rule of Law and its demand (even in a thin account) for useful reasons. Indeed, above, I argued that AI-assisted systems could enable tax authorities better to fulfil the reason-giving Rule of Law requirement than is currently the case.Footnote 180 And yet, ML RMSs are not transparent, which is particularly unfortunate given the bias and inherent probability risks which pertain to these systems.Footnote 181 For instance, in Germany the provision of information about automated RMSs is forbidden by Article 88(5) of the German Fiscal Code if such publicity “could jeopardise the consistency and lawfulness of taxation”. In Slovakia, the use of ML RMSs is entirely non-transparent, with neither legislative provisions nor administrative guidance explaining how AI systems are used.Footnote 182 In the Dutch context, there is opacity about the way data is processed and how taxpayers are selected for audits.Footnote 183 In the UK, there is very little publicly available information about CONNECT or how HMRC’s RMS functions. In short, Germany, Slovakia, the Netherlands and the UK regard it as legitimate, under the guise of protecting the tax base, to undermine the Rule of Law demand for transparency which would enable taxpayers to hold tax authorities to account for their use of ML RMSs.
What makes the situation all the more unfortunate is that the opaque approach is probably unnecessary for the purposes of protecting the tax base from gaming or evasion. The argument that tax collection would be disrupted by some disclosure of the risk factors is “practically false” in the context of natural persons “who cannot modify their residence, nationality or personal data with ease”.Footnote 184 Meanwhile, the basic factors used in determining risk for large businesses in the UK are disclosed,Footnote 185 something which does not appear to have disrupted the collection of tax, but for every other taxpayer the factors are not. Indeed, taxpayers with resources hire “clever tax professionals or accountants” to figure out the risk factors and this too has not jeopardised the RMSs’ functioning.Footnote 186
C. Principle of Finality
Tax authorities tend to have considerable discretion to conduct audits up to four years after the filing of a tax return (in GermanyFootnote 187 and the UK;Footnote 188 five years in the NetherlandsFootnote 189 and SlovakiaFootnote 190 ), and this is extended in cases where the taxpayer was negligent (to five years in GermanyFootnote 191 and six years in the UKFootnote 192 ), where the tax involved offshore arrangements (up to ten years in Slovakia;Footnote 193 twelve years in the NetherlandsFootnote 194 ) and where fraud is suspected (up to fifteen years in GermanyFootnote 195 and twenty years in the UKFootnote 196 ).
Despite the fact that AI could be used to reduce the time limits for tax investigations (as tax authorities are able to perform their functions more quickly thanks to technology utilising ML RMSs, NLP and expert systems), the time limits in the Netherlands, Slovakia and the UK have remained unchanged. Indeed, in Germany, the time limit where fraud is suspected increased in 2020.Footnote 197 This is notwithstanding the fact that time limits were originally set for the Tax Administration 1.0 era – a time of paper-based returns when tax authorities had a significantly more restricted view of taxpayer’s affairs. Catching mistakes and intentional non-compliance took time.Footnote 198 With increased access to taxpayer information and the capabilities of AI to speed up the audit process, the constraining function of law as espoused in thicker Rule of Law accounts would suggest that these time limits should be reduced. As tax risks can be caught much earlier today, it should no longer take tax authorities four or five years to realise there is a potential tax loss, for instance. Failing to amend the time limits when there are material changes in circumstances invokes what Yeung and Adam Harkens call a “fallacy of equivalence”Footnote 199 – the fallacy being the idea that those time limits will continue to be acceptable simply because they were accepted before.
Conversely, for those jurisdictions content with thinner Rule of Law accounts, there is no reason to think that the time limits could not be increased. With Tax Administration 1.0, one strong reason for the various statutory limitation periods was the practicality of proving an underpayment of tax (intentional or not) many years in the past.Footnote 200 Records could be lost, and important conversations and intentions forgotten. Tax collection and assessment accordingly was significantly more resource-intensive. Finality of limitation periods gave not just taxpayers an opportunity to get on with their lives, but also enabled tax authorities to focus their limited resources on present and future compliance risks. But these days, given the transition towards digitalisation of tax records and transparency of information relevant to determining liabilities, married with the ability of AI to sift through this information at a much quicker pace than human tax officials, this old rationale loses its force.
V. Conclusion
This article has investigated how AI can be analysed in light of Rule of Law demands in the tax administration context. This was distilled into two sets of analyses. In the first (Section III), the article provided examples of where AI can actively assist in meeting the Rule of Law demands for accountability, certainty, congruence and finality. In the second (Section IV), the article investigated how the Rule of Law is reflected by administrative tax law concerning AI.
The analysis in Section III can act as a mirror to jurisdictions, prompting them to consider why the opportunities to use AI to satisfy Rule of Law demands have not been seized upon. Section IV provides more incisive analysis of this “mirror” argument, forcing countries to reflect upon whether they are happy with the status quo as showcased in this article. It can be said that some states are internally inconsistent in their satisfaction of Rule of Law demands – advancing thicker versions of some principles, but undermining even thin versions in others. More concretely, there are several important jurisdiction-specific insights that can be discerned from the analysis undertaken in this article. With respect to the principle of legality, thicker accounts favour greater specificity in law about the use of AI in tax administration and the inclusion of safeguards, something which is present in Germany, the Netherlands and Slovakia, but not in the UK. Considering accountability through law, for instance, the article suggested that opacity over AI-assisted RMSs is highly problematic and probably unnecessary. Each of the jurisdictions failed to meet the demands of thinner accounts. If, on reflection, policymakers realise that the regulation of AI in tax is out of sync with the general Rule of Law ethos of the State, then corrective action can be taken. Should they do so, the manner of reform will also reflect a particular approach to the Rule of Law. David Hadwick, in that light, would insist that a synopsis of the risk factors in RMSs and how they are used should be placed in primary legislation in order to satisfy the Rule of Law.Footnote 201 But accountability through law could be protected where administrative guidance is used instead of legislation,Footnote 202 provided that the outcome is the same: that people are equipped with sufficient information to challenge tax authority decisions. With finality, the option, permitted by thicker accounts, to reduce time limits for tax investigations has not been exercised, though the option available under thin accounts to extend time limits has been exercised by Germany!
In addition to providing a case study in using the Rule of Law to understand the opportunities of AI, the article makes several broader contributions. First, it shows that a jurisdiction’s internalisation of Rule of Law values can be understood by looking concretely at particular legal relationships between the Government and individuals. The present article focused on tax administration, but there are myriad other fields of public law where the same analysis can be undertaken.Footnote 203 Second, its core argument can be linked to a broader hypothesis that technological progress can be harnessed for the purpose of meeting the demands of important legal norms, thereby fitting with a particular trend in the literature.Footnote 204 Third, it shows the value of using the analytical framework that can be deduced from the Rule of Law, whichever account is adopted, for critiquing legal regimes. The Rule of Law has a unique place in that it applies in the context of all legal engagements whereas there will be many instances in which, for instance, the provisions of the ECHR do not apply.Footnote 205 Finally, the article sets foundations for further research on the relationship between AI, the Rule of Law and tax administration. Given the impact that technological progress has and will continue to have on tax administration, we might ask whether tax authority powers (to assess, collect, request information and sanction) should be more tightly controlled?