What role for regulators in developing a reputable AI audit industry?

AI audits are used to check and verify that algorithmic systems meet regulatory expectations and do not produce harm (unintended or intentional). Globally, regulatory requirements for AI audits are increasing rapidly:

  • the EU AI bill makes compliance assessments mandatory for high-risk AI applications. Typically this is an internal governance audit but in specific high risk cases external audits are required.
  • the Canadian government has mandated algorithmic impact assessments for federal government institutions;
  • In the United States, senators have proposed a bill for an Algorithmic Accountability Act 2022, which would require impact assessments when companies use automated systems to make critical decisions.
  • New York City passed a bill in November 2021 that mandates hiring vendors to conduct annual bias audits of their AI systems, a move likely to be followed by other state governments. and premises.

These audit requirements raise many questions: Who should these AI auditors be? What training and qualifications should they have? To what standards should algorithmic systems be evaluated? What role should auditing play in the context of demonstrating compliance?

A discussion paper has recently been published seeking views on potential roles regulators could have in the development of an AI audit industry, published by the Group of 4 UK Regulators with an Interest in the digital economy – telecommunications regulator Ofcom, competition regulator the CMA, privacy regulator the ICO and financial regulator the FCA (collectively the Digital Regulatory Cooperation Forum or DRCF)

So why should regulators be involved if the market is starting to deliver?

The DRCF asserts that regulators “have an interest in establishing trust in the audit market, so that organizations and individuals can be confident that audits are credible”. Voluntary standards have an important role, but the DRCF also stated that “there are often incentives for companies to comply, such as technical standards translating regulatory requirements into product or process design”.

The discussion paper noted recent positive developments in AI audit tools:

  • Tech companies have started developing technical tools for algorithmic auditing, including Facebook’s Fairness Flow, IBM’s AI 360 Toolkit, and Google’s Model Cards for Model Reporting or Fairness Indicators in Tensor Flow.
  • Industry associations have worked on voluntary standards such as the IEEE Ethics Aligned Design.

While this “nascent audit ecosystem” provides a promising foundation, the DCRF expressed concern that “it risks becoming a patchwork of ‘wild west’ where entrants can enter the audit market algorithms without any quality assurance”.

Why AI Auditing Isn’t a “Tick a Box” Exercise

While AI auditing can draw on the general auditing world, DRCF points out that AI auditing has its own unique challenges:

  • Machine learning algorithms are data driven and probabilistic in nature and the only way to know the exact output under all circumstances is to test with all possible inputs. This is nearly impossible to achieve in a test environment before “go-live” an AI, and even in the real world there is always a risk of unforeseen issues arising.
  • there can be feedback loops in which algorithms adapt their behavior based on how other algorithms (or humans) react to their actions. This can make it impossible to know what the outputs will be when simulating the algorithm’s output in an isolated test environment – or even in an operating environment after an audit has been performed.
  • AI retraining is necessary to maintain performance as real-world conditions change. However, after retraining, there is no guarantee that previous performance measures are still valid, and it is possible that new biases or other issues may be introduced.
  • some models are now recycled on an individual user’s device with local data, so different users will then have models that behave in divergent ways.
  • When parts of an algorithmic system are built on elements from several different suppliers, it can be difficult to identify where in the supply chain the audit should take place.

Know your AI audit types

The DRCF asserts that the starting point for building a credible AI audit industry is to codify the various audit tools, as shown below:

“A governance audit could examine the organization’s content moderation policy, including its definition of hate speech and whether it meets relevant legal definitions… The audit could assess whether there is appropriate human oversight and determine whether the risk of system error is appropriately managed by human review.An empirical audit could involve a “sock puppet” approach where auditors create simulated users and enter certain classifications of harmful, harmless, or ambiguous content and assess whether the system outputs match what would be expected in order to remain compliant A technical audit could examine the data on which the model was trained, the optimization criteria used to train the algorithm, and relevant performance metrics The AI ​​program effectively tackled the risks of “hate speech”:

The Risks of the Big Four Auditing Big Tech

While the DRCF supports the professionalization of AI auditing, it also notes that AI auditing could settle into a comfortable captive relationship between the Big Four accounting firms and major global technology companies.

The discussion paper analyzes proposals to “facilitate better audits by introducing specific algorithmic access obligations”; indeed, by arming academics and civil society groups to undertake their own audits of AI used by corporations. The discussion paper said that “[p]the establishment of greater access obligations for research or public interest purposes and/or by certified bodies could reduce the current information asymmetries, improve public confidence and lead to more effective enforcement »

But the discussion paper also recognized that it would be important to carefully consider the costs and benefits of any mandatory access to organizations’ systems and considered three approaches:

  • provide access only to the elements necessary to undertake an empirical audit (i.e. an audit of the results), which would respect intellectual property by not requiring access to the “black box” of the AI ​​system;
  • control who has access to different elements of the algorithmic system, such as respected academic institutions with expertise in AI. Auditors may be required to operate under a non-disclosure agreement for the data they inspect.
  • expand the use of regulatory sandbox environments to test algorithmic systems and verify damage in a controlled environment. Regulators could collect data from organizations, for example on the security and bias of algorithmic systems. Regulators could share sufficiently anonymized data with selected third parties such as researchers to enable further investigation and reporting.

The discussion paper also looked at approaches that, in effect, audit “crowd-sourced” AI:

“The public can also benefit from a means of reporting alleged harms from algorithmic systems, alongside journalists, academics and civil society actors who are already raising concerns. These reports could include an incident report database that would allow regulators to prioritize audits. It could also include a form of popular petition or a super complaint mechanism whereby the public could trigger a review by a regulator, subject to reasonable constraints.

The risk of AI audits going nowhere

Audits are only useful if there is a broader governance system that can take on the issues uncovered by an audit of an AI system and retool the AI ​​system to fix the problem.

The discussion paper proposes increased powers for regulators:

  • prohibit organizations from using the system until the organization has addressed and mitigated the harm.
  • establish red lines where algorithmic systems cannot be used based on their perceived risk to the public, building on the right to restrict the processing of personal data under the UK GDPR.
  • share insights (through co-regulatory models) that regulators gain from audits on how algorithmic systems can create harm and how this can be mitigated. This can help inform algorithmic design upfront or give companies a better understanding of how they should audit their own algorithmic systems.

The discussion paper also offers “self-help” solutions for consumers. He notes that, unlike other areas such as privacy, those harmed by poorly performing AI do not necessarily have recourse:

“The audit can tell individuals that they have been harmed, for example from a biased CV screening algorithm. It can provide them with evidence that they could use to seek redress. However, there is an apparent lack of clear mechanisms for the public or civil society to challenge results or decisions made with algorithms or seek redress.

So, what specific roles for regulators?

Given the above issues related to the growth of a credible AI audit market, the discussion paper seeks opinions on 6 assumptions about the appropriate roles of regulators:

Norma A. Roth