The audit in July saw the 237-page assurance report submitted as part of a contract with Deloitte to the Australian Department of Employment and Workplace Relations for AU$440,000 (around US$290,000). This Deloitte refund AI report reviewed the welfare system’s automated penalty system and the quality of IT systems within the department.
Regrettably, the report was short: University of Sydney health and welfare law researcher Chris Rudge reported, publicly and on social media, that the report contained made-up references Deloitte. He identified a number of misreferences, some to real sources that did not exist, some incorrectly attributed academic cites and an unattributed quote of a federal court judge which seemed to pop up in the middle of nowhere.
Error, Deletion, and Partial Refund on AI Errors
Following an in-house review, the department and Deloitte have reconciled that there were some spurious footnotes and references. Deloitte will honor the outstanding payment under the agreement, a refund following AI errors, though the amount of the refund is not stated. We then noticed a revised version uploaded to the department’s website. The revised Deloitte refund AI report had eliminated the spurious court quotation, and the spurious academic references.
The report further made a disclosure that some sections of it were authored with the assistance of a generative AI model, Azure OpenAI. The department insisted that the references were not valid but that the findings and recommendations in the report were not impacted. When we reached Deloitte, we were told that the issue had been resolved directly with the client; and they will not confirm or deny whether the AI tool was responsible for the AI hallucinations in reports.
What Was Created and How It Was Uncovered
As reported by the review of Rudge, there were a few dozen errors in the initial report, and perhaps worst of all was “citing to Professor Lisa Burton Crawford for a book that was not in her field of expertise, and did not exist.”
Rudge further noted that the report claimed authoritative legitimacy by citing the work of a number of academics but could not show distinctly whether the authors had actually read the material cited. He took specific note of how serious it was to misquote what had been said by a federal court judge, on whom the report had intended to pass judgment for legal sufficiency.
Experts believe that among the more recognized risks of generative AI systems is that they will “hallucinate,” or produce text to fill in blanks. The generated references Deloitte incident is a revealing example of AI hallucinations in reports, and this raises questions about the reliability of AI for professional consultancy and research work.
Political and Institutional Backlash
The accident drew intense political responses. Greens Senator Barbara Pocock stated that Deloitte should refund all AU$440,000 for the “ineptitude” of using AI to perform such a delicate task, as misquoting a judge would have received a fail mark from a first-year law student.
Labor Senator Deborah O’Neill called it illustrating a “human intelligence problem,” and not because of software flaws. She stated that the consulting firms should be able to say “this is who made this report” or “this is what the AI software produced,” and more explanation was necessary on how AI would be used for analysis work or official work.
She continued to say that the case should spur a broader review of corporate accountability when AI hallucinations on reports lead to misinformation or public humiliation. Refunding half the cost after AI errors should serve as a warning to other consultancies about transparency and diligence.
What This Means Going Forward
This example illustrates the potential dangers of using generative AI on official work and serious consultancy work. Making mistakes such as making a quote or an academic source that doesn’t exist reduces the report’s credibility and the business’s reputation
Businesses using AI-generated material need to implement rigorous review and fact-checking processes before publishing it. Responsibility cannot be delegated to software.
According to Deloitte, the revised report remains a reflection of good analysis and sound advice. Public trust may now depend on the level of openness that the firm, along with other consulting firms, shares regarding their embracement of AI and their process of securing human endorsement.
What all this adds up to is that while AI may bring about enhanced productivity, it is not a substitute