Home AI might actually make more work for people, not less – study

AI might actually make more work for people, not less – study

TLDR

  • Generative AI was tested by Australia's ASIC and Amazon, revealing inefficiency in summarization tasks.
  • Human employees outperformed AI, scoring 81% vs. AI's 47% in accuracy and coherency.
  • AI responses were often "waffly" and required additional work to verify, making AI less efficient.

The use of generative artificial intelligence (AI) has exploded in popularity since ChatGPT launched in November 2022, but the efficiency of the technology has now come into question as a government trial has found AI could potentially create more work for people.

With companies worldwide scrambling to incorporate AI to stay ahead of the curve, the technology has become a major topic in the corporate world. From worries about an AI takeover of jobs to its use becoming more commonplace generally (ChatGPT has reached 200 million weekly users), the trial aimed to see how the tool actually performs in a workplace setting and with business-related tasks.

Amazon and Australia’s corporate regulator, the Securities and Investments Commission (ASIC), ran the test earlier this year and the outcomes have since been shared in a select committee meeting.

The test included the use of the generative AI model from Meta, the open-source Llama2-70B. The technology was prompted to summarize submissions with a focus on mentions of ASIC, recommendations, and references to regulation.

The tool was also asked to include context and page references. Members of staff from the regulatory body, 10 in total, were then given the same task with similar prompts.

The responses from both the AI and human employees were then blindly assessed by a group of reviewers who looked for coherency, length, ASIC references, regulation references, and for identifying recommendations. At the time, the reviewers weren’t aware that AI was involved.

Results of Australian government trial that pitted AI against human employees

The human summaries beat out their AI counterparts in every criterion, gaining them a score of 81% in comparison to the technology’s 47%.

Where the AI did particularly badly was in finding references to ASIC within the submission documents. In the results section in the trial report, the team experimenting said: “Finding references in larger documents is a notoriously hard task for LLMs due to context window limitations and embedding strategies.

“Page references are not traditionally stored in the embedding models as the contents of PDF documents are ingested as plain text. To achieve better accuracy with this issue, substantial progress was made by splitting documents into pages and treating pages as chunks with associated metadata.”

Some of the AI responses were also described as being “waffly” and “wordy,” with a lack of formatting and unsatisfactory following of the requests in the prompt.

The reviewers had “to refer back to the source material to confirm AI summary details,” and “assessors generally agreed that the AI outputs could potentially create more work if used (in current state), due to the need to fact check outputs, or because the source material presented information better.”

Featured Image: Via Ideogram

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Sophie Atkinson
Tech Journalist

Sophie Atkinson is a UK-based journalist and content writer, as well as a founder of a content agency which focuses on storytelling through social media marketing. She kicked off her career with a Print Futures Award which champions young talent working in print, paper and publishing. Heading straight into a regional newsroom, after graduating with a BA (Hons) degree in Journalism, Sophie started by working for Reach PLC. Now, with five years experience in journalism and many more in content marketing, Sophie works as a freelance writer and marketer. Her areas of specialty span a wide range, including technology, business,…

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.