Graeme Johnston / 30 November 2023
This article addresses an interesting study published in early November 2023 about generative AI in legal work.It was conducted by three professors at the University of Minnesota Law School – the full text is here.
My purpose is not to comment significantly on it, but simply to summarise it as objectively and succinctly as I can, in case of interest to others.
Summary of experiment
A clone of ChatGPT Plus / GPT-4 was used. Almost 60 law students were assigned four tasks.
- A complaint (that is, the US litigation term for the document formally setting out a claim). No legal research required. Students provided with fact pattern and elements of relevant causes of action. Max 5 hours allowed.
- A contract to paint a house. Two pages max, plain English. Key terms provided. Max 2 hours.
- A section of an employee handbook covering a narrow issue of local statute law. Research required into the local statute. Max 1 hour.
- An advice memo. Fact pattern and four cases provided. Memo to be based on those alone, without independent research. Max 5 hours.
Students were asked to track time. They were paid a flat rate to avoid conflict of interest in time recording.
Half were asked to do tasks 1 and 2 with CGPT+ but 3 and 4 without. The other half were asked to do it the other way round.
The groups were roughly balanced by graduation year and grades-so-far. Two hours of CGPT+ training by video were provided.
The professors then blind-graded the work, without sight of the time recorded.
Small average improvement, inconsistent across the four tasks.
Interestingly, the improvement was correlated to what I would instinctively rate as the complexity of the tasks based on the descriptions:
- most basic (simple contract) – largest improvement, though still small
- then the complaint – smaller improvement
- handbook – insignificant improvement
- most complex – the advice – insignificant impairment
Important: “participants who had the worst performance without assistance from GPT-4 received the largest benefits, with little benefit to participants who were capable of producing high-quality work on their own.”
2. Reduction in time taken
This was more significant for all tasks and abilities
Time saving was greater for more basic tasks:
- contract 32%
- complaint 24%
- handbook 21%
- advice 12%
Students who used GPT-4 reported greater satisfaction.
The authors very fairly acknowledge that their research only addresses first-order impacts but suggest that “the assumption that the future will resemble the past is likely tenuous, at best.”
They believe that their results understate the likely impact for three reasons:
(i) Specialist products for lawyers already reduce hallucination problems.
(ii) Limited training given to the students.
(iii) Likely tech improvement over time.
They also suggest that gen AI presents an opportunity for organisations to shift “relatively routine” work away from law firms.
On the other hand, they also make the point I’ve extracted as the image of this article, about how things may play out very differently when the dynamics of litigation and negotiation come into play.
Much more could be said about the implications, and the limitations. But I’ll leave it there. The whole thing is worth a read if you are interested in this area.