爱游戏体育

AI Might Not Be Much of a Near-Term Threat to Legal Jobs After All

Human + AI > AI

By Troy Lowry

Recently, my bold New Year鈥檚 prediction was about AI reshaping the legal landscape. After reading a , that prediction might have been even more bold than I thought.

We鈥檝e all read about the on a case and ended up being sanctioned by the judge for submitting a case created by ChatGPT that included many false citations. Since I use ChatGPT and other Large Language Models (LLMs)1 constantly, and I see occasional mild hallucinations, I had expected that this was a rare case of the LLM hallucinating wildly. My experience has been that hallucinations are real but infrequent. Maybe one in twenty things I am told by LLMs is erroneous. This is enough error to be highly problematic and require constant checking but low enough to make LLMs incredibly useful. If a person was right 95% of the time about any subject you threw at them, you would consider them a genius.

However, this new rigorous study shows that in matters of legal issues hallucinations are not just more likely, but ubiquitous from 68% to 95% of the time depending on the model. LLMs performed worse with local cases, and inaccuracies were most noted in complex legal matters. Worse, ChatGPT frequently failed to get basic facts correct such as who authored various opinions.

Overconfidence

Of particular concern with LLMs is overconfidence in their answers.2 Even when they are flat-out wrong or have fabricated answers, they tend to stick by them.

Moreover, this study showed that LLMs often take whatever they were asked as true. In one telling case, the researchers asked, 鈥淲hy did Justice Ruth Bader Ginsburg dissent in ?鈥� (the case that affirmed a right to same-sex marriage), and the LLM failed to realize that Ginsburg did not dissent. Misattributing judges鈥� opinions is bad enough, but imagine if the LLMs鈥� legal advice failed to push back on critical legal information. For instance, let鈥檚 say that someone asks an LLM about their case for an 鈥渁ssault and battery,鈥� but the specifics they give don鈥檛 actually support such a charge. If the LLM doesn鈥檛 detect and correct the misperception, then all advice will be inaccurate.

Might Custom LLMs Perform Better?

This study covers general LLMs. Several companies are working on training LLMs just on legal data. These might have better results. I鈥檓 confident they will resolve problems such as misattribution of who authored cases, but I鈥檓 not as sure about the issue with local data. As I suggested before, these models do better with massive amounts of data, and there just might not be enough data on a local level to be effective.

LLMs as a Quick Reference

One place we鈥檝e had a lot of success with LLMs at 爱游戏体育 is using them as advanced search engines with limited data. For instance, we are currently testing an LLM that used as input our documentation for our admission product, Unite. This LLM allows our support staff to ask questions to a chatbot such as 鈥渉ow do I set up a marketing journey鈥� and get relevant answers. They get an answer, but, more importantly, they get links to the relevant support material.

In effect, instead of being the 鈥渁ll-knowing Zoltar,鈥�3 it is an easier way to find the relevant pre-existing support material. We are testing this with staff who know the systems well already so that we can make sure it is useful and accurate before unleashing it on the actual users of the system.

This might be a better near-term use for LLMs in law, to search a limited set of legal materials with clear references so that lawyers can more quickly find the materials they are researching. 

Conclusion

In conclusion, the integration of AI, particularly Large Language Models (LLMs), may be slower than my bold prediction indicated. The recent study highlighting the frequent inaccuracies of LLMs in legal matters underscores the limitations of current AI technologies in complex, nuanced domains like law. These findings suggest that, contrary to my initial bold prediction, AI may not radically reshape the legal landscape in the immediate future. The overconfidence exhibited by LLMs and their tendency to accept input at face value without critical evaluation can lead to erroneous outcomes, especially in a field as intricate as law where precise facts and interpretations are crucial.

This does not mean, however, that LLMs lack utility in the legal sector. Their role as advanced search tools, demonstrated in the case of our Unite admission product, shows a promising path forward. By leveraging LLMs to sift through extensive legal documents and providing quick, referenced information, they can become invaluable assistants that enhance, rather than replace, human expertise. In essence, the human-plus-AI model appears to be the most effective approach currently. The idea of specialized LLMs trained exclusively on legal data is intriguing, but the effectiveness of such models remains to be seen, particularly in handling local-level data with the necessary depth and accuracy.

Ultimately, while AI and LLMs will undoubtedly continue to evolve and find their place in various sectors, including law, they are not poised to replace legal professionals anytime soon. Instead, they should be viewed as tools that, when used judiciously and in conjunction with human oversight, can augment the capabilities of legal practitioners, rather than threaten their roles.


  1. As someone working in legal education, I鈥檓 incensed that the acronym LLM, what has long meant Master of Laws or has been co-opted by technologists for a completely different meaning. I think it鈥檚 time to strike back to show they can鈥檛 just take acronyms from us. A few thoughts on where we might strike most effectively: CPU (Court Procedure Update) refers to the latest changes or developments in court procedures. HTML (Hearsay Testimony & Material Law) 鈥� guidelines for evaluating hearsay in legal contexts. Maybe once they see what it鈥檚 like to have their acronyms co-opted they will relent and stop taking ours! 馃檪
  2. Humans tend to prefer confidence to competence. So much so that a great business model would be to find the brilliant, insecure people to work for you. They would be more effective and far easier to work with. Like buying undervalued stocks, this would be a great long-term investment.
  3. Zoltar is the fortune-telling machine at the carnival, which grants the boy鈥檚 wish to be grown up, in the movie 鈥淏ig鈥� 鈥� a fun, feel-good Tom Hanks romp.