Pre-trained vs. Untrained Legal AI: What's the Difference?

Written by Nicole Schnetzer | Sep 12, 2024 12:02:56 PM

With the rapid rise and development of Large Language Models (LLMs), organizations can now analyze large amounts of text-based data more quickly, assess contracts for potential issues more efficiently, and draft legal documents with AI assistance. However, in the legal context, a critical question arises: Should one use a specially pre-trained Legal AI for specific legal tasks, or opt for a general-purpose LLM and train it in-house? This article explores the differences and the advantages of both approaches.

Definition: What is a Pre-trained Legal AI?

A pre-trained Legal AI is an AI-based software solution designed for a specific legal use case. The AI has been pre-trained and optimized for this particular use case through a combination of Natural Language Processing (NLP) techniques incl. a wide application of Large Language Models (LLMs).

The training methodology consists of the following main components:

Extracting the Right Context: This component utilizes NLP techniques to identify specific text segments within legal documents, such as contract clauses. These extracted text segments provide the right context for further analysis via LLMs.
Advanced Text Analysis with LLMs via Prompting: After the relevant text segments have been identified, LLMs are used to further analyze these segments. This analysis is guided by targeted prompting that directs specific queries to the LLM, such as verifying clauses for legal risks, determining the jurisdiction, or identifying other specific data points within contracts.
Quality Assurance with Test Sets: The results generated from prompting are evaluated using robust test sets with diverse formulations of the legal clauses. These test sets contain expected outcomes that are used to measure the accuracy and reliability of the generated answers. The prompting is refined until the results are consistently reliable and precise.

An example of this approach is the Legal Engineering Team at Legartis, which collaborates with "Machine Teachers" to optimize the AI’s capabilities in contract analysis and review.

Definition: What is an Untrained Legal AI?

An "Untrained Legal AI" refers to a Large Language Model (LLM) that has not yet been customized with company-specific or client-specific legal data. It can be a general LLM like ChatGPT or a model already pre-trained on legal topics that a company or law firm may deploy. The model can be adapted to specific requirements using techniques such as Prompt Engineering or Retrieval-Augmented Generation (RAG) without re-training the model itself. In some cases, fine-tuning on proprietary legal data is also possible, but this requires substantial technical resources and expertise. Clients must bring the necessary knowledge for quality assurance testing and adapting the model to their specific use cases.

Comparison: Pre-Trained Legal AI vs. Junior Lawyer

Providers of pre-trained Legal AIs use comprehensive test sets during the development phase to ensure and verify the quality and accuracy of the AI. These test sets consist of legal documents, contract clauses, and scenarios that simulate specific legal tasks. The provider analyzes the AI's results against reference answers provided by legal experts and continuously optimizes the model based on these tests. If the AI achieves a specific level of accuracy and consistency (e.g., an F-Score over 90%), its responses are deemed reliable and made available for client use.

In comparison, junior lawyers achieve a general accuracy/reliability score of 86% (F-Score 0.86) in identifying legal inconsistencies in contracts (source). Therefore, a pre-trained Legal AI can work more accurately and reliably than a junior lawyer in specific tasks like identifying missing or erroneous clauses in contracts.

***Legal AI from Legartis is with 92% more precise than junior lawyers. Talk to us about your business case.

Resources Needed for Adapting an Untrained Legal AI

Organizations that choose to adapt an untrained Legal AI to their specific needs must be prepared to invest substantial resources and expertise. Many companies fail at this challenge due to the complex and costly process involved.

Key Resources and Steps:

Building a Legal Engineering Team: At least 2-3 professionals, including 1-2 lawyers, to ensure the AI processes relevant legal content correctly.
Developing Prompt Engineering Skills: Essential for optimizing AI prompts to deliver precise and context-specific answers.
Creating and Maintaining Test Sets: Comprehensive datasets containing representative legal scenarios are necessary to continuously test and adjust the AI's quality.
Time Investment and Training Experience: Fine-tuning an LLM can take at least one year, often longer, depending on factors like data collection, training, and iterative testing to achieve high quality (F-Score > 0.9).

Risks and Challenges:

Inconsistent Quality Results: Regular tests and adjustments are required to maintain consistent performance.
Dependency on Key Individuals: The risk of knowledge loss due to employee turnover.
Rapidly Changing Needs: Agile project management is crucial to respond to new priorities.
High Investments and Ongoing Adjustments: Continuous investments in time and resources are required due to the ongoing evolution of LLMs and prompt techniques.

Deploying an untrained Legal AI requires a high level of expertise, time, and resources. Companies must be ready to continuously invest in adapting and improving the model to meet their specific needs.

Benefits of a Pre-trained Legal AI

Lower Costs for Training & Adaptation: An untrained Legal AI requires substantial resources to be customized for specific requirements, including the costs of developing data points, adapting to specific legal terms and scenarios, and repeated testing and fine-tuning.

Example: Developing a single data point takes 4-5 working days (32-40 hours) and costs approximately 2,720 – 3,400 EUR at an hourly rate of 85 EUR. For 30 data points, the costs can add up to 81,600 – 102,000 EUR. A pre-trained Legal AI, already optimized for specific legal data points and tasks, significantly reduces these adaptation costs, as it is ready to use immediately.

Time Savings & Faster ROI: Training an untrained AI can take up to a year, while a pre-trained AI can be deployed immediately. This saves time and accelerates Return on Investment (ROI).
Less Personnel Required: No need to build an in-house Legal Engineering Team or hire specialized prompt engineers, resulting in significant personnel cost savings.
Better Consistency and Quality: Pre-trained AIs offer consistently high quality due to extensive pre-training and testing phases, reducing the risk of inconsistent results and the need for additional testing.

Using a pre-trained Legal AI not only saves significant costs and time but also reduces personnel requirements and the risk of inconsistent results, making it a more efficient solution compared to an untrained AI.

Benefits of an Untrained Legal AI

Investment in Internal Skills: Training internal teams in areas like prompt engineering fosters long-term competency development, ensuring the organization is future-proof.
Quick Response to Internal Requirements: With the right skills in-house, the company can quickly adapt to new requirements and changes without relying on external vendors.
Independence from Providers: No dependency on the resources or developments of an external solution partner; full control over the AI development process.
Self-Determination of Quality: The company defines and controls the AI's quality standards and can optimize them according to its own criteria.

Key Takeaway

The choice between a pre-trained and an untrained Legal AI largely depends on the specific needs of a company.

Pre-trained Legal AIs provide an immediately deployable solution with higher accuracy, lower adaptation costs, and faster ROI, ideal for companies seeking quick results with fewer internal resources. However, ongoing license and update costs may arise, and there is some dependency on the provider.

Untrained Legal AIs offer maximum flexibility and control, allowing for customization to specific requirements but involving high initial costs, substantial personnel effort, and longer implementation times. This option is better suited for companies with specialized needs that require full control over AI development, but it also involves a potentially high time investment and the need for internal expertise.

Get to know the pre-trained Legal AI from Legartis.

View full post