top of page

Smoke, Mirrors, and AI Systems

Everywhere you look these days a new legal tech AI system claims to overcome all the issues that prevent lawyers from adopting AI or (more likely) all of the reasons a non-lawyer can now rely on AI rather than a licensed attorney. We’re seeing systems that boast “purpose-built” legal LLMs, applications that guarantee legal outcomes (yes, really), and sweeping statements that businesses can soon replace those pesky, high-priced lawyers with a chatbot and auto-drafter.


Sounds great, right? But how much scrutiny do these claims stand up to? What happens when we pull back the covers and look underneath?


Understanding LLMs

Let’s start with the “purpose-built” legal LLMs. While purpose-built LLMs can and do exist, most of the claims made around purpose-built legal LLMs seem to be pure baloney.


Large language models require vast amounts of training data, expensive hardware, and sophisticated software to develop human-like language skills. Very few entities in the world have access to the sheer volume of data and computing power necessary to produce foundational LLMs. OpenAI, Google, Anthropic, and Meta are among the mere handful of companies creating these foundational models. 


Purpose-built LLMs are LLMs that are not foundational level models, such as OpenAI, but are still built from the data layer up. They have been trained only on data selected for a specific purpose, with language training specifically designed for the language most related to the designated purpose. They might be trained only on medical data or data related to a specific medical condition and the language of diagnosis and treatment, specifically eschewing laymen’s terms around disease so that they can be used for diagnostic and treatment predictions.  They still require the intense compute resources, specialized hardware, and huge volumes of data necessary to develop an LLM, just not quite as many as a foundational LLM. In addition to the foundational LLM companies, which are also building various purpose-built LLMs, there are research institutions and sector leaders with data access (like Bloomberg and Thomson Reuters) who are building these.


Truly purpose-built LLMs are different than “fined tuned” LLMs. Fine-tuned LLMs are not specifically built LLMs, but a smaller and faster subset of foundational LLMs. These models are still built on – and could not exist without – those foundational LLMs. They have not been “purpose built” merely “purpose configured”.  These fine-tuned LLMs have great utility, particularly in areas such as citation verification, hallucination reduction, and reducing known biases in foundational LLMs. They are faster and cheaper to develop, often requiring little technical knowledge or capability.  Using these fine-tuned LLMs is much like using the foundational LLMs when it comes to prompting or acting on outputs. They look and feel more like ChatGPT than traditional software, and you definitely know you are using AI (even in the unlikely event you were not bombarded with advertisements touting the fact that it is AI).


Finally, there are applications (like ours) that are built using LLMs (ours is designed to work with a number of foundational LLMs). These applications are not LLMs at all – they merely use foundational LLM and LLM outputs within a broader application. These often include other algorithmic calculations or integrated operations making use of the LLM and its outputs. Users would interact with these applications the same way they interact with other applications – you might not even realize you were using AI except for the incessant marketing.


Those Pesky Facts

For companies trying to cut through the marketing noise surrounding AI, especially in the legal tech arena, it seems very easy to push the boundaries of puffery in product claims. This is particularly true for fine-tuned or custom GPT programs that are fighting against headlines about hallucinations in court filings and war stories told between lawyers about obviously-AI-drafted documents forwarded from clients. Legal tech providers want to earn trust with assurances against such risks and the resultant embarrassment. This often leads to companies offering “purpose-built legal LLMs” that are frequently nothing more than a wrapper on ChatGPT with specific GPT settings configured.


Does it matter? Yes! And not just for the purposes of quality or reliability. The service terms, data protection terms, privacy obligations, and even disclosure obligations vary between LLM providers. Different foundational LLM providers have different terms, and purpose-build LLMs may have further differing terms, even when provided by the same entity. Application that are built on foundational LLMs, though, are subject to the terms of the foundational LLM.  They don’t get to write new terms – and cannot override the LLM terms – when they are taking technical dependencies on third-party LLMs.  Without knowing the structure of the AI system, you may not know where your data is going, how it is treated, or who has access to it.

So, how do you know which tools are mere GPT wrappers sending your data directly to a foundational LLM?


The Trust Factor

We’ve talked before about the need for in-house counsel to role model responsible use of AI.  Nowhere is that more important than doing diligence on providers of AI-powered legal tech.  We’re not talking about learning higher-level math in order to break down machine learning processes, just your average every day due diligence.

  • Is the vendor transparent about the LLMs that is used?

  • If they have developed their own LLM, are the transparent about how the LLM was trained and is maintained according to responsible AI principles?

  • Even if they are not transparent about that, are they at least transparent about their own business operations?


Example:

Here’s an example I found while researching legal tech AI solutions. To protect the guilty, no names will be used, but I dare you to try this out on your AI vendor of choice).  I learned of this company through a second-degree LinkedIn connection and was intrigued. I went to the website and got the standard marketing pitch of everything this AI assistant could do – including “helping with homework” and “contract drafting” (both paraphrased to protect the guilty). I wanted to learn more, so I started where I always start: the privacy notice.


First, the privacy notice always has the full corporate entity name listed #protip.  Here, it wasn’t at the top but buried in the text. Not a red flag legally but a good flag for the notice being written by AI (IFKYK). The notice ticked most of the boxes one would expect if compiling a privacy notice as a check-the-box exercise without thought about actual operations. Another red flag? Sure, for data protection, scalability, and generalize legal risk, but also pretty dang typical for an early-stage startup. The notice stated that the company was headquartered in the UK and that all data was processed in the UK. But it also stated that it was built on OpenAI ChatGPT (again, not unusual – see above) but then failed to mention any data transfers necessary for such operations. Hmm…OpenAI does offer data residency in the EU for data at rest, so perhaps that’s fine.


Then, there’s the kicker.  I did my standard next step diligence from when I was a baby lawyer. I looked up the company info. The company “headquartered” in the UK with UK data processing commitments is the office of a corporate registration service provider. Registration service providers are companies whose sole business is to register and act as agent for a company – usually with foreign ownership. So, I checked the business registration, where I found the sole owner listed is identified as a foreign national located in foreign country – not the UK.  Big red flag. The likelihood of this company adhering to its privacy notice is now very slim in my mind. Trust is lost.   


Trust is hard to earn and easy to lose.

Does this mean that the application is of poor quality? No – none of that exercise told me about the quality of the service. In fact, the reason I chose this particular company was because it was highly rated in comparison to others in terms of its capabilities in doing specific legal-related tasks. The performance overview, though, never touched on whether the application protected privilege, met privacy compliance regulations, or harvested users or content data for training.


Even if it did, though, I would now have a very hard time believing them.  Can I trust them to be transparent and straightforward with their disclosures if I can’t trust them to review their privacy notice to ensure it matches their actual operations? Especially if those operations are easily discoverable via public record. How do I trust that they are building the features and functionality necessary to provide security, protect privilege, or correct known errors and bias? It would be a hard lift for me, that’s for sure.


Legal tech and AI remain compelling. There is heavy pressure to adopt and implement AI systems within legal practice, in-house counsel teams, and at companies without lawyers. No one loves a cost center and the desire to replace it with robots is very high. Risking you data, your customers’ data, and your reputation (not to mention the regulatory and legal risks), means that AI adoption must be responsible. Mere performance vs. price comparisons are insufficient diligence. Legal teams can lead the way – through policy and practice.


Not your typical startup

That’s why V4 Final is different. We are actively working to shake up the idea that an early-stage company is somehow required to “move fast and break things”. We may not move as fast as we (or our investors) want. We think of ourselves more like a race car that carries a passenger – your data and information – and we don’t want to go around the track without breaks.  Yes, we will have limitations. We are also deeply dedicated to a growth mindset that challenges our limitations and pushes ourselves to achieve more. It will challenge us and make us uncomfortable, but we won’t tell you we’re something we’re not. We value your trust too much.


There’s a lot of AI-powered legal tech out there. Do your diligence. If there’s something you want to know about us? Just ask.

 

 

Comments


Want to learn more?

Register for updates. 

bottom of page