This article, by Jitendra Gupta, Head of AI & Data Science at Wolters Kluwer ELM Solutions, was originally published in Legal IT Professionals.
Generative AI can be a powerful tool for automating routine yet important tasks, allowing corporate attorneys to focus on more value-added work, like practicing law. However, concerns linger about the accuracy and reliability of generative AI output and the technology’s ability to effectively maintain data privacy and security. These concerns are being addressed head-on, with both government action and various industry initiatives defining necessary guardrails for this rapidly evolving innovation.
These concerns shouldn’t overshadow the fact that generative AI is a very useful solution for corporate legal departments (CLDs), especially those struggling to keep their attorneys focused on legal matters versus spending time on necessary but time-consuming administrative tasks. Still, CLDs should carefully vet the vendors they do their AI business with to ensure that the content and recommendations they receive are accurate and trustworthy and that their data is being protected.
Here are the core questions to ask potential vendors when you’re exploring generative AI options. How they answer can reveal the partner who aligns best with your organization’s expectations for trustworthy outputs, data security, and long-term efficacy.
How do you get your data?
Generative AI technologies rely on large language models (LLMs) to generate contracts, research findings, and more. LLMs are made up of very large datasets, often derived from publicly available data. The larger the dataset, the more information the LLM has to learn from. In theory, this leads to more accurate output.
There are some issues with this approach, however. First, creating output based on publicly available data has led to questions about copyright infringement, and it’s not yet clear if companies using generative AI will be held liable. Second, you need to understand what is being done with your data that becomes part of the model. Many generative AI solutions also send data out into the public domain so the AI can become smarter. That can raise data security and privacy concerns.
It's best if the vendor you’re considering uses primarily its own in-house content and data or your own internal data to train its generative AI models. In this approach, your data is culled exclusively from internal sources, such as law firm invoicing, contract language, and other types of benchmark information, and not shared with external sources unless completely anonymized and disclosed to the CLD in advance.
However, the essence of a good AI model relies on many data sources to improve the trustworthiness of the output. The greater the number of sources, the more varied the data is likely to be. This will increase the accuracy of the AI while decreasing the chances of bias. Therefore, most AI solutions will incorporate outside data into their models. These must come from trusted, known sources and should be vetted to at least your minimum sources.
How do you maintain your models?
Models must be continually analyzed, tested, and updated to ensure that they are performing as expected. Likewise, the data that feeds models must be constantly and diligently maintained. This includes cleansing, refining, and structuring the data to ensure that it is current, relevant, and accurate. It’s important that the partner you choose has the appropriate frameworks to perform these actions.
Your solution should also have feedback loops in place to ensure the accuracy of the model. AI is amazing technology, but it’s only useful if there are skilled human beings to provide a level of expertise the machine cannot offer. Vendors with teams of data scientists building, maintaining, and perfecting their models are more likely to provide you with accurate results. Other teams, including legal and legal operations experts, should be part of the feedback loop, too, to vet and approve the results from the models and make recommendations for improvement.
If models aren’t performing as expected, the vendor should have a process that allows them to easily troubleshoot and rectify the issue with the models and optimize their performance. You’ll want a vendor that has set up touchpoints throughout the entire AI lifecycle—from data ingestion to model building to deployment to monitoring. This allows the vendor to recognize if something is not working properly, and to determine why.
Lastly, it’s ideal if your vendor enlists third parties to audit their models regularly. This provides an extra layer of assurance that their models are unbiased and accurate.
How will you guarantee that my data will not be compromised?
A typical generative AI process involves deriving data from all publicly available sources—including, potentially, your own organization. However, this could compromise your company’s data privacy and security. Using a vendor that leverages its internal data sources to inform its generative AI capabilities significantly minimizes the chance of your data being shared externally.
Vendors with an established history of developing AI solutions are also more likely to be safer bets when it comes to data governance. Many legal technology providers are only beginning to jump on the generative AI bandwagon. Those organizations may not have the customer base, expertise, or frameworks in place to build reliable, trusted, secure, and proven AI solutions.
Take care to vet the vendors you’re considering. Make sure they have the track record, skills, and assets to produce generative AI capabilities that are both secure and effective.
Generative AI is a very powerful and complex technology that has created both excitement and concern in the legal industry and across business in general. By asking these three simple questions, you can alleviate many of the concerns you might have in purchasing or using a generative AI solution and successfully identify the most trustworthy and effective technology for your organization.