(Image credit: Shutterstock.)
The insurance life cycle includes millions of documents, many of which are still processed and entered into core systems manually. A growing number of technology offerings allow insurers to scan documents electronically, automating the extraction of both structured and unstructured data. While unstructured data has long been a more difficult issue for the insurance industry, more solution providers are addressing the challenge of turning unstructured text into structured data than ever before.
These types of intelligent text ingestion (ITI) solutions are built on top of artificial intelligence (AI) and machine learning capabilities. Optical character recognition (OCR) and intelligent character recognition (ICR) capabilities enable technology platforms to identify specific characters in a sample of unstructured text, but the broader base of AI and machine learning then helps the platform take action on the data that has been extracted.
How Text Ingestion Works
The basic goals of any ITI solution are simple: be able to scan a document, identify and name the data it contains, and then store that data in a structured form. Identifying and naming data typically relies on AI and machine learning technology; the solution compares what it has scanned with previously scanned data, then comes up with a definition. These solutions can differentiate, for example, between a policy number and a policyholder’s last name.
The success rates of solution vary, but vendors typically market their offerings based on successful conversion averages and how many days it takes for a solution to be trained to handle a new type of document. Insurers looking into these types of platforms should consider running proof-of-concept efforts with multiple solutions to ensure that their specific artifacts can be accommodated.
As noted above, ITI solutions are built to handle structured data as well as unstructured data. But what does that mean? Structured text refers to a digital document that indexes each textual data element based on a known position on the document as well as a field name, definition, and length. Unstructured text is a digital document that doesn’t yet have a definition, indexing, or size for the elements it contains.
There is a third category of documents, semi-structured text, that contains characteristics of unstructured and structured text. This might be an Excel spreadsheet, for instance, where column headers are not structured, but after identification the values in that column are easy for a solution to ingest.
ITI in Insurance
Many property/casualty and life lines of business rely on paper forms and other text-based materials for their claims and underwriting processes. Specialty lines insurers in particular use a broad variety of broker-submitted documents. Any insurance process that relies heavily on paper images or non-standard electronic files could benefit from investment in ITI capabilities.
The three major functional areas currently relying on this technology are underwriting, claims, and customer service. In underwriting, ITI can help ingest supplemental information, schedules, current policy information, ACORD forms, and loss runs from distributors and prospective insureds, feeding into robotic process automation and data analytics processes. In claims, it can assist with the automation of a document-heavy process, which includes items like medical bills, repair estimates, medical records, and legal notices. And in customer service, ITI can help insurers sift through and ingest data such as good student discounts, mailing address changes, payment information, and premium audit requests.
Insurers are already seeing benefits in these areas including routing timeliness, improvement in errors and omissions, and increased process efficiency. Different solutions address different combinations of use cases, but these offerings can be divided into four categories of solution: generalist tools, broad use insurance-specific tools, claims/underwriting-specific tools, and ITI advanced underwriting tools. The best-fit ITI solution for an insurer will depend upon that individual insurer’s goals and needs.
Generalist tools are those that can ingest text from any industry, which may require additional training on insurance-specific use cases. Examples include tools from AWS, Microsoft, and Google. Broad use insurance-specific tools, on the other hand, are set up to automate manual document review processes, identify key data, and enter that data into core systems. These solutions are equipped with a business rules engine that carriers can use to apply rules of engagement for specific incoming documents.
Claims- and underwriting-specific tools are, as the name suggests, tailored to artifacts and forms used in the claims and underwriting processes, such as ACORD forms or scanning for medical details that may be “hidden” in unstructured text. ITI advanced underwriting tools are typically designed for commercial insurance business submission and underwriting, which is known for being a complex, paper-intensive process. Machine learning and AI comprise more of these tools to help classify risks accurately and speed up underwriting.
ITI can be a useful tool for dealing with the myriad documents involved in the insurance life cycle, but carriers should define the specific use cases they want to address before launching a pilot. A broad approach can include too many variables to provide real success. Insurer IT executives hoping to learn more about developments in this space can read more in Novarica’s report, Intelligent Text Ingestion: Overview and Prominent Providers.