Curious about what keeps experts, CEOs and other decision-makers in the Intelligent Document Processing (IDP) space on their toes? Get food for thought on IDP-related topics from the industry’s leading minds.
In this opinion piece, Dan Lucarini, Senior Analyst at analyst firm Deep Analysis, delves into the history of IDP development over the past 50+ years and the impact of AI technology, culminating in generative AI, what Deep Analysis calls the 4th Wave.
Since the invention of omnifont OCR, document scanners, and personal computers in the early 1970s, software developers have been on a mission to teach computers how to do paperwork for us. What if our computers could replace the interminable number of person-hours needed each day in offices around the globe to read documents, understand the meaning, and know what data is needed for the next step in a work process? What if they could also do the data entry?
This is the Holy Grail of knowledge worker productivity. It is also the goal of intelligent document processing (IDP), which for 50 years has slowly progressed. AI has always been at the center of this quest. Now with generative AI, we have entered a new paradigm for IDP; what Deep Analysis calls the 4th Wave.
Why is this 4th Wave so important to IDP? For the first time, using GenAI, a computer can reliably classify documents and extract data without the need for human intervention, training samples, or prior knowledge. In AI terms this is known as zero-shot learning; in other words, the document is recognized with no prior encounter or training. IDP’s traditional categorization of documents into structured, semi-structured, and unstructured does not matter to GenAI. You can send it invoices, contracts, forms, emails, correspondence, or any other text.
“In my 30 years of experience, I have not seen a modern technology adopted so quickly by so many companies.”
Dan Lucarini
A Brief History
To better assess the transformative impact of this 4th Wave, it’s helpful to understand the history of IDP development in the three prior waves.
I’ve made and sold IDP software from the early days. My first job in the industry was selling an early commercial “OCR machine,” a large drum-shaped scanner with built-in optical character recognition software developed by Ray Kurzweil, the father of modern OCR. A full page of perfectly printed text could be read by the computer in under five minutes. Correcting the errors took another five minutes. At the time, office managers called it “magical” and “a game changer.” Secretaries and clerks worried they would soon be replaced. Ironically, today we see the same reactions to generative AI.
Since the 1970s, OCR and other developments have been increasingly paired with AI technologies to increase their effectiveness. What follows is a description of the progressive impact of AI technology since the 1970s, culminating in generative AI, the 4th Wave.
1st Wave: OCR (1973)
AI legend Ray Kurzweil invented omnifont OCR software for computers to read books and letters for the blind, by training his “OCR machine” to recognize any character from any scanned image. While OCR had been around for years for very narrow use cases, Kurzweil made it universally workable for the coming office PC revolution. Thus began the business of intelligent document processing.
Notable 1st Wave companies: ABBYY, Kurzweil (Xerox), Caere (Nuance), Calera (Caere), Iris (Canon).
2nd Wave: Forms and templates (circa 1990)
The emerging use of forms and templates to extract specific data from a page, software that worked on most structured and some semi-structured documents. The software could not manage document layout or format variability, constantly requiring new template design and training. Straight through processing was rare, achieved sometimes with standard forms with zero variability and perfect image scans.
Notable 2nd Wave companies: Kofax, Datacap, Cardiff, Captiva, Wheb Systems (Captiva), Formware (Captiva), Parascript, TCG Process, Brainware (Hyland), ReadSoft (Kofax), ITESOFT, KnowledgeLake, Nanonets, OCR Labs (IDverse), OpenText, Planet AI.
3rd Wave: Machine learning and deep learning (circa 2010 – 2012)
General industry adoption of machine learning (ML), natural language processing (NLP), and deep learning algorithms began. Companies developed discriminative ML models for many document types. While this led to impressive improvements in classification, extraction accuracy, and straight-through processing, users still had to gather thousands of samples and employ subject matter experts to train models to acceptable levels of automation. Human in the loop labeling and validation were essential.
Notable 3rd Wave companies: Instabase, Hyperscience, Rossum, Indico Data, Eigen Technologies, Ocrolus, Evolution AI, AYR, AntWorks, Alkymi, Automation Hero, Infrrd, Cortical.io, ReciTAL, Planet AI, Skilja, Ephesoft (Kofax), Veryfi, Klippa, Insiders, TC Labs.
4th Wave: Generative AI and LLMs (2023)
And here we are. GenAI and LLMs are not new; they have been around since Google’s 2017 introduction of transformers. However, it took OpenAI’s ChatGPT launch in November 2022 to focus the software world’s attention on the disruptive power of the technology.
Caveat
As we have repeatedly reported at Deep Analysis, GenAI alone is not an IDP solution. GenAI needs to be surrounded by a trusted data ecosystem. IDP companies are skillfully adding GenAI functionality where it makes sense, running alongside their own discriminative machine learning models which were fine-tuned for document types and do not hallucinate the data.
Conclusion
GenAI is arguably the single most important and disruptive technology advancement in the long history of IDP. Deep Analysis research found that, as of November 2023, a staggering 75% of IDP companies already had GenAI functionality in a release or in development. In my 30 years of experience, I have not seen a modern technology adopted so quickly by so many companies.
Most of the 2nd and 3rd Wave companies have quickly adopted generative AI into their platforms. Startups are appearing with GenAI-first document automation software. Within 12 months, we predict over 90% of IDP products will have some GenAI functionality, and customers will have elevated expectations for AI that can read their documents out of the box with no training.
This excerpt is taken from the Deep Analysis IDP Market Report 2024-2027
About the Author
Dan Lucarini is a senior analyst at Deep Analysis, the leading research firm for unstructured data management technology. Prior to this, he was a successful startup entrepreneur and a product and sales management executive for industry leaders Kofax and OpenText. Dan lives in Cornwall, UK.
📨Get IDP industry news, distilled into 5 minutes or less, once a week. Delivered straight to your inbox: