What is Document AI: Google's Artificial Intelligence to automate document management?

Receipts, delivery notes, invoices, e-mails, contracts... Companies accumulate documents of all kinds. And these in turn include a multitude of information, very valuable for companies. 

But this information, stored in this way, loses part of its potential. In other words, today, if we want to process it digitally, it is of no use to us. For example, an invoicing programme will not be able to interpret an invoice stored in PDF. 

The reason is that the computers and software we use need "structured data" to function. However, all this information stored in invoices, contracts, emails... is considered "unstructured data". And what is structured data? Basically, it can be said to be data that is organized in fields and has been labelled or categorized. 

Is my company digitalized?

Many companies see digitization as moving the majority of their documents from a physical format to the cloud. However, the change of format is not the decisive factor. What is important is extracting the data from the documents so that they can be processed. And this is where a serious problem arises.

Who dumps all the supplier invoice data into the invoicing software? How can all the complaint mails that a retail customer service department receives be handled efficiently? Is there a way to reduce the time needed to process accident reports?

Until now, these tasks - and others like them - have required the work of one or more people within an organization. In other words, companies need thousands of hours per year to be able to take advantage of the data they already have. This implies a high cost in terms of human resources and time that most companies cannot afford.

Artificial Intelligence may be the answer to this problem. The key is to automate the extraction of this information during the digitization process. But also to convert this unstructured data into organized fields that can be processed by computer programmes.

What is Document AI?

Artificial Intelligence is increasingly present in our day-to-day lives: its applications monopolize all kinds of devices. And if there is one company that is a pioneer in offering AI and Machine Learning solutions, it is Google. Precisely, at the end of September 2020, it presented multiple novelties in one of its most promising products: Document AI. 

Document AI is a Google Cloud Platform Natural Language Processing (NLP) solution based on Optical Character Recognition (OCR) technology. This service can recognize text, images and characters in more than 200 languages.

With Document AI, unstructured data can be extracted from the image of a document: invoice, form, ticket, etc. The service interprets the information using Machine Learning and structures it in key:value format or in a table. In this way, it is already possible to use the data to automate business processes that traditionally require manual intervention. 

In addition to all the possibilities described above, Document AI makes it possible:

  • Locating documents, through the data they contain.
  • Identifying patterns or trends.
  • Creating statistics and generating knowledge. 

This not only reduces operating costs, but also increases the scalability of processes that require manual intervention (onboarding, customer service, etc.).

The potential of Document AI services is that they automate one or more parts of the digitization process. This process includes the extraction of information and its classification, including its correct storage. 

What other obstacles does it solve?

Several circumstances add to the complexity of unstructured data processing and automation:

  1. The volume of information. Often the batches of documentation to be digitized held by companies are quite large. For example, if a bank wanted to digitize and process the data of all its mortgage deeds, it would be an arduous task.
  2. Diversity. Some documents change over time. Both in terms of how data is collected and the content of the data due to regulatory changes. It also happens that the same document can be presented in different formats. This is the case with invoices, whose model changes from one company to another. They all collect the same information, but present it differently, which adds an extra difficulty when it comes to locating the different data fields.
  3. Verification. Sometimes it is necessary to check with external sources in order to validate the information. For example, to check that the format of a document such as an ID card is correct. 
  4. Semantic search. Some documents contain information that does not have a specific format, such as an email. There are also others whose fields are very generic, such as a complaint form. In these cases, the key data are extracted from the context, making the process more difficult. 

Cómo funciona Document AI

 

How does Document AI work? 

The solutions included in Document AI intervene in each of the phases into which the Automation process can be divided.

Digitize documents

The information extraction process takes place at the same time as the scanning of the document. This is possible thanks to OCR techniques. Thus, when digitizing an invoice, it is possible to directly capture the data that appears on it.

Data extraction is also automatic for documents that are already digital. In addition, after the latest updates of Document AI, it can be applied to other formats such as images, voice messages, videos...

Semantic data classification

Document AI identifies patterns without the need for specific fields. As a result, it can classify unstructured information intelligently, just as a human would. For example, it can differentiate between an email that is a commercial order and one that is a complaint.

Information management and processing

Once the data is extracted and well classified, it is possible to manage it. In effect, it can be processed by any software with a purpose, to achieve a goal. Document AI allows workflows to be set up, enabling action to be taken. This requires that the specific Document AI solution is properly integrated with all the systems involved.

Let's give you an example: a person suffers a traffic accident and wants to report it to his insurance company. Thanks to Document AI and its integrations, this person will be able to send a photo of the handwritten form via WhatsApp. Then, the system will extract the main data to send them, already filtered, to the department in charge of this management. With this automation, not only does the customer save time in the process, but the company itself simplifies the process, making it more efficient.

Document AI inside

Document AI comprises three main services:

  • General Document AI, with APIs related to document extraction. An example of this is Vision API that interprets handwritten textual information.
  • Custom Document AI, based on AutoML and custom Machine Learning models. Therefore, a company can both create and train its own models for its documents, forms and use cases.
  • Specialized Document AI, which are pre-built templates by Google specialized in a specific type of document (receipts, invoices...). 

However, each of them also integrates more specific services, whether they are models pre-trained by Google, available through APIs or the possibility of training our own models through AutoML.

Likewise, Document AI's Machine Learning APIs and models are integrated with other Google services:

  • Dialogflow: To create operational virtual assistants that are able to interpret documents provided by the users, such as accident reports, purchase tickets, invoices, etc...
  • NLP: This can be used to analyze the general sentiment expressed in a block of text or to identify entities such as people, organizations, locations... 

Making the most of Document AI 

In short, the possibilities offered by Document AI are manifold, but many companies find it difficult to take advantage of it. What's more, they are sometimes aware of their problems and the existing products, but lack the perspective to find a solution that fits their circumstances.

That's why it is important to have a partner to help take these solutions to the next level. At Emergya, as a Google Cloud Premier Partner, we know all these tools how to apply them to find specific and functional solutions. Every day we help our clients to take advantage of them, putting all the Google Cloud technology to work for their business.

We want to help you achieve your digital objectives. Let's talk!

fondo-footer
base pixel px
Convert
Enter PX px
or
Enter EM em
Result