Amazon Comprehend for Intelligent Document Processing

Automate document processing and extract accurate insights

Why Amazon Comprehend for Intelligent Document Processing?

Companies spend significant time and effort manually or digitally pre-processing documents to make them usable for their applications. Documents have different formats, types, and layouts, making this a time-consuming, error-prone, and costly process. Teams might not have the machine learning (ML) expertise to automate intelligent document processing, but they want a simple and efficient solution that can scale with business requirements and provide accurate results.

Amazon Comprehend helps you automate document processing with no prior ML experience required. Use the classification and extraction capabilities to rapidly process a variety of document types and accurately extract insights to inform your business decisions. Access capabilities to detect and protect sensitive data and help meet compliance requirements. 

Benefits of Amazon Comprehend for Intelligent Document Processing

Quickly and accurately process and extract insights from a variety of document types.
Create custom models specific to your domain, industry, or business requirements, with classes and entities you define.
Automate your document processing pipeline and manage models at scale, with no ML experience required.
Discover and protect personally identifiable information in your documents to help meet privacy and compliance standards.

Features

Page Topics

General

General

Use a single API for processing both text and semi-structured documents that are digital or scanned. Access on-demand and batch processing support for document types such as PDF, Docx, JPEG, TIFF, PNG, and plaintext UTF-8.

Build custom models to accurately catch your domain-specific document categories (such as W2s and auto and home insurance claims) and terminology (such as names, acronyms, product codes, and order types) for your use case or industry. Use dedicated ML models and endpoints built with your data, that only you create and access.

Improve document processing outcomes with a combination of optical character recognition (OCR) and natural language processing (NLP). Use additional datasets at training time to increase accuracy of classification and entity recognition.

Quickly train, deploy, retrain, and manage your models with out-of-the-box model management capabilities. Access insights from your models with single-step inference.

Expand the reach of your use case by processing documents across the multiple languages that Amazon Comprehend supports, reducing the need for translations.

Use the Amazon Comprehend PII redaction capabilities to help automate the discovery and redaction of PII data in your documents at scale.

Use cases

Classify and extract critical information from medical bills and claim forms such as policies and medical codes to provide accurate insights for completing claims processing.

Extract entities from income statements, identity verification, and other loan application documents for credit evaluation and underwriting.  

Automate legal contract processing, classify and triage high-risk documents, and extract insights such as case numbers, trademarks, and clauses to inform negotiations. 

Classify and extract insights from bills, contracts, W2 forms, bank statements, and invoices for tax provisioning and filing.

Resources

Documentation

Get started with Amazon Comprehend for intelligent document processing.

Read documentation