Azure Document Intelligence
Azure Document Intelligence (formerly known as Azure Forms Recognizer) is machine-learning based service that extracts text (including handwriting), tables or key-value-pairs from scanned documents or images.
This current implementation of a loader using Document Intelligence is able to incorporate content page-wise and turn it into LangChain documents.
Document Intelligence supports PDF, JPEG, PNG, BMP, or TIFF.
Further documentation is available at https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/?view=doc-intel-3.1.0.
%pip install langchain azure-ai-formrecognizer -q
Example 1
The first example uses a local file which will be sent to Azure Document Intelligence.
First, an instance of a DocumentAnalysisClient is created with endpoint and key for the Azure service.
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
document_analysis_client = DocumentAnalysisClient(
endpoint="<service_endpoint>", credential=AzureKeyCredential("<service_key>")
)
With the initialized document analysis client, we can proceed to create an instance of the DocumentIntelligenceLoader:
from langchain.document_loaders.pdf import DocumentIntelligenceLoader
loader = DocumentIntelligenceLoader(
"<Local_filename>", client=document_analysis_client, model="<model_name>"
) # e.g. prebuilt-document
documents = loader.load()
The output contains each page of the source document as a LangChain document:
documents
[Document(page_content='...', metadata={'source': '...', 'page': 1})]