🔧 Extract Structured Data from Documents and Emails

Customer operations teams often face a mountain of unstructured information arriving through emails with attachments in PDF, Word, scanned images, or just raw text. Manually reviewing these inputs, classifying them, and entering the extracted data into ERPs, CRMs, or ticketing systems is time-consuming and error-prone.

With UpBrains AI Agent Builder, you can automate this process by building AI Agents that read, classify, and extract structured data from emails and their attachments, and automatically send that information to your internal tools. This article shows you how to build such an agent using drag-and-drop flows, leveraging the capabilities of UpBrains' Information Extraction and Document Classification tools.


🔄 Step-by-Step: Building a Data Extraction AI Agent

🌐 Step 1: Create a New AI Agent Flow

Start by creating a new AI Agent in your Agent Builder dashboard. This will serve as the container for your document-processing workflow.

ℹ️ Tip: Refer to the guide on Creating Your First AI Agent on how to create AI agents, and Common Tools and Actions guide for an overview of the available AI blocks and steps used in agent flows.

✉️ Step 2: Start with an Inbox, Email or Document Store Trigger

Use the Trigger block to automatically activate the agent whenever a new email is received in a connected inbox, or tool such as Microsoft OneDrive, Google Drive, Box, Amazon S3, etc. You can also set up routing filters (e.g., subject line contains "invoice") to target specific types of incoming messages.

🔍 Step 3: (Optional) Classify the Document

Drag in the Document Classification action to determine the type of document (e.g., purchase order, invoice, certificate of analysis). This is helpful when the same inbox receives multiple types of documents and you want the flow to adapt accordingly.

Input: Email body text and attachment files (PDF, DOCX, JPG, PNG)

Output: Document class (e.g., "Invoice")

You can use this classification output in conditional branches within your flow to route documents to the correct extraction logic.

🔢 Step 4: Extract Structured Data

Use the Information Extraction action to extract fields such as invoice number, date, line items, total amounts, PO numbers, and more from the body of the email and its attachments.

This action can process a variety of content types including:

  • Plain email text

  • PDF or Images

  • Microsoft Word and Excel

You may use one of the following types of extractors:

  • Prebuilt extractors for documents such as invoices, purchase orders, ID documents, transportation documents and more.

  • Create your own custom extractors

💡 For detailed guide on how to create custom extractors see the Create Custom Extractor article.

Once you have decided what extractor you want to use, then add an "Information Extraction" action to your flow and configure it as follows (for Document Skills — the process is the same for extractors on text through Conversation Skills):

  • Choose the default connection (called UpBrains AI Connection for Document Skills)

  • Set the attachment URL (from the trigger. Note that in some cases you may need to create a Signed URL, for example for Front App attachments and Microsoft Outlook, if there is a need for including credentials in the link as extractors do not deal with credential information).

  • Choose the desired extractor (prebuilt or custom) as the value of Extractor.

  • Choose file types that you need to be processed (PDF, Images, etc).

  • Specify pages if you would like to limit the number of pages processed (e.g., enter value 1 to process only the first page).

This step automatically handles multi-page files up to 50 pages and supports AI-assisted correction of noisy inputs and image quality enhancement, rotation, etc.



Output: A structured JSON object containing the extracted fields. Review the JSON format of the structured extractor in Extractor Data Schema article.

🚀 Step 5: Send Extracted Data to External Tools

Drag in integration steps like HTTP or pre-built connectors to tools such as:

  • ERP systems (e.g., NetSuite, SAP)

  • CRM platforms (e.g., Salesforce)

  • Customer support or order management systems

Use the extracted JSON from the previous step as input for these connectors. If you need to map the output, you may use provided tools/actions such as Build JSON to create the desired output for mapping each field to the correct target field in your external system. See Common Tools and Actions guide for more details.

📖 Example Use Case

A supply chain team wants to automate intake of emailed purchase orders:

  1. Email with PO arrives.

  2. Agent classifies the attachment as a "Purchase Order".

  3. Agent extracts fields like PO number, customer ID, and item list.

  4. Agent sends the data to the company’s ERP system to automatically create a new order.

📊 Final Thoughts

With UpBrains' no-code Agent Builder, you can deploy intelligent agents that combine document understanding, classification, and structured data extraction in a few clicks. Whether you're processing invoices, RFQs, POs, or customs documents, these AI agents can seamlessly enter extracted data into your business systems—dramatically reducing manual effort and improving accuracy.

Ready to start building? Explore more actions in the Common Tools and Actions guide to customize your agent further.

🔧 Extract Structured Data from Documents and Emails

Customer operations teams often face a mountain of unstructured information arriving through emails with attachments in PDF, Word, scanned images, or just raw text. Manually reviewing these inputs, classifying them, and entering the extracted data into ERPs, CRMs, or ticketing systems is time-consuming and error-prone.

With UpBrains AI Agent Builder, you can automate this process by building AI Agents that read, classify, and extract structured data from emails and their attachments, and automatically send that information to your internal tools. This article shows you how to build such an agent using drag-and-drop flows, leveraging the capabilities of UpBrains' Information Extraction and Document Classification tools.


🔄 Step-by-Step: Building a Data Extraction AI Agent

🌐 Step 1: Create a New AI Agent Flow

Start by creating a new AI Agent in your Agent Builder dashboard. This will serve as the container for your document-processing workflow.

ℹ️ Tip: Refer to the guide on Creating Your First AI Agent on how to create AI agents, and Common Tools and Actions guide for an overview of the available AI blocks and steps used in agent flows.

✉️ Step 2: Start with an Inbox, Email or Document Store Trigger

Use the Trigger block to automatically activate the agent whenever a new email is received in a connected inbox, or tool such as Microsoft OneDrive, Google Drive, Box, Amazon S3, etc. You can also set up routing filters (e.g., subject line contains "invoice") to target specific types of incoming messages.

🔍 Step 3: (Optional) Classify the Document

Drag in the Document Classification action to determine the type of document (e.g., purchase order, invoice, certificate of analysis). This is helpful when the same inbox receives multiple types of documents and you want the flow to adapt accordingly.

Input: Email body text and attachment files (PDF, DOCX, JPG, PNG)

Output: Document class (e.g., "Invoice")

You can use this classification output in conditional branches within your flow to route documents to the correct extraction logic.

🔢 Step 4: Extract Structured Data

Use the Information Extraction action to extract fields such as invoice number, date, line items, total amounts, PO numbers, and more from the body of the email and its attachments.

This action can process a variety of content types including:

  • Plain email text

  • PDF or Images

  • Microsoft Word and Excel

You may use one of the following types of extractors:

  • Prebuilt extractors for documents such as invoices, purchase orders, ID documents, transportation documents and more.

  • Create your own custom extractors

💡 For detailed guide on how to create custom extractors see the Create Custom Extractor article.

Once you have decided what extractor you want to use, then add an "Information Extraction" action to your flow and configure it as follows (for Document Skills — the process is the same for extractors on text through Conversation Skills):

  • Choose the default connection (called UpBrains AI Connection for Document Skills)

  • Set the attachment URL (from the trigger. Note that in some cases you may need to create a Signed URL, for example for Front App attachments and Microsoft Outlook, if there is a need for including credentials in the link as extractors do not deal with credential information).

  • Choose the desired extractor (prebuilt or custom) as the value of Extractor.

  • Choose file types that you need to be processed (PDF, Images, etc).

  • Specify pages if you would like to limit the number of pages processed (e.g., enter value 1 to process only the first page).

This step automatically handles multi-page files up to 50 pages and supports AI-assisted correction of noisy inputs and image quality enhancement, rotation, etc.



Output: A structured JSON object containing the extracted fields. Review the JSON format of the structured extractor in Extractor Data Schema article.

🚀 Step 5: Send Extracted Data to External Tools

Drag in integration steps like HTTP or pre-built connectors to tools such as:

  • ERP systems (e.g., NetSuite, SAP)

  • CRM platforms (e.g., Salesforce)

  • Customer support or order management systems

Use the extracted JSON from the previous step as input for these connectors. If you need to map the output, you may use provided tools/actions such as Build JSON to create the desired output for mapping each field to the correct target field in your external system. See Common Tools and Actions guide for more details.

📖 Example Use Case

A supply chain team wants to automate intake of emailed purchase orders:

  1. Email with PO arrives.

  2. Agent classifies the attachment as a "Purchase Order".

  3. Agent extracts fields like PO number, customer ID, and item list.

  4. Agent sends the data to the company’s ERP system to automatically create a new order.

📊 Final Thoughts

With UpBrains' no-code Agent Builder, you can deploy intelligent agents that combine document understanding, classification, and structured data extraction in a few clicks. Whether you're processing invoices, RFQs, POs, or customs documents, these AI agents can seamlessly enter extracted data into your business systems—dramatically reducing manual effort and improving accuracy.

Ready to start building? Explore more actions in the Common Tools and Actions guide to customize your agent further.

🔧 Extract Structured Data from Documents and Emails

Customer operations teams often face a mountain of unstructured information arriving through emails with attachments in PDF, Word, scanned images, or just raw text. Manually reviewing these inputs, classifying them, and entering the extracted data into ERPs, CRMs, or ticketing systems is time-consuming and error-prone.

With UpBrains AI Agent Builder, you can automate this process by building AI Agents that read, classify, and extract structured data from emails and their attachments, and automatically send that information to your internal tools. This article shows you how to build such an agent using drag-and-drop flows, leveraging the capabilities of UpBrains' Information Extraction and Document Classification tools.


🔄 Step-by-Step: Building a Data Extraction AI Agent

🌐 Step 1: Create a New AI Agent Flow

Start by creating a new AI Agent in your Agent Builder dashboard. This will serve as the container for your document-processing workflow.

ℹ️ Tip: Refer to the guide on Creating Your First AI Agent on how to create AI agents, and Common Tools and Actions guide for an overview of the available AI blocks and steps used in agent flows.

✉️ Step 2: Start with an Inbox, Email or Document Store Trigger

Use the Trigger block to automatically activate the agent whenever a new email is received in a connected inbox, or tool such as Microsoft OneDrive, Google Drive, Box, Amazon S3, etc. You can also set up routing filters (e.g., subject line contains "invoice") to target specific types of incoming messages.

🔍 Step 3: (Optional) Classify the Document

Drag in the Document Classification action to determine the type of document (e.g., purchase order, invoice, certificate of analysis). This is helpful when the same inbox receives multiple types of documents and you want the flow to adapt accordingly.

Input: Email body text and attachment files (PDF, DOCX, JPG, PNG)

Output: Document class (e.g., "Invoice")

You can use this classification output in conditional branches within your flow to route documents to the correct extraction logic.

🔢 Step 4: Extract Structured Data

Use the Information Extraction action to extract fields such as invoice number, date, line items, total amounts, PO numbers, and more from the body of the email and its attachments.

This action can process a variety of content types including:

  • Plain email text

  • PDF or Images

  • Microsoft Word and Excel

You may use one of the following types of extractors:

  • Prebuilt extractors for documents such as invoices, purchase orders, ID documents, transportation documents and more.

  • Create your own custom extractors

💡 For detailed guide on how to create custom extractors see the Create Custom Extractor article.

Once you have decided what extractor you want to use, then add an "Information Extraction" action to your flow and configure it as follows (for Document Skills — the process is the same for extractors on text through Conversation Skills):

  • Choose the default connection (called UpBrains AI Connection for Document Skills)

  • Set the attachment URL (from the trigger. Note that in some cases you may need to create a Signed URL, for example for Front App attachments and Microsoft Outlook, if there is a need for including credentials in the link as extractors do not deal with credential information).

  • Choose the desired extractor (prebuilt or custom) as the value of Extractor.

  • Choose file types that you need to be processed (PDF, Images, etc).

  • Specify pages if you would like to limit the number of pages processed (e.g., enter value 1 to process only the first page).

This step automatically handles multi-page files up to 50 pages and supports AI-assisted correction of noisy inputs and image quality enhancement, rotation, etc.



Output: A structured JSON object containing the extracted fields. Review the JSON format of the structured extractor in Extractor Data Schema article.

🚀 Step 5: Send Extracted Data to External Tools

Drag in integration steps like HTTP or pre-built connectors to tools such as:

  • ERP systems (e.g., NetSuite, SAP)

  • CRM platforms (e.g., Salesforce)

  • Customer support or order management systems

Use the extracted JSON from the previous step as input for these connectors. If you need to map the output, you may use provided tools/actions such as Build JSON to create the desired output for mapping each field to the correct target field in your external system. See Common Tools and Actions guide for more details.

📖 Example Use Case

A supply chain team wants to automate intake of emailed purchase orders:

  1. Email with PO arrives.

  2. Agent classifies the attachment as a "Purchase Order".

  3. Agent extracts fields like PO number, customer ID, and item list.

  4. Agent sends the data to the company’s ERP system to automatically create a new order.

📊 Final Thoughts

With UpBrains' no-code Agent Builder, you can deploy intelligent agents that combine document understanding, classification, and structured data extraction in a few clicks. Whether you're processing invoices, RFQs, POs, or customs documents, these AI agents can seamlessly enter extracted data into your business systems—dramatically reducing manual effort and improving accuracy.

Ready to start building? Explore more actions in the Common Tools and Actions guide to customize your agent further.

Exporting to Google Sheets