How to build an invoice extraction agent

Today

Building an Automated Invoice Processing System with n8n

In today's fast-paced business environment, handling invoices manually can be time-consuming and error-prone. That's why I decided to build an automated invoice processing agent using n8n, an open-source workflow automation tool. This system streamlines the entire invoice management process—from receipt to storage and analysis—all through a simple Telegram interface.

The Problem: Invoice Management Headaches

Managing invoices traditionally involves several manual steps:

For small businesses and freelancers, this process can eat up valuable time that could be better spent on core business activities.

The Solution: A Telegram-Based Invoice Agent

I created a fully automated workflow using n8n that accomplishes the following:

  1. Receives invoices through a Telegram bot
  2. Extracts key information using OCR technology
  3. Stores the data in a Google Sheet for easy tracking
  4. Archives the original invoice in Google Drive
  5. Provides a confirmation message with invoice details

The entire process happens within seconds of uploading an invoice, with no manual data entry required.

How It Works: The Technical Architecture

Let's break down the workflow components:

1. Telegram Interface

The system is triggered when a user sends a document to a dedicated Telegram bot. This provides a simple, mobile-friendly way to submit invoices from anywhere. The workflow begins with a "Telegram Trigger" node that listens for incoming files.

2. OCR Processing with Mistral AI

Once an invoice is received, the system:

This part of the workflow leverages Mistral's powerful OCR capabilities to convert the visual invoice into machine-readable text.

3. Intelligent Information Extraction

The raw OCR results are then processed through an "Information Extractor" node powered by OpenAI's GPT-4o model, which identifies and extracts:

This AI-powered approach can handle various invoice formats and layouts without requiring template-specific configurations.

4. Database Integration

The extracted information is automatically appended to a Google Sheet that serves as an invoice database. This creates a searchable, sortable record of all processed invoices with their key details.

5. Document Archiving

The original invoice document is saved to a dedicated "Invoices" folder in Google Drive with a standardized naming convention ("Invoice [Month-Day-Year]"). This ensures all original documents are properly archived and easily accessible when needed.

6. User Feedback

Finally, the system sends a confirmation message back to the user through Telegram, including:

This user-friendly response provides immediate confirmation that the invoice has been processed successfully.

The Complete n8n Workflow

The workflow consists of 13 interconnected nodes:

  1. Telegram Trigger - Initiates the workflow when a document is received
  2. Download File - Retrieves the file from Telegram
  3. Upload to Mistral - Sends the document to Mistral's OCR service
  4. Get Signed URL - Obtains a processing URL for the document
  5. Get OCR Results - Extracts text from the document
  6. Information Extractor - Identifies key invoice data points
  7. OpenAI Chat Model - Powers the information extraction
  8. Update Database - Adds the invoice data to Google Sheets
  9. Telegram - Prepares the original file for Google Drive
  10. Add Invoice Image to Drive - Archives the document
  11. Invoice Agent - Prepares the response message
  12. Window Buffer Memory - Maintains conversation context
  13. Reply - Sends the confirmation message to the user

Benefits and Results

This automated invoice processing system delivers several key benefits:

Time Savings

What used to take 5-10 minutes per invoice now happens automatically in seconds, freeing up valuable time for more important tasks.

Improved Accuracy

By eliminating manual data entry, the system reduces the risk of transcription errors and ensures consistent data formatting.

Better Organization

All invoices are systematically named, stored, and tracked, making it easy to find any document when needed.

Enhanced Visibility

The Google Sheet database provides at-a-glance visibility into outstanding invoices, payment due dates, and financial obligations.

Mobile Accessibility

The Telegram interface allows for processing invoices from anywhere using a smartphone, without needing a computer or scanner.

Future Enhancements

While the current system works well, I'm planning several enhancements:

  1. Payment tracking - Automatically marking invoices as paid when transactions are detected
  2. Due date reminders - Sending notifications when payment deadlines approach
  3. Expense categorization - Using AI to classify expenses for accounting purposes
  4. Multi-currency support - Converting foreign currencies to a base currency
  5. Vendor analysis - Identifying spending patterns with different suppliers

Build Your Own Invoice Agent

I'm sharing the complete workflow JSON so you can easily implement this system yourself. Here's how to get started:

Download and Import the Workflow

  1. Download the workflow JSON file:

    • You can download the complete workflow here
    • Alternatively, copy the JSON from the code block at the end of this article
  2. Import into n8n:

    • Log in to your n8n instance
    • Go to "Workflows" in the main menu
    • Click the "Import from File" button
    • Select the downloaded JSON file or paste the copied JSON
    • Click "Import" to add the workflow to your n8n instance
  3. Configure credentials:

    • The workflow requires the following credential connections:
      • Telegram Bot API credentials
      • Google Sheets OAuth
      • Google Drive OAuth
      • Mistral AI API key
      • OpenAI API key
    • Click on each node to configure the credentials for your accounts
  4. Set up your Google Sheet:

    • Create a new Google Sheet with columns for: Invoice Number, Date, Total Amount ($), Billing Address, Due Date, and Notes
    • Share the sheet with the Google account connected to n8n
    • Update the Google Sheets node with your new sheet ID
  5. Create a Google Drive folder:

    • Create a folder named "Invoices" in your Google Drive
    • Copy the folder ID from the URL
    • Update the Google Drive node with your folder ID
  6. Activate the workflow:

    • Once all credentials are configured, activate the workflow
    • Test it by sending an invoice to your Telegram bot

Resources and APIs

Here are the resources you'll need to set up your own invoice processing system:

  1. n8n.io - The workflow automation platform
  2. Telegram Bot API - For creating your own Telegram bot
  3. Mistral AI - For OCR processing
  4. OpenAI API - For intelligent information extraction
  5. Google Sheets API - For database functionality
  6. Google Drive API - For document storage

The best part is that most of these tools offer free tiers sufficient for personal or small business use, making this a cost-effective solution for invoice management.

Complete Workflow JSON

{
  "name": "Invoice Agent",
  "nodes": [
    {
      "parameters": {
        "updates": ["message"],
        "additionalFields": {}
      },
      "id": "578e9355-1e60-45bc-b1f1-20b83e8cd744",
      "name": "Telegram Trigger",
      "type": "n8n-nodes-base.telegramTrigger",
      "typeVersion": 1.1,
      "position": [-240, -320],
      "webhookId": "c6a9a358-6607-4e0e-8bb9-3d065f1716aa",
      "credentials": {}
    },
    {
      "parameters": {
        "resource": "file",
        "fileId": "={{ $json.message.document.file_id }}"
      },
      "id": "64e6b76c-2efb-4ae7-9bd0-1e6f85e6bfa3",
      "name": "Download File",
      "type": "n8n-nodes-base.telegram",
      "typeVersion": 1.2,
      "position": [-100, -320],
      "webhookId": "0cd2a44b-17c6-4911-bd4e-24bc3d3fa2fd",
      "credentials": {}
    },
    {
      "parameters": {
        "operation": "append",
        "documentId": {
          "__rl": true,
          "value": "15x2ZQ8dIBU8VJmoh7ZvQ-57ZzX5i2Gaj53pTzLB2fTk",
          "mode": "list",
          "cachedResultName": "Invoice Database",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/15x2ZQ8dIBU8VJmoh7ZvQ-57ZzX5i2Gaj53pTzLB2fTk/edit?usp=drivesdk"
        },
        "sheetName": {
          "__rl": true,
          "value": "gid=0",
          "mode": "list",
          "cachedResultName": "Sheet1",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/15FOh4i1wwy7zv-k4foZc8ubMkMLz_-BibPJKTZ-IziM/edit#gid=0"
        },
        "columns": {
          "mappingMode": "defineBelow",
          "value": {
            "Invoice Number": "={{ $json.output.invoiceNumber }}",
            "Date": "={{ $json.output.invoiceDate }}",
            "Total Amount ($)": "={{ $json.output.totalAmount }}",
            "Billing Address": "={{ $json.output.billingAddress }}",
            "Due Date": "={{ $json.output.dueDate }}",
            "Notes": "={{ $json.output?.notes ?? '' }}"
          },
          "matchingColumns": [],
          "schema": [
            {
              "id": "Invoice Number",
              "displayName": "Invoice Number",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Date",
              "displayName": "Date",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Total Amount ($)",
              "displayName": "Total Amount ($)",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Billing Address",
              "displayName": "Billing Address",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Due Date",
              "displayName": "Due Date",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "Notes",
              "displayName": "Notes",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            }
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {
          "useAppend": true
        }
      },
      "id": "9152abce-0cbb-41f8-9024-a0847b6088fa",
      "name": "Update Database",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4.5,
      "position": [-220, 80],
      "credentials": {}
    },
    {
      "parameters": {
        "name": "=Invoice [{{ $now.format('MMMM-dd-yyyy')}}]",
        "driveId": {
          "__rl": true,
          "value": "My Drive",
          "mode": "list",
          "cachedResultName": "My Drive",
          "cachedResultUrl": "https://drive.google.com/drive/my-drive"
        },
        "folderId": {
          "__rl": true,
          "value": "1qhiNfbHmSPz38I7zyVc9QZkoWNbFRg3y",
          "mode": "list",
          "cachedResultName": "Invoices",
          "cachedResultUrl": "https://drive.google.com/drive/folders/1qhiNfbHmSPz38I7zyVc9QZkoWNbFRg3y"
        },
        "options": {}
      },
      "id": "30495278-30ce-48cd-9587-b38c00ff1bc6",
      "name": "Add Invoice Image to Drive",
      "type": "n8n-nodes-base.googleDrive",
      "typeVersion": 3,
      "position": [60, 140],
      "credentials": {}
    },
    {
      "parameters": {
        "resource": "file",
        "fileId": "={{ $('Download File').item.json.result.file_id }}"
      },
      "id": "5f037117-7c95-48f2-a30f-169dbbe9ffa2",
      "name": "Telegram",
      "type": "n8n-nodes-base.telegram",
      "typeVersion": 1.2,
      "position": [-100, 80],
      "webhookId": "bb8b2f45-d2d8-4a30-89fd-0be24e684c07",
      "credentials": {}
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "89364ec6-9dcb-4d46-a2a1-56fd69349ae6",
      "name": "OpenAI Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "typeVersion": 1,
      "position": [-220, 440],
      "credentials": {}
    },
    {
      "parameters": {
        "sessionIdType": "customKey",
        "sessionKey": "={{ $('Telegram Trigger').item.json.message.chat.id }}"
      },
      "id": "8d161c1a-178b-4180-b1da-71bf99879ccf",
      "name": "Window Buffer Memory",
      "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow",
      "typeVersion": 1.3,
      "position": [-120, 520]
    },
    {
      "parameters": {
        "chatId": "={{ $('Telegram Trigger').item.json.message.chat.id }}",
        "text": "={{ $json.output }}",
        "additionalFields": {
          "appendAttribution": false
        }
      },
      "id": "2f7e0962-b92c-4a6c-90b6-1ccc0a083f16",
      "name": "Reply",
      "type": "n8n-nodes-base.telegram",
      "typeVersion": 1.2,
      "position": [60, 340],
      "webhookId": "53424144-133c-4f5e-86aa-814e345cb34c",
      "credentials": {}
    },
    {
      "parameters": {
        "promptType": "define",
        "text": "=[Invoice Information]\n{{ $('Get OCR Results').item.json.pages[0].markdown}} \n\n[Link to the Invoice Database]\nhttps://docs.google.com/spreadsheets/d/15x2ZQ8dIBU8VJmoh7ZvQ-57ZzX5i2Gaj53pTzLB2fTk/edit?gid=0#gid=0",
        "options": {
          "systemMessage": "=# System Role\nYou are an efficient invoice assistant that extracts key information from invoices and generates concise user responses.\n\n# Task Specification\nThank the user for submitting an invoice. Extract the total amount, due date, and a summary of the notes from the invoice. Inform the user that the original invoice has been added to the Google Drive, include the file name, and provide the link to the invoice database.\n\n# Specifics and Context\nStart off by thanking the user for submitting an invoice. Respond clearly and concisely to help users quickly understand their invoice details and access the database if needed.\n\n# Example\n## Input\n[Invoice Information]  \nInvoice Number: INV-12345  \nDate: 2024-12-01  \nTotal Amount: 950.00  \nBilling Address: 123 Business Lane  \nDue Date: 2024-12-15  \nNotes:  \n- Please make payment within the due date to avoid late fees.  \n- For questions about this invoice, contact billing@business.com or call (123) 456-7890.  \n- Payment methods accepted: Bank transfer, credit card, or PayPal.  \n- Late payments are subject to a 5% penalty fee  \n\n[File Name]  \nInvoice [December-05-2024]  \n\n[Link to the Invoice Database]  \nhttps://docs.google.com/spreadsheets/d/15FOh41iwvy7zv-k4foZc8ubMkMLz_BiBPXJTZ-IziM/edit?usp=sharing  \n\n## Output\nThanks for submitting!\n\nThe total amount of the invoice is **$950.00**, and the due date is **2024-12-15**.  \n\n**Notes:**  \n- Please make payment on time to avoid late fees.  \n- Contact billing@business.com or call (123) 456-7890 for any questions.  \n- Accepted payment methods: Bank transfer, credit card, and PayPal.  \n\nThe original invoice has been added to Google Drive with the file name **\"Invoice [December-05-2024]\"**.  \n\nYou can access the invoice database [here](https://docs.google.com/spreadsheets/d/15FOh41iwvy7zv-k4foZc8ubMkMLz_BiBPXJTZ-IziM/edit?usp=sharing).\n"
        }
      },
      "id": "0e93afd2-ea2d-46c5-9c11-663cc72c31b9",
      "name": "Invoice Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "typeVersion": 1.7,
      "position": [-240, 260]
    },
    {
      "parameters": {
        "method": "POST",
        "url": "https://api.mistral.ai/v1/files",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendBody": true,
        "contentType": "multipart-form-data",
        "bodyParameters": {
          "parameters": [
            {
              "name": "purpose",
              "value": "ocr"
            },
            {
              "parameterType": "formBinaryData",
              "name": "file",
              "inputDataFieldName": "data"
            }
          ]
        },
        "options": {}
      },
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [40, -320],
      "id": "43371c37-0a2e-4382-9a69-b6c650cf0544",
      "name": "Upload to Mistral",
      "credentials": {}
    },
    {
      "parameters": {
        "url": "=https://api.mistral.ai/v1/files/{{$json.id}}/url",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "options": {}
      },
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [-220, -120],
      "id": "5eb5734d-6f62-4a3e-a775-1d8a4ad27513",
      "name": "Get Signed URL",
      "credentials": {}
    },
    {
      "parameters": {
        "method": "POST",
        "url": "https://api.mistral.ai/v1/ocr",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={\n  \"model\": \"mistral-ocr-latest\",\n  \"document\": {\n    \"type\": \"document_url\",\n    \"document_url\": \"{{ $json.url }}\"\n  },\n  \"include_image_base64\": true\n}",
        "options": {}
      },
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [-100, -120],
      "id": "b7b9295a-0335-4079-b128-cfe049d85c81",
      "name": "Get OCR Results",
      "credentials": {}
    },
    {
      "parameters": {
        "text": "={{ $json.pages[0].markdown }}",
        "attributes": {
          "attributes": [
            {
              "name": "invoiceNumber",
              "description": "Invoice Number"
            },
            {
              "name": "invoiceDate",
              "description": "invoice date"
            },
            {
              "name": "totalAmount",
              "description": "Total Amount"
            },
            {
              "name": "billingAddress",
              "description": "Billing Address"
            },
            {
              "name": "dueDate",
              "description": "Due Date"
            },
            {
              "name": "notes",
              "description": "Notes"
            }
          ]
        },
        "options": {}
      },
      "type": "@n8n/n8n-nodes-langchain.informationExtractor",
      "typeVersion": 1,
      "position": [40, -120],
      "id": "af755240-c142-4230-b120-0fdcecfa7e7e",
      "name": "Information Extractor"
    },
    {
      "parameters": {
        "model": {
          "__rl": true,
          "value": "gpt-4o",
          "mode": "list",
          "cachedResultName": "gpt-4o"
        },
        "options": {}
      },
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "typeVersion": 1.2,
      "position": [60, 20],
      "id": "c0d0af98-ec81-4844-aabc-805927dde670",
      "name": "OpenAI Chat Model1",
      "credentials": {}
    }
  ],
  "pinData": {},
  "connections": {
    "Telegram Trigger": {
      "main": [
        [
          {
            "node": "Download File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Download File": {
      "main": [
        [
          {
            "node": "Upload to Mistral",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Update Database": {
      "main": [
        [
          {
            "node": "Telegram",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Add Invoice Image to Drive": {
      "main": [
        [
          {
            "node": "Invoice Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Telegram": {
      "main": [
        [
          {
            "node": "Add Invoice Image to Drive",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "Invoice Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Window Buffer Memory": {
      "ai_memory": [
        [
          {
            "node": "Invoice Agent",
            "type": "ai_memory",
            "index": 0
          }
        ]
      ]
    },
    "Invoice Agent": {
      "main": [
        [
          {
            "node": "Reply",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Upload to Mistral": {
      "main": [
        [
          {
            "node": "Get Signed URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Get Signed URL": {
      "main": [
        [
          {
            "node": "Get OCR Results",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Get OCR Results": {
      "main": [
        [
          {
            "node": "Information Extractor",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model1": {
      "ai_languageModel": [
        [
          {
            "node": "Information Extractor",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Information Extractor": {
      "main": [
        [
          {
            "node": "Update Database",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "fffb027a-5fc5-4fb4-a498-893c0e0dca3c",
  "meta": {
    "templateCredsSetupCompleted": true,
    "instanceId": "REMOVED-INSTANCE-ID"
  },
  "id": "XuzdrH13IU63LxXi",
  "tags": []
}

Note: I've abbreviated the JSON for readability. Download the complete version from the link above.

Troubleshooting Common Issues

While setting up your own invoice agent, you might encounter a few common challenges:

API Rate Limits

OCR Accuracy Problems

Telegram File Size Limits

Google API Authentication Errors

Conclusion

Building this automated invoice processing system has transformed how I handle financial documentation. What was once a tedious, error-prone process is now a streamlined, efficient workflow that happens in the background. Through the combination of n8n's workflow capabilities, Telegram's accessibility, and AI-powered extraction, I've created a system that saves time, improves accuracy, and makes invoice management nearly effortless.

If you're drowning in invoice paperwork or simply want to modernize your financial documentation processes, consider building a similar automation. The initial setup might take a few hours, but the long-term time savings are well worth the investment.

I've made the entire workflow available for you to download and use, so you can get started right away without building everything from scratch. Just follow the setup instructions in the "Build Your Own Invoice Agent" section above.