Technical Architecture

Bennie Product Blueprint:
Enterprise Architecture for
Document Intelligence

A high-level overview of the systems, agents, and data infrastructure that power intelligent document generation for regulated industries.

01 / Architecture Overview

Layered Enterprise Architecture

Bennie is built as a five-layer system. Each layer has a single responsibility. Data flows top-down through deterministic orchestration, with specialized AI agents providing intelligent reasoning at every stage.

Presentation Layer
Web Application
Interactive workspace with real-time progress tracking and collaboration tools
Orchestration Layer
API & Workflow Engine
Deterministic workflow engine -- controls all state transitions, agent invocations, and data flow
Reasoning Layer
Specialized AI Agents
Purpose-built agents for planning, retrieval, generation, and validation -- each with scoped tool access
Service Layer
Data & Knowledge Services
Knowledge management, data processing, event routing, and curated domain knowledge
Infrastructure Layer
Cloud Platform
Enterprise-grade, fully managed cloud infrastructure with built-in compliance and scalability
02 / The Three-Loop Workflow

Plan. Data. Report.

Every workspace follows a deterministic three-loop lifecycle with human approval gates at every critical transition. The workflow is rigid in sequence but flexible within each loop.

Loop 1

Plan Generation

Upload task requirements and a template. Bennie analyzes both and produces a structured, section-level execution plan with data requirements, quality criteria, and deliverables.

  • Upload task description and output template
  • AI-driven structure extraction and section enrichment
  • Domain knowledge automatically woven into the plan
  • Iterate with natural language feedback via chat
H Human Approval Gate: User reviews and approves the plan before proceeding
Loop 2

Data Collection & Ingestion

Upload source materials -- PDFs, spreadsheets, images, code files. Bennie classifies, enriches, and indexes everything, then checks data coverage against the plan.

  • Multi-file upload with automatic format detection
  • AI-driven metadata enrichment with human review
  • Dual ingestion: structured data and unstructured content indexed separately
  • Automated completeness assessment with gap analysis
H Human Approval Gate: User reviews data coverage and decides when to generate
Loop 3

Report Generation & Validation

Bennie generates each section with full context retrieval, validates the complete report for cross-section consistency, and auto-corrects flagged issues before user review.

  • Parallel section generation with intelligent context assembly
  • Multi-source retrieval per section for comprehensive coverage
  • Holistic validation across all sections simultaneously
  • Automatic regeneration of flagged sections with fix guidance
  • Per-section versioning, inline editing, and document export
H Human Approval Gate: User reviews, edits, and provides section-level feedback
03 / Agent Architecture

Specialized AI Agents

Each agent has a single responsibility and scoped tool access. Agents never call other agents directly -- the orchestration layer coordinates all inter-agent workflows.

Concierge

Chat Orchestrator

The sole orchestrator for all chat interactions. Reasons about user intent, dispatches worker agents, and synthesizes cohesive responses.

Plan Architect

Plan Generation

Generates structured execution plans from task requirements and templates. Extracts section hierarchy, determines content depth, and identifies data needs.

Plan Enricher

Plan Enrichment

Enriches plan sections with curated domain knowledge, generating detailed instructions, quality criteria, and deliverables for each section.

Data Preprocessor

File Classification

Enriches file metadata with intelligent classification: content descriptions, topic extraction, and section mapping for downstream processing.

Completeness Checker

Coverage Analysis

Compares the file catalog against plan requirements. Produces per-section coverage verdicts, identifies gaps, and recommends specific uploads.

Context Engineer

Retrieval Specialist

Performs multi-source retrieval to assemble a rich context bundle with citation tracking for each section of the report.

Content Generator

Report Writer

Produces high-quality section content from pre-assembled context with inline citations. All context is pre-assembled by the retrieval specialist.

Output Validator

Quality Assurance

Performs holistic validation across all sections. Checks plan alignment, detects cross-section inconsistencies, and emits targeted fix guidance.

Pipeline Composer

Test Orchestration

Manages optional test execution workflows. Prepares test scripts, configures parameters, and reingests results into the data pipeline.

Core Architectural Principle

"Agents reason. The API orchestrates." -- No agent has direct dependencies on another agent. The orchestration layer controls all state transitions and inter-agent coordination.

04 / Data Pipeline

Multi-Source Retrieval Engine

Every piece of source data flows through a multi-stage ingestion pipeline. At retrieval time, the context engine draws from three independent sources to assemble precisely the right context for each section.

Stage 1

Upload & Classify

Files uploaded and registered in the file catalog with automatic format detection

Stage 2

AI-Driven Enrichment

Intelligent extraction of content descriptions, topics, section mappings, and retrieval hints

Stage 3

Dual-Path Ingestion

Structured data (tables, spreadsheets) stored for precise queries | Unstructured content (documents, images) indexed for semantic search

Stage 4

Context Assembly

Multi-source retrieval merges semantic search, structured queries, and curated domain knowledge per section

Three Retrieval Sources

V

Semantic Search

Hybrid vector and keyword search with semantic reranking across documents, images (via OCR), and narrative content.

S

Structured Query

Direct queries for tabular data from CSV and Excel files. Preserves full row/column structure for precise numerical and record-level retrieval.

C

Curated Knowledge

Live retrieval from the domain knowledge service. Industry-specific procedures, methodologies, templates, and evaluation frameworks.

05 / Curated Knowledge Network

The Domain Intelligence Layer

Bennie's deepest competitive moat. A continuously maintained repository of externalized domain expertise built by seasoned industry professionals.

Remotely Hosted & Maintained

Curated knowledge lives in a dedicated service -- independently versioned and continuously enriched by subject matter experts. Always accessed live, never duplicated.

Stage-Scoped Integration

Each workflow stage receives precisely the right curated content: planning guidance during enrichment, domain context during generation, and validation checklists during quality checks.

IP-Protected Access

Curated knowledge is gated by invocation context. Only accessible during deterministic system flow steps -- never exposed through ad-hoc interactions.

Vertical Extensibility

New domain packs are authored by subject matter experts and deployed seamlessly. Each vertical brings its own procedures, templates, checklists, and specialized tools.

Enhancement, Not Dependency

The system works without curated knowledge -- it produces less refined output. When the service is unavailable, agents proceed with user-uploaded data and general reasoning.

Embedded Tool Access

Beyond reference content, curated knowledge can include access paths to remote services -- test servers, calculators, and specialized analytical tools.

06 / Validation Engine

Multi-Layer Quality Assurance

Bennie does not generate and hope for the best. Every report goes through automated validation and self-correction before the user sees the output.

Section Generation

Each section is generated with intelligent context retrieval and specialized writing. Sections progress through a tracked lifecycle with full visibility.

Holistic Validation

Once all sections are drafted, the validator reviews the complete report simultaneously. Checks plan alignment, cross-section coherence, and internal consistency.

Auto-Regeneration

Flagged sections are automatically regenerated with targeted fix guidance from the validator. The system self-corrects without manual intervention.

Quality Scoring

Validation results are persisted as structured reports. Each issue includes section reference, description, and specific fix guidance for full traceability.

Validation Design

Non-flagged sections are immediately released as they pass validation. Flagged sections are regenerated in the background -- the user sees progressive results, not a single all-or-nothing delivery.

07 / Enterprise Infrastructure

Cloud-Native, Built for Regulated Industries

Every infrastructure decision follows one principle: leverage fully managed cloud services for maximum reliability, compliance coverage, and operational simplicity.

Compute Managed

API and data services run on fully managed, auto-scaling container infrastructure with built-in observability. No cluster management overhead.

Document Store NoSQL

All workspace data, plans, sections, file catalogs, and workflow state stored in a managed document database, partitioned by workspace for isolation.

File Storage Object Store

Source documents, templates, and exports stored in cloud object storage with workspace-scoped isolation and access controls.

Search & Retrieval AI Search

Vector retrieval with semantic reranking, integrated OCR pipeline for image-embedded content, and hybrid keyword + vector search.

AI Foundation Agent Platform

All agents hosted on a managed AI platform with built-in governance. Foundation model agnostic -- swap or upgrade models without touching orchestration logic.

Workspace Isolation Security

Every workspace is a self-contained unit -- data, state, and agent context are scoped by workspace. No cross-workspace data leakage by design.

08 / Security & Compliance

Enterprise-Grade Security by Design

Built for regulated industries where data confidentiality, audit defensibility, and operational security are non-negotiable.

Data Isolation

Every project operates in a strictly isolated workspace. Document uploads, generated content, and workflow state are scoped to individual workspaces with no cross-project data leakage.

Encryption

All data encrypted at rest (AES-256) and in transit (TLS 1.2+). Enterprise-grade cloud infrastructure with SOC 2 Type II compliance. No customer data used for model training.

Audit Trail

Complete audit trail of every operation: plan creation, data ingestion, content generation, validation, and approval. Every claim traced to source with page-level citation provenance.

Regulatory Alignment

Architecture designed to support SR 11-7, TRIM, SS1/23 and other regulatory frameworks. Human approval gates ensure regulatory defensibility at every decision point.

Access Control

Role-based access control, managed identity authentication, and network isolation. Workspace-level permissions ensure only authorized users access project data.

Reproducibility

Deterministic orchestration ensures every workflow is reproducible and auditable. Full version history maintained for plans, sections, and final deliverables.

See Bennie in Action

From raw data to validated, enterprise-grade reports -- in hours instead of weeks.

Visit Bennie Website