A high-level overview of the systems, agents, and data infrastructure that power intelligent document generation for regulated industries.
Bennie is built as a five-layer system. Each layer has a single responsibility. Data flows top-down through deterministic orchestration, with specialized AI agents providing intelligent reasoning at every stage.
Every workspace follows a deterministic three-loop lifecycle with human approval gates at every critical transition. The workflow is rigid in sequence but flexible within each loop.
Upload task requirements and a template. Bennie analyzes both and produces a structured, section-level execution plan with data requirements, quality criteria, and deliverables.
Upload source materials -- PDFs, spreadsheets, images, code files. Bennie classifies, enriches, and indexes everything, then checks data coverage against the plan.
Bennie generates each section with full context retrieval, validates the complete report for cross-section consistency, and auto-corrects flagged issues before user review.
Each agent has a single responsibility and scoped tool access. Agents never call other agents directly -- the orchestration layer coordinates all inter-agent workflows.
The sole orchestrator for all chat interactions. Reasons about user intent, dispatches worker agents, and synthesizes cohesive responses.
Generates structured execution plans from task requirements and templates. Extracts section hierarchy, determines content depth, and identifies data needs.
Enriches plan sections with curated domain knowledge, generating detailed instructions, quality criteria, and deliverables for each section.
Enriches file metadata with intelligent classification: content descriptions, topic extraction, and section mapping for downstream processing.
Compares the file catalog against plan requirements. Produces per-section coverage verdicts, identifies gaps, and recommends specific uploads.
Performs multi-source retrieval to assemble a rich context bundle with citation tracking for each section of the report.
Produces high-quality section content from pre-assembled context with inline citations. All context is pre-assembled by the retrieval specialist.
Performs holistic validation across all sections. Checks plan alignment, detects cross-section inconsistencies, and emits targeted fix guidance.
Manages optional test execution workflows. Prepares test scripts, configures parameters, and reingests results into the data pipeline.
"Agents reason. The API orchestrates." -- No agent has direct dependencies on another agent. The orchestration layer controls all state transitions and inter-agent coordination.
Every piece of source data flows through a multi-stage ingestion pipeline. At retrieval time, the context engine draws from three independent sources to assemble precisely the right context for each section.
Files uploaded and registered in the file catalog with automatic format detection
Intelligent extraction of content descriptions, topics, section mappings, and retrieval hints
Structured data (tables, spreadsheets) stored for precise queries | Unstructured content (documents, images) indexed for semantic search
Multi-source retrieval merges semantic search, structured queries, and curated domain knowledge per section
Hybrid vector and keyword search with semantic reranking across documents, images (via OCR), and narrative content.
Direct queries for tabular data from CSV and Excel files. Preserves full row/column structure for precise numerical and record-level retrieval.
Live retrieval from the domain knowledge service. Industry-specific procedures, methodologies, templates, and evaluation frameworks.
Bennie's deepest competitive moat. A continuously maintained repository of externalized domain expertise built by seasoned industry professionals.
Curated knowledge lives in a dedicated service -- independently versioned and continuously enriched by subject matter experts. Always accessed live, never duplicated.
Each workflow stage receives precisely the right curated content: planning guidance during enrichment, domain context during generation, and validation checklists during quality checks.
Curated knowledge is gated by invocation context. Only accessible during deterministic system flow steps -- never exposed through ad-hoc interactions.
New domain packs are authored by subject matter experts and deployed seamlessly. Each vertical brings its own procedures, templates, checklists, and specialized tools.
The system works without curated knowledge -- it produces less refined output. When the service is unavailable, agents proceed with user-uploaded data and general reasoning.
Beyond reference content, curated knowledge can include access paths to remote services -- test servers, calculators, and specialized analytical tools.
Bennie does not generate and hope for the best. Every report goes through automated validation and self-correction before the user sees the output.
Each section is generated with intelligent context retrieval and specialized writing. Sections progress through a tracked lifecycle with full visibility.
Once all sections are drafted, the validator reviews the complete report simultaneously. Checks plan alignment, cross-section coherence, and internal consistency.
Flagged sections are automatically regenerated with targeted fix guidance from the validator. The system self-corrects without manual intervention.
Validation results are persisted as structured reports. Each issue includes section reference, description, and specific fix guidance for full traceability.
Non-flagged sections are immediately released as they pass validation. Flagged sections are regenerated in the background -- the user sees progressive results, not a single all-or-nothing delivery.
Every infrastructure decision follows one principle: leverage fully managed cloud services for maximum reliability, compliance coverage, and operational simplicity.
API and data services run on fully managed, auto-scaling container infrastructure with built-in observability. No cluster management overhead.
All workspace data, plans, sections, file catalogs, and workflow state stored in a managed document database, partitioned by workspace for isolation.
Source documents, templates, and exports stored in cloud object storage with workspace-scoped isolation and access controls.
Vector retrieval with semantic reranking, integrated OCR pipeline for image-embedded content, and hybrid keyword + vector search.
All agents hosted on a managed AI platform with built-in governance. Foundation model agnostic -- swap or upgrade models without touching orchestration logic.
Every workspace is a self-contained unit -- data, state, and agent context are scoped by workspace. No cross-workspace data leakage by design.
Built for regulated industries where data confidentiality, audit defensibility, and operational security are non-negotiable.
Every project operates in a strictly isolated workspace. Document uploads, generated content, and workflow state are scoped to individual workspaces with no cross-project data leakage.
All data encrypted at rest (AES-256) and in transit (TLS 1.2+). Enterprise-grade cloud infrastructure with SOC 2 Type II compliance. No customer data used for model training.
Complete audit trail of every operation: plan creation, data ingestion, content generation, validation, and approval. Every claim traced to source with page-level citation provenance.
Architecture designed to support SR 11-7, TRIM, SS1/23 and other regulatory frameworks. Human approval gates ensure regulatory defensibility at every decision point.
Role-based access control, managed identity authentication, and network isolation. Workspace-level permissions ensure only authorized users access project data.
Deterministic orchestration ensures every workflow is reproducible and auditable. Full version history maintained for plans, sections, and final deliverables.
From raw data to validated, enterprise-grade reports -- in hours instead of weeks.
Visit Bennie Website