Technology Architecture: Intelligine Private AI Operating System

PATENT GRANTED · AGENTIC ANONYMIZING PROXY

An agentic anonymizing proxy that modifies the customer query rather than redacting it.

Public large language models depend on contextual content in order to produce high quality answers, and the conventional approach of redacting sensitive content from a prompt before transmission removes precisely the contextual information that the public model relies on for accuracy. The Intelligine proxy operates as an autonomous agent that interprets the inbound customer query, substitutes sensitive entities with semantically plausible generic placeholders, transmits the modified prompt to the chosen public model, and then re-binds the public response to the original customer entities inside the customer firewall, with the result that the public model never receives any identifying information and the customer user never receives a degraded answer.

REQUEST LIFECYCLE · ANNOTATED

SHEET T-01 · REV 4.1

STEP 01
● INSIDE

The customer query is received by the proxy at the perimeter

The inbound prompt is received by the Intelligine proxy with the authenticated customer identity already attached, the tenant scope resolved against customer directory services, and the applicable routing policy evaluated against the user, the data sensitivity, and the destination model.

“Compare risk exposure for Westbridge Capital Bank against Coltrane Holdings for fiscal year 2024.”

STEP 02
● INSIDE

Named entity extraction and sensitivity classification

The proxy identifies the named entities, regulated terms, and confidential references that appear in the customer prompt, and each identified item is classified into a sensitivity tier and then matched against the substitution policy that the customer has configured for items at that tier.

Westbridge Capital Bank → ENTITY · TIER-1 · BANKING
Coltrane Holdings → ENTITY · TIER-2 · PRIVATE-FIRM
Fiscal year 2024 → DATE · GENERIC-OK

STEP 03
○ OUTBOUND

The query is rewritten in flight rather than redacted

Sensitive entities are substituted with semantically equivalent placeholder phrases that have been chosen to preserve answer quality on the public model side, while the original entity mapping is held inside the customer perimeter for use during the inbound response phase.

“Compare risk exposure for a tier-one commercial bank against a private holdings firm for a recent fiscal year.”

STEP 04
○ OUTBOUND

The selected public model answers without contextual information that it should not receive

The selected public large language model produces a response to the rewritten query, and at no point during this round trip does the public model receive the actual customer entities, the customer tenant identifier, or the identity of the customer user that originated the request.

The public model returns a 1,400-token analysis covering comparative risk methodologies, capital ratios, and exposure profiles.

STEP 05
● INSIDE

The response is re-bound to the original customer entities inside the perimeter

The proxy applies the inverse of the substitution map that was held locally during the outbound phase, validates the resulting answer against the customer private knowledge base for factual consistency, and then returns the fully bound response to the customer user with citations attached to each substantive claim.

“Westbridge Capital Bank shows an 18.3 percent Tier-1 capital ratio against Coltrane Holdings at 12.1 percent as disclosed in fiscal year 2024…”

STEP 06
● INSIDE

Token-level audit log written to customer security information and event management

The original prompt, the modified outbound prompt, the public model that was selected for the request, the mapped entity table, the returned response, the round-trip latency, the per-request cost, and the policy decisions that influenced routing are all written to the audit log and streamed in real time to the customer security information and event management platform of choice.

ANSWER FAITHFULNESS

+34%

Improvement in answer faithfulness measured against a blunt redaction baseline on the standard internal evaluation suite.

PERSONALLY IDENTIFYING INFORMATION LEAKAGE

Sensitive entities never traverse the customer perimeter to a public model under any combination of routing decisions.

LATENCY OVERHEAD INTRODUCED

< 80ms

Median round-trip latency added by the combined rewriting and entity re-binding pipeline at production load.

ORCHESTRATION · MULTI-MODEL ROUTING

A single orchestrator that selects between the customer private model and more than 40 public models on a per-request basis.

The customer private model sits at the center of the orchestration layer and handles every request for which it is the most appropriate model, while the orchestrator selects from a pool of more than 40 public large language models for tasks on which a public model offers measurably better quality, lower cost, or more suitable latency, and the orchestrator records the rationale for each routing decision so that the customer audit team can review the choice after the fact. No engineering organization in the regulated enterprise market should be contractually locked to a single model in the current state of the field.

TASK DESCRIPTION	SELECTED MODEL	SELECTION RATIONALE	CONFIDENCE / FAITHFULNESS	COST PER REQUEST
Summarization of an internal credit memo	CUSTOMER PRIVATE MODEL	The content is commercially sensitive, the terminology is in-domain, and the local round-trip latency is significantly lower than any external alternative.	0.96	$0.001
Translation of a regulatory note from English into Japanese	GPT-4O	The selected public model produces the strongest multilingual faithfulness scores on this particular language pair across the customer evaluation suite.	0.94	$0.012
Long-context legal review across 240 pages	CLAUDE 3.5	The selected public model demonstrates the highest long-context recall scores on the customer legal review evaluation set.	0.91	$0.18
Bulk 1st-pass document classification at scale	LLAMA 3.1 70B	The selected public model is the lowest cost option that consistently meets the customer-defined quality threshold of 0.85 on the classification task.	0.87	$0.0004
Clinical summarization involving protected health information	CUSTOMER PRIVATE MODEL	Customer policy prohibits any egress of protected health information to external models, and the orchestrator therefore forces local routing for every request of this category.	0.98	$0.001
Code review against an internal customer repository	DEEPSEEK V3	The selected public model produces the highest score on the customer code review evaluation at approximately half the cost of the next best option.	0.90	$0.003

ROUTING SIGNALS

6 independent dimensions inform every routing decision.

The orchestrator considers per-request cost, expected round-trip latency, faithfulness against the customer evaluation suite, the sensitivity classification of the input data, the regulatory jurisdiction of the requesting user, and the real-time availability of each candidate model provider.

CONTINUOUS EVALUATION

Public models age, and the customer routing weights are updated to reflect that.

The customer evaluation suite runs every night against every public model that is wired into the orchestrator, and the routing weights are then updated automatically based on the latest measurements, so that the customer procurement team is not required to re-run a vendor selection process whenever a new model is released.

EXPLAINABILITY

Every routing choice is recorded against the request.

For every request the orchestrator records which model was selected, why that model was selected over the other candidates, and which alternative models were considered alongside their respective scores, with the full record exportable on demand to a regulator or to an internal audit team.

OWNED · MODEL WEIGHTS HANDED OVER TO THE CUSTOMER

The customer owns the trained model weights, not an application programming interface key, and not a software-as-a-service seat.

Within 30 calendar days of the engagement kick-off Intelligine delivers a foundation model that has been trained on customer proprietary data, the training itself was executed on graphical processing units that sit inside the customer cloud account, and the resulting runtime is configured so that customer operations engineers can administer it without recourse to Intelligine support staff. There is no recurring software-as-a-service subscription, there is no application programming interface dependency back to Intelligine infrastructure, and the contract contains no provision under which the underlying service can be modified or withdrawn at vendor discretion.

THE TYPICAL ARTIFICIAL INTELLIGENCE VENDOR

The customer rents access to intelligence on a recurring basis while the vendor retains ownership of the underlying asset.

WEIGHTSThe model weights are owned by the vendor and are not visible to the customer at any point in the relationship.
DATACustomer data is frequently retained by the vendor for indefinite periods and is sometimes used to inform subsequent vendor model training.
RUNTIMEThe model runtime is delivered as a software-as-a-service offering and the customer operational uptime is therefore directly coupled to the vendor uptime.
PRICINGCommercial pricing is typically charged per token, per seat, or against a usage envelope that the customer cannot easily forecast in advance.
EXITA customer that ends the relationship retains no model assets that are independently usable outside the vendor environment.
EVOLUTIONThe future direction of the platform is determined by the vendor product roadmap rather than by the customer technology strategy.

INTELLIGINE

The customer owns the asset, and Intelligine provides the documented operator runbook that accompanies it.

WEIGHTSThe trained model weights are the property of the customer and are formally delivered on the 30th calendar day of the engagement.
DATACustomer data remains inside the customer virtual private cloud at all times and is never transmitted to Intelligine infrastructure.
RUNTIMEThe model runtime is portable and operates inside any container infrastructure that the customer is already using internally.
PRICINGCommercial pricing is structured as an annual agreement with heavy, medium, and lightweight usage tiers that map to actual consumption patterns.
EXITA customer that ends the relationship retains the trained model weights, the deployment runtime, and the documented operating runbook.
EVOLUTIONThe platform roadmap is co-engineered with customer architecture and is therefore aligned with the customer technology direction.

DELIVERY ARTIFACTS HANDED OVER ON DAY 30

The customer receives the trained model weights, the customer-specific tokenizer, the embedding model, the evaluation suite, the serving runtime, the monitoring stack, the operator runbook, and the documented escalation tree as a single integrated handover package.

Request a sample handover package →

EVOLUTION · INFRASTRUCTURE ROADMAP

Architected to age well across many model generations rather than to lock the customer into any single one.

The public large language model layer continues to release new generations of models on a 6-month rolling cadence, and customer infrastructure should not be obliged to replatform on the same cadence in order to remain current. The Intelligine platform is therefore deliberately structured so that the orchestrator, the governance layer, the memory layer, and the anonymizing proxy are all expected to outlive any specific public model that is wired into them, and the customer private model can be re-trained or fully replaced on customer-driven cycles without requiring any rebuild of the surrounding infrastructure stack.

2023

Agentic anonymizing proxy

First granted patent covering agentic query modification at the customer perimeter.

2024

Multi-model orchestrator

Routing, evaluation, and explainability across more than 40 public large language models in production.

NOW · 2026

Customer-owned foundation model

Foundation model training executed against customer proprietary data, with the trained weights handed over to the customer within 30 calendar days.

2026 H2

Autonomous agents on customer rails

Long-running autonomous agents that operate against tool access bound by customer policy with full request traceability throughout the agent lifecycle.

2027+

Cross-tenant federated learning

Customers may opt in to share model gradients rather than underlying data, which compounds the value of every participating customer model while preserving full data isolation.

CHANGE CONTROL UNDER CUSTOMER GOVERNANCE

Every layer of the platform stack is version pinned to a specific release that the customer change control board has explicitly approved, every subsequent upgrade is reviewed by customer operations engineers before it is permitted to ship into the customer environment, and air-gapped customers receive cryptographically signed update bundles on physical installation media.

Read more about Security and Privacy →

3 patented systems.
1 private AI operating system.

An agentic anonymizing proxy that modifies the customer query rather than redacting it.

The customer query is received by the proxy at the perimeter

Named entity extraction and sensitivity classification

The query is rewritten in flight rather than redacted

The selected public model answers without contextual information that it should not receive

The response is re-bound to the original customer entities inside the perimeter

Token-level audit log written to customer security information and event management

A single orchestrator that selects between the customer private model and more than 40 public models on a per-request basis.

6 independent dimensions inform every routing decision.

Public models age, and the customer routing weights are updated to reflect that.

Every routing choice is recorded against the request.

The customer owns the trained model weights, not an application programming interface key, and not a software-as-a-service seat.

The customer rents access to intelligence on a recurring basis while the vendor retains ownership of the underlying asset.

The customer owns the asset, and Intelligine provides the documented operator runbook that accompanies it.

Architected to age well across many model generations rather than to lock the customer into any single one.

Curious about the stack? Speak directly to an architect.

3 patented systems.1 private AI operating system.

An agentic anonymizing proxy that modifies the customer query rather than redacting it.

The customer query is received by the proxy at the perimeter

Named entity extraction and sensitivity classification

The query is rewritten in flight rather than redacted

The selected public model answers without contextual information that it should not receive

The response is re-bound to the original customer entities inside the perimeter

Token-level audit log written to customer security information and event management

A single orchestrator that selects between the customer private model and more than 40 public models on a per-request basis.

6 independent dimensions inform every routing decision.

Public models age, and the customer routing weights are updated to reflect that.

Every routing choice is recorded against the request.

The customer owns the trained model weights, not an application programming interface key, and not a software-as-a-service seat.

The customer rents access to intelligence on a recurring basis while the vendor retains ownership of the underlying asset.

The customer owns the asset, and Intelligine provides the documented operator runbook that accompanies it.

Architected to age well across many model generations rather than to lock the customer into any single one.

Curious about the stack? Speak directly to an architect.

3 patented systems.
1 private AI operating system.