An agentic anonymizing proxy that modifies the customer query rather than redacting it.
Public large language models depend on contextual content in order to produce high quality answers, and the conventional approach of redacting sensitive content from a prompt before transmission removes precisely the contextual information that the public model relies on for accuracy. The Intelligine proxy operates as an autonomous agent that interprets the inbound customer query, substitutes sensitive entities with semantically plausible generic placeholders, transmits the modified prompt to the chosen public model, and then re-binds the public response to the original customer entities inside the customer firewall, with the result that the public model never receives any identifying information and the customer user never receives a degraded answer.
● INSIDE
The customer query is received by the proxy at the perimeter
The inbound prompt is received by the Intelligine proxy with the authenticated customer identity already attached, the tenant scope resolved against customer directory services, and the applicable routing policy evaluated against the user, the data sensitivity, and the destination model.
● INSIDE
Named entity extraction and sensitivity classification
The proxy identifies the named entities, regulated terms, and confidential references that appear in the customer prompt, and each identified item is classified into a sensitivity tier and then matched against the substitution policy that the customer has configured for items at that tier.
Coltrane Holdings → ENTITY · TIER-2 · PRIVATE-FIRM
Fiscal year 2024 → DATE · GENERIC-OK
○ OUTBOUND
The query is rewritten in flight rather than redacted
Sensitive entities are substituted with semantically equivalent placeholder phrases that have been chosen to preserve answer quality on the public model side, while the original entity mapping is held inside the customer perimeter for use during the inbound response phase.
○ OUTBOUND
The selected public model answers without contextual information that it should not receive
The selected public large language model produces a response to the rewritten query, and at no point during this round trip does the public model receive the actual customer entities, the customer tenant identifier, or the identity of the customer user that originated the request.
● INSIDE
The response is re-bound to the original customer entities inside the perimeter
The proxy applies the inverse of the substitution map that was held locally during the outbound phase, validates the resulting answer against the customer private knowledge base for factual consistency, and then returns the fully bound response to the customer user with citations attached to each substantive claim.
● INSIDE
Token-level audit log written to customer security information and event management
The original prompt, the modified outbound prompt, the public model that was selected for the request, the mapped entity table, the returned response, the round-trip latency, the per-request cost, and the policy decisions that influenced routing are all written to the audit log and streamed in real time to the customer security information and event management platform of choice.
Improvement in answer faithfulness measured against a blunt redaction baseline on the standard internal evaluation suite.
Sensitive entities never traverse the customer perimeter to a public model under any combination of routing decisions.
Median round-trip latency added by the combined rewriting and entity re-binding pipeline at production load.
A single orchestrator that selects between the customer private model and more than 40 public models on a per-request basis.
The customer private model sits at the center of the orchestration layer and handles every request for which it is the most appropriate model, while the orchestrator selects from a pool of more than 40 public large language models for tasks on which a public model offers measurably better quality, lower cost, or more suitable latency, and the orchestrator records the rationale for each routing decision so that the customer audit team can review the choice after the fact. No engineering organization in the regulated enterprise market should be contractually locked to a single model in the current state of the field.
| TASK DESCRIPTION | SELECTED MODEL | SELECTION RATIONALE | CONFIDENCE / FAITHFULNESS | COST PER REQUEST |
|---|---|---|---|---|
| Summarization of an internal credit memo | CUSTOMER PRIVATE MODEL | The content is commercially sensitive, the terminology is in-domain, and the local round-trip latency is significantly lower than any external alternative. | 0.96 | $0.001 |
| Translation of a regulatory note from English into Japanese | GPT-4O | The selected public model produces the strongest multilingual faithfulness scores on this particular language pair across the customer evaluation suite. | 0.94 | $0.012 |
| Long-context legal review across 240 pages | CLAUDE 3.5 | The selected public model demonstrates the highest long-context recall scores on the customer legal review evaluation set. | 0.91 | $0.18 |
| Bulk 1st-pass document classification at scale | LLAMA 3.1 70B | The selected public model is the lowest cost option that consistently meets the customer-defined quality threshold of 0.85 on the classification task. | 0.87 | $0.0004 |
| Clinical summarization involving protected health information | CUSTOMER PRIVATE MODEL | Customer policy prohibits any egress of protected health information to external models, and the orchestrator therefore forces local routing for every request of this category. | 0.98 | $0.001 |
| Code review against an internal customer repository | DEEPSEEK V3 | The selected public model produces the highest score on the customer code review evaluation at approximately half the cost of the next best option. | 0.90 | $0.003 |
6 independent dimensions inform every routing decision.
The orchestrator considers per-request cost, expected round-trip latency, faithfulness against the customer evaluation suite, the sensitivity classification of the input data, the regulatory jurisdiction of the requesting user, and the real-time availability of each candidate model provider.
Public models age, and the customer routing weights are updated to reflect that.
The customer evaluation suite runs every night against every public model that is wired into the orchestrator, and the routing weights are then updated automatically based on the latest measurements, so that the customer procurement team is not required to re-run a vendor selection process whenever a new model is released.
Every routing choice is recorded against the request.
For every request the orchestrator records which model was selected, why that model was selected over the other candidates, and which alternative models were considered alongside their respective scores, with the full record exportable on demand to a regulator or to an internal audit team.
The customer owns the trained model weights, not an application programming interface key, and not a software-as-a-service seat.
Within 30 calendar days of the engagement kick-off Intelligine delivers a foundation model that has been trained on customer proprietary data, the training itself was executed on graphical processing units that sit inside the customer cloud account, and the resulting runtime is configured so that customer operations engineers can administer it without recourse to Intelligine support staff. There is no recurring software-as-a-service subscription, there is no application programming interface dependency back to Intelligine infrastructure, and the contract contains no provision under which the underlying service can be modified or withdrawn at vendor discretion.
The customer rents access to intelligence on a recurring basis while the vendor retains ownership of the underlying asset.
- WEIGHTSThe model weights are owned by the vendor and are not visible to the customer at any point in the relationship.
- DATACustomer data is frequently retained by the vendor for indefinite periods and is sometimes used to inform subsequent vendor model training.
- RUNTIMEThe model runtime is delivered as a software-as-a-service offering and the customer operational uptime is therefore directly coupled to the vendor uptime.
- PRICINGCommercial pricing is typically charged per token, per seat, or against a usage envelope that the customer cannot easily forecast in advance.
- EXITA customer that ends the relationship retains no model assets that are independently usable outside the vendor environment.
- EVOLUTIONThe future direction of the platform is determined by the vendor product roadmap rather than by the customer technology strategy.
The customer owns the asset, and Intelligine provides the documented operator runbook that accompanies it.
- WEIGHTSThe trained model weights are the property of the customer and are formally delivered on the 30th calendar day of the engagement.
- DATACustomer data remains inside the customer virtual private cloud at all times and is never transmitted to Intelligine infrastructure.
- RUNTIMEThe model runtime is portable and operates inside any container infrastructure that the customer is already using internally.
- PRICINGCommercial pricing is structured as an annual agreement with heavy, medium, and lightweight usage tiers that map to actual consumption patterns.
- EXITA customer that ends the relationship retains the trained model weights, the deployment runtime, and the documented operating runbook.
- EVOLUTIONThe platform roadmap is co-engineered with customer architecture and is therefore aligned with the customer technology direction.
The customer receives the trained model weights, the customer-specific tokenizer, the embedding model, the evaluation suite, the serving runtime, the monitoring stack, the operator runbook, and the documented escalation tree as a single integrated handover package.
Architected to age well across many model generations rather than to lock the customer into any single one.
The public large language model layer continues to release new generations of models on a 6-month rolling cadence, and customer infrastructure should not be obliged to replatform on the same cadence in order to remain current. The Intelligine platform is therefore deliberately structured so that the orchestrator, the governance layer, the memory layer, and the anonymizing proxy are all expected to outlive any specific public model that is wired into them, and the customer private model can be re-trained or fully replaced on customer-driven cycles without requiring any rebuild of the surrounding infrastructure stack.
Every layer of the platform stack is version pinned to a specific release that the customer change control board has explicitly approved, every subsequent upgrade is reviewed by customer operations engineers before it is permitted to ship into the customer environment, and air-gapped customers receive cryptographically signed update bundles on physical installation media.