STRATEGICDEFINE

How AI Execution Works in Regulated Systems

2026-02-07Omar Trejo PDF

AI in regulated systems is often framed as a contradiction. It is not. The contradiction is trying to run AI with startup-style ambiguity inside an environment that depends on evidence, change control, and recoverable decisions.

Regulation does not usually kill the initiative. Weak operating discipline does. Teams fail when they treat the regulated environment as a late-stage review problem instead of an early-stage design constraint.

How Regulation Reshapes Execution

The AI Risk Management Framework (NIST, 2023) and the Generative AI Profile (NIST, 2024) both emphasize governance, traceability, and monitoring as operating requirements. Good Machine Learning Practice guidance (FDA, 2021) and predetermined change control plans for ML-enabled devices (FDA, 2025) point in the same direction: if the system can change, the change path must be characterized before scale.

The Three Control Layers

Regulated AI execution usually needs three control layers working together. Each addresses a different failure mode, and skipping any one of them creates a gap that regulators and auditors will find.

Evidence layer that records data assumptions, evaluation results, and decision rationale.
Change layer that governs how models, prompts, rules, or workflows are modified over time.
Operating layer that defines human review, escalation, monitoring, and rollback in production.

graph TD
    subgraph Evidence["Evidence Layer"]
        A1["Data provenance"]
        A2["Decision rationale"]
        A3["Evaluation results"]
    end

    subgraph Change["Change Layer"]
        B1["Change taxonomy"]
        B2["Approval gates"]
        B3["Re-validation triggers"]
    end

    subgraph Operating["Operating Layer"]
        C1["Human review points"]
        C2["Escalation paths"]
        C3["Rollback triggers"]
    end

    Evidence -->|"What can change"| Change
    Change -->|"How it runs"| Operating
    Operating -->|"New evidence"| Evidence

    style Evidence fill:#1a1a2e,stroke:#16c79a,color:#fff
    style Change fill:#1a1a2e,stroke:#ffd700,color:#fff
    style Operating fill:#1a1a2e,stroke:#0f3460,color:#fff

Layer One: Evidence Before Scale

The first question in a regulated environment is not "can the model work?" It is "what evidence would make this system acceptable to operate?" That evidence includes data provenance, evaluation boundaries, performance by known edge cases, and the reasoning behind the acceptance threshold.

Teams get into trouble when they try to reconstruct evidence after the build has already accelerated. Evidence built after the fact is almost always thinner and harder to defend than evidence designed into the execution path from the start.

In practice, the evidence layer is an engineering artifact, not a documentation exercise. Every consequential action needs a durable, queryable audit trail that captures who acted, what changed, why it changed, and what the system state was before and after.

The proposed 2025 healthcare security rule updates (Federal Register, 2025) eliminate the distinction between "required" and "addressable" security controls, making comprehensive audit logging mandatory. Systems built without this evidence infrastructure will need expensive retrofits.

The evidence layer is not a report you write after the build. It is a system behavior you design before the first line of production code runs.

Layer Two: Change Control Before Drift

AI systems drift in more ways than ordinary software. Models, prompts, retrieval corpora, rules, and human review thresholds all change. In a regulated setting, those are controlled changes with potential downstream impact.

The change model must exist before the system becomes operationally important. Which changes are pre-approved, which need re-validation, and which trigger rollback — these are execution questions, not policy questions.

Security Changes as Controlled Changes

Systematic security hardening is a disciplined cycle of audit, triage, and remediation applied to the entire system surface. Every finding is cataloged, severity-ranked, and remediated in priority order — each one a controlled change with its own evidence trail. Teams that have run hardening cycles before know which findings to prioritize and which apparent urgencies are low-risk.

Data Protection: Fail Safe by Default

The instinctive approach to sensitive data is to strip known sensitive fields. That fails silently when new data fields appear that the filter does not cover. In regulated systems, that silent failure is a compliance incident.

The stronger pattern defines what data is permitted to pass through, blocking everything else by default. When a new data field appears, it is held until explicitly classified and approved. Guidance on de-identification (HHS, 2024) reinforces this: the safe harbor method works from a defined set of identifiers, not open-ended exclusion logic.

Layer Three: Production Controls

A regulated AI system needs clear production behavior: who approves exceptions, who sees low-confidence outputs, how incidents are escalated, and how the system reverts when behavior drifts. Without this, "human oversight" is a vague phrase instead of an actual control.

Concrete Operating Controls

Authentication and session controls. Hardening, session limits, and structured auth logging for compliance reconstruction
Network and perimeter controls. Defense-in-depth, consistent access rules, and automated infrastructure validation
Multi-tenant isolation. Cross-organization access blocked at every layer as a system invariant, not an application convention

Each integration point — authentication providers, clinical data systems, payment processors — needs its own audit trail. SP 800-53 (NIST, 2024) requires organizations to generate, protect, and retain audit records for monitoring, analysis, and investigation. Every integration boundary becomes a logging boundary.

Where Teams Usually Go Wrong

Either they overreact and block progress because the control model was never broken into practical layers, or they underreact and treat governance like documentation that can be cleaned up near launch.

The stronger pattern is to narrow the first use case, define the evidence and change path early, and let that smaller system prove the operating model. The bottleneck in regulated AI execution is rarely headcount — it is having someone who understands both the control requirements and the system architecture deeply enough to decide without round-trips.

When This Approach Does Not Apply

If the workflow is still vague, the risk owner is unclear, or the team cannot define what human review is required, the problem is unresolved scope — not regulation. Scoping should happen before execution planning. If the system category implies formal regulatory pathways beyond the team's competence, technical assessment and control design should come before direct build.

First Steps

Name the first use case. Narrow enough that the evidence path can be designed concretely for each workflow transition.
Define the change taxonomy. Which changes are allowed, reviewed, or blocked — with fail-safe-by-default for sensitive data.
Write the control loop. Authentication, session management, logging, isolation, and rollback triggers — before the system is operationally important.

Practical Solution Pattern

Execute regulated AI through explicit control layers built into the system from the start: evidence layer first (audit trails with change rationale at every transition), then the change model (fail-safe data protection and structured hardening), then operating controls (authentication, isolation, and session management as first-class system behaviors).

This works because regulated environments reward legibility. The clearer the evidence and change paths, the faster teams move without governance debt that blocks deployment. The measure of success is whether the system reaches production with controls that survive the first audit. If the workflow is real but the architecture still needs pressure-testing, an AI Technical Assessment is the stronger first move. If the target is not yet clear enough to control, a Strategic Scoping Session should come first.

References

National Institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST, 2023.
National Institute of Standards and Technology. Artificial Intelligence Risk Management Framework: Generative AI Profile. NIST, 2024.
U.S. Food and Drug Administration, Health Canada, and MHRA. Good Machine Learning Practice for Medical Device Development: Guiding Principles. Regulatory Reference, 2021.
U.S. Food and Drug Administration. Predetermined Change Control Plans for Machine Learning-Enabled Medical Devices. Regulatory Reference, 2025.
U.S. Department of Health and Human Services. Healthcare Security Rule — Strengthening Cybersecurity of Electronic Protected Health Information. Federal Register, 2025.
U.S. Department of Health and Human Services. Guidance Regarding Methods for De-identification of Protected Health Information. HHS, 2024.
National Institute of Standards and Technology. Security and Privacy Controls for Information Systems and Organizations (SP 800-53 Rev. 5). NIST, 2024.