A healthcare AI platform needed to ingest ECG recordings from clinic exam rooms into a cloud processing system. The recordings arrived through three different paths — clinician web uploads, SMB file shares on clinic networks, and EHR integration triggers.

Clinical device ingestion operates under fundamentally different constraints than standard data pipelines: clinic networks use SMB file shares as the only programmatic interface, devices write files without notification mechanisms, network interruptions produce partial uploads, and the same recording may arrive through multiple paths. A configuration error that routed one organization's clinical data to another's storage would be both a compliance violation and a contractual breach.

When a single misconfiguration can leak clinical data across organizations, the ingestion pipeline is the compliance boundary — not just a data pipe.

The platform was already processing web uploads and EHR-triggered ingestion through separate code paths. Adding SMB as a third parallel system would have created divergent processing logic, inconsistent audit trails, and bugs that only appeared when the same recording arrived through multiple paths. ML LABS built the device-to-cloud pipeline that unified all three ingestion paths onto a single processing queue, enforced per-organization data isolation at the storage level, and replaced manual operational scripts with event-driven scheduling — all deployed across test, US production, and UK production environments.

Ingress Queue Unification

Web uploads and EHR integrations previously used separate processing paths. Building SMB as a third parallel system would have tripled the surface area for processing bugs and made cross-path deduplication impossible.

graph TD
    A["Web Upload"] --> D["Unified Ingress<br/>Queue"]
    B["SMB File Share"] --> D
    C["EHR Integration"] --> D
    D --> E["Serverless<br/>Processing"]
    E --> F["AI Model<br/>Inference"]
    F --> G["Clinical Results"]

    style A fill:#1a1a2e,stroke:#0f3460,color:#fff
    style B fill:#1a1a2e,stroke:#0f3460,color:#fff
    style C fill:#1a1a2e,stroke:#0f3460,color:#fff
    style D fill:#1a1a2e,stroke:#ffd700,color:#fff
    style E fill:#1a1a2e,stroke:#ffd700,color:#fff
    style F fill:#1a1a2e,stroke:#16c79a,color:#fff
    style G fill:#1a1a2e,stroke:#16c79a,color:#fff

Every path now produces an identical message on a single processing queue. The downstream processor handles all recordings identically regardless of source, eliminating path-specific bugs and making deduplication work across all three paths. Every file that enters the system gets a log entry with the ingestion source, timestamp, and processing outcome.

Cross-Organization Isolation

Each organization's data is isolated at the infrastructure layer — storage, credentials, and processing configuration are all scoped per-org. Onboarding a new organization is an automated provisioning step, not a manual checklist, which eliminates the configuration errors that produce data isolation gaps. The pipeline also validates file completeness before processing, preventing corrupt records from reaching downstream systems.

Operational Automation

ML LABS replaced manual scripts with event-driven scheduling and automated failure alerting. Backups, sync monitoring, and deployments run on schedules without human initiation. Failures route to alarms rather than going unnoticed. Environment-specific configuration manages regional differences across test, US production, and UK production, ensuring changes are validated before reaching clinical workflows.

When Complexity Exceeds Capacity

This architecture is tractable for a bounded number of clinic sites with consistent device types. It becomes categorically harder when the platform must support dozens of sites with heterogeneous device populations or when regulatory requirements span multiple jurisdictions with different data residency rules.

The gap between "we have a working pipeline" and "the pipeline runs unattended across all sites" is where most clinical AI initiatives stall.

First Steps

  1. Unify your ingress queue. Converge all ingestion paths onto a single queue with a common message format to eliminate path-specific bugs.
  2. Automate org provisioning. Storage, credentials, and config for a new organization should be one script. One missed step is an isolation gap.
  3. Replace manual scripts. Anything that depends on someone remembering to run it should be scheduled with automated alerting.

Practical Solution Pattern

Build a unified device-to-cloud pipeline where every ingestion path produces an identical message on a single processing queue, with per-organization storage isolation enforced at provisioning time and event-driven automation replacing manual operational scripts.

This works because ingress unification eliminates the path-specific processing branches that diverge and fail in untested combinations, while automated provisioning removes the human configuration errors that produce data isolation gaps. If your organization needs to scope a device-to-cloud pipeline, a Strategic Scoping Session can map ingestion paths, isolation requirements, and automation gaps before the build begins.