How to Choose an Enterprise AI Architecture for Ecommerce at Scale

28/05/2025 18 minutes to read

Alex Bolshakova

Chief Strategist for Eсommerce Platforms

When architecting an AI platform architecture for ecommerce, companies often focus on selecting specific technical components, but the foundational choice lies in the implementation model as well. Typically, this comes down to three paths: ready-made AI products, enterprise AI platforms, or a custom AI core, with each making different trade-offs between speed, control, and scalability.

Products work well for narrow use cases, platforms support multiple workflows, and custom cores are designed for complex operations spanning catalog, pricing, search, customer support, multiple brands, and international markets.

Selecting the right model is critical because deploying these capabilities is no longer just a future-looking experiment. McKinsey’s State of AI 2025 reports that 88% of organizations use AI in at least one business function. The challenge begins when AI has to operate inside a commerce environment: PIM, ERP, pricing, inventory, compliance requirements, and multi-market operations. That is where architectural decisions start showing up in latency, auditability, and cost.

Drawing on our experience at Expert Soft delivering high-load architectures and complex enterprise AI projects for multi-market retailers, we have seen firsthand how these foundational decisions play out under real-world pressure. This article breaks down the three core architecture models, the six criteria that drive the choice between them, and the four compounding risks that teams face as systems scale.

Quick Tips for Busy People

A control framework for core operations: enterprise AI architecture in ecommerce determines how models access data, handle business logic, enforce compliance, and integrate with existing commerce systems to safely transition AI from an isolated tool to a centralized corporate layer.
Five mandatory structural layers: a production-scale architecture requires five non-optional layers — data (for cleanup and ingestion), model (for logic and guardrails), orchestration (for API routing and A/B testing), ops (for cost and peak observability), and policy/governance (for global rules and market overrides).
Three distinct implementation profiles: companies choose between ready-made products, enterprise platforms, or a custom AI core.
A strategic choice driven by operational reality: choosing the right model requires scoring your business honestly across six critical criteria: future use case scope, required schema control, necessary integration depth, peak latency tolerances, regulatory audit needs, and 3-year cost curves.
Four compounding long-term risks: if architecture is omitted from early planning, systems experience severe degradation over a 24-to-36-month horizon through vendor lock-in, unmodeled cost growth, blind spots from weak observability, and high maintenance costs from duplicated regional logic.
A clear blueprint for selection: deploy a turnkey product if your use case is isolated and simple; adopt an enterprise platform if you need centralized governance across standard tools; build a custom AI core if AI must cross-cut your catalog, search, and pricing across multiple international brands.

Enterprise AI Architecture in Ecommerce: Brief Overview

Enterprise AI platform architecture in ecommerce is the foundational framework that governs how AI handles data, executes business logic, enforces policies, integrates with core commerce systems, scales under peak loads, and maintains observability. It’s the discipline that keeps all of those things consistent as new AI use cases arrive on top of an already complex platform.

This especially matters when AI becomes more than just a standalone tool in the ecommerce ecosystem. A standalone AI tool operates within a narrow scope, handling basic tasks like summarizing a review or ranking products. Because its responsibilities are limited, it requires minimal integration and functions independently of your broader ecommerce stack.

Integrating AI into core operations vanishes this simplicity, requiring unified data flows across SAP Commerce, PIM, and ERP systems alongside consistent business policies across brands. To maintain operational integrity, the underlying logic must behave identically across catalog, pricing, search, and support workflows while ensuring every decision remains fully observable and auditable.

This transition from an isolated tool to a centralized layer with system-wide obligations is what turns AI implementation into an architectural decision.

Core Components of Enterprise AI Architecture

AI enterprise architecture in ecommerce is built on five layers: data, model, orchestration, ops, and policy and governance. Each layer handles a specific responsibility, and none of them are optional at the production scale. Skip the one, and the system will struggle the moment traffic spikes or compliance rules are tested, which in ecommerce usually happens during peak events when the financial stakes are highest. The architecture is typically structured around five layers:

Data layer

Responsible for getting the right data to the model at the right time and in a usable shape. It handles ingestion from internal (SAP Commerce, PIM, ERP, OMS) and external (producer data sheets, images\assets, external databases ) data sources, along with enrichment and validation to fill gaps and catch bad records, and freshness windows that define how recent each input must be before it is used in a decision. Without this layer, every other layer is operating on data of unknown quality.
Model layer

Responsible for the AI logic itself: ranking, recommendation, retrieval, classification, and LLM-based reasoning. It includes the models in use, the adapters that connect them to your data, and the guardrails that constrain their output, for example, blocked categories, prompt-injection defenses, and content filters. Because models change faster than the rest of the stack, this layer is designed so that a new model can be slotted in without changing the data, orchestration, or policy layers around it.
Orchestration layer

Responsible for deciding which model or workflow handles a given request and under what conditions. It includes the APIs that other systems call, the routing rules that send different traffic to different models, the feature flags and canaries that gate new logic, and the experimentation infrastructure (A/B tests, interleaving, offline evaluation) that measures whether a change is better. Without this layer, changes to AI logic require code deploys instead of configuration changes.
Ops layer

Responsible for keeping the system reliable, observable, and within budget under production load. It includes traditional observability infrastructure (logs, metrics, per-decision traces) alongside AI-specific metrics like response quality, groundedness, and hallucination rates. This is the layer that determines whether a problem can be diagnosed in minutes or stays unresolved for days.
Policy and governance layer

Responsible for enforcing rules that apply across markets, brands, and regulatory contexts: GDPR, regional disclosure requirements, consent flows, data residency, and brand-specific business rules. It defines global logic once and allows scoped overrides per market or brand, instead of cloning the entire AI workflow for each region. This is the layer that determines whether multi-market expansion stays manageable or accumulates duplicated logic that becomes a compliance risk.

Every enterprise AI system requires these five foundational layers to function reliably under production load. However, the way a company chooses to build, assemble, and manage these layers depends on its operational scale, budget, and internal engineering capabilities. This brings us to the three primary implementation models available to ecommerce enterprises, each packaging these five layers with different trade-offs in speed, control, and complexity.

Which AI Model Fits Your Ecommerce Scale?

There is no universally best AI architecture model for ecommerce. The right choice depends on how much control the business needs over data, latency, integrations, experimentation, compliance, and long-term cost. The three models below correspond to three control profiles, each adding capability in the dimensions above.

Ready-made AI products

This is the product-first approach: ready-made SaaS tools for well-defined problems, like recommendation engines, on-site search, or personalization. They are the fastest way to prove value and the safest start when the use case is standardized.

Strengths are obvious: quick wins, no infrastructure to stand up, and vendor responsibility for the model under the hood. However, these benefits come at a structural cost: data schemas are dictated by the vendor, integrations remain shallow, and experimentation is limited to predefined native features. The OPEX costs is also worth mentioning – for high-volume systems ready-made AI products are more expensive Additionally, because inference queues are shared across multi-tenant infrastructure, service-level objectives (SLOs) during peak traffic are entirely dependent on vendor capacity rather than internal performance tuning.

Ready-made products fit when AI solves one narrow task, the use case is not strategically differentiating, latency and compliance rules are relaxed, and a single brand or locale dominates the traffic. They start to crack the moment you need a second brand, a stricter SLO, or a custom attribute the vendor has never seen.

Enterprise AI platforms

The platform-first approach assembles AI capabilities on a broader environment, for example, IBM ICA, Copilot Studio, or a major hyperscaler’s AI platform, so internal tools can be built quickly under one set of governance rules. This is a common starting point for teams that need flexibility across multiple use cases but lack the resources or appetite to build a custom solution from scratch.

The compromise here is systemic dependency. You work inside the platform’s primitives, its connector library, and its roadmap. When a connector you need does not exist yet, you either build a mini-integration around the platform or wait. Performance behavior under peak load is often partly hidden behind shared infrastructure, and cost in enterprise AI platform architecture can move with tiers in ways you only fully understand after the first quarterly invoice.

Platforms are a strong fit when you need several AI workflows under one governed environment, connectors cover most of your systems, and platform-level monitoring is enough for your audit needs. They start to strain when you outgrow vendor primitives or need SLOs that the platform cannot contractually back.

Custom AI core

A custom AI core is what you build when AI has to behave like a part of your core commerce operations, not a side tool. You own the schemas, the orchestration, the policy layer, and the ops discipline. It is not the fastest start, but it is the approach that scales cleanly as complexity rises.

The strengths address the limitations of the other two models across four dimensions:

Schema control, as you own the data shapes.
Integration depth with direct adapters to SAP, ERP, PIM, and CRM.
Performance, including tuned caches and warm pools sized to your peak.
Governance with per-decision traces, model neutrality, and an explicit policy layer with global logic and local overrides.

The price is execution: you carry the build, the talent, the runbooks, and the cost governance yourself.

A custom core fits when AI supports multiple connected ecommerce journeys, strict latency and peak resilience matter, and auditability, consent, or data residency become contractual rather than optional. It overshoots when the use case is narrow, standardized, and not strategically differentiating. In those cases, a packaged product covers the requirement with less ownership cost, and a custom core adds maintenance overhead without proportional value.

AI models comparison

The table below maps each model against nine architectural dimensions that determine how the system will behave at scale.

Dimension	Ready-made AI products	Enterprise AI platforms	Custom AI core
Data control	Narrow, vendor-defined shapes	Prebuilt connectors and schemas; custom fields often forced into generic slots	You own schemas, denormalization, freshness windows; easiest to reuse
Model availability	1-2 simplest models only	A bit wider list, but still no frontier models available	Almost any model available, often even version fine-tuned on your own data
Integration depth	Shallow unless you buy extra modules	API coverage varies; webhook latency may pinch	Direct adapters; near-real-time ingestion where it matters
Latency at peak	Shared inference queues; CDN helps, but you do not tune it	Shared tenancy; burst behavior is partly opaque	Can be run on separated hardware (e.g., SLM models)
Experimentation	Limited to the product’s UI layer	Bound by exposed knobs	Feature flags, canaries, offline eval sets; ship weekly
Multi-brand / multi-locale	One catalog flavor; locale expansion gets expensive	Policy duplication risk	Policy layer with global logic and local overrides
Observability	Black-box scoring; exports at best	Dashboards; limited traceability	Per-decision traces, replayable datasets, audit trails
Cost governance	Tier jumps at usage peaks	Coarse alerts, little route-level control	Route-level unit economics, budget guards in code
Vendor lock-in	Proprietary schemas; exit means re-implementing logic	API/DSL ties; export possible but costly	Lock-in moves to your code; manageable with clean APIs

Ready-made products optimize for deployment speed. Enterprise platforms balance speed and control within vendor primitives. A custom core trades upfront ownership cost for full control. The decision comes down to which architectural dimensions are non-negotiable over the 24-36 month roadmap.

Implement an AI system that holds under enterprise load and multi-market complexity Talk to Our Team

How to Choose an Enterprise AI Model

To choose the right AI architecture model, evaluate the future operating conditions of AI, not just the first use case: data control, integration depth, latency, governance, experimentation, cost predictability, and reuse across journeys. The six criteria below are the ones that move the decision when teams score it honestly.

Use case scope

One narrow, standardized use case rarely justifies an architecture conversation. A ready-made product is usually enough, and your job is to keep the data exportable and resist the urge to over-engineer.

The moment you have several connected use cases, such as ranking, support, and product content, the calculation shifts. Either a platform centralizes orchestration, or a custom core lets all of them share the same data, the same policies, and the same integrations. Skipping that step produces duplicated integration code, divergent data models, and policy logic that each team interprets slightly differently.

Control over data and schemas

If vendor-defined data shapes match your reality, a product works. If you need custom attributes, golden records, freshness windows, or denormalized data reused across multiple use cases, vendor shapes start to fight you. That tension shows up as “temporary” mapping scripts that eventually become permanent.

The deeper question is whether you control the schema or the schema controls you. With a platform, you get more room, but still inside its conventions. With a custom core, the schemas are yours, which means you can shape them around how your business thinks about customers, products, and decisions.

Integration depth

Shallow integrations, such as read-only feeds and occasional webhooks, are well-suited for a packaged product or platform architecture, where many enterprise AI initiatives can effectively operate during initial pilot phases.
Deep integrations, by contrast, present entirely different architectural requirements. High-frequency, near-real-time read/write flows, contract-tested APIs, comprehensive synchronization across SAP, ERP, PIM, and CRM systems, and shared core services designed for cross-system utilization all necessitate a custom-first architecture. While platform-native connectors can bridge some of these gaps, they rarely satisfy the intricate demands of an SAP Commerce environment managing complex, multi-market data.

Ultimately, a custom-first model delivers the greatest value when deep integration is a fundamental capability rather than an auxiliary requirement.

Latency and peak-load requirements

Autosuggest, search, ranking, support flows, and pricing logic each carry their own SLO and should work reliably within any load.

A useful rule of thumb: an AI-assisted path should hold P95 < 120 ms for autosuggest and P95 < 200 ms at peak for the rest, not just off-peak. Shared SaaS infrastructure rarely guarantees that, and even strong AI platforms struggle when peaks go 5× baseline or more. A custom core earns its keep here through tuned caches, precompute windows, and warm pools sized to your actual peak profile, not the vendor’s averaged one.

When standard, dashboard-level auditing is sufficient and corporate data residency policies align with a vendor’s existing geographic availability, packaged products or platform paths can adequately support the compliance framework. These solutions are fundamentally engineered to satisfy the median enterprise requirement.

However, if the operational environment demands rigorous decision lineage, localized per-market consent mechanisms, custom policy enforcement, or comprehensive traceability for every AI-driven action, systemic ownership must shift directly to your infrastructure. Regulatory bodies do not accept vendor assurances as a substitute for verifiable forensic evidence.

Establishing a self-managed policy layer represents the only viable architectural mechanism to execute global business logic with regional overrides cleanly, completely avoiding the high-maintenance risk of cloning the entire environment for each independent market.

Long-term cost behavior

Initial license or build cost is the smallest part of the picture. Over a 24–36 month horizon, costs are driven by a handful of factors: traffic growth, peak demand patterns, and the complexity of the business itself.

Localization requirements, catalog expansion, and the pace of experimentation all add to the equation. On the technology side, model pricing, data egress, observability tooling, vector storage, and the operational team required to support the system shape the long-term cost curve.

Products tend to come with tier jumps and localization surcharges. Platforms tend to come with premium guardrails, connector gaps, and egress fees. Custom cores pile on costs for observability, model lifecycle, and vector growth if you skipped them on day one. Honest cost modeling means projecting all three across three years and picking the curve you can defend, not just the launch month you can fund.

Common AI Model Risks in Ecommerce

AI architectures rarely experience day-one failures. Instead, systemic degradation occurs over a 24-to-36-month horizon as technical debt from early implementation shortcuts collides with production scale, complex multi-market policies, and operational budget constraints. While these architectural vectors manifest across all deployment models, their specific operational timelines vary. Proactively mitigating these risks differentiates a scalable, maintainable enterprise infrastructure from an inflexible system bound to legacy constraints.

Vendor lock-in

Lock-in exists in every model but manifests at different architectural layers. Products lock you into proprietary schemas, APIs, DSLs, and a preferred model stack. Platforms lock you into connectors and primitives. Custom cores lock you into your own code, which means every migration becomes your responsibility, not the vendor’s.

Pro tip:

enforce exportable data formats, isolate vendor-specific logic behind API facades, and document integration contracts outside individual team memory.

Hidden cost growth

The visible costs, including licenses, build, and infrastructure, are usually the easiest to estimate. The harder part is accounting for tier jumps at peak, overage fees, localization surcharges, and the operational burden that emerges over time. Custom is not free either: observability, model lifecycle, vector growth, and governance processes continue generating costs long after launch.

The right architecture decisions at the inference layer can keep cost growth under control even as AI usage scales. The example below shows how one of our clients cut LLM costs by tying summary regeneration to change thresholds rather than to raw activity.

We worked with a large beauty retailer to implement automated AI summaries on product pages that aggregated thousands of customer reviews. While generating the summaries was technically straightforward, doing so at scale introduced a cost challenge: regenerating a summary for every single new review would lead to unpredictable LLM token costs.

To solve this, we designed an event-driven architecture that treated summary generation as a selective process triggered by meaningful data changes rather than raw activity. In this setup, a dedicated AI microservice processed the reviews, and SAP Commerce Cloud consumed and rendered the static summaries on product pages. This decoupled design ensured that the AI layer could scale independently without adding processing overhead to the core ecommerce platform.

Learn how to leverage microservices for scalable AI capabilities.

Get Framework

The core of the cost discipline sat within an auxiliary Azure-based decision layer. SAP Commerce Cloud exported review data as lightweight JSON payloads to an Azure hot folder, triggering an Azure Function. Instead of automatically calling the LLM, this function compared the incoming review count against the volume used for the previous summary. A new LLM generation was initiated only when the volume of new reviews crossed a predefined threshold. By handling this evaluation logic entirely within the Azure Function, we kept the commerce platform lean and insulated the business from constant LLM calls, tying expensive AI execution to meaningful data shifts.

Pro tip:

model the cost curve across a 24-36 month horizon, tie expensive operations like LLM calls to meaningful change events, and budget for ops, governance, and vector growth on day one, not after launch.

Weak observability

Weak observability becomes visible the moment a model output is challenged, whether by an auditor, a customer, or a postmortem on a conversion drop. Without per-decision traces, replayable datasets, and audit trails, none of those questions can be answered with evidence rather than speculation.

Dashboards alone do not solve this. The architecture has to support reconstructing the exact inputs, model state, and policy decisions behind a single output, scoped to a specific customer, market, and timestamp, without manual log correlation across systems.

Comprehensive observability extends beyond static dashboards. It requires the structural capability to reconstruct the precise variables influencing a specific model decision for a given customer, market, and timestamp without requiring prolonged forensic cross-analysis of disparate, un-indexed log files.

Pro tip:

instrument per-decision traces, retain replayable input datasets, and structure logs around customer, market, and timestamp so any output can be reconstructed without forensic log-mining.

Duplicated regional logic

This approach scales linearly with the number of markets, while the maintenance and audit costs scale quadratically. Each new market adds a copy that must stay synchronized with every other copy whenever a global rule changes.

The cleaner pattern is global logic plus local overrides: one policy layer that recognizes locale and market, applies the right data-handling and decision flow, and logs consent or compliance steps locally. The same pattern keeps audits manageable, because reviewers see one system with documented variations, not five systems pretending to be one.

Pro tip:

define global business and compliance logic in one policy layer with scoped overrides per market or brand, rather than cloning workflows for each region.

Which AI Model to Choose

At some point, most enterprise ecommerce teams stop running disconnected AI pilots and start asking how all of these initiatives should work together under a single AI-governed system. That is the moment architecture stops being a back-office concern and becomes a leadership decision.

Optimize your data layer for AI without a complete infrastructure rebuild.

Download whitepaper

Gartner reports that more than 40% of agentic AI projects will be canceled by the end of 2027, largely due to escalating costs, unclear value, and weak risk controls. These systemic failures directly illustrate the vulnerabilities that manifest when enterprise architecture is excluded from early-stage strategic planning.

The three checklists below define the conditions under which each model is the right architectural choice. If your situation lines up with one column, that is your model. If it spans two, the right move is usually to start where you are and design for migration to the next step.

Choose ready-made AI products when:

AI solves one narrow, standardized use case.
The use case is not strategically differentiating.
Vendor logic and data structures are enough for your needs.
Integrations are shallow or mostly read-only.
Latency and compliance requirements are relaxed.
Fast validation matters more than long-term architectural control.

Choose an enterprise AI platform when:

You need several AI workflows or internal AI apps under one governed environment.
The platform’s connectors cover most of your required systems.
Workflow orchestration matters, but full ownership is not required.
Platform-level governance, monitoring, and access control are enough.
Performance requirements are moderate, and vendor SLOs are acceptable.
The team can work within the platform’s primitives, roadmap, and customization limits.

Choose a custom AI core when:

AI supports multiple connected ecommerce journeys.
AI logic needs to be reused across catalog, search, support, pricing, operations, or product content.
The business runs multiple brands, locales, policies, or compliance contexts.
Strict latency, peak-load resilience, or P95/P99 control matters.
Teams need frequent changes to prompts, ranking logic, rules, workflows, or experiments.
Auditability, consent, data residency, or per-decision traceability are required.
Long-term cost predictability matters more than the fastest possible launch.

The right model depends on the operating conditions AI has to work within over the next 24-36 months. Migrations between models cost more than getting it right the first time.

To Sum Up

Enterprise AI architecture for ecommerce represents a control decision. Determining where to isolate data, business logic, integrations, compliance, performance, and cost governance dictates how machine learning functions as it integrates into core operations. Consequently, the selection of an architectural model depends significantly more on the specific parameters of operational control required over a 24-to-36-month horizon than on the immediate, isolated use case under consideration.

Ready-made products, enterprise AI platforms, and a custom AI core each present distinct balances of deployment speed, technical ownership, and risk exposure. The objective of this evaluation is to align the systemic framework with actual organizational scale requirements before suboptimal architectural choices manifest as unpredictable cost growth, compliance audit gaps, or roadmap dependencies bound to vendor constraints.

If you are evaluating these options within your existing commerce ecosystem, Expert Soft is available to provide technical insights. Our engineering team can assist in analyzing your infrastructure against high-volume performance and multi-market scaling requirements.

FAQ

What does an enterprise AI architect do?

An enterprise AI architect designs how AI fits into the rest of the business: how it gets data from systems like SAP Commerce, PIM, ERP, and CRM, how it enforces policies across brands and markets, how it meets SLOs at peak load, and how it stays observable and auditable. The role spans data, model, orchestration, ops, and governance layers, and it tends to operate at the intersection of engineering, compliance, and product.
What are the 3 types of enterprise AI architecture?

In an ecommerce context, the three working models are ready-made AI products (product-first), enterprise AI platforms (platform-first), and a custom AI core (custom-first). Product-first is fastest to deploy for narrow use cases. Platform-first centralizes governance and orchestration within vendor primitives. Custom-first gives you full control over data, integrations, latency, and policy at the cost of stronger ownership.
What are the core components of enterprise AI architecture?

Five layers usually matter in this architecture. The data layer ingests, enriches, and validates inputs from systems like PIM, ERP, and SAP Commerce. The model layer manages ranking, recommendation, and LLM adapters with integrated guardrails. The orchestration layer handles APIs, routing, and experimentation through feature flags. For system monitoring, the ops layer oversees observability, cost governance, and SLOs. The policy and governance layer manages global business rules while allowing local overrides for multi-market and multi-brand setups.
What are the specifics of AI architecture for ecommerce systems?

Ecommerce AI architecture has to deal with three pressures that most other industries do not face at the same time. Data is federated across PIM, ERP, OMS, CRM, and CMS, so AI cannot assume a single source of truth. Traffic is spiky and seasonal, so latency targets like P95 < 120 ms for autosuggest and P95 < 200 ms at peak have to hold during BFCM, as well as a low-traffic Tuesday. And multi-brand, multi-locale operations make policy and compliance a first-class architectural concern, not a configuration field.