What a forward-deployed engineering engagement looks like inside a $720M regional packaged-foods company. Synthetic client (Vermillion Foods, natural-ingredient brand with 6 plants and 7 DCs), real methodology, real artifacts. Eight agents shipped to production across SAP S/4HANA + Manhattan SCALE + o9. The team stopped chasing exceptions and started doing strategic sourcing.
What we walked into. Three weeks of discovery, eight systems mapped, the heatmap of where time was actually leaking.
Supplier quirks, regional weather rules, DC dock constraints. How undocumented ops conventions became agent-runnable rules.
What the Coordinator, Director Ops, and COO each saw on Monday morning after the agents went live.
How a single supplier disruption flows through five sub-agents in parallel, and what the resolution loop looks like.
Before/after metrics, accuracy curve, full audit trail, and what the team kept after we left.
Vermillion Foods had migrated SAP from ECC to S/4HANA the prior year, deployed Manhattan SCALE for warehouse ops, and rolled out o9 Solutions for planning. The systems were modern. The work wasn’t. Coordinators spent their week chasing supplier exceptions through Slack and email. War rooms triggered for every Type-12 disruption. The COO wanted OTIF up to 97%, stockout rate down to 2%, and the team out of the war-room business and into strategic sourcing. Three weeks in, we had the picture below.
Heatmap from time-and-motion shadowing of 16 ops professionals across two weeks (coordinators, planners, buyers, warehouse leads, quality analysts). Exception chasing consumed nearly a third of the org’s capacity — most of it work that the systems should have absorbed automatically.
Week-three deliverable: which agent to build first and why. P1 + P2 are the foundation — exception routing and demand sensing are how the war room shuts down. Everything else flows from those.
The systems weren’t the constraint. SAP, Manhattan, and o9 were all modern, all integrated. The constraint was the human-in-the-loop logic that connected them. Every Type-12 supplier delay required a coordinator to: check the DC’s safety stock manually, calculate transfer cost vs lost-revenue manually, find a backup supplier’s capacity manually, draft a customer ETA update manually. The agent doesn’t replace any of those systems — it sits between them and absorbs the work that humans had to do because nobody had built the connective tissue. Five exception types covered 71% of weekly war-room volume. Solve those five, the war room mostly disappears.
Six plants, 7 DCs, 180 suppliers, 240 SKUs. Half of how Vermillion’s ops team actually runs production was in nobody’s SOP — it lived in the heads of the senior coordinators who had been there through every supplier transition, every dock-constraint workaround, every weather-driven inventory build.
Vermillion had a 47-page operations playbook in Confluence. The agent extracted 12 executable rules from it. Six of the most-cited shown below.
What lived in people’s heads. Captured through structured interviews with senior coordinators, planners, the procurement bench, the QA team, and the regional warehouse leads. Each rule has a named source so it can be revisited as the org changes.
Pacific Components always ships 2-3 days late on orders over $200k. Build buffer into timeline; don't rely on quoted lead times.
DC-7 (Atlanta) has loading dock constraints that require staggered inbound scheduling on Mondays — too many trucks book the 8am slot otherwise.
Q4 holiday season: double safety stock on top 50 SKUs starting October 1, not the policy date of November 1. Two years of stockouts taught us this.
When o9 forecast disagrees with sales team input by >20%, always use the higher number for top 10 accounts. Sales has direct customer signal we don’t.
DC-3 (Tampa) always carries 12% extra of hurricane-zone SKUs in summer months — not in any official document, but every coordinator knows.
Mark at procurement handles all Coastal Co-Pack exceptions because of timezone and relationship context. Don’t auto-route those.
Every agent ships with an explicit confidence threshold. Below it, the agent escalates to a named human; never silently fails. Quality-related decisions ship with stricter thresholds because of recall risk.
In CPG, a wrong agent decision can mean a stockout that loses a Walmart slot, a quality issue that becomes a recall, or a supplier dispute that breaks a relationship. The whole stack is built around the agent declaring uncertainty rather than guessing. Below threshold, work routes to a named human with the full reasoning trace attached. Quality and recall-risk patterns trigger the highest thresholds; routing decisions can run leaner.
Three perspectives on the same Monday morning, three weeks after deployment. The Coordinator saw a 4-item queue (down from 30+). The Director Ops saw a 220-person team operating at 300-person effective capacity. The COO saw the weekly ops cycle at Day 3 of 5, on track.
Rachel Torres, Senior Supply Chain Coordinator · DC-4 opened her queue Monday morning and saw four items requiring human judgment. Down from 30+ before the agents shipped. Everything else was handled overnight across 7 DCs.
A supplier disruption arrives. The Main Operations Agent spawns five specialized sub-agents in parallel — each assembling different context, checking different thresholds, surfacing different decisions. The orchestrator synthesises a recommendation. The team gets the full plan; they decide whether to approve.
Assembled context: 340 units short, 12 customer orders affected, DC-4 safety stock at 15%, DC-7 has 420 surplus units, backup supplier Meridian has capacity.
Classified as Type 12: Supplier Delay — Partial Shortfall. Severity: High. Matches repeatable pattern (1 of 31 automatable types).
Revenue at risk: $127k across 12 orders. 3 orders have delivery commitments within 48 hours. 2 key accounts affected. Alternative sourcing available.
Recommended: Reroute 280 units from DC-7 safety stock + expedite 60 from Meridian Parts. Cost: $4,200. Alternative: wait 3 days, risk $127k revenue. Recommend: approve reroute.
Draft actions assembled: DC-7 → DC-4 transfer order, Meridian Parts expedite PO, Pacific Components delay notification, customer ETA updates. Awaiting approval.
When a human corrects the agent’s recommendation — say, choosing a partial-reroute split-source over a full reroute — the correction is captured. The next similar disruption pattern gets the corrected playbook automatically.
Three real ops changes from Q1 — supplier lead-time shifts, demand pattern adjustments, new supplier onboarding. Each detected automatically and absorbed without manual reconfiguration of routing rules.
Average lead time increased from 14 to 18 days. Safety stock calculations and PO timing rules auto-adjusted across affected SKUs.
POS data showed 22% above forecast for 3 consecutive weeks. Demand sensing weights recalibrated. Pre-build triggered at DC-3 and DC-5.
Backup supplier added to exception routing options. Lead time, quality standards, and capacity constraints integrated into resolution planning.
Month 1 vs Month 4 at Vermillion Foods. The agents got smarter, disruption response got faster, the team shifted from war-room exception chasing to strategic sourcing.
Agents get smarter every week. Human corrections and SOP changes are absorbed automatically. Overall accuracy lifted from 85% in week 1 to 96.8% by week 16.
Every agent action with timestamp, reasoning, confidence, and human approvals. Searchable. Filterable. Used for retailer compliance audits, supplier QBR documentation, and quarterly board reporting.
Vermillion owns the agents, the data, the rules, the methodology. We did the work; they keep everything.
Every workflow, every rule, every model. Deployed on their infrastructure, inside their VPC, within their security perimeter.
Processing happens in their environment. No supplier or operations data sent to external servers. Full compliance with their security and retailer-data policies.
Zero vendor lock-in. They keep everything if the engagement ends. The IP is in the methodology, not the output.
They own the building. We designed and built it. The blueprints, the structure, the systems. All theirs.
Range, not point estimate. CPG CFOs read these numbers carefully and they need to survive board scrutiny. Below is how the value gets created — each line tied to a specific agent and a specific measurable outcome.
Inventory carrying costs down 19%. Optimized safety stock levels across 7 DCs based on better demand sensing. $9.3M working capital released; conservative carry-cost computation at 7-11% yields the range.
Stockout rate on top-50 SKUs from 6.8% to 2.4%. Revenue not lost. Plus: zero Walmart OTIF penalty exposure this quarter (vs $48k Q1 prior year). Conservative — only counts top-50 SKUs.
14 coordinator FTE-equivalents previously running 35-50% on exception chasing have been redeployed to supplier development, strategic sourcing, and cross-DC optimization. No layoffs — capacity moved up the value chain.
Disruption response time 2-4 days → <6 hours. War room reduced from 90 min/12 people to 25 min/5 people. Proactive Type-12 handling saves expedited-freight costs and prevents retailer penalty exposure.
All baselines are pre-engagement (the prior fiscal quarter at Vermillion). Working capital release computed at 9% blended carry cost (warehouse + opportunity cost). Stockout avoidance uses contribution-margin-weighted lost-sales model on top-50 SKUs only (conservative; long-tail SKU losses excluded). Headcount redeployment uses fully-loaded comp ($85k median for senior coordinators benchmarked against 2025 ASCM data). Range exists because cohort sizes are still small (n=2 quarters of post-engagement data). We never bill more than the lower bound of created value.
Especially in supply chain — where one quality issue can become a product recall and one missed retailer commitment can lose a slot — knowing the limits is the only way to deploy something that survives reality.
Quarter two: scope expansion to S&OP automation, retailer order-quality scoring, and co-packer capacity optimization. Quarter three: BOT (build-operate-transfer) optionality. The methodology is portable; the agents are theirs.