Enterprise Platform Build Cloud-Native POS 200-Outlet Migration India's Largest Premium Ice Cream Chain

Naturals

Ice Cream

A ground-up cloud-native platform replacing a legacy POS across 200 outlets — in a single day. Unified order management from every aggregator. Custom ERP, inventory, and reporting. All running on a multi-datacenter ARM64 cluster that costs a fraction of equivalent managed cloud infrastructure.

200+
Outlets
200K
Orders / Day
99.9%
Uptime SLA
1 Day
Full Migration
6
Order Channels
<2min
Deploy Cycle

01 // The Challenge

Growth Constrained by the System, Not the Market

Naturals Ice Cream — India's largest premium ice cream chain — had outgrown its POS. The legacy system couldn't keep pace with the aggregator volume coming in from Swiggy, Zomato, Vendekin, Delight, and Urban Piper simultaneously. Menu changes required manual updates on each platform separately. Inventory was tracked in spreadsheets checked at end of shift. The ERP had no live connection to what was happening across 200 outlets in real time. Leadership had no unified view — just fragmented data from disconnected systems.

The brand had built a national reputation. The technology hadn't scaled with it. They needed a complete replacement — purpose-built, not adapted from a generic product — and they needed it to go live without disrupting a single day of outlet operations.

In plain terms

Imagine running 200 shops where each one is using a different notebook to track orders, a different phone to receive Swiggy and Zomato orders, and sending daily sales figures to head office by WhatsApp. That was the scale of the problem. Every hour, the gap between what the system could handle and what the business actually needed was getting wider.

02 // What Was Built

A Platform Built for Naturals — Not Adapted From Someone Else's

A purpose-built, cloud-native microservices platform covering the full order lifecycle: discovery, ordering, payment, kitchen dispatch, inventory deduction, reporting, and ERP sync — unified under a single API used by every client: POS terminal, kiosk, web, and mobile.

In plain terms

Think of it as a single brain for the entire Naturals operation. Whether an order comes from Swiggy, a customer walking into an outlet, or the kiosk at the counter — it all flows through the same system, gets processed the same way, and shows up in the same reports. One version of the truth, everywhere.

// System Components

Service Technical Role What It Means for Operations
Gateway GraphQL API surface — single endpoint for all clients (POS, kiosk, web, mobile) Every device talks to one system. POS terminal, kiosk, and the manager's phone all see the same data.
Auth JWT, RBAC, multi-tenant session management with JTI token revocation Staff see only what they need. Managers see their outlet. CXOs see everything. Access is controlled precisely.
Catalogue Menu, products, pricing, taxes, discounts — master data layer for all channels Change a price or add a seasonal flavour once. It reflects on Swiggy, Zomato, and the counter POS simultaneously.
Order Full order lifecycle: placement, KOT generation, payment reconciliation, status tracking Every order from every source goes through the same pipeline. No missed orders. No duplicate kitchen tickets.
Inventory Double-entry stock accounting, recipe management, atomic gRPC stock reservation on order confirm When an order is placed, stock is deducted automatically. When a flavour runs out, Swiggy and Zomato mark it unavailable without anyone touching a phone.
Integration Aggregator + payment gateway adapters — normalises all external order formats into one internal schema Swiggy, Zomato, Vendekin, Delight, Urban Piper each work differently internally. This layer hides all of that complexity from the rest of the system.
Order Events Real-time push via SSE (Server-Sent Events) — no polling loops Kitchen screens update the instant a new order arrives. No refresh button. No delay. No orders sitting unseen in a queue.
Reporting Async analytics on a dedicated read replica with TimescaleDB fact tables, materialized views, MinIO file export Reports open in under a second. Export to Excel anytime. Custom reports built by operations teams, no engineering ticket required.
Audit Append-only compliance log — every state change recorded with before/after values, user ID, timestamp Full trail of who changed what and when. Invaluable during audits, disputes, or investigating a discrepancy.

Services communicate over gRPC internally. All external clients see only GraphQL. Async jobs — report exports, stock updates, notification dispatch — run through Asynq on Redis. No Kafka. No RabbitMQ. Nothing additional to operate or maintain.

03 // Order Management

Every Channel. One Queue.

Swiggy Zomato Vendekin Delight Urban Piper In-Store / Counter Kiosk

The Integration service normalises orders from every aggregator into a single internal schema before they reach the Order service. Each aggregator has its own quirks — Swiggy's flexible boolean types, Zomato's menu format, Vendekin's order acknowledgement protocol — handled in isolated adapters so that changes to one aggregator's API don't affect anything else in the system.

Payment rails: Paytm EDC for card and UPI at the counter, Razorpay for online orders, plus cash and custom payment modes. Delivery dispatch: Shadowfax. Customer notifications: MSG91 for SMS, Firebase for mobile push.

The outlet operator sees one unified order queue on one screen, regardless of which platform the customer ordered from. There's no tab-switching between Swiggy tablet, Zomato tablet, and the POS. One screen. All orders.

In plain terms

Before, a Naturals outlet might have had three separate tablets — one for Swiggy, one for Zomato, one for their own system — plus their POS. Staff had to watch all of them simultaneously and manually update each one. Now there is one screen. One queue. The system handles everything else automatically in the background.

1

04 // The Migration

200 Outlets.
One Day.
Zero Downtime.

Migrating 200 live outlets simultaneously required zero-downtime cutover tooling and complete confidence in rollback. Most vendors would schedule this across three to six months — outlet by outlet, city by city — with extended parallel-run periods where both systems run together.

We compressed it to one day through a migration architecture that pre-loaded all outlet data, menus, inventory configurations, staff accounts, and pricing into the new system before the cutover window. Nothing was entered live on the day.

The deployment pipeline — deploy_via_watchdog.sh targeting Docker Swarm stack updates — enabled per-service rolling updates with automatic health-check gating. A failed health check stops the rollout before it reaches the next replica. The same mechanism supports single-service canary deploys in under two minutes.

Outlets came online in a controlled batch sequence within a single maintenance window. Staff arrived the next morning and worked on the new system. No outlet was left on the old system. No rollback was required.

In plain terms

Moving a restaurant's entire technology system is like changing the engine of a moving vehicle. You can't stop the business, but you need to replace every component underneath it. We pre-built everything — it was ready before the switch. Then on the day, we flipped all 200 outlets at once. If something had gone wrong with one piece, the system would have automatically stopped and held the rest until it was fixed.

Migration Day Timeline

Pre-DayAll outlet data, menus, staff accounts pre-loaded. Full dry run completed.
NightMaintenance window opened. Legacy system frozen. Data snapshot taken.
CutoverAll 200 outlets activated in controlled batch sequence. Health checks verified per outlet.
MorningOutlets open on new system. First live orders processed. Zero rollback events.

05 // Architecture

Multi-Datacenter Redundancy
on a Startup Budget

Production-grade infrastructure doesn't have to mean hyperscaler bills. The entire Naturals platform runs on a 6-node ARM64 cluster across two European datacenters — infrastructure that costs a fraction of equivalent managed cloud, with no compromise on reliability.

// Traffic Flow Diagram

Cloudflare (Global Load Balancer + DDoS + WAF + Rate Limiting 500K req/min)
↓                                ↓
Hetzner LB — Helsinki
3× cax31 ARM64 nodes
Hetzner LB — Nuremberg
3× cax31 ARM64 nodes
↓                                ↓
6× Gateway Replicas (Docker Swarm)
One per node — zero-hop entry
Microservices (Docker Swarm)
Auth, Order, Inventory, Catalogue, Integration, Reporting, Audit, Events
↓                                ↓
PostgreSQL + TimescaleDB
Patroni + etcd leader election · Helsinki primary · Nuremberg standby
Redis Sentinel
Helsinki master · Nuremberg slave · auto-promotion on master loss
Tailscale VPN — All inter-node traffic encrypted · Private 10.0.0.0/16 · Data layer never publicly exposed

ARM64 Compute — 30–40% Cost Saving

All nodes run Hetzner cax31 ARM64 instances. ARM64 cuts infrastructure spend roughly 30–40% versus equivalent x86 instances — for identical performance on server workloads.

Total reserved capacity: ~14 vCPU / 16 GB with burst headroom to ~34 vCPU / 80 GB before any node is saturated. The entire production cluster — 6 compute nodes, 2 database nodes, 1 observability node — costs what a single managed Kubernetes node costs on AWS or GCP.

In plain terms

The servers running the Naturals platform use a more efficient chip architecture — the same kind Apple uses in the M-series Macs. It does the same work for significantly less money. That saving compounds every month.

Docker Swarm — No Kubernetes Complexity

Docker Swarm was a deliberate choice over Kubernetes. It eliminates CRDs, Helm charts, and operator sprawl while delivering rolling updates, health checks, and placement constraints — everything needed for a reliable multi-service deployment.

The result is a platform that any competent engineer can operate and debug without deep Kubernetes expertise — reducing operational risk and the cost of ongoing maintenance.

In plain terms

We chose simpler infrastructure tools deliberately. Simpler tools break less. When they do break, they're faster to fix. And they cost less to maintain. Complexity is a liability in production systems at this scale.

Automatic Database Failover — Patroni + etcd

PostgreSQL runs under Patroni with etcd for distributed consensus. The leader (Helsinki) is continuously replicating to a hot standby (Nuremberg). If the primary fails, Patroni automatically promotes the standby — no manual intervention, no data loss.

In plain terms

If the main database server fails, the backup in another city automatically takes over within seconds. Outlets keep taking orders. No one calls to say the system is down.

Private Network — Tailscale VPN

All inter-node traffic is encrypted via Tailscale VPN. The database and Redis layers sit on a private 10.0.0.0/16 network — never exposed to the public internet. Even if a compute node were compromised, it cannot reach data infrastructure directly.

In plain terms

The databases and sensitive systems are completely invisible to the internet. An attacker can't probe them because they literally can't see them. It's the equivalent of keeping your most valuable records in a vault with no public-facing door.

06 // Inventory

Double-Entry Stock Accounting

In plain terms

The same accounting principles your finance team uses for money — every rupee in, every rupee out, always balanced — applied to stock. You can trace every mango, every cup, every topping forward and backward. Nothing disappears without a record. When an outlet's stock doesn't match the system, you know exactly where the gap is and when it happened.

Inventory uses double-entry accounting — every stock movement is a debit/credit entry pair, making reconciliation auditable and reversible. There are no one-sided adjustments that can hide shrinkage, theft, or data entry errors.

The Recipe engine decomposes each menu item into raw ingredient quantities. When an order is confirmed, the Order service calls Inventory over gRPC to atomically reserve the exact stock required — at the ingredient level, not the product level. If two orders come in simultaneously for the last two portions of a flavour, the system handles the conflict without double-selling.

Aggregator stock sync pushes real-time availability back to Swiggy and Zomato automatically. When a flavour is sold out, it goes offline on the delivery platforms within seconds — before more orders come in for something you can't fulfil.

Low-stock alerts fire before an outlet runs out, not after. Operations staff get notified with enough time to act.

What this replaces

Manual end-of-day stock counting. Spreadsheet updates. WhatsApp messages to head office saying "we've run out of mango." Discovering stockouts after they've already caused order rejections.

What this delivers

Real-time stock visibility across all 200 outlets. Automatic delivery platform updates. Auditable trail of every stock movement. Fewer stockouts. Less waste. Less manual work.

07 // Reporting & ERP

Reports That Open in Under a Second

In plain terms

Most reporting systems make you wait while they calculate. Ours pre-calculates everything in the background so that when you open a report, the answer is already there. A finance person can pull a 200-outlet sales breakdown by payment mode for any date range — and it loads instantly. They can also build their own reports without asking engineering for help.

The Reporting service maintains a dedicated read replica with TimescaleDB fact tables pre-aggregated by day, item, category, and payment mode. Materialized views — mv_daily_sales, mv_item_performance, mv_payment_mode and others — make dashboard queries sub-second regardless of the date range or outlet count queried.

Custom SQL query builder: A no-code report builder with schema metadata, safe join validation, and automatic tenant-scope injection lets the operations team construct ad-hoc reports without engineering. The builder prevents invalid queries at the UI level before they ever reach the database.

Excel and CSV exports run asynchronously — the system generates the file, stores it in MinIO (an S3-compatible object store), and serves it back via a signed download URL. Large exports don't block the UI or slow the system.

ERP sync: Invoice generation, tax settlement exports, and payment reconciliation APIs are consumed directly by the finance team's existing tooling. The ERP gets clean, structured data — no manual export, no reformatting, no copy-paste between systems.

Reports available out of the box

Outlet-Level Sales Channel Mix Flavour Performance Payment Mode Breakdown Same-Store Growth Peak-Hour Demand Discount Leakage Inventory Variance City-Level Aggregation ERP Tax Settlement Payment Reconciliation Custom Ad-Hoc Queries

08 // Compliance & Security

Built for Audit. Built for Enterprise.

In plain terms

If a government auditor walks in tomorrow asking for GST records, they're available instantly and complete. If an internal investigation needs to know who changed a price at outlet 47 on a specific date, the system can show exactly that — with the before and after values and the timestamp. Security is enforced at the database level, not just on the surface.

RBAC at the ORM Layer

Role-based access control is enforced at the data layer — not just middleware. An outlet manager's credentials cannot retrieve data from another outlet even if they construct a direct API request. Access scoping is structural, not permissive.

Append-Only Audit Log

Every state change — order, price, stock, user, configuration — is captured with before/after values, the user ID who made the change, and an exact timestamp. The log is append-only: records cannot be edited or deleted, only added. It is the authoritative record for any dispute or investigation.

JWT with JTI Token Revocation

Authentication tokens include a unique identifier (JTI) that allows individual sessions to be revoked without invalidating every other session. If a staff member's device is lost, their access is terminated precisely — no forced logout of every other user in the system.

VAPT Cleared — Web Application & Network Layer

Vulnerability Assessment and Penetration Testing reports on file for both the web application layer and the network infrastructure. The platform has been independently tested and cleared. bcrypt for password storage. Encrypted storage for all integration credentials (aggregator API keys, payment gateway secrets). Rate limiting at 500K requests/minute at the gateway. Cloudflare DDoS protection and WAF active in front of all traffic.

GST-Native — Not an Afterthought

GST compliance is built into the order and invoice lifecycle — not bolted on as an export. HSN-code mapping, GSTIN validation, tax calculation at the item level, and return-ready data exports are handled by the platform at the point of transaction. Audit readiness across all 200 outlets at any moment.

09 // Observability

See Everything. Know Before It Breaks.

In plain terms

The system monitors itself. If an outlet is having issues receiving orders, an alert fires before the outlet manager notices and calls support. If the database is getting slow, the on-call engineer is notified before any customer is affected. This is how a platform maintains 99.9% uptime — not by hoping nothing goes wrong, but by detecting problems in the first few seconds and having a human respond.

Every service emits three signals simultaneously:

Logs

Structured JSON logs from every service, shipped via Promtail to Loki. Searchable by outlet, order ID, user, error type, or any field — across the entire fleet of services in one query.

Metrics

Prometheus metrics scraped to persistent TSDB. Grafana dashboards cover order throughput by channel, service-level latency, database connection pool utilisation, Redis sentinel health, and queue depth. Historical data retained for trend analysis and capacity planning.

Traces

OpenTelemetry traces collected by Tempo. A single order request can be traced from the Swiggy webhook arrival through Integration → Order → Inventory → Reporting in one view. When something is slow, the trace shows exactly which service and which database call caused it.

Alertmanager routes on-call pages based on configured thresholds — error rate spikes, latency degradation, outlet connectivity failures, sentinel failover events. The team knows before the business does.

10 // Outcomes

The Numbers

Metric Result What It Means
Uptime SLA 99.9% Less than 9 hours of downtime per year across all 200 outlets
Daily order volume 100K–200K Handled without performance degradation at peak hours
Channels unified 6+ Swiggy, Zomato, Vendekin, Delight, Urban Piper, in-store, kiosk — one queue
Migration timeline 1 Day 200 outlets simultaneously. Industry standard would be 3–6 months.
Infrastructure model 6-Node ARM64 Across 2 datacenters. 30–40% cheaper than equivalent x86 managed cloud.
Deploy cycle < 2 min Canary rolling update with automatic health-check gating. No downtime.
Inventory reconciliation Real-time Stock deducted on order confirm. Aggregators updated on stockout. Manual counting eliminated.
Report load time < 1 sec Pre-aggregated materialized views. Any outlet, any date range, instant.
Security posture VAPT Cleared Web application and network layer independently tested and cleared.

11 // Technology Stack

Go PostgreSQL TimescaleDB Redis Redis Sentinel Asynq gRPC GraphQL SSE Docker Swarm Patroni etcd Tailscale VPN Hetzner cax31 ARM64 Cloudflare MinIO Prometheus Grafana Loki Promtail Tempo OpenTelemetry Alertmanager Swiggy API Zomato API Vendekin Delight Urban Piper Paytm EDC Razorpay Shadowfax MSG91 Firebase JWT + JTI bcrypt RBAC

What's Next

This for Your Operation.

We start with a free audit of your current stack — POS, aggregator integrations, reporting, and infrastructure. We show you exactly what's leaking revenue, what's at risk, and what a proper platform looks like for your scale.

Chat on WhatsApp