Defer the Database, Not the Design
The modern scaling landscape is overwhelming. Queues, event buses, microservices, serverless, Kubernetes. The list of "essential" technologies grows longer every year. It's very easy to get lost in the choices, and even easier to feel pressure to adopt whatever patterns are currently deemed "industry standard". Everyone else seems to be doing it, so it must be right.
Right?
The Stack Overflow Anomaly
Consider Stack Overflow. For over a decade, they debugged the world's software problems using a handful of servers and a surprisingly simple architecture. No microservices. No event sourcing. No exotic databases. Just a well-designed system built on proven, boring technology that scaled to tens of millions of users.
How did they achieve this? And more importantly, why does their approach feel like such an anomaly in today's architecture discussions?
The Danger Zone
The answer lies in understanding a critical but often overlooked danger zone: the space between over-optimisation and under-optimisation.
Over-optimise too early, and you'll drown in premature complexity, vendor lock-in, and crushing cognitive overhead. Your team will spend more time wrestling with distributed systems than building features.
Under-optimise, and you'll build fragile and ad-hoc architectures that buckle under the first real load, forcing expensive rewrites that could have been avoided.
There's a third trap worth naming: "career-driven development". We've all seen it ... the choice to adopt a hot new technology not because it solves a real problem, but because it looks good on LinkedIn. These decisions are costly, and they're more common than we'd like to admit.
A Different Path
This article offers a different path, built on one principle: defer the database, not the design. Model the problem first. Understand how it will be accessed. Then choose storage later, once the constraints are clear and the costs are justified.
Data Modelling: The Foundation
Data modelling is often treated as an "also-ran" in system design; something rushed through in favour of picking databases and frameworks. Yet it's arguably the most important architectural decision you'll make. Get the domain model right, and everything else becomes easier. Get it wrong, and you'll spend years fighting technical debt.
Storage-Agnostic Design
The key principle is deceptively simple: model the business problem first.
- Create types that represent meaningful domain concepts
- Think in terms of entities, relationships, and operations
- Don't think about storage yet
That decision comes later, once you understand how these models will actually be used.
Keep It Simple
This approach draws from Domain-Driven Design, but in simplified form:
- Clear types: Models should clearly refer to meaningful types within your system
- Extendable but not complex: They should grow without becoming over-engineered
- Business-aligned: Focus on clarity and business alignment, not database schemas
Consider a simple e-commerce system. Here's what storage-agnostic domain modelling looks like:
class Product:
id: ProductId
name: string
price: Money
inventory_count: int
class Order:
id: OrderId
customer: Customer
items: List[OrderItem]
total: Money
status: OrderStatus # PENDING, CONFIRMED, SHIPPED, DELIVERED
class Customer:
id: CustomerId
email: string
shipping_addresses: List[Address]
class OrderItem:
product: Product
quantity: int
price_at_purchase: Money # Capture historical pricing
Notice what's missing: no database fields, no save() methods, no ORM annotations. Just clean domain concepts that model the business problem. These models don't know or care whether they'll live in PostgreSQL, MongoDB, or flat files. That's intentional.
Storage decisions come later, after we understand touchpoints.
Data Touchpoints: Understanding Access Patterns
A touchpoint is how your system actually uses the data. Not "what data exists" but rather "how is it accessed". This includes:
- Queries: What data do you read?
- Updates: What data do you write?
- Frequency: How often?
- Latency requirements: How fast must it be?
- Volume: How much data?
Understanding touchpoints is the bridge between domain modelling and technology choices.
Why Touchpoints Matter
Touchpoints reveal your system's actual constraints. Different touchpoints have different optimal technologies, and understanding these patterns prevents premature tech choices.
You're not picking technologies based on marketing materials or blog posts, you're picking them based on measured, understood access patterns.
Common Touchpoint Patterns
Transactional touchpoints involve complex updates with low latency and ACID requirements. Consider a LegalCase in a case management system: the access pattern is individual records with frequent small modifications. This implies a need for strong consistency and transaction support.
Analytical or reporting touchpoints are characterised by time-range queries, aggregations, and read-heavy workloads. An AuditReport querying AuditLogItem entries is typical. Scan many records, filter by date or other criteria. These touchpoints are optimised for reads and often benefit from denormalized data structures.
Graph or relational touchpoints involve traversal operations and relationship-heavy queries. A GraphQuery exploring interconnected data follows edges and performs multi-hop queries. These can be satisfied either by specialized graph databases or by clever relational modelling with recursive queries and smart indexing.
Beyond these three, other common patterns include high-throughput writes (constant data ingestion), time-series data (temporal queries and trends), and full-text search (text matching and ranking). Each has distinct access characteristics that inform technology choices.
Revisiting Our E-Commerce Example
Let's examine our earlier domain model through the lens of touchpoints:
Product Catalogue:
- Transactional touchpoint: Frequent updates to
inventory_count, price changes - Search touchpoint: Customers browsing by category, filtering by price range
- A single domain model, two very different access patterns
Order Processing:
- Purely transactional: ACID properties required
- Order creation and inventory deduction must be atomic
- Cannot tolerate race conditions during checkout
Customer Analytics:
- Analytical touchpoint: Reporting on order history, revenue by segment
- The same
Ordermodel that required strict transactional guarantees now serves aggregate queries across thousands of records
The key observation: One domain model (Order) can have multiple touchpoint types depending on how it's used. This is why premature technology choices are dangerous. You don't yet know all the ways your data will be accessed.
The Key Insight
Touchpoints determine constraints, not technologies.
- You can satisfy most touchpoints with multiple tech choices
- The right choice depends on scale, team expertise, and cost
- A single PostgreSQL instance can handle transactional, analytical, and even graph workloads ... until it can't
The data layer ensures that "until" doesn't become a crisis.
The Data Layer: Your Scaling Insurance
If you're unfamiliar with the concept of a data access layer, I've written about it in depth in How to Build a Data Access Layer.
The Provider Pattern (In Brief)
The core principles:
- Immutable domain models: No ActiveRecord patterns, no embedded queries
- Provider pattern: Well-defined interface accepting and returning domain types
- Dependency injection: Application code doesn't know about storage implementation
Why This Matters for Scaling
The data layer is your abstraction boundary:
- Application code depends on the interface, not the implementation
- You can swap storage technologies without touching application logic
- When that analytical touchpoint outgrows PostgreSQL, you can introduce an analytics database behind the data layer whilst the application remains blissfully unaware
This approach is sometimes formalised in a more theoretical way as "hexagonal architecture" or "ports and adapters". The underlying principle is the same, arrived at here from practical experience rather than architectural prescription.
Your Insurance Policy
Think of the data layer as insurance:
- You're not optimising for scale today
- You're creating optionality for tomorrow
- It's your insurance against lock-in; technological, architectural, and strategic
The Scaling Pathway
The core philosophy is simple: defer decisions, not preparation.
- Don't pick technologies for hypothetical scale
- Do create the structure that allows future evolution
- This distinction is everything
A Cautionary (and Expensive) Tale
In a previous company, we built a sophisticated record processing system on top of an ORM, without a well-defined data layer; a very common pattern. The data model consisted of "magic objects": ORM-specific query and filter clauses that leaked throughout the application, binding us tightly to a single-instance database.
At the time, I argued for retrofitting a proper data layer. As is common in business, there was always something more important to do, and the decision was endlessly deferred.
The product was also single-tenanted.
What Happened?
It grew. We added clients. Because the data model was inseparable from the ORM and its single-instance assumptions, migrating to a multi-tenanted storage solution simply was not viable. Instead, we duplicated the entire infrastructure stack for each new client.
We had extremely talented SREs who worked magic with Kubernetes, shuffling containers between hosts to manage the load. But we were running roughly ten containers per client: web server, workers, database primary, standby, read replica, Elasticsearch cluster. Multiplied across hundreds of clients.
The operational and development cost of maintaining this conservatively ran to millions of dollars more than a multi-tenanted architecture would have required.
Every year.
Had a clean data layer been in place from the start, the exit route would have existed. Two migration steps, each self-contained:
Step 1: Migrate storage to shared infrastructure. Each client's application still runs separately, but the database layer is consolidated. Ten containers per client drops to one or two. The majority of the operational cost is gone before a single line of application code is touched.
Step 2: Migrate the application to multi-tenanted. With the infrastructure pressure gone, consolidating the application itself becomes a focused project rather than an emergency rewrite.
The Lesson
The data layer was not a missing abstraction nicety. It was the only thing that would have kept the exit route open. Without it, every new client made the problem slightly worse and the fix slightly less affordable. By the time the cost was undeniable, it had become more expensive to fix than to absorb.
Deferred decisions are not free. They accrue interest, quietly, until the bill arrives.
So where do you start?
Start Simple
Default to boring, proven technology:
- SQLite for side projects and MVPs
- Single PostgreSQL instance for most startups
- Sharded SQL with customer affinity for B2B SaaS products
"Boring technology" is a feature, not a bug.
Why Simple Wins Early
- Lower cognitive overhead whilst iterating on the product
- Well-understood failure modes and debugging
- Easy to hire for with abundant documentation
- Optimisation is cheap when you're small
Adding an index or a read replica is a day's work, not a quarter-long migration project.
The Evolution
The interface and application code are fixed from day one:
class ProductProvider(ABC):
@abstractmethod
def search(self, query: str, country_code: str) -> List[Product]:
pass
class ProductSearchHandler:
def __init__(self, products: ProductProvider):
self._products = products
def handle(self, query: str, country_code: str) -> List[Product]:
return self._products.search(query, country_code)
Everything that follows is what lives behind that interface.
Stage 1 ยท boring monolith
โโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
โ App โโโโโโถโ SQL provider โโโโโโถโ PostgreSQL โ
โโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
Stage 2 ยท augmented
โโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ
โ App โโโโโโถโ Elastic provider โโโโโโถโ Elasticsearch โ
โโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ
Stage 3 ยท federated
โโโโโโโ โโโโโโโโโโโโโโโโโโโโ US โโโถ Elasticsearch
โ App โโโโโโถโRegional provider โ EU โโโถ Elasticsearch
โโโโโโโ โโโโโโโโโโโโโโโโโโโโ GB โโโถ PostgreSQL
In practice, the interface will evolve as requirements grow; new methods, refined signatures. But that evolution is bounded: changes happen in one place, each provider can be tested in isolation, and the application code is never entangled with storage decisions.
Stage 1: The boring monolith. One database. Handles the vast majority of real-world products.
class SQLProductProvider(ProductProvider):
def search(self, query: str, country_code: str) -> List[Product]:
return self._db.query(
"SELECT * FROM product WHERE name ILIKE %s AND country_code = %s;",
[f"%{query}%", country_code],
)
products = SQLProductProvider(db)
app = MyApplication(ProductSearchHandler(products))
Stage 2: Augmented. A search touchpoint outgrows PostgreSQL; requires specialised features. ElasticSearch is introduced behind the data layer. One new class; one line changes at startup. The application is unchanged:
class ElasticProductProvider(ProductProvider):
def search(self, query: str, country_code: str) -> List[Product]:
hits = self._es.search(index="products", query={"match": {"name": query}})
return [self._to_product(hit) for hit in hits["hits"]["hits"]]
products = ElasticProductProvider(es)
app = MyApplication(ProductSearchHandler(products))
Stage 3: Federated at scale. Regional sharding, dedicated client infrastructure, a routing layer that directs queries to the right region or dedicated client backend. Some regions run shared PostgreSQL and ElasticSearch clusters; others have dedicated infrastructure for specific clients. A global PostgreSQL instance holds customer metadata and routing configuration. The application still calls handle() unchanged:
class RegionalProductProvider(ProductProvider):
def __init__(
self,
providers: Dict[str, ProductProvider],
default: ProductProvider,
):
self._providers = providers
self._default = default
def search(self, query: str, country_code: str) -> List[Product]:
provider = self._providers.get(country_code, self._default)
return provider.search(query, country_code)
products = RegionalProductProvider(
providers={
"US": ElasticProductProvider(us_es),
"EU": ElasticProductProvider(eu_es),
"GB": SQLProductProvider(gb_db), # dedicated client, still on SQL
},
default=ElasticProductProvider(global_es),
)
app = MyApplication(ProductSearchHandler(products))
When to Evolve
Watch for clear signals, not hypotheticals:
- Latency violations: Queries and operations consistently missing SLAs
- Cost curves: Database costs growing faster than revenue
- Query complexity: Application code contorting itself to work around storage limits
- Operational pain: Backups, replication, or scaling operations becoming frequent fire drills
The Evolution Process
When these signals appear, follow this process:
- Identify the bottleneck touchpoint: Which access pattern is breaking? Be specific.
- Evaluate alternatives: What technologies solve this constraint? Not "what's popular", but "what addresses this specific, measured problem".
- Implement behind the data layer: Swap storage without touching application code.
- Migrate incrementally: Run old and new systems in parallel. Achieve very low or even zero downtime.
Why This Pathway Wins
- Pay costs only when benefits are clear
- Not locked into early decisions
- Proven by experience: Stack Overflow, GitHub, and Stripe all scaled this way; incrementally, deliberately, and without Hail Mary rewrites
Summary
Model the domain first. Understand the access patterns. Then choose storage, once the constraints are real and the costs are justified.
The data layer is what keeps the exit routes open. Microservices, event buses, and specialised databases all have their place; adopting them early trades velocity for theoretical scale. A clean data layer means the architecture can evolve precisely, one touchpoint at a time, without a rewrite and without the pressure of doing it as an emergency.