Defer the Database, Not the Design

The modern scaling landscape is overwhelming. Queues, event buses, microservices, serverless, Kubernetes. The list of "essential" technologies grows longer every year. It's very easy to get lost in the choices, and even easier to feel pressure to adopt whatever patterns are currently deemed "industry standard". Everyone else seems to be doing it, so it must be right.

Right?

The Stack Overflow Anomaly

Consider Stack Overflow. For over a decade, they debugged the world's software problems using a handful of servers and a surprisingly simple architecture. No microservices. No event sourcing. No exotic databases. Just a well-designed system built on proven, boring technology that scaled to tens of millions of users.

How did they achieve this? And more importantly, why does their approach feel like such an anomaly in today's architecture discussions?

The Danger Zone

The answer lies in understanding a critical but often overlooked danger zone: the space between over-optimisation and under-optimisation.

Over-optimise too early, and you'll drown in premature complexity, vendor lock-in, and crushing cognitive overhead. Your team will spend more time wrestling with distributed systems than building features.

Under-optimise, and you'll build fragile and ad-hoc architectures that buckle under the first real load, forcing expensive rewrites that could have been avoided.

There's a third trap worth naming: "career-driven development". We've all seen it ... the choice to adopt a hot new technology not because it solves a real problem, but because it looks good on LinkedIn. These decisions are costly, and they're more common than we'd like to admit.

A Different Path

This article offers a different path, built on one principle: defer the database, not the design. Model the problem first. Understand how it will be accessed. Then choose storage later, once the constraints are clear and the costs are justified.

Data Modelling: The Foundation

Data modelling is often treated as an "also-ran" in system design; something rushed through in favour of picking databases and frameworks. Yet it's arguably the most important architectural decision you'll make. Get the domain model right, and everything else becomes easier. Get it wrong, and you'll spend years fighting technical debt.

Storage-Agnostic Design

The key principle is deceptively simple: model the business problem first.

Create types that represent meaningful domain concepts
Think in terms of entities, relationships, and operations
Don't think about storage yet

That decision comes later, once you understand how these models will actually be used.

Keep It Simple

This approach draws from Domain-Driven Design, but in simplified form:

Clear types: Models should clearly refer to meaningful types within your system
Extendable but not complex: They should grow without becoming over-engineered
Business-aligned: Focus on clarity and business alignment, not database schemas

Consider a simple e-commerce system. Here's what storage-agnostic domain modelling looks like:

class Product:
    id: ProductId
    name: string
    price: Money
    inventory_count: int

class Order:
    id: OrderId
    customer: Customer
    items: List[OrderItem]
    total: Money
    status: OrderStatus  # PENDING, CONFIRMED, SHIPPED, DELIVERED

class Customer:
    id: CustomerId
    email: string
    shipping_addresses: List[Address]

class OrderItem:
    product: Product
    quantity: int
    price_at_purchase: Money  # Capture historical pricing

Notice what's missing: no database fields, no save() methods, no ORM annotations. Just clean domain concepts that model the business problem. These models don't know or care whether they'll live in PostgreSQL, MongoDB, or flat files. That's intentional.

Storage decisions come later, after we understand touchpoints.

Data Touchpoints: Understanding Access Patterns

A touchpoint is how your system actually uses the data. Not "what data exists" but rather "how is it accessed". This includes:

Queries: What data do you read?
Updates: What data do you write?
Frequency: How often?
Latency requirements: How fast must it be?
Volume: How much data?

Understanding touchpoints is the bridge between domain modelling and technology choices.

Why Touchpoints Matter

Touchpoints reveal your system's actual constraints. Different touchpoints have different optimal technologies, and understanding these patterns prevents premature tech choices.

You're not picking technologies based on marketing materials or blog posts, you're picking them based on measured, understood access patterns.

Common Touchpoint Patterns

Transactional touchpoints involve complex updates with low latency and ACID requirements. Consider a LegalCase in a case management system: the access pattern is individual records with frequent small modifications. This implies a need for strong consistency and transaction support.

Analytical or reporting touchpoints are characterised by time-range queries, aggregations, and read-heavy workloads. An AuditReport querying AuditLogItem entries is typical. Scan many records, filter by date or other criteria. These touchpoints are optimised for reads and often benefit from denormalized data structures.

Graph or relational touchpoints involve traversal operations and relationship-heavy queries. A GraphQuery exploring interconnected data follows edges and performs multi-hop queries. These can be satisfied either by specialized graph databases or by clever relational modelling with recursive queries and smart indexing.

Beyond these three, other common patterns include high-throughput writes (constant data ingestion), time-series data (temporal queries and trends), and full-text search (text matching and ranking). Each has distinct access characteristics that inform technology choices.

Revisiting Our E-Commerce Example

Let's examine our earlier domain model through the lens of touchpoints:

Product Catalogue:

Transactional touchpoint: Frequent updates to inventory_count, price changes
Search touchpoint: Customers browsing by category, filtering by price range
A single domain model, two very different access patterns

Order Processing:

Purely transactional: ACID properties required
Order creation and inventory deduction must be atomic
Cannot tolerate race conditions during checkout

Customer Analytics:

Analytical touchpoint: Reporting on order history, revenue by segment
The same Order model that required strict transactional guarantees now serves aggregate queries across thousands of records

The key observation: One domain model (Order) can have multiple touchpoint types depending on how it's used. This is why premature technology choices are dangerous. You don't yet know all the ways your data will be accessed.

The Key Insight

Touchpoints determine constraints, not technologies.

You can satisfy most touchpoints with multiple tech choices
The right choice depends on scale, team expertise, and cost
A single PostgreSQL instance can handle transactional, analytical, and even graph workloads ... until it can't

The data layer ensures that "until" doesn't become a crisis.

The Data Layer: Your Scaling Insurance

If you're unfamiliar with the concept of a data access layer, I've written about it in depth in How to Build a Data Access Layer.

The Provider Pattern (In Brief)

The core principles:

Immutable domain models: No ActiveRecord patterns, no embedded queries
Provider pattern: Well-defined interface accepting and returning domain types
Dependency injection: Application code doesn't know about storage implementation

Why This Matters for Scaling

The data layer is your abstraction boundary:

Application code depends on the interface, not the implementation
You can swap storage technologies without touching application logic
When that analytical touchpoint outgrows PostgreSQL, you can introduce an analytics database behind the data layer whilst the application remains blissfully unaware

This approach is sometimes formalised in a more theoretical way as "hexagonal architecture" or "ports and adapters". The underlying principle is the same, arrived at here from practical experience rather than architectural prescription.

Your Insurance Policy

Think of the data layer as insurance:

You're not optimising for scale today
You're creating optionality for tomorrow
It's your insurance against lock-in; technological, architectural, and strategic

The Scaling Pathway

The core philosophy is simple: defer decisions, not preparation.

Don't pick technologies for hypothetical scale
Do create the structure that allows future evolution
This distinction is everything

A Cautionary (and Expensive) Tale

In a previous company, we built a sophisticated record processing system on top of an ORM, without a well-defined data layer; a very common pattern. The data model consisted of "magic objects": ORM-specific query and filter clauses that leaked throughout the application, binding us tightly to a single-instance database.

At the time, I argued for retrofitting a proper data layer. As is common in business, there was always something more important to do, and the decision was endlessly deferred.

The product was also single-tenanted.

What Happened?

It grew. We added clients. Because the data model was inseparable from the ORM and its single-instance assumptions, migrating to a multi-tenanted storage solution simply was not viable. Instead, we duplicated the entire infrastructure stack for each new client.

We had extremely talented SREs who worked magic with Kubernetes, shuffling containers between hosts to manage the load. But we were running roughly ten containers per client: web server, workers, database primary, standby, read replica, Elasticsearch cluster. Multiplied across hundreds of clients.

The operational and development cost of maintaining this conservatively ran to millions of dollars more than a multi-tenanted architecture would have required.

Every year.

Had a clean data layer been in place from the start, the exit route would have existed. Two migration steps, each self-contained:

Step 1: Migrate storage to shared infrastructure. Each client's application still runs separately, but the database layer is consolidated. Ten containers per client drops to one or two. The majority of the operational cost is gone before a single line of application code is touched.

Step 2: Migrate the application to multi-tenanted. With the infrastructure pressure gone, consolidating the application itself becomes a focused project rather than an emergency rewrite.

The Lesson

The data layer was not a missing abstraction nicety. It was the only thing that would have kept the exit route open. Without it, every new client made the problem slightly worse and the fix slightly less affordable. By the time the cost was undeniable, it had become more expensive to fix than to absorb.

Deferred decisions are not free. They accrue interest, quietly, until the bill arrives.

So where do you start?

Start Simple

Default to boring, proven technology:

SQLite for side projects and MVPs
Single PostgreSQL instance for most startups
Sharded SQL with customer affinity for B2B SaaS products

"Boring technology" is a feature, not a bug.

Why Simple Wins Early

Lower cognitive overhead whilst iterating on the product
Well-understood failure modes and debugging
Easy to hire for with abundant documentation
Optimisation is cheap when you're small

Adding an index or a read replica is a day's work, not a quarter-long migration project.

The Evolution

The interface and application code are fixed from day one:

class ProductProvider(ABC):

    @abstractmethod
    def search(self, query: str, country_code: str) -> List[Product]:
        pass


class ProductSearchHandler:

    def __init__(self, products: ProductProvider):
        self._products = products

    def handle(self, query: str, country_code: str) -> List[Product]:
        return self._products.search(query, country_code)

Everything that follows is what lives behind that interface.

  Stage 1 · boring monolith
  ┌─────┐     ┌──────────────────┐     ┌────────────┐
  │ App │────▶│   SQL provider   │────▶│ PostgreSQL │
  └─────┘     └──────────────────┘     └────────────┘

  Stage 2 · augmented
  ┌─────┐     ┌──────────────────┐     ┌───────────────┐
  │ App │────▶│ Elastic provider │────▶│ Elasticsearch │
  └─────┘     └──────────────────┘     └───────────────┘

  Stage 3 · federated
  ┌─────┐     ┌──────────────────┐     US ──▶ Elasticsearch
  │ App │────▶│Regional provider │     EU ──▶ Elasticsearch
  └─────┘     └──────────────────┘     GB ──▶ PostgreSQL

In practice, the interface will evolve as requirements grow; new methods, refined signatures. But that evolution is bounded: changes happen in one place, each provider can be tested in isolation, and the application code is never entangled with storage decisions.

Stage 1: The boring monolith. One database. Handles the vast majority of real-world products.

class SQLProductProvider(ProductProvider):

    def search(self, query: str, country_code: str) -> List[Product]:
        return self._db.query(
            "SELECT * FROM product WHERE name ILIKE %s AND country_code = %s;",
            [f"%{query}%", country_code],
        )

products = SQLProductProvider(db)
app = MyApplication(ProductSearchHandler(products))

Stage 2: Augmented. A search touchpoint outgrows PostgreSQL; requires specialised features. ElasticSearch is introduced behind the data layer. One new class; one line changes at startup. The application is unchanged:

class ElasticProductProvider(ProductProvider):

    def search(self, query: str, country_code: str) -> List[Product]:
        hits = self._es.search(index="products", query={"match": {"name": query}})
        return [self._to_product(hit) for hit in hits["hits"]["hits"]]

products = ElasticProductProvider(es)
app = MyApplication(ProductSearchHandler(products))

Stage 3: Federated at scale. Regional sharding, dedicated client infrastructure, a routing layer that directs queries to the right region or dedicated client backend. Some regions run shared PostgreSQL and ElasticSearch clusters; others have dedicated infrastructure for specific clients. A global PostgreSQL instance holds customer metadata and routing configuration. The application still calls handle() unchanged:

class RegionalProductProvider(ProductProvider):

    def __init__(
            self,
            providers: Dict[str, ProductProvider],
            default: ProductProvider,
    ):
        self._providers = providers
        self._default = default

    def search(self, query: str, country_code: str) -> List[Product]:
        provider = self._providers.get(country_code, self._default)
        return provider.search(query, country_code)

products = RegionalProductProvider(
    providers={
        "US": ElasticProductProvider(us_es),
        "EU": ElasticProductProvider(eu_es),
        "GB": SQLProductProvider(gb_db),  # dedicated client, still on SQL
    },
    default=ElasticProductProvider(global_es),
)
app = MyApplication(ProductSearchHandler(products))

When to Evolve

Watch for clear signals, not hypotheticals:

Latency violations: Queries and operations consistently missing SLAs
Cost curves: Database costs growing faster than revenue
Query complexity: Application code contorting itself to work around storage limits
Operational pain: Backups, replication, or scaling operations becoming frequent fire drills

The Evolution Process

When these signals appear, follow this process:

Identify the bottleneck touchpoint: Which access pattern is breaking? Be specific.
Evaluate alternatives: What technologies solve this constraint? Not "what's popular", but "what addresses this specific, measured problem".
Implement behind the data layer: Swap storage without touching application code.
Migrate incrementally: Run old and new systems in parallel. Achieve very low or even zero downtime.

Why This Pathway Wins

Pay costs only when benefits are clear
Not locked into early decisions
Proven by experience: Stack Overflow, GitHub, and Stripe all scaled this way; incrementally, deliberately, and without Hail Mary rewrites

Summary

Model the domain first. Understand the access patterns. Then choose storage, once the constraints are real and the costs are justified.

The data layer is what keeps the exit routes open. Microservices, event buses, and specialised databases all have their place; adopting them early trades velocity for theoretical scale. A clean data layer means the architecture can evolve precisely, one touchpoint at a time, without a rewrite and without the pressure of doing it as an emergency.