A Short Primer on Scalable Data Storage Solutions for ML Systems

Posted Nov 25, 2025 Updated Mar 12, 2026

By Sayan Biswas

24 min read

Provide a consolidated view of popular data storage technologies relevant in ML based Ecosystems.

Here is a consolidated view of the most popular data storage technologies used in modern system design. It’s grouped by their primary architectural category to make the comparison easier.

1. Relational Databases (RDBMS) - The General Purpose Workhorses

These are the default choice for structured data where data integrity is paramount.

Technology	PostgreSQL
Type	SQL (Relational)
Pattern	OLTP (Online Transaction Processing)
ACID	Yes (Strict ACID compliance)
Scalability	Vertical (Scale Up). Horizontal scaling is hard (requires sharding/read replicas), but tools like Citus exist.
CAP Theorem	CA (In practice, it prioritizes Consistency and Availability, usually single-master).
Primary Use Case	Core application DB, User profiles, Financial ledgers, Complex joins.
Latency	Low (ms).
Managed Service	Yes (AWS RDS, Google Cloud SQL, Azure Database).
Critical Info	The industry standard for general-purpose app development. Very extensible (PostGIS for geo, JSONB for NoSQL-like features).

2. Data Warehouses (OLAP) - The Analytics Engines

These are designed for scanning billions of rows to answer business questions. They are not for live application backends.

Technology	Google BigQuery	Amazon Redshift	Apache Hive
Type	SQL (Columnar)	SQL (Columnar)	SQL-like (HQL) on Hadoop
Pattern	OLAP (Online Analytical Processing)	OLAP	OLAP
ACID	Atomic updates supported, but not designed for transactional work.	Yes (Snapshot isolation).	ACID support is available (ACID v2) but limited compared to RDBMS.
Scalability	Horizontal (Massively Parallel Processing - MPP).	Horizontal (MPP).	Horizontal (Uses HDFS/Cloud Storage).
Use Case	Ad-hoc analysis, BI dashboards, ML training data generation.	Enterprise Data Warehouse, BI reporting.	Batch processing on huge datasets (Petabyte scale) over HDFS/S3.
Latency	Seconds to Minutes (High).	Seconds to Minutes (High).	Minutes to Hours (Very High).
Managed?	Serverless (Fully Managed).	Cluster-based (You manage nodes) or Serverless.	Often self-managed or via EMR/Dataproc.
Critical Info	Decouples compute and storage perfectly. You pay per query.	Tightly coupled compute/storage (traditionally), though RA3 nodes fix this.	Legacy tech. Being replaced by Spark SQL or Presto/Trino.

3. NoSQL: Key-Value & In-Memory - The Speed Demons

Used for caching, real-time features, and session management.

Technology	Redis	Aerospike
Type	NoSQL (Key-Value)	NoSQL (Key-Value)
Pattern	OLTP (Real-time)	OLTP (Real-time at Scale)
ACID	Atomic single-key ops. No multi-key transactions.	Strong Consistency available (ACID for single row).
Scalability	Horizontal (Redis Cluster) - AP or CP depending on config.	Horizontal (Linear scale). AP (Available/Partition Tolerant).
Use Case	Caching, Session Store, Leaderboards, Pub/Sub.	Real-time Feature Store (Fraud detection), Ad-Tech bidding.
Latency	Microseconds (In-memory).	Sub-millisecond (Hybrid RAM/SSD).
Managed?	Yes (AWS ElastiCache, Redis Enterprise).	Yes (Aerospike Cloud).
Critical Info	Single-threaded event loop. Data must usually fit in RAM.	Optimized for Flash/SSD. Can store TBs/PBs of data with RAM-like speed.

4. NoSQL: Wide-Column Stores - The Big Data Writers

Designed for massive write throughput and storing petabytes of data.

Technology	Apache Cassandra	Google Bigtable	Apache HBase
Type	NoSQL (Wide-Column)	NoSQL (Wide-Column)	NoSQL (Wide-Column)
Pattern	OLTP (Heavy Write)	OLTP (Heavy Write/Read)	OLTP (Heavy Write)
ACID	No (Tunable Consistency).	No (Single-row atomicity).	Strong Consistency (Single row).
Scalability	Horizontal. AP (Eventual Consistency).	Horizontal. CP (Strong Consistency).	Horizontal. CP (Strong Consistency).
Use Case	IoT sensor data, activity logs, messaging apps (Discord/Messenger).	Time-series data, financial data, user history (Gmail/Search/Maps).	Hadoop-native random access storage.
Latency	Low (Single digit ms).	Low (Single digit ms).	Low to Medium.
Managed?	Yes (AWS Keyspaces, Datastax).	Fully Managed (Serverless).	Self-managed or Cloud Bigtable.
Critical Info	“Masterless” architecture (no single point of failure). Great for writes.	Powered by Colossus (Google’s file system). Storage/Compute decoupled.	Built on top of HDFS. Requires ZooKeeper. High operational overhead.

5. NoSQL: Document & Search - The Flexible Ones

Used for unstructured data and complex text searching.

Technology	MongoDB	Elasticsearch
Type	NoSQL (Document - JSON)	NoSQL (Search Engine / Document)
Pattern	OLTP	OLAP (Text Analytics / Log Search)
ACID	Yes (Multi-document ACID supported since v4.0).	No (Near real-time, eventual consistency).
Scalability	Horizontal (Sharding). CP (Consistency/Partition Tolerance).	Horizontal. AP (Focuses on Availability).
Use Case	Content management, Catalogs, Rapid prototyping (Schema-less).	Full-text search, Log analytics (ELK Stack), Autocomplete.
Latency	Low (ms).	Low (ms) for search, Slower for updates.
Managed?	Yes (MongoDB Atlas).	Yes (Elastic Cloud, AWS OpenSearch).
Critical Info	Very developer-friendly. Flexible schema allows rapid iteration.	Based on Lucene. Not a primary datastore (danger of split-brain).

Summary of CAP Theorem Categorization

CP (Consistency + Partition Tolerance): MongoDB, HBase, Bigtable, Redis (if configured for strong consistency), Kafka.
- Trade-off: If the network partitions, the system may stop accepting writes to ensure data doesn’t diverge.
AP (Availability + Partition Tolerance): Cassandra, DynamoDB, Couchbase, Aerospike (in AP mode).
- Trade-off: If the network partitions, the system stays up, but you might read stale data (“Eventual Consistency”).
CA (Consistency + Availability): RDBMS (Postgres, MySQL).
- Trade-off: They cannot handle network partitions well. They usually run on a single node or a primary/replica setup where partition tolerance is sacrificed (the system goes down or becomes read-only).

Summary Table for Quick Reference

Name	Category	ACID	CAP	Best For
Postgres	RDBMS	Yes	CA	Relational data, Core Apps
BigQuery	Data Warehouse	No	-	Analytics on Petabytes
Redis	Key-Value	Atomic	CP	Caching, Sessions, Queues
Aerospike	Key-Value	Atomic	AP	Real-time Feature Store (Fraud)
Cassandra	Wide-Column	No	AP	Massive Write workloads (IoT)
Bigtable	Wide-Column	Atom	CP	High throughput K-V at scale
Elastic	Search	No	AP	Text Search, Log Analysis
Mongo	Document	Yes	CP	Flexible Schema, Catalogs

What’s the difference between ‘Columner’ and ‘Wide-Column’ data storage solutions?

That is an excellent question that clarifies a very common confusion. Although both terms contain the word “column,” columnar and wide-column storage systems are built for completely different purposes and have vastly different internal designs.

Here is a side-by-side comparison.

Columnar Storage (The OLAP/Analytics Engine)

Columnar storage systems are built to run analytical queries (OLAP) by prioritizing reading efficiency over everything else.

Core Idea: Storage by Attribute

How it Works: Data is stored on disk grouped by attribute (column), not by row.
- Example: A table with columns (User_ID, Name, Sale_Amount, Date). All Sale_Amount values are stored next to each other on the disk, and all Date values are stored next to each other.
Best For: Read-heavy, analytical queries that involve aggregating data over large numbers of rows.
Key Benefit (I/O Reduction): When you run a query like SELECT AVG(Sale_Amount), the database only reads the Sale_Amount column from the disk. It completely skips reading the User_ID, Name, and Date columns, which drastically reduces I/O and speeds up the query.
Data Types: Optimized for homogeneous, structured data (like SQL tables).
Examples: Google BigQuery, Amazon Redshift, Snowflake.

Wide-Column Stores (The Scalable OLTP Engine)

Wide-column stores are a NoSQL database design built to handle massive write throughput and petabyte-scale horizontal scaling. They are designed for operational workloads.

Core Idea: Flexible, Distributed Rows

How it Works: Data is stored as rows, but each row can have a massive number of columns (hence “wide”). Unlike a traditional RDBMS, columns can be created on the fly, and different rows in the same table can have different sets of columns. The columns are grouped into structures called “Column Families” or “Bins.”
Best For: Extremely high-throughput writes (like log ingestion, sensor data, or storing user history) and fast key-based lookups.
Key Benefit (Scalability & Flexibility): The database automatically partitions rows across thousands of machines based on the primary key, providing linear scalability for writes and reads. The dynamic schema allows rapid application development.
Data Types: Optimized for mutable, semi-structured data where the schema is expected to evolve.
Examples: Apache Cassandra, Google Bigtable, Apache HBase, Aerospike (conceptually shares this distributed, K-V model).

Side-by-Side Comparison

Feature	Columnar Storage	Wide-Column Stores
Primary Goal	Analytics (OLAP)	Scalable Transactions (OLTP/Real-Time)
Data Layout	Data is stored by column. (All `Sale_Amount` values are adjacent).	Data is stored by row (but rows are wide and flexible).
Optimization	Read I/O. Only read the columns you need.	Write Throughput and Horizontal Partitioning.
Schema	Fixed (Schema-on-Write).	Flexible/Dynamic (Schema-on-Read).
Typical Query	`SELECT SUM(Sales) GROUP BY Month`	`GET user_data WHERE user_id = 123`
Key Advantage	Massive cost and speed savings on analytical queries.	Linear scalability and predictable low latency for key lookups.
Examples	BigQuery, Redshift, Snowflake	Cassandra, Bigtable, ScyllaDB

Do Wide-column stores allow partitioning and indexing? If so, then explain how.

Wide-column stores absolutely allow both partitioning and indexing. In fact, these features are essential to their core function of providing massive horizontal scalability and fast retrieval.

The difference lies in how they implement them compared to a traditional SQL database.

Here is a breakdown of how partitioning and indexing work in a wide-column store (using Cassandra/Bigtable as primary examples).

1. Partitioning (The Key to Scale)

Partitioning is the primary mechanism for scaling wide-column stores horizontally. It dictates where data lives on the cluster.

How it Works

Partition Key (Required): When you define a table, you must designate one or more columns as the Partition Key (or Row Key in Bigtable).
Data Distribution: The database uses a hashing function on the Partition Key to determine which physical machine (node) in the cluster will store that data.
Data Grouping: All rows (records) that share the same Partition Key are guaranteed to be stored together on the same physical node.

The Impact

Fast Writes/Reads: Queries that target a specific partition key (e.g., SELECT * FROM users WHERE user_id = 'Alice') are extremely fast because the system knows exactly which single node to query, avoiding a cluster-wide search.
Scalability: If your cluster has 100 machines, partitioning ensures your 100 terabytes of data are evenly distributed across all 100 machines, allowing the database to scale linearly.
The Limitation: You must know the Partition Key to query efficiently. Trying to query without it results in an expensive “full cluster scan,” which is discouraged.

2. Indexing (The Key to Flexibility)

Indexing in wide-column stores is available, but it is implemented differently than in an RDBMS, and it often comes with significant performance trade-offs.

The Problem with Traditional RDBMS Indexes

In a traditional RDBMS, if you index the city column, the index is a separate structure that maps all city values to the exact physical location of the rows on disk. This is fast for lookups but complex to update.

The Wide-Column Approach: Secondary Indexes

Wide-column stores use Secondary Indexes to allow queries on non-primary-key columns.

Primary Access (The “Main Index”): The Partition Key acts as the primary, built-in index, optimized for that specific query.
Secondary Indexes: These are separate, secondary tables that the database creates and manages automatically.
- How it Works (Conceptual): If you create a secondary index on the status='active' column, the database effectively creates a new internal key-value table:
  - Key: 'status_active'
  - Value: [user_id_1, user_id_50, user_id_300, ...]
Trade-off (The “Cost”):
- Reads (Pros): They allow you to run queries like SELECT * FROM users WHERE city = 'New York', which would otherwise require a full cluster scan.
- Writes (Cons): Every time you update the city column on a user record, the database has to also update the massive, distributed Secondary Index table. This adds latency to your write operations, which defeats the purpose of using a high-throughput system like Cassandra.

Conclusion:

Both Partitioning and Indexing are used in wide-column stores. Partitioning is your mandatory, structural mechanism for scalability. Indexing is an optional feature (the secondary index) that provides query flexibility but adds significant overhead to your critical write path and must be used sparingly.

Show examples of some ‘good’ and ‘bad’ queries for storage systems like BigTable.

This is a fantastic way to understand the limitations and design philosophy of Bigtable. Unlike a relational database, where a simple SELECT can be slow due to poor indexing, Bigtable’s speed is entirely governed by how efficiently the system can locate and scan data using the Row Key.

Here is a list of three fast and three very slow queries on Bigtable.

Fast Queries (O(1) to O(logN) by Row Key)

The fast queries are all based on Row Key lookups or contiguous Row Key range scans. Bigtable is optimized to find data based on the key because the key maps directly to a specific server node in the cluster.

#	Query Goal	Row Key Structure	Why It’s Fast
1	Single Point Lookup (Finding one user’s profile)	`user_4321`	This is the fastest possible operation. Bigtable hashes the key and immediately sends the request to the one server that owns that row. It’s an O(1) lookup.
2	Time-Series Range Scan (Recent history for one device)	`device_XYZ#2025-11-19-14:00:00`	The user queries for `key_prefix='device_XYZ' AND time BETWEEN T1 AND T2`. Since all data for one device is stored contiguously on a few servers, Bigtable only scans a small, optimized range. This is often called a Row Key Prefix Scan.
3	Small Transaction Lookup (A tiny slice of the data)	`transaction_1234`	Similar to #1, this is a direct, targeted read. If your key design is good (e.g., you’ve structured the key as `user_ID#transaction_ID`), you still benefit from the fast primary lookup.

Slow Queries (Full Cluster Scan / Scatter-Gather)

The slow queries are those that force Bigtable to abandon the Row Key and scan large portions of the cluster to find the answer. This is the anti-pattern for Bigtable.

#	Query Goal	Reason for Slowness	Why It’s Slow (Full Scan)
1	Secondary Attribute Lookup (Finding all users in one country)	`SELECT * WHERE country_code = 'AUS'`	Bigtable has no secondary indexes. To answer this, it must read the `country_code` column of every single row on every single server in the cluster. This is the definition of an expensive, slow, Full Cluster Scan.
2	Complex Aggregation (Counting the total number of items)	`COUNT(*)` or `GROUP BY` on a non-key column.	Bigtable must read every single row in the cluster and then perform a cluster-wide aggregation. While faster than traditional databases, it is the slowest possible operation in Bigtable and defeats its design goals.
3	Poorly Designed Key Range Scan (Scanning a non-contiguous range)	`user_ID LIKE 'A%'` (If users A-Z are scattered)	If the users whose IDs start with ‘A’ are stored across 50 different servers (i.e., they are not contiguous on disk), the query requires 50 parallel network requests, making the operation slow and high-latency. Effective range scans require careful Row Key design.

The Core Principle

To use Bigtable correctly, you must ensure that all your read queries are satisfiable by the Row Key and its prefix. If you find yourself needing to query based on an attribute that is not part of the Row Key, you should use a different database (like Elasticsearch or BigQuery).

Explain CAP theory in a concise manner with examples.

Here is a simple explanation of the CAP Theorem, with examples for Kafka and Elasticsearch.

What is the CAP Theorem?

In simple terms, the CAP Theorem is a rule for distributed systems (like databases or applications spread across multiple computers). It states that any distributed system can only provide two of the following three guarantees at the same time:

C - Consistency:
- What it means: All clients see the same, most up-to-date data at the same time, no matter which node they connect to.
- Analogy: A bank account. If you withdraw money, your balance must be instantly and consistently updated everywhere. You can’t be shown an old, incorrect balance.
A - Availability:
- What it means: The system is always available to respond to requests. It never returns an error, even if some of its nodes are down or the data is slightly out-of-date.
- Analogy: A social media feed. It always loads, even if the “like” count you see is 30 seconds stale. It prefers to show you something rather than an error page.
P - Partition Tolerance:
- What it means: The system continues to operate even if the network breaks down between its nodes (a “network partition”). The nodes are still alive, but they just can’t talk to each other.

The Real Choice: C vs. A

In any real-world system, you must choose Partition Tolerance (P). Network failures will happen. The internet is unreliable, and servers will get disconnected.

Therefore, the CAP theorem isn’t really “pick two of three.” It’s:

“When a network partition (P) happens, you must choose between Consistency (C) and Availability (A).”

CP (Choose Consistency over Availability): When a partition happens, the system will become unavailable (it will stop responding or return errors) to avoid the risk of serving stale or incorrect data.
AP (Choose Availability over Consistency): When a partition happens, the system will stay available, but it might serve inconsistent (stale) data.

Example 1: Kafka (A CP System)

Kafka is a distributed streaming log. Its primary purpose is to be a durable, ordered, and truthful record of events.

How it works: Data is written to a topic partition. That partition has one “leader” node and several “follower” nodes that copy the data.
Consistency (C): For a write to be successful, the leader must write the data and then wait for a “quorum” of its in-sync followers (the ISR) to also copy it. Both writes and reads are served only by the leader.
Partition (P): What happens if the leader node is partitioned and can’t talk to its followers or the central coordinator (ZooKeeper/KRaft)?
The Choice: The cluster will elect a new leader from the followers that were in the ISR.
The Consequence: During the time the old leader is partitioned and a new leader is being elected (a process that can take seconds), that partition is unavailable. It will refuse to accept new writes.
Conclusion: Kafka is a CP system. It sacrifices availability (for a short time) to guarantee that the data log is perfectly consistent and no messages are ever lost or out of order. A bank would use a CP system.

Example 2: Elasticsearch (An AP System)

Elasticsearch is a distributed search and analytics engine. Its primary purpose is to be fast and always available for search queries.

How it works: Data is written to a “primary shard” (node) and then copied to “replica shards.”
Availability (A): A search query (a read request) can be answered by either the primary shard or any of the replica shards.
Partition (P): What happens if a node with a replica shard gets partitioned from the primary?
The Choice: The primary shard might receive new data, but the isolated replica doesn’t know about it. If a user’s read request hits that isolated replica, the replica will happily serve the stale data it has. It doesn’t know it’s inconsistent.
The Consequence: The system remains fully available for reads, but different users might see slightly different results (stale data) for a short time.
Conclusion: Elasticsearch is an AP system. It prioritizes availability (you can always search) at the cost of “eventual consistency.” A search engine or a product catalog, where being 10 seconds out-of-date is acceptable, would use an AP system.

Explain what are OLTP and OLAP systems.

That’s a core concept in data systems design. Understanding the difference between OLTP and OLAP defines what kind of database you should use for any given task.

Here is a detailed breakdown of OLTP and OLAP, including their characteristics and real-world examples.

What is OLTP? (Online Transaction Processing)

OLTP is about running the core business. It involves fast, frequent, short transactions that keep the lights on.

Core Characteristics

Primary Goal: Data integrity, high concurrency, and speed for small, atomic transactions.
Data Pattern: Read, Insert, Update, and Delete (CRUD) operations. Transactions are typically simple and affect only a few rows at a time.
Database Design: Optimized for row-based storage (faster to read/write entire records) and relies heavily on indexing (to locate a single record quickly).
Scale: Must handle high write volume and high concurrency (many users trying to update data simultaneously).
Integrity: Requires strict ACID compliance (Atomicity, Consistency, Isolation, Durability) to ensure no money is lost, and the data is never corrupted.
Data Volume: Typically holds the current, operational state (Terabytes, but focused on the speed of retrieval).

Real-World OLTP Examples

System	Transaction Example	Characteristic
Banking (ATM)	A deposit or withdrawal.	High concurrency, requires immediate write integrity.
E-commerce Checkout	Processing an order.	The transaction must either fully succeed or fully fail.
Website Login	Updating a user’s last login time.	Fast key-value lookup, small update.
User Feature Store	Updating a user’s `tx_count_last_hour` feature.	High-frequency read-modify-write operation.

Appropriate Databases: PostgreSQL, MySQL, Cassandra, Aerospike, Google Bigtable.

What is OLAP? (Online Analytical Processing)

OLAP is about running the business analysis. It involves complex queries on vast amounts of historical data to uncover trends and insights.

Core Characteristics

Primary Goal: Speed for complex queries, aggregation, and deep analysis.
Data Pattern: Read-only, large-scale, complex queries. Queries typically aggregate data over millions or billions of rows (e.g., SUM, AVG, GROUP BY).
Database Design: Optimized for columnar storage (faster to read only the columns needed) and Massively Parallel Processing (MPP). It uses partitioning and clustering instead of traditional indexes.
Scale: Optimized for massive read volume and high computational power.
Integrity: Typically follows BASE principles (Basically Available, Soft state, Eventually consistent) or is highly durable but not transactional.
Data Volume: Holds historical data, logs, and facts (Petabytes).

Real-World OLAP Examples

System	Query Example	Characteristic
Business Intelligence (BI)	What was the total revenue in Q3 for customers aged 30-40 in Europe?	Aggregates over a large time window and segment.
ML Training	Calculate the average transaction amount for all users over the past 3 years.	Full scan of historical logs to build feature vectors.
Reporting Dashboard	Show the latency trend of the API over the last 90 days.	High-volume reads of time-series data.
Financial Forecasting	Predict the sales volume for the next quarter by region.	Complex joins and large-scale computation.

Appropriate Databases: Google BigQuery, Amazon Redshift, Apache Hive, Snowflake.

Summary of Key Differences

Feature	OLTP (Transaction Processing)	OLAP (Analytical Processing)
Activity	Day-to-day operations (Updating a record).	Strategic analysis (Running a report).
Typical Query	`SELECT * WHERE user_id = 123`	`SELECT AVG(sales) GROUP BY country`
Metrics	Transactions per second (TPS).	Query latency for complex aggregates.
Data Structure	Row-based, indexed.	Columnar-based, partitioned.
Goal	Consistency and Speed of Writes/Updates.	Speed and efficiency of Reads/Scans.

Is there any solution to achieve massive scaling for an RDBMS system while still maintaining ACID compliance?

Yes, one such product in Google Spanner.

What is Spanner?

Google Cloud Spanner is Google’s globally distributed, horizontally scalable, strongly consistent, fully managed relational database.

In one line:

It combines the scalability of NoSQL with the consistency + SQL capabilities of a relational database.

It’s the backbone of many Google-scale systems (AdWords, Google Play, Gmail metadata, etc.) and is available as a fully managed service in GCP.

How does Spanner work? (the core concepts)

Spanner achieves something extremely rare: global distribution + horizontal scaling + strong consistency + SQL + ACID transactions. It uses a set of innovations to pull this off.

Let’s break it down simply.

1. Tables → Shards (Splits) → Distributed Across Nodes

Spanner automatically splits tables into “splits” based on primary key ranges:

You choose a primary key.
Spanner automatically shards data based on the key.
Each shard is stored on different nodes and can be replicated across regions.

This gives:

Horizontal scalability
Automatic load balancing
Automatic resharding when data grows

2. TrueTime: the magic ingredient

Spanner uses TrueTime, a globally synchronized clock API that provides bounded clock uncertainty.

TrueTime is powered by:

GPS receivers
Atomic clocks inside Google data centers

TrueTime gives timestamps with a guarantee:

t ∈ [earliest_possible_time, latest_possible_time]

This allows Spanner to:

Order transactions globally
Avoid inconsistency due to clock drift
Achieve external consistency (strongest consistency you can get)

TrueTime is why Spanner can do global ACID transactions, something most distributed databases can’t do.

3. Synchronous Replication Across Regions

Data is replicated using Paxos/Raft-like quorum consensus, not asynchronous replication.

For example (3 replicas):

Majority (2/3) must acknowledge writes.
If one replica fails, the system continues.

This provides:

No data loss
Strong consistency
Geographic availability

You can choose:

multi-region,
regional, or
multi-zone configurations

4. Strongly Consistent Transactions

Spanner supports:

Strongly consistent reads (linearizable)
ACID transactions
Serializable isolation across distributed shards

SQL support includes:

ANSI SQL
Joins
Secondary indexes
Mutations
Commit timestamps
Change streams (CDC)

This is full relational behavior but at horizontal scale.

5. Automatic scaling + storage separation

Compute and storage scale independently.

Spanner:

Automatically adds splits as data grows
Rebalances hot shards
Handles replica placement
Manages failovers automatically

This is crucial for YouTube-scale or multi-petabyte workloads.

6. Fully Managed

You don’t manage:

Replication
Sharding
Clock sync
Failover
Backups
Storage resizing

GCP handles all of that.

7. Use cases where Spanner shines

Global-scale OLTP
Financial systems needing strong consistency
Inventory, account balances, orders
Gaming backends
Multi-region user databases
Enterprise SaaS with global tenants
Google-scale metadata systems
High-QPS transactional workloads

The moment you need:

billions of rows
globally consistent transactions
horizontal scale
SQL + strong consistency Spanner is often the right answer.

8. When NOT to use Spanner

Spanner is not cheap. It is premium infrastructure.

Avoid Spanner when:

Dataset is small (<1 TB)
QPS is low
You don’t need multi-region or strong global consistency
You need simple lookups only (use Bigtable)
You need flexible schema/noSQL (use Firestore/Bigtable)

9. Comparison: Bigtable vs Spanner

Feature	Bigtable	Spanner
Type	Wide-column NoSQL	Distributed RDBMS
Schema	Semi-structured	Structured SQL
Transactions	Single row	Global, ACID
Joins	No	Yes
Consistency	Eventually/strong (per row)	Strong globally
Use cases	High-throughput lookups	Global OLTP
Scaling	Massive linear scale	Massive linear scale
Pricing	Cheaper	Premium

If you need:

key-value store → Bigtable
global SQL database → Spanner

TL;DR Answer

How does Spanner work? Using sharded tables, synchronous replication, Paxos, and TrueTime global clock to provide globally consistent, scalable SQL with ACID transactions.
Is it available in GCP? Yes, it is a fully managed GCP service known as Cloud Spanner.

Since Google Cloud Spanner is a distributed RDBMS, does it support all 3 properties in CAP theorem?

Yes, Google Cloud Spanner is a unique distributed database that challenges the traditional understanding of the CAP theorem.

While the CAP theorem states that a distributed system can only guarantee two out of the three properties (Consistency, Availability, and Partition Tolerance), Spanner effectively provides Consistency (C) and Availability (A) while running on a highly partitioned, global network.

How Spanner “Cheats” CAP

It doesn’t technically violate CAP, but it minimizes the trade-off to a point where it feels like it supports all three.

Consistency (Strong External Consistency): Spanner provides global, synchronous replication. It uses a unique timekeeping API called TrueTime, which relies on GPS and atomic clocks in Google’s data centers. This allows Spanner to assign globally consistent timestamps to transactions, ensuring strict serializability across the globe.
Availability (High Availability): Google claims 99.999% availability for Spanner. It achieves this through massive redundancy and a highly optimized network.
Partition Tolerance (The “Technical” Trade-off): In the strict CAP sense, Spanner is a CP system. If a major network partition occurs (e.g., a fiber cut isolating a continent), Spanner chooses Consistency over Availability. The isolated partition will stop accepting writes to maintain data integrity.
- However: Google’s private global network is so redundant that such partitions are exceptionally rare. Thus, in practice, users experience it as a CA system that also handles partitioning (P) seamlessly 99.999% of the time.

So, practically speaking, Spanner is often described as a CA database that is effectively P-tolerant due to Google’s infrastructure, though theoretically, it remains a CP system.

ML Platform

ML-Platform

This post is licensed under CC BY 4.0 by the author.

Provide a consolidated view of popular data storage technologies relevant in ML based Ecosystems.

1. Relational Databases (RDBMS) - The General Purpose Workhorses

2. Data Warehouses (OLAP) - The Analytics Engines

3. NoSQL: Key-Value & In-Memory - The Speed Demons

4. NoSQL: Wide-Column Stores - The Big Data Writers

5. NoSQL: Document & Search - The Flexible Ones

Summary of CAP Theorem Categorization

Summary Table for Quick Reference

What’s the difference between ‘Columner’ and ‘Wide-Column’ data storage solutions?

Columnar Storage (The OLAP/Analytics Engine)

Core Idea: Storage by Attribute

Wide-Column Stores (The Scalable OLTP Engine)

Core Idea: Flexible, Distributed Rows

Side-by-Side Comparison

Do Wide-column stores allow partitioning and indexing? If so, then explain how.

1. Partitioning (The Key to Scale)

How it Works

The Impact

2. Indexing (The Key to Flexibility)

The Problem with Traditional RDBMS Indexes

The Wide-Column Approach: Secondary Indexes

Show examples of some ‘good’ and ‘bad’ queries for storage systems like BigTable.

Fast Queries (O(1) to O(logN) by Row Key)

Slow Queries (Full Cluster Scan / Scatter-Gather)

The Core Principle

Explain CAP theory in a concise manner with examples.

What is the CAP Theorem?

The Real Choice: C vs. A

Example 1: Kafka (A CP System)

Example 2: Elasticsearch (An AP System)

Explain what are OLTP and OLAP systems.

What is OLTP? (Online Transaction Processing)

Core Characteristics

Real-World OLTP Examples

What is OLAP? (Online Analytical Processing)

Core Characteristics

Real-World OLAP Examples

Summary of Key Differences

Is there any solution to achieve massive scaling for an RDBMS system while still maintaining ACID compliance?

What is Spanner?

How does Spanner work? (the core concepts)

1. Tables → Shards (Splits) → Distributed Across Nodes

2. TrueTime: the magic ingredient

3. Synchronous Replication Across Regions

4. Strongly Consistent Transactions

5. Automatic scaling + storage separation

6. Fully Managed

7. Use cases where Spanner shines

8. When NOT to use Spanner

9. Comparison: Bigtable vs Spanner

TL;DR Answer

Since Google Cloud Spanner is a distributed RDBMS, does it support all 3 properties in CAP theorem?

How Spanner “Cheats” CAP

Trending Tags