Day 4 of the 30-Day System Design Roadmap
"Just use MongoDB!" a junior developer once told me during a system design review for a banking application. "It's web scale!"
I smiled, knowing we were about to have an important conversation. Three months later, after countless headaches with inconsistent account balances and a near-disaster with lost transactions, that same developer came to my desk and said, "I think I understand why you were hesitant."
Here's the thing: choosing a database isn't about what's trendy or what your favorite tech influencer recommends. It's not about MongoDB vs PostgreSQL vs Redis in some abstract battle. It's about understanding your data's personality, your application's needs, and the trade-offs you're willing to make.
I've spent the last decade designing database architectures for everything from small startups to systems handling billions of transactions. And you know what? The database choice can make or break your application—not because one is inherently "better," but because each one is designed for specific problems.
Today, we're going to explore database design patterns with both theory and practical examples. By the end, you'll know how to choose the right data store for your specific situation.
Let me paint you a picture. You're building a social media platform—the next big thing. You sit down to design your data layer and realize you need to handle:
- User profiles with structured information (email, name, birthdate)
Your first instinct might be to pick one database and force everything into it. I've seen teams do this countless times. They choose PostgreSQL and struggle with graph queries for social connections. Or they choose MongoDB and fight to ensure data consistency for financial transactions.
Here's the reality that took me years to accept: Modern applications don't use just one database. Netflix uses multiple types. So does Amazon. So does pretty much every company operating at scale.
The SQL versus NoSQL debate is often framed wrong. People treat it like choosing sides in a sports game. But it's more like choosing between a sedan and a pickup truck—both are vehicles, but they're designed for different jobs.
SQL databases—PostgreSQL, MySQL, Oracle—are like that reliable friend who always keeps their promises. They're built around ACID properties and excel when you need absolute certainty about your data.
I once worked on an e-commerce platform processing millions of dollars daily. When a customer places an order, we need to:
1. Check if the product is in stock 2. Reduce the inventory count 3. Create an order record 4. Charge the customer's payment method
Imagine if step 3 succeeded but step 4 failed. You'd have an order in the system but no payment. This is where SQL databases shine—transactions ensure either everything succeeds or nothing does.
Here's how it looks in practice:
-- E-commerce order processing with ACID guarantees
BEGIN TRANSACTION;
-- Step 1: Check and reduce inventory
UPDATE inventory
SET quantity = quantity - 1
WHERE product_id = 'PROD-123'
AND quantity > 0;
-- If no rows affected, product is out of stock
IF @@ROWCOUNT = 0 THEN
ROLLBACK;
RAISE 'Product out of stock';
END IF;
-- Step 2: Create the order
INSERT INTO orders (user_id, product_id, amount, status)
VALUES (42, 'PROD-123', 99.99, 'pending')
RETURNING order_id;
-- Step 3: Process payment
INSERT INTO payments (order_id, amount, status)
VALUES (order_id, 99.99, 'processing');
-- All succeed together, or all fail together
COMMIT; SQL databases are your go-to when:
- You're dealing with money (banking, payments, e-commerce)
NoSQL databases trade some of SQL's guarantees for massive scalability and flexibility. Let me tell you about a real problem I faced.
I consulted for a social media startup using PostgreSQL for their user feed. Everything worked fine until they hit 100,000 users. Then performance collapsed. The problem? They were doing complex joins across 6 different tables to build each user's feed:
-- This query became a nightmare at scale
SELECT
p.post_id,
p.content,
p.created_at,
u.user_id,
u.name,
u.avatar_url,
COUNT(DISTINCT l.like_id) as like_count,
COUNT(DISTINCT c.comment_id) as comment_count,
EXISTS(
SELECT 1 FROM likes
WHERE post_id = p.post_id AND user_id = @current_user_id
) as user_has_liked
FROM posts p
INNER JOIN users u ON p.author_id = u.user_id
INNER JOIN followers f ON f.followed_id = p.author_id
LEFT JOIN likes l ON l.post_id = p.post_id
LEFT JOIN comments c ON c.post_id = p.post_id
WHERE f.follower_id = @current_user_id
GROUP BY p.post_id, u.user_id
ORDER BY p.created_at DESC
LIMIT 50; Each feed request hit 6 tables with multiple joins. At scale, this was a disaster.
We migrated to MongoDB with a denormalized approach:
// Single document contains everything needed for display
{
"_id": "feed_user_12345",
"user_id": 12345,
"posts": [
{
"post_id": "post_789",
"content": "Just deployed to production on Friday! 🚀",
"created_at": ISODate("2025-01-15T14:30:00Z"),
"author": {
"user_id": "user_456",
"name": "Jane Doe",
"avatar_url": "https://cdn.example.com/jane.jpg"
},
"stats": {
"likes": 42,
"comments": 7
},
"user_has_liked": false
}
// ... more posts
],
"last_updated": ISODate("2025-01-15T14:35:00Z")
}
// Fetching a feed becomes one simple query
db.feeds.findOne({ user_id: 12345 }) Feed load times dropped from 2 seconds to 50 milliseconds. Yes, we had to update more places when data changed. But the trade-off was worth it for the read performance.
NoSQL databases are your go-to when:
- You need to scale horizontally (add more servers)
Here's what most tutorials won't tell you: both approaches have painful limitations.
SQL databases struggle with horizontal scaling. When you outgrow a single server, sharding is complex and you lose the ability to join across shards.
NoSQL databases struggle with consistency. When data is spread across servers, you might not see your own write immediately. Transactions across multiple documents are limited or non-existent.
The key is understanding which limitations you can live with.
ACID stands for Atomicity, Consistency, Isolation, and Durability. These aren't just buzzwords—they're fundamental guarantees that can save you from catastrophic failures.
Atomicity means operations either all succeed or all fail. No partial success.
# Without atomicity - DISASTER waiting to happen
def transfer_money_bad(from_account, to_account, amount):
balance = get_balance(from_account)
if balance >= amount:
deduct(from_account, amount) # What if this succeeds...
add(to_account, amount) # But this fails? Money disappears!
# With atomicity - SAFE
def transfer_money_good(from_account, to_account, amount):
with database.transaction():
balance = get_balance(from_account)
if balance >= amount:
deduct(from_account, amount)
add(to_account, amount)
# Both succeed or both fail - no middle ground I learned this the hard way consulting for a startup with a race condition in their wallet system. Money could be deducted but not credited due to network timeouts. They lost $50K before discovering the bug. Atomicity would have prevented this entirely.
Consistency means your database enforces rules, even when your application code has bugs.
-- Database-level constraints ensure data integrity
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL, -- No duplicate emails, ever
age INTEGER CHECK (age >= 18), -- Must be adult
balance DECIMAL(10,2) CHECK (balance >= 0), -- Can't go negative
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(user_id), -- Must be valid user
amount DECIMAL(10,2) NOT NULL CHECK (amount > 0)
); Think about an inventory system. You have a rule: quantity can never go below zero. Without database-level consistency, two customers could buy the last item simultaneously:
# Race condition without proper constraints
# Customer A: Check quantity = 1 ✓
# Customer B: Check quantity = 1 ✓
# Customer A: Create order ✓
# Customer B: Create order ✓
# Result: Two orders, one item!
# With proper isolation and constraints
def purchase_item(user_id, product_id):
with db.transaction(isolation_level='SERIALIZABLE'):
product = db.execute(
"SELECT * FROM inventory WHERE product_id = %s FOR UPDATE",
(product_id,)
)
if product.quantity < 1:
raise OutOfStockError()
db.execute(
"UPDATE inventory SET quantity = quantity - 1 "
"WHERE product_id = %s AND quantity > 0",
(product_id,)
)
db.execute(
"INSERT INTO orders (user_id, product_id) VALUES (%s, %s)",
(user_id, product_id)
) Isolation is about what happens when multiple operations occur simultaneously. Databases offer different isolation levels, and choosing the right one is a balancing act:
Read Committed (default for most databases): You only see data that's been committed. Good for most applications.
Repeatable Read: Same query returns same results throughout a transaction. Great for reports.
Serializable: Complete isolation, as if transactions ran one at a time. Maximum safety, minimum concurrency.
# Example: Booking system with proper isolation
from sqlalchemy import create_engine
# Configure isolation level based on your needs
engine = create_engine(
'postgresql://localhost/bookings',
isolation_level='REPEATABLE_READ' # Consistent reads within transaction
)
def book_seat(user_id, seat_id):
with engine.begin() as conn:
# Lock the row for update
seat = conn.execute(
"SELECT * FROM seats WHERE seat_id = %s FOR UPDATE",
(seat_id,)
).fetchone()
if not seat.available:
raise SeatAlreadyBookedError()
conn.execute(
"UPDATE seats SET available = false, booked_by = %s "
"WHERE seat_id = %s",
(user_id, seat_id)
) Once a transaction commits, it survives crashes, power failures, everything. This is why database write operations are slower than pure memory operations—durability has a cost, but for critical data, it's worth it.
The CAP theorem states you can only have 2 out of 3 in a distributed system:
- Consistency: All nodes see the same data at the same time
Imagine two database servers in different data centers. The network connection breaks (this happens regularly in real systems). You face a choice:
Option A: Stop accepting writes until connection is restored. This maintains consistency but sacrifices availability.
Option B: Continue accepting writes on both servers. This maintains availability but sacrifices consistency (servers now have different data).
There's no third option. You must choose.
CP systems choose consistency over availability. MongoDB with majority write concern, HBase, and Consul are CP systems.
# MongoDB configured for strong consistency
from pymongo import MongoClient, WriteConcern
client = MongoClient('mongodb://localhost:27017/')
db = client.banking
# Write succeeds only if majority of replicas confirm
# If replica set can't reach majority, write is rejected (less available)
# But all replicas agree on the data (consistent)
result = db.accounts.insert_one(
{
'account_id': 'ACC-123',
'balance': 1000.00,
'currency': 'USD'
},
write_concern=WriteConcern(w='majority', j=True)
) When to choose CP: Financial transactions, inventory systems, any scenario where wrong data is worse than slow data.
AP systems choose availability over consistency. Cassandra, DynamoDB, and Riak are AP systems.
# Cassandra configured for high availability
from cassandra.cluster import Cluster
from cassandra.query import SimpleStatement, ConsistencyLevel
cluster = Cluster(['node1', 'node2', 'node3'])
session = cluster.connect('social_media')
# Write succeeds even if some nodes are down (available)
# But might be temporarily inconsistent across nodes
statement = SimpleStatement(
"INSERT INTO user_sessions (user_id, session_data) VALUES (%s, %s)",
consistency_level=ConsistencyLevel.ONE # Only need one node to acknowledge
)
session.execute(statement, (user_id, session_data)) When to choose AP: Social media feeds, shopping carts, user sessions—scenarios where availability trumps consistency.
At a previous company, we didn't choose one philosophy. We analyzed each data type:
# User accounts: PostgreSQL (CP)
class UserService:
def update_balance(self, user_id, amount):
# Strong consistency required for money
with postgres_db.transaction():
return postgres_db.execute(
"UPDATE accounts SET balance = balance + %s "
"WHERE user_id = %s",
(amount, user_id)
)
# User sessions: Redis (AP)
class SessionService:
def create_session(self, user_id, data):
# High availability, eventual consistency OK
return redis_cluster.setex(
f"session:{user_id}",
3600, # 1 hour
json.dumps(data)
)
# Shopping cart: Hybrid approach
class CartService:
def add_item(self, user_id, item):
# Optimistic update to local cache (AP)
cart = redis.get(f"cart:{user_id}") or []
cart.append(item)
redis.set(f"cart:{user_id}", cart)
# Async sync to PostgreSQL for durability (CP)
task_queue.enqueue(sync_cart_to_db, user_id, cart) Let's explore different database types with practical examples.
Perfect for structured data with complex relationships.
-- E-commerce schema showing the power of relationships
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
price DECIMAL(10,2) NOT NULL,
stock INTEGER NOT NULL CHECK (stock >= 0)
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(user_id),
total DECIMAL(10,2) NOT NULL,
status VARCHAR(50) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE order_items (
order_id INTEGER REFERENCES orders(order_id),
product_id INTEGER REFERENCES products(product_id),
quantity INTEGER NOT NULL,
price DECIMAL(10,2) NOT NULL,
PRIMARY KEY (order_id, product_id)
);
-- Complex queries are natural and efficient
SELECT
u.email,
COUNT(DISTINCT o.order_id) as total_orders,
SUM(o.total) as total_spent,
AVG(o.total) as avg_order_value
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
WHERE o.created_at >= NOW() - INTERVAL '30 days'
GROUP BY u.email
HAVING COUNT(DISTINCT o.order_id) > 5
ORDER BY total_spent DESC; When to use: Financial systems, CRM, ERP, any application with complex relationships.
Perfect for hierarchical data and flexible schemas.
// Blog post in MongoDB - naturally nested structure
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"title": "Database Design Patterns",
"slug": "database-design-patterns",
"author": {
"user_id": 12345,
"name": "Hrithik Dhakrey",
"avatar": "https://cdn.example.com/avatar.jpg"
},
"content": "Today we're learning about databases...",
"tags": ["databases", "system-design", "backend"],
"published_at": ISODate("2025-01-15T10:00:00Z"),
"stats": {
"views": 15420,
"likes": 342,
"shares": 67
},
"comments": [
{
"comment_id": "cmt_001",
"user": {
"user_id": 67890,
"name": "Jane Developer"
},
"text": "Great article!",
"created_at": ISODate("2025-01-15T11:30:00Z")
}
]
}
// One query gets everything - no joins needed
const post = await db.posts.findOne({ slug: "database-design-patterns" });
// Easy to evolve - add new fields without migration
await db.posts.updateOne(
{ _id: post._id },
{
$set: {
video_url: "https://youtube.com/...",
reading_time: 15
}
}
); When to use: Content management, product catalogs, user profiles with varying fields.
Perfect for caching and simple, fast lookups.
import redis
import json
redis_client = redis.Redis(host='localhost', port=6379)
# Session management
def create_session(user_id, session_data):
session_id = generate_session_id()
redis_client.setex(
f"session:{session_id}",
3600, # 1 hour expiry
json.dumps(session_data)
)
return session_id
# Rate limiting
def check_rate_limit(user_id, max_requests=100, window=3600):
key = f"rate_limit:{user_id}"
current = redis_client.incr(key)
if current == 1:
redis_client.expire(key, window)
return current <= max_requests
# Caching expensive queries
def get_user_profile(user_id):
cache_key = f"user:{user_id}"
# Try cache first
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
# Cache miss - fetch from database
user = postgres_db.query("SELECT * FROM users WHERE id = %s", user_id)
# Cache for 5 minutes
redis_client.setex(cache_key, 300, json.dumps(user))
return user
# Leaderboard with sorted sets
def update_leaderboard(user_id, score):
redis_client.zadd("game_leaderboard", {user_id: score})
def get_top_players(limit=10):
return redis_client.zrevrange("game_leaderboard", 0, limit-1, withscores=True) When to use: Caching, sessions, real-time features, rate limiting.
Perfect when relationships are as important as the data itself.
-- Social network in Neo4j
-- Create users and relationships
CREATE (alice:User {user_id: 1, name: 'Alice'})
CREATE (bob:User {user_id: 2, name: 'Bob'})
CREATE (carol:User {user_id: 3, name: 'Carol'})
CREATE (dave:User {user_id: 4, name: 'Dave'})
CREATE (alice)-[:FOLLOWS]->(bob)
CREATE (alice)-[:FOLLOWS]->(carol)
CREATE (bob)-[:FOLLOWS]->(carol)
CREATE (bob)-[:FOLLOWS]->(dave)
CREATE (carol)-[:FOLLOWS]->(dave)
-- Find mutual friends (try this efficiently in SQL!)
MATCH (alice:User {name: 'Alice'})-[:FOLLOWS]->(mutual)<-[:FOLLOWS]-(friend)
WHERE alice <> friend
RETURN DISTINCT friend.name
-- Friend recommendations (friends of friends you don't follow)
MATCH (alice:User {name: 'Alice'})-[:FOLLOWS*2..2]->(recommendation)
WHERE NOT (alice)-[:FOLLOWS]->(recommendation)
AND alice <> recommendation
RETURN recommendation.name, COUNT(*) as connection_strength
ORDER BY connection_strength DESC
LIMIT 5 When to use: Social networks, recommendation engines, fraud detection, knowledge graphs.
Perfect for timestamped data and analytics.
from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS
client = InfluxDBClient(url="http://localhost:8086", token="token", org="myorg")
write_api = client.write_api(write_options=SYNCHRONOUS)
# Recording application metrics
def track_api_request(endpoint, response_time, status_code):
point = Point("api_metrics") \
.tag("endpoint", endpoint) \
.tag("status_code", str(status_code)) \
.field("response_time_ms", response_time) \
.field("value", 1)
write_api.write(bucket="monitoring", record=point)
# Recording IoT sensor data
def record_sensor_data(sensor_id, temperature, humidity):
point = Point("sensor_readings") \
.tag("sensor_id", sensor_id) \
.tag("location", "warehouse_a") \
.field("temperature", temperature) \
.field("humidity", humidity)
write_api.write(bucket="iot", record=point)
# Querying time-series data
query = '''
from(bucket: "monitoring")
|> range(start: -24h)
|> filter(fn: (r) => r["_measurement"] == "api_metrics")
|> filter(fn: (r) => r["endpoint"] == "/api/orders")
|> aggregateWindow(every: 1h, fn: mean)
|> yield(name: "mean_response_time")
'''
result = client.query_api().query(query) When to use: Application monitoring, IoT data, financial tick data, analytics.
Let's design a real system. Here's how I'd architect a social media platform using multiple databases.
"""
Database Strategy for Social Media Platform:
1. PostgreSQL - User accounts and authentication (needs ACID)
2. MongoDB - Posts and comments (flexible schema, high write volume)
3. Neo4j - Social graph (optimized for relationship queries)
4. Redis - Feed cache and sessions (speed critical)
5. InfluxDB - Analytics and metrics (time-series data)
"""
# 1. User Service (PostgreSQL)
class UserService:
def create_user(self, email, password, name):
"""
User accounts need ACID guarantees.
Email must be unique. Password must be hashed.
All or nothing.
"""
with postgres_db.transaction():
user = postgres_db.execute(
"INSERT INTO users (email, password_hash, created_at) "
"VALUES (%s, %s, NOW()) RETURNING user_id",
(email, hash_password(password))
)
postgres_db.execute(
"INSERT INTO user_profiles (user_id, name) VALUES (%s, %s)",
(user.user_id, name)
)
return user.user_id
def get_user(self, user_id):
"""
Complex query across related tables - SQL's strength
"""
return postgres_db.query(
"SELECT u.*, p.name, p.bio, p.avatar_url, "
"COUNT(DISTINCT o.order_id) as order_count "
"FROM users u "
"JOIN user_profiles p ON u.user_id = p.user_id "
"LEFT JOIN orders o ON u.user_id = o.user_id "
"WHERE u.user_id = %s "
"GROUP BY u.user_id, p.name, p.bio, p.avatar_url",
(user_id,)
)
# 2. Post Service (MongoDB)
class PostService:
def create_post(self, user_id, content, media_urls=None):
"""
Posts have flexible schema - some have images, some videos, etc.
MongoDB's document model fits naturally.
"""
post = {
'user_id': user_id,
'content': content,
'media_urls': media_urls or [],
'created_at': datetime.now(),
'stats': {'likes': 0, 'comments': 0, 'shares': 0},
'comments': [] # Embedded for fast reads
}
result = mongo_db.posts.insert_one(post)
# Async: update follower feeds
task_queue.enqueue(update_feeds, user_id, str(result.inserted_id))
return str(result.inserted_id)
def add_comment(self, post_id, user_id, text):
"""
Embedded comments - one query to get post with all comments
"""
mongo_db.posts.update_one(
{'_id': ObjectId(post_id)},
{
'$push': {
'comments': {
'comment_id': generate_id(),
'user_id': user_id,
'text': text,
'created_at': datetime.now()
}
},
'$inc': {'stats.comments': 1}
}
)
# 3. Social Graph Service (Neo4j)
class SocialGraphService:
def follow_user(self, follower_id, followee_id):
"""
Neo4j excels at relationship operations
"""
query = """
MATCH (follower:User {user_id: $follower_id})
MATCH (followee:User {user_id: $followee_id})
CREATE (follower)-[r:FOLLOWS {created_at: datetime()}]->(followee)
RETURN r
"""
neo4j_db.run(query, follower_id=follower_id, followee_id=followee_id)
def get_friend_suggestions(self, user_id, limit=10):
"""
Friends of friends - natural in graph DB, nightmare in SQL
"""
query = """
MATCH (user:User {user_id: $user_id})-[:FOLLOWS*2..2]->(suggestion)
WHERE NOT (user)-[:FOLLOWS]->(suggestion)
AND user <> suggestion
WITH suggestion, COUNT(*) as strength
ORDER BY strength DESC
LIMIT $limit
RETURN suggestion.user_id as user_id, strength
"""
return neo4j_db.run(query, user_id=user_id, limit=limit)
# 4. Feed Service (Redis)
class FeedService:
def get_feed(self, user_id, page=1, per_page=20):
"""
Feeds must be blazing fast - Redis cache to the rescue
"""
cache_key = f"feed:{user_id}:page:{page}"
# Try cache first
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
# Cache miss - build feed from MongoDB
following = social_graph.get_following(user_id)
posts = mongo_db.posts.find(
{'user_id': {'$in': following}}
).sort('created_at', -1).skip((page-1)*per_page).limit(per_page)
feed = list(posts)
# Cache for 5 minutes
redis_client.setex(cache_key, 300, json.dumps(feed, default=str))
return feed
def invalidate_feed(self, user_id):
"""
When user posts, invalidate follower caches
"""
followers = social_graph.get_followers(user_id)
pipeline = redis_client.pipeline()
for follower_id in followers:
pipeline.delete(f"feed:{follower_id}:*")
pipeline.execute()
# 5. Analytics Service (InfluxDB)
class AnalyticsService:
def track_post_view(self, post_id, user_id):
"""
Millions of events per day - time-series DB handles it
"""
point = Point("post_views") \
.tag("post_id", str(post_id)) \
.tag("user_id", str(user_id)) \
.field("value", 1)
influx_write_api.write(bucket="analytics", record=point)
def get_trending_posts(self, hours=1):
"""
Time-based aggregation - InfluxDB's specialty
"""
query = f'''
from(bucket: "analytics")
|> range(start: -{hours}h)
|> filter(fn: (r) => r["_measurement"] == "post_views")
|> group(columns: ["post_id"])
|> count()
|> sort(columns: ["_value"], desc: true)
|> limit(n: 20)
'''
return influx_query_api.query(query) Each database does what it's optimized for:
- PostgreSQL ensures user account integrity with ACID transactions
Could we use just PostgreSQL? Technically yes. But we'd struggle with feed performance, graph queries would be complex, and analytics would slow down operational tables.
Here's the framework I use:
Step 1: Understand Your Data
- Structured or flexible?
Step 2: Define Consistency Requirements
- Can you tolerate eventual consistency?
Step 3: Consider Scale
- How much data?
Step 4: Evaluate Your Team
- What's your team familiar with?
Step 5: Start Simple, Scale Pragmatically
- Don't optimize prematurely
Choosing based on hype: Netflix's needs aren't your startup's needs.
Premature optimization: PostgreSQL can handle millions of users if designed well.
Ignoring operations: A database is only as good as your ability to back it up and recover from failures.
Not planning for growth: Design your application with clear service boundaries so you can migrate later if needed.
1. No perfect database exists - every choice involves trade-offs 2. Polyglot persistence is normal - use multiple database types 3. Understand ACID and CAP - these have real implications 4. Start simple, scale pragmatically - don't optimize prematurely 5. Operations matter - consider backups, monitoring, recovery
Tomorrow we're diving into Caching Strategies (Day 5). You'll learn how to serve millions of requests without constantly hitting your database, understand cache invalidation patterns, and design multi-layer caching strategies.
Understanding databases is crucial, but understanding how to avoid hitting them unnecessarily is just as important.
---
Practice Exercise: Design database architecture for a ride-sharing app:
- Driver and rider profiles
For each data type, choose a database and justify why. What consistency level does it need? What's the read/write pattern? Share your design in the comments!