DynamoDB Data Modeling: Blog Platform Example (case study)
By Oleksandr Andrushchenko — Published on — Modified on
This case study describes a production blog platform backed entirely by Amazon DynamoDB. The system serves typical blog workloads such as “latest posts”, “most popular posts”, “posts by author”, and “posts by tag”, while also supporting administrative queries like user moderation and contact messages.
Requirements
Functional Requirements
- Serve latest published posts in chronological order for homepage and feeds
- Serve most popular posts sorted by engagement metrics (views, likes, ratings)
- Retrieve individual posts using SEO-friendly slugs
- List posts by a specific author, filtered by publication status
- Enable tag-based discovery and autocomplete for content creation
- Support user authentication and profile retrieval by email or username
- Allow moderators to query users by status and engagement rating
- Store and retrieve contact messages efficiently for administrative purposes
Non-Functional Requirements
- Provide low-latency responses for all read queries (under 100 ms)
- Scale seamlessly to handle sudden traffic spikes, including viral posts
- Support on-demand billing and avoid capacity planning complexity
- Maintain strong consistency for critical authentication queries
- Ensure eventual consistency is acceptable for ranking and feed updates
- Minimize operational overhead and schema complexity while supporting high throughput
- Enable horizontal scalability without table redesign
- Allow indexing of multiple access patterns without duplicating data unnecessarily
Single-Table Design Overview
DynamoDB single-table design optimizes for access patterns rather than normalization. Multiple entity types (users, posts, tags, messages) are stored in one table and distinguished by composite keys (pk and sk) using prefixes like USER# or POST#. This enables fast, predictable queries without joins or scans, since related data is grouped under the same partition key.
Relationships are modeled through key design instead of foreign keys, allowing all related items (e.g., post metadata, tags, counters) to be retrieved in a single query. Carefully chosen key conventions ensure different access patterns coexist efficiently while keeping data logically organized.
| Entity | PK / SK Pattern | Attributes & GSIs | Explanation |
|---|---|---|---|
| User | pk = USER#, sk = META | user_email_pk (USERS_BY_EMAIL), username (USERS_BY_USERNAME), user_status_pk + rating_sk (USERS_BY_STATUS_RATING), created_at_sk (USERS_BY_STATUS_CREATED_AT) | Stores user metadata, authentication info, profile, and counters. Supports lookups by email or username, ranking by popularity, and latest active users. |
| Provider User | pk = PROVIDER_USER##, sk = META | user_id | Maps external OAuth or identity provider users to internal Users. Used during authentication and provider account linking. |
| Post | pk = POST#, sk = META | post_status_pk + created_at_sk (POSTS_BY_STATUS_CREATED_AT), post_status_pk + rating_sk (POSTS_BY_STATUS_RATING), post_user_status_pk + created_at_sk (POSTS_BY_USER_STATUS_CREATED_AT), post_slug (POSTS_BY_SLUG) | Stores post content, metadata, status, and ranking. Supports fetching by slug, latest posts, popular posts, and posts by author. |
| Post Slug | pk = POST_SLUG#, sk = META | post_id | Maps SEO-friendly slugs to post IDs for single-post lookups. |
| Post Tag | pk = POST_TAG#, sk = META | tag_type_pk + rating_sk (TAGS_BY_TYPE_RATING), tag_type_pk + tag_name_sk (TAGS_BY_TYPE_NAME) | Stores individual tags with popularity ranking. Supports autocomplete and trending tag queries. |
| Post Tag Combo | pk = POST_TAG_COMBO##..., sk = POST# | N/A | Tracks combinations of tags per post for multi-tag filtering and faceted searches. |
| User Impression | pk = USER#, sk = REL#USER# | N/A | Tracks interactions between users, e.g., whether one user has viewed another user’s profile or content. |
| Post Impression | pk = POST#, sk = IMP#USER# | N/A | Tracks whether a specific user has seen a post; supports engagement metrics and analytics. |
| Contact Message | pk = CONTACT_MESSAGE#, sk = timestamp | N/A | Stores messages submitted through contact forms for administrative review. |
Key Observations:
- Single-table design encodes all entities via PK/SK prefixes, allowing multiple logical entities to coexist in one table.
- GSIs provide alternate query paths for access patterns like email/username lookups, trending posts, tag autocomplete, and author-specific feeds.
- Transactional writes ensure consistency between posts, users, tags, and counters during updates like publishing a post or updating post status.
- Impression entities track user-post and user-user interactions without creating separate relational tables.
Global Secondary Indexes as Query APIs
Each Global Secondary Index (GSI) represents a first-class query API. Instead of adding indexes later, the schema defines GSIs upfront based on concrete access patterns. For example, the POSTS_BY_STATUS_CREATED_AT index directly supports the “latest posts” page.
GSIs reuse existing attributes but reinterpret them as alternative keys. This enables the same item to appear in multiple sorted views without duplication. A published post appears simultaneously in “latest posts”, “popular posts”, and “posts by author” feeds.
| GSI Name | PK / RK | Entity | Purpose / Access Pattern |
|---|---|---|---|
| USERS_BY_EMAIL | PK: user_email_pk | User | Lookup users by email for authentication and login flows. |
| USERS_BY_USERNAME | PK: username | User | Fetch user profiles via username (SEO-friendly) for public pages or profile views. |
| USERS_BY_STATUS_RATING | PK: user_status_pk RK: rating_sk |
User | Retrieve popular or high-ranking users filtered by status; supports leaderboards and moderation dashboards. |
| USERS_BY_STATUS_CREATED_AT | PK: user_status_pk RK: created_at_sk |
User | List latest active or registered users; used in discovery, admin views, or onboarding analytics. |
| POSTS_BY_STATUS_CREATED_AT | PK: post_status_pk RK: created_at_sk |
Post | Fetch latest posts by status (e.g., published) for homepage or feeds sorted by creation time. |
| POSTS_BY_STATUS_RATING | PK: post_status_pk RK: rating_sk |
Post | Retrieve popular posts by engagement metrics, filtered by status; used for trending or top posts. |
| POSTS_BY_USER_STATUS_CREATED_AT | PK: post_user_status_pk RK: created_at_sk |
Post | List posts of a specific author, filtered by status, sorted by creation date; used for author feeds or dashboards. |
| POSTS_BY_SLUG | PK: post_slug | Post / Post Slug | Retrieve a post by its SEO-friendly slug for direct post pages. |
| TAGS_BY_TYPE_RATING | PK: tag_type_pk RK: rating_sk |
Post Tag | Fetch popular tags sorted by engagement metrics; used for trending tags on the UI. |
| TAGS_BY_TYPE_NAME | PK: tag_type_pk RK: tag_name_sk |
Post Tag | Support autocomplete or search of tags by prefix; useful for content creation and filtering. |
Implementation examples
New post creation:
import uuid, boto3, time
table = boto3.resource("dynamodb").Table("blog-dynamodb-table")
now = int(time.time() * 1000)
post_id = str(uuid.uuid4())
slug = "my-first-post"
table.put_item(Item={
"pk": f"POST#{post_id}",
"sk": "META",
"post_slug": slug,
"user_id": "USER#123",
"title": "My First Post",
"content": "Hello world!",
"created_at_sk": now,
"post_status_pk": "POST#STATUS#UNPUBLISHED",
"post_user_status_pk": "POST#USER#123#STATUS#UNPUBLISHED"
})
table.put_item(Item={"pk": f"POST_SLUG#{slug}", "sk": "META", "post_id": post_id})
Latest published posts querying:
import boto3
table = boto3.resource("dynamodb").Table("blog-dynamodb-table")
response = table.query(
IndexName="POSTS_BY_STATUS_CREATED_AT",
KeyConditionExpression="post_status_pk = :status",
ExpressionAttributeValues={
":status": "POST#STATUS#published"
},
ScanIndexForward=False, # descending order (latest first)
Limit=10
)
latest_posts = response["Items"]
Cloud Formation resource:
DynamoDBTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: !Sub "${AWS::StackName}-${Stage}"
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: pk
AttributeType: S
- AttributeName: sk
AttributeType: S
- AttributeName: user_email_pk
AttributeType: S
- AttributeName: username
AttributeType: S
- AttributeName: post_status_pk
AttributeType: S
- AttributeName: post_slug
AttributeType: S
- AttributeName: tag_type_pk
AttributeType: S
- AttributeName: rating_sk
AttributeType: N
- AttributeName: tag_name_sk
AttributeType: S
- AttributeName: user_status_pk
AttributeType: S
- AttributeName: post_user_status_pk
AttributeType: S
- AttributeName: created_at
AttributeType: N
KeySchema:
- AttributeName: pk # e.g. "PROVIDER_USER#cognito#1", "USER#1", "POST#1", "POST_SLUG#slug1", "POST_TAG#tag1", "POST_TAG_COMBO#tag1#tag2", "CONTACT_MESSAGE#1"
KeyType: HASH
- AttributeName: sk # e.g. "META", "1759086095241", "POST#1"
KeyType: RANGE
GlobalSecondaryIndexes:
- IndexName: USERS_BY_EMAIL # get_user_by_user_token (get user by email)
KeySchema:
- AttributeName: user_email_pk # e.g. "me@example.com"
KeyType: HASH
Projection:
ProjectionType: ALL
- IndexName: USERS_BY_USERNAME # get_user_by_slug (get user by username)
KeySchema:
- AttributeName: username # e.g. "user-123"
KeyType: HASH
Projection:
ProjectionType: ALL
- IndexName: USERS_BY_STATUS_RATING # get_popular_active_users
KeySchema:
- AttributeName: user_status_pk # e.g. "USER#active"
KeyType: HASH
- AttributeName: rating_sk # e.g. 10000001759086095241
KeyType: RANGE
Projection:
ProjectionType: ALL
- IndexName: POSTS_BY_STATUS_CREATED_AT_2 # get_latest_posts
KeySchema:
- AttributeName: post_status_pk # e.g. "POST#published"
KeyType: HASH
- AttributeName: created_at # e.g. 1759086095241
KeyType: RANGE
Projection:
ProjectionType: ALL
- IndexName: POSTS_BY_SLUG # get_post_by_slugs
KeySchema:
- AttributeName: post_slug # e.g. "post-title-about-something"
KeyType: HASH
Projection:
ProjectionType: ALL
- IndexName: TAGS_BY_TYPE_RATING # get_popular_post_tags
KeySchema:
- AttributeName: tag_type_pk # e.g. "POST_TAG"
KeyType: HASH
- AttributeName: rating_sk # e.g. 10000001759086095241
KeyType: RANGE
Projection:
ProjectionType: ALL
- IndexName: POSTS_BY_STATUS_RATING # get_popular_posts
KeySchema:
- AttributeName: post_status_pk # e.g. "POST#published"
KeyType: HASH
- AttributeName: rating_sk # e.g. 10000001759086095241
KeyType: RANGE
Projection:
ProjectionType: ALL
- IndexName: TAGS_BY_TYPE_NAME # get_post_tags_by_prefix
KeySchema:
- AttributeName: tag_type_pk # e.g. "POST_TAG"
KeyType: HASH
- AttributeName: tag_name_sk # e.g. "tag1"
KeyType: RANGE
Projection:
ProjectionType: ALL
- IndexName: USERS_BY_STATUS_CREATED_AT_2 # get_latest_active_users
KeySchema:
- AttributeName: user_status_pk # e.g. "USER#active"
KeyType: HASH
- AttributeName: created_at # e.g. 1759086095241
KeyType: RANGE
Projection:
ProjectionType: ALL
- IndexName: POSTS_BY_USER_STATUS_CREATED_AT_2 # get_latest_posts_by_user
KeySchema:
- AttributeName: post_user_status_pk # e.g. "POST#1#published"
KeyType: HASH
- AttributeName: created_at # e.g. 1759086095241
KeyType: RANGE
Projection:
ProjectionType: ALL
Full blog implementation can be found on my GitHub.
Operational Trade-offs
This design trades schema simplicity for predictable performance. Adding a new access pattern often requires a new GSI, which must be planned carefully due to DynamoDB limits. A real-world example is adding a “posts by category” view, which would require another indexed attribute.
- Pros: predictable latency, horizontal scalability, no joins
- Cons: upfront modeling effort, limited ad-hoc queries
- Operational focus on access pattern discipline
Key Takeaways
This case study demonstrates how a blog platform can scale using a single DynamoDB table with carefully designed GSIs. The schema directly encodes business queries, eliminating the need for scans or complex joins. A realistic outcome is serving millions of reads per day with consistent single-digit millisecond latency.
The approach emphasizes planning, explicit access patterns, and acceptance of denormalization. When applied correctly, it enables high-throughput content platforms with minimal operational overhead.
Comments (0)