DynamoDB Data Modeling: Blog Platform Example (case study)

By Oleksandr Andrushchenko — Published on — Modified on

Blog on DynamoDB
Blog on DynamoDB

This case study describes a production blog platform backed entirely by Amazon DynamoDB. The system serves typical blog workloads such as “latest posts”, “most popular posts”, “posts by author”, and “posts by tag”, while also supporting administrative queries like user moderation and contact messages.

Requirements

Functional Requirements

  • Serve latest published posts in chronological order for homepage and feeds
  • Serve most popular posts sorted by engagement metrics (views, likes, ratings)
  • Retrieve individual posts using SEO-friendly slugs
  • List posts by a specific author, filtered by publication status
  • Enable tag-based discovery and autocomplete for content creation
  • Support user authentication and profile retrieval by email or username
  • Allow moderators to query users by status and engagement rating
  • Store and retrieve contact messages efficiently for administrative purposes

Non-Functional Requirements

  • Provide low-latency responses for all read queries (under 100 ms)
  • Scale seamlessly to handle sudden traffic spikes, including viral posts
  • Support on-demand billing and avoid capacity planning complexity
  • Maintain strong consistency for critical authentication queries
  • Ensure eventual consistency is acceptable for ranking and feed updates
  • Minimize operational overhead and schema complexity while supporting high throughput
  • Enable horizontal scalability without table redesign
  • Allow indexing of multiple access patterns without duplicating data unnecessarily

Single-Table Design Overview

DynamoDB single-table design optimizes for access patterns rather than normalization. Multiple entity types (users, posts, tags, messages) are stored in one table and distinguished by composite keys (pk and sk) using prefixes like USER# or POST#. This enables fast, predictable queries without joins or scans, since related data is grouped under the same partition key.

Relationships are modeled through key design instead of foreign keys, allowing all related items (e.g., post metadata, tags, counters) to be retrieved in a single query. Carefully chosen key conventions ensure different access patterns coexist efficiently while keeping data logically organized.

Entity PK / SK Pattern Attributes & GSIs Explanation
User pk = USER#, sk = META user_email_pk (USERS_BY_EMAIL), username (USERS_BY_USERNAME), user_status_pk + rating_sk (USERS_BY_STATUS_RATING), created_at_sk (USERS_BY_STATUS_CREATED_AT) Stores user metadata, authentication info, profile, and counters. Supports lookups by email or username, ranking by popularity, and latest active users.
Provider User pk = PROVIDER_USER##, sk = META user_id Maps external OAuth or identity provider users to internal Users. Used during authentication and provider account linking.
Post pk = POST#, sk = META post_status_pk + created_at_sk (POSTS_BY_STATUS_CREATED_AT), post_status_pk + rating_sk (POSTS_BY_STATUS_RATING), post_user_status_pk + created_at_sk (POSTS_BY_USER_STATUS_CREATED_AT), post_slug (POSTS_BY_SLUG) Stores post content, metadata, status, and ranking. Supports fetching by slug, latest posts, popular posts, and posts by author.
Post Slug pk = POST_SLUG#, sk = META post_id Maps SEO-friendly slugs to post IDs for single-post lookups.
Post Tag pk = POST_TAG#, sk = META tag_type_pk + rating_sk (TAGS_BY_TYPE_RATING), tag_type_pk + tag_name_sk (TAGS_BY_TYPE_NAME) Stores individual tags with popularity ranking. Supports autocomplete and trending tag queries.
Post Tag Combo pk = POST_TAG_COMBO##..., sk = POST# N/A Tracks combinations of tags per post for multi-tag filtering and faceted searches.
User Impression pk = USER#, sk = REL#USER# N/A Tracks interactions between users, e.g., whether one user has viewed another user’s profile or content.
Post Impression pk = POST#, sk = IMP#USER# N/A Tracks whether a specific user has seen a post; supports engagement metrics and analytics.
Contact Message pk = CONTACT_MESSAGE#, sk = timestamp N/A Stores messages submitted through contact forms for administrative review.

Key Observations:

  • Single-table design encodes all entities via PK/SK prefixes, allowing multiple logical entities to coexist in one table.
  • GSIs provide alternate query paths for access patterns like email/username lookups, trending posts, tag autocomplete, and author-specific feeds.
  • Transactional writes ensure consistency between posts, users, tags, and counters during updates like publishing a post or updating post status.
  • Impression entities track user-post and user-user interactions without creating separate relational tables.

Global Secondary Indexes as Query APIs

Each Global Secondary Index (GSI) represents a first-class query API. Instead of adding indexes later, the schema defines GSIs upfront based on concrete access patterns. For example, the POSTS_BY_STATUS_CREATED_AT index directly supports the “latest posts” page.

Blog platform DynamoDB Schema
Blog platform DynamoDB Schema

GSIs reuse existing attributes but reinterpret them as alternative keys. This enables the same item to appear in multiple sorted views without duplication. A published post appears simultaneously in “latest posts”, “popular posts”, and “posts by author” feeds.

GSI Name PK / RK Entity Purpose / Access Pattern
USERS_BY_EMAIL PK: user_email_pk User Lookup users by email for authentication and login flows.
USERS_BY_USERNAME PK: username User Fetch user profiles via username (SEO-friendly) for public pages or profile views.
USERS_BY_STATUS_RATING PK: user_status_pk
RK: rating_sk
User Retrieve popular or high-ranking users filtered by status; supports leaderboards and moderation dashboards.
USERS_BY_STATUS_CREATED_AT PK: user_status_pk
RK: created_at_sk
User List latest active or registered users; used in discovery, admin views, or onboarding analytics.
POSTS_BY_STATUS_CREATED_AT PK: post_status_pk
RK: created_at_sk
Post Fetch latest posts by status (e.g., published) for homepage or feeds sorted by creation time.
POSTS_BY_STATUS_RATING PK: post_status_pk
RK: rating_sk
Post Retrieve popular posts by engagement metrics, filtered by status; used for trending or top posts.
POSTS_BY_USER_STATUS_CREATED_AT PK: post_user_status_pk
RK: created_at_sk
Post List posts of a specific author, filtered by status, sorted by creation date; used for author feeds or dashboards.
POSTS_BY_SLUG PK: post_slug Post / Post Slug Retrieve a post by its SEO-friendly slug for direct post pages.
TAGS_BY_TYPE_RATING PK: tag_type_pk
RK: rating_sk
Post Tag Fetch popular tags sorted by engagement metrics; used for trending tags on the UI.
TAGS_BY_TYPE_NAME PK: tag_type_pk
RK: tag_name_sk
Post Tag Support autocomplete or search of tags by prefix; useful for content creation and filtering.

Implementation examples

New post creation:

import uuid, boto3, time

table = boto3.resource("dynamodb").Table("blog-dynamodb-table")
now = int(time.time() * 1000)
post_id = str(uuid.uuid4())
slug = "my-first-post"

table.put_item(Item={
    "pk": f"POST#{post_id}",
    "sk": "META",
    "post_slug": slug,
    "user_id": "USER#123",
    "title": "My First Post",
    "content": "Hello world!",
    "created_at_sk": now,
    "post_status_pk": "POST#STATUS#UNPUBLISHED",
    "post_user_status_pk": "POST#USER#123#STATUS#UNPUBLISHED"
})

table.put_item(Item={"pk": f"POST_SLUG#{slug}", "sk": "META", "post_id": post_id})

Latest published posts querying:

import boto3

table = boto3.resource("dynamodb").Table("blog-dynamodb-table")

response = table.query(
    IndexName="POSTS_BY_STATUS_CREATED_AT",
    KeyConditionExpression="post_status_pk = :status",
    ExpressionAttributeValues={
        ":status": "POST#STATUS#published"
    },
    ScanIndexForward=False,  # descending order (latest first)
    Limit=10
)

latest_posts = response["Items"]

Cloud Formation resource:

DynamoDBTable:
  Type: AWS::DynamoDB::Table
  Properties:
    TableName: !Sub "${AWS::StackName}-${Stage}"
    BillingMode: PAY_PER_REQUEST
    AttributeDefinitions:
      - AttributeName: pk
        AttributeType: S
      - AttributeName: sk
        AttributeType: S
      - AttributeName: user_email_pk
        AttributeType: S
      - AttributeName: username
        AttributeType: S
      - AttributeName: post_status_pk
        AttributeType: S
      - AttributeName: post_slug
        AttributeType: S
      - AttributeName: tag_type_pk
        AttributeType: S
      - AttributeName: rating_sk
        AttributeType: N
      - AttributeName: tag_name_sk
        AttributeType: S
      - AttributeName: user_status_pk
        AttributeType: S
      - AttributeName: post_user_status_pk
        AttributeType: S
      - AttributeName: created_at
        AttributeType: N
    KeySchema:
      - AttributeName: pk # e.g. "PROVIDER_USER#cognito#1", "USER#1", "POST#1", "POST_SLUG#slug1", "POST_TAG#tag1", "POST_TAG_COMBO#tag1#tag2", "CONTACT_MESSAGE#1"
        KeyType: HASH
      - AttributeName: sk # e.g. "META", "1759086095241", "POST#1"
        KeyType: RANGE
    GlobalSecondaryIndexes:
      - IndexName: USERS_BY_EMAIL # get_user_by_user_token (get user by email)
        KeySchema:
          - AttributeName: user_email_pk # e.g. "me@example.com"
            KeyType: HASH
        Projection:
          ProjectionType: ALL
      - IndexName: USERS_BY_USERNAME # get_user_by_slug (get user by username)
        KeySchema:
          - AttributeName: username # e.g. "user-123"
            KeyType: HASH
        Projection:
          ProjectionType: ALL
      - IndexName: USERS_BY_STATUS_RATING # get_popular_active_users
        KeySchema:
          - AttributeName: user_status_pk # e.g. "USER#active"
            KeyType: HASH
          - AttributeName: rating_sk # e.g. 10000001759086095241
            KeyType: RANGE
        Projection:
          ProjectionType: ALL
      - IndexName: POSTS_BY_STATUS_CREATED_AT_2 # get_latest_posts
        KeySchema:
          - AttributeName: post_status_pk # e.g. "POST#published"
            KeyType: HASH
          - AttributeName: created_at # e.g. 1759086095241
            KeyType: RANGE
        Projection:
          ProjectionType: ALL
      - IndexName: POSTS_BY_SLUG # get_post_by_slugs
        KeySchema:
          - AttributeName: post_slug # e.g. "post-title-about-something"
            KeyType: HASH
        Projection:
          ProjectionType: ALL
      - IndexName: TAGS_BY_TYPE_RATING # get_popular_post_tags
        KeySchema:
          - AttributeName: tag_type_pk # e.g. "POST_TAG"
            KeyType: HASH
          - AttributeName: rating_sk # e.g. 10000001759086095241
            KeyType: RANGE
        Projection:
          ProjectionType: ALL
      - IndexName: POSTS_BY_STATUS_RATING # get_popular_posts
        KeySchema:
          - AttributeName: post_status_pk # e.g. "POST#published"
            KeyType: HASH
          - AttributeName: rating_sk # e.g. 10000001759086095241
            KeyType: RANGE
        Projection:
          ProjectionType: ALL
      - IndexName: TAGS_BY_TYPE_NAME # get_post_tags_by_prefix
        KeySchema:
          - AttributeName: tag_type_pk # e.g. "POST_TAG"
            KeyType: HASH
          - AttributeName: tag_name_sk # e.g. "tag1"
            KeyType: RANGE
        Projection:
          ProjectionType: ALL
      - IndexName: USERS_BY_STATUS_CREATED_AT_2 # get_latest_active_users
        KeySchema:
          - AttributeName: user_status_pk # e.g. "USER#active"
            KeyType: HASH
          - AttributeName: created_at # e.g. 1759086095241
            KeyType: RANGE
        Projection:
          ProjectionType: ALL
      - IndexName: POSTS_BY_USER_STATUS_CREATED_AT_2 # get_latest_posts_by_user
        KeySchema:
          - AttributeName: post_user_status_pk # e.g. "POST#1#published"
            KeyType: HASH
          - AttributeName: created_at # e.g. 1759086095241
            KeyType: RANGE
        Projection:
          ProjectionType: ALL

Full blog implementation can be found on my GitHub.

Operational Trade-offs

This design trades schema simplicity for predictable performance. Adding a new access pattern often requires a new GSI, which must be planned carefully due to DynamoDB limits. A real-world example is adding a “posts by category” view, which would require another indexed attribute.

  • Pros: predictable latency, horizontal scalability, no joins
  • Cons: upfront modeling effort, limited ad-hoc queries
  • Operational focus on access pattern discipline

Key Takeaways

This case study demonstrates how a blog platform can scale using a single DynamoDB table with carefully designed GSIs. The schema directly encodes business queries, eliminating the need for scans or complex joins. A realistic outcome is serving millions of reads per day with consistent single-digit millisecond latency.

The approach emphasizes planning, explicit access patterns, and acceptance of denormalization. When applied correctly, it enables high-throughput content platforms with minimal operational overhead.

Comments (0)