Overview

This post is about a system I made for people to share comments on my “blog”/personal site. All this system really has is just a way to send and view comments with some basic features. Everything was built to be sustained by me for a pretty long time.

What I mean by small

Small doesn’t mean “no security” or “no moderation.” Small just means to keep within my ability to maintain it.

Here are the development constraints I set for myself:

One thread per blog post, created with my blog rss.
Replies are regular comments with a parentCommentId and a depth.
Markdown is allowed, raw HTML is not.
Mutations require Origin allowlist + CSRF token + auth.

If you’re building something similar, this is the difference between “I can maintain this” and “I can’t touch this without opening ten tabs.”

Architecture: flows before tables

I’m going to describe the system the way it behaves, because the database schema makes more sense once you can picture the request paths.

1) Viewer loads a post

The blog frontend asks the comments service to “resolve” the post slug into a threadId.
Then it lists comments for that thread.
If the viewer is signed in, the list includes whether they personally liked each comment.

2) Viewer signs in

The frontend sends you to /auth/github/start.
That sets up OAuth PKCE state in the database.
GitHub redirects back to /auth/github/callback.
The service upserts a User row, creates a Session row, and sets a session cookie.

3) Viewer writes anything

Check Origin allowlist (and require Origin in production).
Check CSRF token (cookie + header).
Check auth session.
Apply rate limits.
Do the write.

Data model

Tables:

User — GitHub identity + moderation flags (isAdmin, isBanned, reasons)
Session — server-side session with expiresAt, revokedAt, lastUsedAt
Thread — unique (siteKey, resourceType, resourceId) so one post maps cleanly
Comment — parent pointer, depth, markdown body, HTML body, edit/delete timestamps
CommentReaction — unique (commentId, userId, reaction) (reaction currently only like)
OAuthState — PKCE state (short lived), stores codeVerifier + returnTo
PrebannedUser — “don’t ever let this login/id in” list

Soft delete fields

Instead of instantly hard-deleting, a comment has:

deletedAt
deletedBy

I then have a cron-job that deletes comments older than 72 hours.

Thread resolution (RSS-gated, on purpose)

The resolve endpoint takes a tuple:

siteKey (basically “which site are we talking about?”)
resourceType (currently only post)
resourceId (the slug)

and returns a threadId. The upsert is normal. The guardrail is not.

The guardrail: only allow real posts

Instead of letting any resourceId create a thread, the service fetches the site’s RSS feed, extracts slugs, and only permits ones that exist. And that prevents:

someone creating threads for random slugs forever
your database becoming a graveyard of junk rows

Loading chart...

Why PKCE

PKCE lets you avoid storing a client secret in the browser while still getting a strong proof that the callback belongs to the flow you started. So it looks like:

/auth/github/start generates:
- state (random)
- codeVerifier
- codeChallenge
it stores { state, codeVerifier, returnTo, expiresAt } in OAuthState
it redirects to GitHub with { state, codeChallenge }

The returnTo problem

Any OAuth flow has a “where do we go after login?” question. So the returnTo URL is validated against an allowlist of known origins:

the blog origins
plus the service origin itself

Sessions here are not JWTs. They’re server-side rows in Postgres. The cookie is basically: lh_comments_session=<uuid>

On each authenticated request with a lastUsedAt update in the background:

read cookie
find session row
verify not revoked
verify not expired
return user

Why server-side sessions?

easy revocation (log out, revoke)
easy ban enforcement (don’t trust old tokens)
less complexity around token invalidation

JWTs are fine. I just didn’t want to carry that complexity for a blog comments service.

2) Mutation gating

Even with CORS configured, the service still enforces Origin checks on writes. In production-like mode, it requires an Origin header. If something non-browser hits your endpoints, you still want the server to be the authority.

// Pseudocode shaped like the real route guard
export async function mutationAllowed(request: NextRequest) {
  const origin = request.headers.get('origin');

  if (env.NODE_ENV === 'production') {
    if (!origin) return { ok: false, code: 'MUTATION_ORIGIN_REQUIRED' };
    if (!isAllowedOrigin(origin)) return { ok: false, code: 'MUTATION_ORIGIN_NOT_ALLOWED' };
  } else {
    if (origin && !isAllowedOrigin(origin)) {
      return { ok: false, code: 'MUTATION_ORIGIN_NOT_ALLOWED' };
    }
  }

  // CSRF check happens here too (next section)
  return { ok: true };
}

The mechanism

This service uses a CSRF cookie plus a request header:

cookie: csrf_token=<random>
header: X-CSRF-Token: <same random>

The server checks:

cookie token exists
header token exists
same length (fails early)
constant-time equality (prevents timing leaks)

If any of that fails, the mutation is blocked.

const CSRF_COOKIE = 'csrf_token';

export async function verifyCsrf(request: NextRequest) {
  const cookieToken = (await cookies()).get(CSRF_COOKIE)?.value;
  const headerToken = request.headers.get('x-csrf-token');

  if (!cookieToken || !headerToken) return false;
  if (cookieToken.length !== headerToken.length) return false;

  // Constant-time compare to avoid leaking info via timing
  return crypto.timingSafeEqual(
    Buffer.from(cookieToken),
    Buffer.from(headerToken)
  );
}

How the client gets the token

The /v1/me endpoint returns:

user (or null)
csrfToken

So the client pattern is:

call /v1/me on load
keep csrfToken in memory
attach it to every write in X-CSRF-Token

Why this fails in real life

The failures are usually mundane:

Your frontend forgot to include the header on one request.
You’re testing in a new origin and forgot to add it to the allowlist.
Cookies aren’t being set because you’re mixing http and https.
You tried to mutate without calling /v1/me first.

Loading chart...

GET /comments list latency (p50)

Loading chart...

Most of the improvement here is “stop doing extra work.” Stuff like:

returning bodyHtml instead of re-rendering
grouping likes in one query
keeping response shapes consistent so clients don’t do follow-up calls

GET /comments list latency (p95)

Loading chart...

The p95 tells on you.

It’s usually:

cold starts
a slow DB connection setup
a “this one thread has a ton of comments and you forgot a limit” moment

POST /comment create latency (median-ish)

Loading chart...

Creating comments costs a little more because it includes markdown render + sanitize, plus checks. But honestly, I’m okay with that. If writes are a bit heavier, it usually means reads can stay cheap, and on a blog that trade is basically always correct.

Error rate (all endpoints)

Loading chart...

The bump on Day 4 is the kind of thing I’d expect from either:

RSS fetch failing (thread resolve refuses to create threads)
an origin allowlist mismatch after a domain change
an accidentally missing credentials: 'include' on one client fetch

Problems I actually hit (mostly browser-shaped)

This is the part where I admit the “hard” problems weren’t SQL.

Cookies + multiple origins

You end up asking questions like:

Is the service on HTTPS?
Is the blog on HTTPS?
What does the browser think the site boundary is?
Are you sending credentials on fetch?

If something breaks, it often looks like “random 401s.” But it’s really “cookie didn’t get sent.”

CSRF token ordering

If the frontend tries to post before fetching /v1/me (and therefore before receiving the CSRF cookie/token), your write fails. Which is correct. But it feels like a bug until you remember you’re protecting yourself from cross-site writes.

Origin allowlist drift

This one is extremely normal:

you add a new domain or preview deploy
you forget to update the allowlist
suddenly writes fail

Rate limits in multi-instance environments

The current rate limiter uses an in-memory map.

That’s fine for a single instance.

If you ever run multiple instances, you want a shared store (Redis, etc.).

Closing

Everything was done intentionally small:

OAuth for identity
server sessions for control
Origin allowlist for request boundaries
CSRF for cookie-based write protection
markdown + sanitize for UGC safety
moderation for reality

It’s basically me trying to build something I won’t hate maintaining. And if you’ve ever maintained a “simple” system that wasn’t actually simple, you know exactly what I mean.

Where to next?

Search

Popular Tags

Quick Links

Building a Small Simple Comment System

Overview

What I mean by small

Architecture: flows before tables

1) Viewer loads a post

2) Viewer signs in

3) Viewer writes anything

Data model

Soft delete fields

Thread resolution (RSS-gated, on purpose)

The guardrail: only allow real posts

Why PKCE

The returnTo problem

Why server-side sessions?

2) Mutation gating

The mechanism

How the client gets the token

Why this fails in real life

GET /comments list latency (p50)

GET /comments list latency (p95)

POST /comment create latency (median-ish)

Error rate (all endpoints)

Problems I actually hit (mostly browser-shaped)

Cookies + multiple origins

CSRF token ordering

Origin allowlist drift

Rate limits in multi-instance environments

Closing