Toby Allen

Migrating Blog Analytics Storage from Vercel Blob to Firestore

· Vercel, Firebase

Earlier this week I added server-side page view tracking to this blog using Next.js 16's proxy layer and Vercel Blob. The design stored each page view as a tiny object in Vercel Blob — one file per visit, with all the analytics data encoded into the pathname so the stats page could aggregate from list() calls alone, without fetching any blob content. It was clean and required no database.

It also ran through the Vercel Blob free tier surprisingly quickly.

Vercel Blob pricing is based on operations — reads, writes, and list calls — not on storage volume. The objects were one byte each, so storage was negligible. But every page view was a PUT operation, and every stats page load ran fourteen list calls. For a blog with any real traffic, that adds up fast. After roughly 1,500 page views the operation count was already pushing against free tier limits.

In this post I will detail how I migrated the analytics storage to Firebase Firestore using daily summary documents — and the Firestore behaviour that caused the first migration to silently store the wrong data entirely.

The steps we will cover in this post are as follows.

  1. Why Firestore daily summary documents are the right model
  2. Writing the summary document with FieldValue.increment()
  3. The set() dot-notation gotcha and how to avoid it
  4. Migrating the existing Blob data to Firestore
  5. Reading the stats with parallel document gets

Why Daily Summary Documents

The original Blob approach had a cost structure that scaled with total page views: one blob PUT per view, fourteen list calls per stats page load. Even with the pathname-as-data trick that avoided per-blob content fetches, the list operations themselves cost money.

The better model for a counter is a summary document. Instead of one record per event, you maintain one document per day and increment the relevant counters in-place on each page view. Firebase Firestore's FieldValue.increment() operation does exactly this server-side and atomically, so concurrent writes are not a problem.

The cost profile inverts completely:

  • Writes: still one per page view, same as before
  • Reads: one document per day of stats window — fourteen reads for fourteen days, regardless of traffic

For 1,500 page views over two days, the original Blob implementation used roughly 1,500 PUT operations and several hundred list calls. The Firestore equivalent uses two document writes per page view on average (updating the daily counter) and two document reads for the stats page. Firestore's free tier allows 20,000 writes and 50,000 reads per day — more than enough for a personal blog that will never approach those numbers.

Writing the Summary Document

Each page view updates a document in the analytics collection, keyed by date:

import { FieldValue } from 'firebase-admin/firestore'
import { getDb } from './firebase'
 
export async function logPageView(
  path: string,
  country: string | null,
  ua: string | null,
): Promise<void> {
  if (!process.env.FIREBASE_PROJECT_ID) return
  if (ua && isBot(ua)) return
 
  const day = new Date().toISOString().slice(0, 10)
  const cc = (country ?? 'XX').toUpperCase().slice(0, 2)
  const device = deviceType(ua ?? '')
  const pageKey = encodePath(path)
 
  const db = getDb()
  await db.collection('analytics').doc(day).set(
    {
      date: day,
      total: FieldValue.increment(1),
      [`page_${pageKey}`]: FieldValue.increment(1),
      [`country_${cc}`]: FieldValue.increment(1),
      [`device_${device}`]: FieldValue.increment(1),
    },
    { merge: true },
  )
}

set() with merge: true creates the document if it does not exist and merges the provided fields if it does — so the first page view of a day creates the document, and every subsequent one increments the existing counters. FieldValue.increment(n) is safe under concurrent writes because Firestore resolves it server-side rather than as a read-modify-write cycle in the client.

The encodePath() function is the same one from the original implementation: it converts /articles/my-post to articles~my-post and the homepage / to the special token _root_, producing field names that are safe to use in Firestore.

The set() Dot-Notation Gotcha

The first version of the summary document used nested field names with dot notation — pages.articles~my-post, countries.AU, devices.d — following the pattern from the Firestore update() documentation.

// What I tried first — does NOT work with set()
await db.collection('analytics').doc(day).set(
  {
    [`pages.${pageKey}`]: FieldValue.increment(1),
    [`countries.${cc}`]: FieldValue.increment(1),
  },
  { merge: true },
)

The stats page returned "No data" for every section. The documents had been written, the total counter was correct, but the page and country breakdowns were empty.

The reason: set() treats dotted keys as literal field names. A key of pages.articles~my-post creates a field literally named pages.articles~my-post — with the dot as part of the name — rather than a field articles~my-post nested inside a map field pages. Only update() interprets dotted keys as field paths.

This is documented behaviour, but it is easy to miss if you have come from using update() and assume both methods handle field paths the same way. The documents Firestore stored looked healthy from the console — you could see the fields — but they were structured as a flat list of dot-separated field names rather than nested maps, so the reader code found nothing at data.pages.

The fix is to avoid nested maps entirely and use flat prefixed field names instead. page_articles~my-post, country_AU, device_d are all top-level fields that set({ merge: true }) handles correctly. Reading them back is straightforward:

for (const [field, count] of Object.entries(data)) {
  if (field.startsWith('page_')) {
    const path = decodePath(field.slice(5))
    allPages[path] = (allPages[path] ?? 0) + (count as number)
  } else if (field.startsWith('country_')) {
    allCountries[field.slice(8)] = (allCountries[field.slice(8)] ?? 0) + (count as number)
  } else if (field.startsWith('device_')) {
    const dev = field.slice(7) as keyof typeof devices
    if (dev in devices) devices[dev] += count as number
  }
}

No nested map traversal, no data.pages lookup that returns undefined.

Migrating the Existing Blob Data

The Blob implementation had collected 1,570 page views across two days before the switch. Rather than discard those, I wrote a one-time migration API route that:

  1. Called the Vercel Blob HTTP API directly to list all objects under the analytics/ prefix
  2. Parsed each object's pathname to extract date, encoded path, country, and device code
  3. Aggregated the counts into day-keyed buckets in memory
  4. Wrote each day's bucket to Firestore as a single set() call — overwriting any previous bad run

The route was protected with a MIGRATE_SECRET environment variable and deleted from the codebase immediately after the migration confirmed successfully. The whole process — writing the route, deploying it, calling it, verifying the data appeared correctly in the stats page, and deleting the route — took one deployment cycle.

One note on the Blob HTTP API: @vercel/blob wraps a straightforward REST endpoint at https://blob.vercel-storage.com. You can call it directly with fetch() and an Authorization: Bearer header, which means the migration route did not need the npm package reinstalled just to read the existing data.

Reading the Stats

Reading back the daily summaries is a parallel batch of document gets — one per day in the stats window:

const db = getDb()
const snaps = await Promise.all(
  dates.map(date => db.collection('analytics').doc(date).get()),
)
 
for (const snap of snaps) {
  if (!snap.exists) continue
  const data = snap.data()!
  // accumulate from flat prefixed fields...
}

Fourteen parallel document gets complete in roughly the same time as one, because Firestore handles them concurrently. The stats page latency is now bounded by a single round-trip to Firestore rather than by the number of page views in the window.

Conclusion

The original design was clever but wrong for the billing model. Pathname-as-data is a useful technique when you have a managed object store with cheap list operations and need fast aggregation — but Vercel Blob's operation-based pricing makes it expensive for high-frequency small writes. For counters and time-series aggregations, a database with server-side atomic increment is the right tool.