Technical Decisions¶
This page documents the key technical choices made during the design and implementation of Meet Transcriber, and the reasoning behind each.
Decision Log¶
1. Puppeteer over Google Calendar API¶
Decision: Poll Google Calendar by scraping meet.google.com/landing via Puppeteer instead of using the Google Calendar REST API.
Reasoning: - The Google Calendar API requires OAuth 2.0 app registration, API key management, and Google Workspace admin approval for certain scopes - Using a headless browser works with any standard consumer Gmail account — no developer console, no OAuth app, no API key - The Meet landing page already aggregates upcoming meetings in a single view, making it an effective polling target - Browser-based auth (cookie persistence) is simpler to maintain than OAuth token refresh flows
Trade-off: The scraping approach is more fragile than an API — Google UI changes can break selectors. Mitigated by using click-based extraction and HTML pattern matching rather than rigid CSS selectors.
2. PulseAudio over Direct Audio Capture¶
Decision: Use a PulseAudio virtual null sink + loopback module to capture audio, rather than recording from a microphone input.
Reasoning: - The bot runs on a headless server with no physical audio hardware - Recording system audio output (Chrome's audio stream) requires a virtual device that captures what the application plays, not what a microphone hears - PulseAudio's null sink + loopback combination routes Chrome's audio output through a virtual device that FFmpeg can record - This captures all meeting participants' audio cleanly, without ambient noise
Trade-off: PulseAudio must be running and the virtual devices must be set up before Chrome starts. Handled by setupPulseAudio() with pre-checks and an FFmpeg retry on race condition.
3. SQLite for State Persistence¶
Decision: Use SQLite (via better-sqlite3) for storing bot state, meeting records, and session data.
Reasoning:
- Single-server deployment — no need for a networked database
- SQLite has zero external dependencies and zero operational overhead
- Sufficient for the expected data volume (a few meetings per day)
- Synchronous API (better-sqlite3) simplifies the codebase — no async/await complexity for DB calls
- Data survives process restarts and can be inspected with standard SQLite tools
Trade-off: Not suitable for multi-server deployments. Acceptable for this single-instance use case.
4. Whisper API over Local Whisper¶
Decision: Send audio to the OpenAI Whisper API rather than running Whisper locally on the DigitalOcean server.
Reasoning: - The DigitalOcean server (basic droplet) has no GPU — local Whisper inference would be extremely slow - The Whisper API delivers high-accuracy transcription in seconds - Simpler deployment — no model download, no GPU drivers, no memory management - API cost is low for infrequent meeting transcriptions
Trade-off: Requires an OpenAI API key and internet connectivity. Acceptable since the server already has internet access and the bot already requires external services (Google, Telegram).
5. Telegram for Notifications¶
Decision: Use Telegram Bot API for all user-facing notifications, commands, and transcript delivery.
Reasoning:
- OpenClaw (the assistant platform) already has Telegram integration — no additional infrastructure needed
- Telegram bots support inline media (screenshots), long messages, and bidirectional commands out of the box
- Reliable delivery with no email spam filters or push notification token management
- The /login, /status, /join, and /iseeevents commands provide a natural interactive control channel
Trade-off: Requires the user to have a Telegram account and the bot to be configured with a valid token and chat ID.
6. Browser-based Calendar Polling over API-based Polling¶
Decision: Use the browser (Puppeteer + authenticated session) to poll the calendar, rather than an API-based approach.
Reasoning: - No Google API credentials required — works with the same Google session used for joining meetings - The bot only needs one authenticated browser session for both calendar polling and meeting join - Reduces the attack surface — no API tokens to manage or rotate - The Meet landing page provides a unified view of upcoming meetings without requiring calendar API scopes
Trade-off: Google's Meet UI may show a limited time window of upcoming events. Polling frequency (every 5 minutes) mitigates this.