shipped spoolcast dev log 6 — why ai makes things up about its own work
today's spoolcast video is about a quieter agent failure than facts being wrong: the agent doesn't lie, it just invents answers about its own work because there's nothing real for it to look at. the hook is a 3:30 am moment from the last episode's production — desperate, asking the agent how much was left, getting three different answers in a row.
most of the session was about audio. switched spoolcast's tts pipeline from one-mp3-per-beat with silent gaps to chunk-level ssml synthesis — one mp3 per chunk with <break> tags between beats — so google's prosody flows across what were beat boundaries instead of restarting cold every sentence. caught a real bug along the way: speakingrate=1.1 was being baked in by google AND playbackrate=1.1 was being applied at remotion. cumulative 1.21x speedup. that was the inhuman feel.
the friction was "ai" pronunciation. tried five ssml and phonetic approaches — every one had a different drift, including one ("ay-eye") that came out sounding like a pirate. the fix that landed was rewriting the script: "ai" / "the ai" / "my ai" → "agent" / "the agent" / "my agent" throughout. now baked into video_output_rules.md §7 as a permanent rule: when chirp3-hd doesn't read an acronym cleanly, rewrite the script before reaching for ssml.