LanWhisper
Voice typing that uses the faster computer already on your network.
Six days from first prompt to a finished, shipped product on Mac and Windows.
The Challenge
Speech-to-text is good now. The options for getting it are not.
Cloud services work well, but they send your audio somewhere else and charge forever. Local options assume you’re on a fast machine and don’t mind a terminal. There is a third path. LanWhisper works as well as a cloud solution but locally, and refuses to expose avoidable complexity.
One installer per platform. Automatic network discovery of faster instances. No terminal, no IP addresses, no ports, no account.
Process
I started with a 15-line prompt: menu-bar app, hold a hotkey, POST the audio to a hardcoded Whisper endpoint, paste the result at the cursor. That first version worked in an hour. It proved the basic idea worked. It was also pretty bad.
There was no .app bundle. The menu bar icon was a microphone emoji. Configuring the hotkey required the user to understand pynput’s string format. Setting the server required the user to know what URL format the Whisper server wanted.
So the real work started. I worked out the first-run flow on paper before I wrote any code for the new version.
A Figma prototype can’t catch the thing that actually matters here, which is how it actually works in practice, including when you have network issues. You have to use it. So I handed the raw sketch to Claude Code and iterated in working software.
Type, proportion, UI copy, etc. all got worked out the same way the code did: I let the agent produce a first draft and then did more hands-on iteration from there.
Once the SwiftUI macOS version was solid, I ported it to Windows. Same Go daemon, new native client in C# and WinUI 3. Going all-Go with a cross-platform tray toolkit would have saved a codebase, but the whole product is supposed to feel like a normal Mac or Windows app. That naturalness is why you’d pick LanWhisper over a terminal-flavored alternative.
Distribution
An app that doesn’t install cleanly isn’t really an app.
Mac: notarized DMG with a branded background. Windows: signed installer via Azure Trusted Signing
The landing page
The landing page is plain HTML, CSS, and a sprinkling of vanilla JavaScript.
The centerpiece is the animation at the top of this page: a pill of voice bars pops up, fades, and transcribed text appears at the caret, looping through three apps (Messages, Terminal, Notes). CSS does the visuals and keyframes; a small script sequences the phases. A simple autoplay loop beats an explainer video for understanding the basic premise at a glance.
The hard parts were the little ones. The caret had to sit at the same vertical position before and after the transcription landed. The voice pill had to clearly communicate speaking, not loading. And the pill’s exit had to feel as considered as its entrance.
None of that is technically impressive. It’s just a lot of small decisions.
Being the everything-er
I conceived the product, designed how it should work, and built it. From first prompt to signed, publicly downloadable installers on both platforms was about six days. When engineering and design are done by the same person in the same workflow, building is faster and more fun.