<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>Ajay Walia</title><link>https://curiousbit.netlify.app/</link><description>Digital workplace, artificial intelligence, cloud, security, automation, and enterprise technology notes by Ajay Walia.</description><language>en-au</language><managingEditor>Ajay Walia</managingEditor><webMaster>Ajay Walia</webMaster><copyright>Copyright 2026 Ajay Walia</copyright><lastBuildDate>Sun, 21 Jun 2026 05:46:10 +0000</lastBuildDate><atom:link href="https://curiousbit.netlify.app/tags/build-log/index.xml" rel="self" type="application/rss+xml"/><image><url>https://curiousbit.netlify.app/images/og-default.png</url><title>Ajay Walia</title><link>https://curiousbit.netlify.app/</link></image><item><title>I Built My Own Video Downloader — No Ads, No Watermarks, Three Platforms</title><link>https://curiousbit.netlify.app/i-built-my-own-video-downloader-no-ads-no-watermarks-three-platforms/</link><guid isPermaLink="true">https://curiousbit.netlify.app/i-built-my-own-video-downloader-no-ads-no-watermarks-three-platforms/</guid><pubDate>Mon, 11 May 2026 00:00:00 +0000</pubDate><dc:creator>Ajay Walia</dc:creator><description>&lt;p&gt;Every video download site I tried felt like navigating a minefield — five ad clicks to reach a button that triggers another redirect. So I stopped using them and built a clean local tool that handles TikTok, Instagram, and X in one paste.&lt;/p&gt;</description><content:encoded>&lt;![CDATA[<img src="https://curiousbit.netlify.app/images/video-downloader-banner.jpg" alt="Build-Log" style="max-width:100%;height:auto;margin-bottom:1.5em;"/><p>Every video download site I tried felt like navigating a minefield — five ad clicks to reach a button that triggers another redirect. So I stopped using them and built a clean local tool that handles TikTok, Instagram, and X in one paste.</p><p><strong>No watermarks. No ads. No accounts. Just a URL.</strong></p><p><em>~6 min read · Node.js + React · conceptual deep-dive</em></p><style>
@import url('https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;600;700;800&family=JetBrains+Mono:wght@400;500&display=swap');
.vd-article {
--vd-surface: #111827;
--vd-surface2: #1a2235;
--vd-border: #1f2d45;
--vd-text: #e2e8f0;
--vd-muted: #8b9ab3;
--vd-green: #00c853;
--vd-green-glow: rgba(0,200,83,0.15);
--vd-cyan: #22d3ee;
--vd-cyan-glow: rgba(34,211,238,0.12);
--vd-orange: #f59e0b;
color: var(--vd-text);
font-size: 1.05rem;
line-height: 1.85;
margin: 2.5rem 0;
}
.vd-article * { box-sizing: border-box; }
.vd-article h2 {
font-family: 'Space Grotesk', sans-serif;
font-size: clamp(1.5rem, 1.2rem + 1vw, 2.1rem);
font-weight: 700;
letter-spacing: -0.02em;
color: var(--vd-text);
margin: 2.6rem 0 0.9rem;
padding-bottom: 0.5rem;
border-bottom: 1px solid var(--vd-border);
}
.vd-article h3 {
font-family: 'Space Grotesk', sans-serif;
font-size: clamp(1.15rem, 1.05rem + 0.3vw, 1.45rem);
font-weight: 700;
color: var(--vd-green);
margin: 2rem 0 0.6rem;
}
.vd-article p { margin: 0 0 1.2rem; color: var(--vd-text); }
.vd-article strong { color: #fff; font-weight: 600; }
.vd-article a { color: var(--vd-green); text-decoration: underline; text-underline-offset: 3px; }
.vd-article code {
background: var(--vd-surface2);
padding: 0.1rem 0.4rem;
border-radius: 4px;
font-family: 'JetBrains Mono', monospace;
font-size: 0.88rem;
}
.vd-callout {
background: var(--vd-surface2);
border-left: 3px solid var(--vd-green);
border-radius: 0 10px 10px 0;
padding: 1.1rem 1.4rem;
margin: 1.6rem 0;
font-size: 1rem;
line-height: 1.7;
}
.vd-callout.warning { border-color: var(--vd-orange); background: rgba(245,158,11,0.06); }
.vd-callout.info { border-color: var(--vd-cyan); background: rgba(34,211,238,0.06); }
.vd-steps { display: flex; flex-direction: column; gap: 1rem; margin: 1.5rem 0 2rem; }
.vd-step-card {
background: var(--vd-surface);
border: 1px solid var(--vd-border);
border-radius: 12px;
padding: 1.2rem 1.4rem;
display: flex;
gap: 1.1rem;
align-items: flex-start;
}
.vd-step-num {
font-family: 'Space Grotesk', sans-serif;
font-size: 1.3rem; font-weight: 800;
color: var(--vd-green);
min-width: 2rem;
line-height: 1.3;
}
.vd-step-body h4 {
font-family: 'Space Grotesk', sans-serif;
font-weight: 700; font-size: 1rem;
margin: 0 0 0.3rem;
color: #fff;
}
.vd-step-body p { margin: 0; font-size: 0.92rem; color: var(--vd-muted); line-height: 1.6; }
.vd-platform-row { display: flex; gap: 0.75rem; flex-wrap: wrap; margin: 1.2rem 0 1.8rem; }
.vd-platform-badge {
display: flex; align-items: center; gap: 0.5rem;
background: var(--vd-surface);
border: 1px solid var(--vd-border);
border-radius: 10px;
padding: 0.6rem 1rem;
font-size: 0.88rem; font-weight: 600;
color: var(--vd-text);
}
.vd-platform-badge .vd-dot { width: 8px; height: 8px; border-radius: 50%; }
.vd-dot-tiktok { background: #ff0050; }
.vd-dot-insta { background: #e1306c; }
.vd-dot-x { background: #1d9bf0; }
.vd-post-img {
width: 100%; border-radius: 12px;
border: 1px solid var(--vd-border);
margin: 1.5rem 0 0.4rem;
display: block;
box-shadow: 0 4px 24px rgba(0,0,0,0.4);
}
.vd-img-caption {
text-align: center;
font-size: 0.8rem;
color: var(--vd-muted);
margin: 0 0 2rem;
font-style: italic;
}
.vd-flow {
background: var(--vd-surface);
border: 1px solid var(--vd-border);
border-radius: 14px;
padding: 1.8rem 1.4rem;
margin: 1.8rem 0;
display: flex; flex-direction: column; align-items: center;
}
.vd-flow-node {
background: var(--vd-surface2);
border: 1px solid var(--vd-border);
border-radius: 10px;
padding: 0.65rem 1.4rem;
font-size: 0.9rem;
font-weight: 600;
color: var(--vd-text);
text-align: center;
width: 100%;
max-width: 420px;
}
.vd-flow-node.green { border-color: rgba(0,200,83,0.4); background: var(--vd-green-glow); color: var(--vd-green); }
.vd-flow-node.orange { border-color: rgba(245,158,11,0.4); background: rgba(245,158,11,0.07); color: var(--vd-orange); }
.vd-flow-node.cyan { border-color: rgba(34,211,238,0.4); background: var(--vd-cyan-glow); color: var(--vd-cyan); }
.vd-flow-arrow { color: var(--vd-muted); font-size: 1.1rem; line-height: 1; padding: 0.35rem 0; }
.vd-results { display: grid; grid-template-columns: 1fr 1fr; gap: 1rem; margin: 1.5rem 0 2rem; }
@media (max-width: 600px) { .vd-results { grid-template-columns: 1fr; } }
.vd-result-card {
background: var(--vd-surface);
border: 1px solid var(--vd-border);
border-radius: 12px;
padding: 1.2rem;
}
.vd-result-card .vd-rc-icon { font-size: 1.6rem; margin-bottom: 0.5rem; }
.vd-result-card h4 {
font-family: 'Space Grotesk', sans-serif;
font-size: 0.95rem; font-weight: 700;
color: #fff; margin: 0 0 0.3rem;
}
.vd-result-card p { font-size: 0.86rem; color: var(--vd-muted); margin: 0; line-height: 1.55; }
.vd-cta {
background: var(--vd-surface);
border: 1px solid rgba(0,200,83,0.25);
border-radius: 14px;
padding: 1.6rem 1.8rem;
margin: 2.5rem 0 0;
}
.vd-cta p { font-size: 0.95rem; color: var(--vd-text); margin: 0; line-height: 1.7; }
.vd-cta strong { color: var(--vd-green); }
.vd-divider { border: none; border-top: 1px solid var(--vd-border); margin: 2.2rem 0; }</style><div class="vd-article"><h2>The Problem With Every Downloader Site</h2><p>You find a TikTok video you want to keep. Maybe it's a tutorial, a recipe, a clip you want to share somewhere offline. You Google "download TikTok without watermark" and click the first result. What follows is a ritual:</p><div class="vd-callout warning">
Pop-up #1 appears. You close it. A second tab opens. You close that. You find the actual download button — but it's fake and triggers another ad. The real button is somewhere underneath a consent banner. You finally click download. It starts… and gives you a watermarked file anyway.</div><p>This happens on nearly every popular downloader site. They're monetised almost entirely through advertising, and the UX is designed to maximise your exposure to that advertising — not to help you download a video. The actual download logic underneath all that noise is usually a single API call.</p><p>So I asked myself: how hard would it actually be to build a clean version of this for personal use?</p><h2>What I Actually Wanted</h2><p>Three things, nothing more:</p><div class="vd-steps"><div class="vd-step-card"><div class="vd-step-num">1</div><div class="vd-step-body"><h4>No watermark, always</h4><p>The tool should try its hardest to get a watermark-free file. If it can't, it tells you — it doesn't quietly give you the watermarked version pretending it's clean.</p></div></div><div class="vd-step-card"><div class="vd-step-num">2</div><div class="vd-step-body"><h4>Support the platforms I actually use</h4><p>TikTok, Instagram, and X (Twitter). Paste any URL from any of these three and it should just work.</p></div></div><div class="vd-step-card"><div class="vd-step-num">3</div><div class="vd-step-body"><h4>Zero friction interface</h4><p>One input, one button. No accounts, no CAPTCHAs, no ads. It runs locally so there's nothing to sign up for.</p></div></div></div><h2>How It Works — The Conceptual Picture</h2><p>The tool is split into two halves: a lightweight React frontend and a Node.js backend. The frontend is just the URL input box. All the interesting logic lives in the backend.</p><p>Here are the five things that happen the moment you paste a URL and hit Download:</p><h3>Step 1 — Figure Out Which Platform You're On</h3><p>The very first thing the backend does is inspect the URL and decide: is this TikTok, Instagram, or X? Each platform has its own URL patterns — including short links, mobile URLs, and regional variants — and the tool checks all of them.</p><div class="vd-platform-row"><div class="vd-platform-badge"><span class="vd-dot vd-dot-tiktok"/> tiktok.com · vm.tiktok.com · vt.tiktok.com</div><div class="vd-platform-badge"><span class="vd-dot vd-dot-insta"/> instagram.com · www.instagram.com</div><div class="vd-platform-badge"><span class="vd-dot vd-dot-x"/> x.com · twitter.com · mobile.twitter.com</div></div><p>If the URL doesn't match any of these, the backend rejects it immediately with a clear error before wasting any time trying to fetch something it can't handle. No silent failures.</p><img src="/images/video-downloader-arch.jpg" alt="Anime engineer pointing at a holographic flowchart of the download architecture" class="vd-post-img"><p class="vd-img-caption">The backend runs through a short chain: detect → resolve → try providers → cache → serve</p><h3>Step 2 — Resolve Short Links</h3><p>TikTok in particular loves to generate short share links like<code>vm.tiktok.com/AbcXyz</code>. These redirect to the full video URL, but the download providers need the real URL to work with. So the backend follows up to five redirects to resolve the final destination before doing anything else.</p><h3>Step 3 — The Provider Chain</h3><p>This is the core of the tool. The backend doesn't rely on a single source for the download link — it tries multiple providers in order, and only moves to the next one if the previous failed.</p><div class="vd-flow"><div class="vd-flow-node cyan">Validated &amp; resolved URL</div><div class="vd-flow-arrow">↓</div><div class="vd-flow-node"><strong>Provider 1:</strong> TikWM API — fast, HD, usually no-watermark</div><div class="vd-flow-arrow">↓ if failed</div><div class="vd-flow-node"><strong>Provider 2:</strong> yt-dlp — catches what TikWM misses, supports all 3 platforms</div><div class="vd-flow-arrow">↓ if both fail on no-watermark</div><div class="vd-flow-node orange">Last resort: watermarked fallback (with a clear warning shown)</div><div class="vd-flow-arrow">↓</div><div class="vd-flow-node green">✓ Download link returned to frontend</div></div><p>The first provider, TikWM, is a public API that's fast and usually returns an HD, watermark-free file. But it occasionally struggles with newer videos or private content. That's when<strong>yt-dlp</strong> steps in — a powerful open-source tool that knows how to extract media from hundreds of platforms, and is updated constantly as platforms change their serving behaviour.</p><div class="vd-callout info"><strong>Why yt-dlp as a fallback and not primary?</strong> TikWM is faster and returns a clean pre-parsed result. yt-dlp is more capable but adds latency since it runs as a local process and parses raw platform data. Using TikWM first keeps the happy path quick.</div><h3>Step 4 — The Proxy Download</h3><p>Here's a detail that actually matters for reliability: the tool doesn't give your browser a direct link to TikTok's CDN or Instagram's servers. Instead, it registers a short-lived<strong>secure token</strong> that points back to the Node backend. When you click download, your browser hits the backend's own<code>/api/file</code> endpoint, which streams the video directly to you.</p><p>Why does this matter? Direct CDN links from social platforms often include authentication tokens or short expiry times. They also sometimes block downloads when accessed directly from a browser outside the platform. Running the stream through the backend sidesteps both issues — and means the download starts with a clean filename instead of a jumble of CDN parameters.</p><h3>Step 5 — Caching</h3><p>Once a video URL has been resolved and a download link extracted, the result is cached in memory for 30 minutes. If you (or someone else on the same local instance) pastes the same video URL again within that window, the backend returns the cached result instantly — no API calls, no yt-dlp process, just the stored answer.</p><h2>The Result</h2><p>What this adds up to in practice:</p><div class="vd-results"><div class="vd-result-card"><div class="vd-rc-icon">🎯</div><h4>No-watermark first, always</h4><p>The tool tries every avenue for a clean file before falling back. The fallback is clearly labelled.</p></div><div class="vd-result-card"><div class="vd-rc-icon">⚡</div><h4>Fast on repeat URLs</h4><p>Same video twice within 30 minutes? Instant response from the in-memory cache.</p></div><div class="vd-result-card"><div class="vd-rc-icon">🎵</div><h4>Audio extraction too</h4><p>The tool also surfaces the audio-only track when available — useful for saving music from TikTok.</p></div><div class="vd-result-card"><div class="vd-rc-icon">🔒</div><h4>No external accounts</h4><p>Runs entirely locally. Nothing to log into, nothing phoning home, no API keys required.</p></div></div><img src="/images/video-downloader-result.jpg" alt="Anime developer smiling at a clean minimal download interface" class="vd-post-img"><p class="vd-img-caption">One URL input, three download buttons, zero pop-ups. That's the whole interface.</p><h2>What's Next</h2><p>The tool is currently running locally but I'm planning to deploy it on<strong>curiousbit.netlify.app</strong> so anyone who wants a clean download experience can use it without having to run Node themselves. The architecture is already production-ready — it's just a matter of pointing it at a hosting environment and wiring up the environment variables.</p><p>A few things I'd like to add before making it fully public: rate limiting per IP (to avoid abuse), a simple download history in the UI, and potentially Instagram Stories support which currently needs a different extraction path.</p><div class="vd-callout">
The code is structured as a monorepo — backend and frontend live together, share URL validation logic, and build to a single deployable package. If you want to run it yourself locally, it's a single<code>npm install &amp;&amp; npm run dev</code> away.</div><hr class="vd-divider"><div class="vd-cta"><p><strong>Over to you:</strong> Have you ever got fed up enough with a broken web experience that you built your own alternative? I'd love to hear what you made — or whether you'd actually use a clean, ad-free downloader like this if it were publicly hosted. Drop your thoughts below or find me on LinkedIn.</p></div></div>
]]></content:encoded><media:content url="https://curiousbit.netlify.app/images/video-downloader-banner.jpg" medium="image"><media:title type="plain">Build-Log</media:title></media:content><category>automation</category><category>build-log</category><category>nodejs</category><category>Knowledge Base</category></item><item><title>Camera Roll to Caption — Python Pipeline, Vision Model for Photo Tags</title><link>https://curiousbit.netlify.app/camera-roll-to-caption-python-pipeline-vision-model-for-photo-tags/</link><guid isPermaLink="true">https://curiousbit.netlify.app/camera-roll-to-caption-python-pipeline-vision-model-for-photo-tags/</guid><pubDate>Sat, 02 May 2026 00:00:00 +0000</pubDate><dc:creator>Ajay Walia</dc:creator><description>&lt;p&gt;Vision models, language models, and most other generative systems are confident-but-wrong some non-trivial fraction of the time. The instinct is to fix that with better prompts, bigger models, or smarter agents. The cheaper move is usually to add a small structured review seam — a thirty-second checkpoint where a human can glance, correct, and move on.&lt;/p&gt;</description><content:encoded>&lt;![CDATA[<img src="https://curiousbit.netlify.app/images/ctp/bottlebrush.jpg" alt="Build-Log" style="max-width:100%;height:auto;margin-bottom:1.5em;"/><p>Vision models, language models, and most other generative systems are confident-but-wrong some non-trivial fraction of the time. The instinct is to fix that with better prompts, bigger models, or smarter agents. The cheaper move is usually to add a small structured review seam — a thirty-second checkpoint where a human can glance, correct, and move on.</p><p>This post is the case study for one such seam, dropped into a build I needed for myself. Of 35 garden photos handed to a vision model,<strong>74% came back with correct first-pass labels</strong>. After thirty seconds editing a CSV,<strong>97% were acceptable to publish</strong>. Total API cost:<strong>$0.18</strong>. Total inference time:<strong>~74 seconds</strong> at 2.1 sec/photo on<code>gpt-4o-mini</code>. The CSV was the highest-leverage code in the project — and it isn&rsquo;t really code.</p><p>Here&rsquo;s the story.</p><h2 id="the-annoyance">The annoyance</h2><p>It was a Saturday afternoon in early March. I&rsquo;d come back from a walk around the garden with thirty-five photos on my iPhone — bottlebrush in full red, honeysuckle dripping with rain, a lilly-pilly cluster doing its outrageous pink thing, and at least one inexplicable shot of an old railway station I&rsquo;d passed on the way home.</p><p>I wanted to post a handful of them with consistent little hashtag labels —<code>#bottlebrush</code>,<code>#honeysuckle</code>,<code>#flower</code> — burned into the corner like a quiet caption. Not a watermark, not a filter, just a small readable pill that says &ldquo;this is what you&rsquo;re looking at.&rdquo;</p><p>What I didn&rsquo;t want was to open each HEIC in Preview, draw a text box, fiddle with the font, export, repeat thirty-five times. So I did the only reasonable thing: I wrote a small Python tool that does it for me.</p><p><img src="/images/ctp/bottlebrush.jpg" alt="Bottlebrush hero — red Australian bottlebrush flower with a #bottlebrush hashtag pill in the bottom-right corner"/><h2 id="the-shape-of-the-pipeline">The shape of the pipeline</h2><pre tabindex="0"><code>┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Folder │──▶│ Vision │──▶│ CSV │──▶│ Apply │──▶│ Tagged │
│ photos │ │ provider │ │ review │ │ + pill │ │ output │
└──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
↑
human-in-the-loop seam
--mode propose: folder ─▶ vision provider ─▶ CSV
--mode apply: CSV ─▶ render + pill ─▶ tagged output</code></pre><p>The minimal product was easy to describe. Point the script at a folder. For each image — HEIC, JPG, PNG, whatever the iPhone or my camera roll throws at it — open it, figure out what&rsquo;s in it, draw a small rounded hashtag pill into the bottom-right corner, save the result to a<code>tagged_output/</code> subfolder. No watermark across the centre of the image, no filter or colour grade, no destructive edit to the original, and no making me choose the label by hand when a vision model can have a decent first guess.</p><p>That last point is where the design got interesting.</p><h2 id="the-seam">The seam</h2><p>You could write this as a single command: walk the folder, ask the model, render the tag, done. I tried that first. The first-pass run produced a folder of beautifully tagged images, about a quarter of which were wrong in some quietly maddening way — a daisy called<code>#flower</code>, a fern called<code>#leaves</code>, the railway station called, charitably,<code>#station</code>.</p><p>So the script runs in two passes.</p><p><code>--mode propose</code> opens each image, hands it to the vision model, and writes a CSV with five columns:</p><pre tabindex="0"><code>image_path, label, score, suggested_tag, final_tag</code></pre><p><code>final_tag</code> is initialised to<code>suggested_tag</code>, but the whole point of the column is that you can edit it. Open the CSV, glance down the list, fix anything obvious —<code>flower</code> becomes<code>morning_glory</code>,<code>leaves</code> becomes<code>bamboo</code> — save, close. On this batch,<strong>9 of 35 rows needed editing</strong> (a daisy, the railway station, two ferns, the bamboo, and four generic-flower fallbacks). A thirty-second pass.</p><p><code>--mode apply</code> then reads the CSV row by row and renders the tag using whatever&rsquo;s in<code>final_tag</code>. The CSV is the human-in-the-loop seam. It is much cheaper than re-running inference, and it catches the cases where the model was right about the genus but wrong about the species, or just wrong.</p><h2 id="three-providers-one-interface">Three providers, one interface</h2><p>I didn&rsquo;t want to commit to one vision model — the price/quality trade-offs are too lively right now. The script supports three providers behind one interface, picked via<code>--provider local|openai|xai</code>.</p><p><strong>Local CLIP.</strong> HuggingFace&rsquo;s<code>openai/clip-vit-large-patch14</code> against a fixed candidate list. Free, offline,<strong>~0.4 sec/photo on an M3 Pro</strong>. The cost is breadth: anything outside the candidate list collapses to the nearest match. CLIP doesn&rsquo;t know what a bottlebrush is unless I tell it the word.</p><p><strong>OpenAI.</strong><code>gpt-4o-mini</code> by default, with an opt-in<code>--high-accuracy</code> flag that retries low-confidence cases (under 0.72) on<code>gpt-4o</code>.<strong>~2.1 sec/photo, ~$0.18 for the 35-photo batch.</strong> Open-ended labels — how<code>bottlebrush</code>,<code>honeysuckle</code>,<code>fern</code>, and<code>berries</code> ended up in the CSV rather than<code>flower</code>,<code>flower</code>,<code>leaves</code>,<code>fruit</code>.<strong>22% of the batch tripped the retry threshold</strong> and went to<code>gpt-4o</code>.</p><p><img src="/images/ctp/berries.jpg" alt="Hot-pink lilly-pilly berries tagged #berries — an example of gpt-4o-mini producing a specific label rather than the generic &ldquo;fruit&rdquo;"/><p><strong>xAI Grok.</strong> Same OpenAI-compatible client, pointed at<code>api.x.ai</code> with<code>grok-2-vision-latest</code>. Useful if you&rsquo;re already on the x.ai stack or want a different model family&rsquo;s vote.</p><p>The mental model: local CLIP for batch-of-a-hundred-photos-on-a-flight, OpenAI as the daily driver, and the high-accuracy retry for exactly the case where the model says &ldquo;flower&rdquo; with 0.55 confidence and I want it to look harder before I have to.</p><p>The blue morning glory below is what generic labels look like in practice — still a decent fallback, just unspecific. The model wasn&rsquo;t wrong; it just wasn&rsquo;t curious.</p><p><img src="/images/ctp/morning-glory.jpg" alt="Blue morning glory tagged #flower — an example of the model falling back to a generic label even with the specific species clearly visible"/><h2 id="two-small-touches">Two small touches</h2><p>Two design choices are the difference between &ldquo;the script works&rdquo; and &ldquo;the output looks intentional.&rdquo;</p><p><strong>Style-aware contrast.</strong> The pill needs to be readable on both a bright sky and dark foliage. The script crops the bottom-right region of the image, measures the mean luminance using the standard Rec. 709 weights, and flips the colour scheme above or below a threshold:</p><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="k">def</span><span class="nf">style_aware_colors</span><span class="p">(</span><span class="n">img</span><span class="p">):</span></span></span><span class="line"><span class="cl"><span class="n">w</span><span class="p">,</span><span class="n">h</span><span class="o">=</span><span class="n">img</span><span class="o">.</span><span class="n">size</span></span></span><span class="line"><span class="cl"><span class="n">crop</span><span class="o">=</span><span class="n">img</span><span class="o">.</span><span class="n">crop</span><span class="p">((</span><span class="nb">int</span><span class="p">(</span><span class="n">w</span><span class="o">*</span><span class="mf">0.68</span><span class="p">),</span><span class="nb">int</span><span class="p">(</span><span class="n">h</span><span class="o">*</span><span class="mf">0.80</span><span class="p">),</span><span class="n">w</span><span class="p">,</span><span class="n">h</span><span class="p">))</span></span></span><span class="line"><span class="cl"><span class="n">r</span><span class="p">,</span><span class="n">g</span><span class="p">,</span><span class="n">b</span><span class="o">=</span><span class="n">ImageStat</span><span class="o">.</span><span class="n">Stat</span><span class="p">(</span><span class="n">crop</span><span class="o">.</span><span class="n">convert</span><span class="p">(</span><span class="s2">"RGB"</span><span class="p">))</span><span class="o">.</span><span class="n">mean</span><span class="p">[:</span><span class="mi">3</span><span class="p">]</span></span></span><span class="line"><span class="cl"><span class="n">luminance</span><span class="o">=</span><span class="mf">0.2126</span><span class="o">*</span><span class="n">r</span><span class="o">+</span><span class="mf">0.7152</span><span class="o">*</span><span class="n">g</span><span class="o">+</span><span class="mf">0.0722</span><span class="o">*</span><span class="n">b</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="n">luminance</span><span class="o">&lt;</span><span class="mi">140</span><span class="p">:</span></span></span><span class="line"><span class="cl"><span class="k">return</span><span class="p">(</span><span class="mi">255</span><span class="p">,</span><span class="mi">255</span><span class="p">,</span><span class="mi">255</span><span class="p">,</span><span class="mi">245</span><span class="p">),</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">95</span><span class="p">)</span><span class="c1"># white text, dark pill</span></span></span><span class="line"><span class="cl"><span class="k">return</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">245</span><span class="p">),</span><span class="p">(</span><span class="mi">255</span><span class="p">,</span><span class="mi">255</span><span class="p">,</span><span class="mi">255</span><span class="p">,</span><span class="mi">95</span><span class="p">)</span><span class="c1"># black text, light pill</span></span></span></code></pre></div><p>Eight lines of PIL. In this batch every photo sampled dark — gardens are mostly green and shadow in the corner — so every output got the dark pill. The bright-pill branch is still there, waiting for a photo with sky or a light wall in the corner.</p><p><strong>Save with fallback.</strong> HEIC writes occasionally fail for reasons that aren&rsquo;t worth diagnosing in a personal tool. The save function tries the original format first; if PIL throws, it quietly drops to JPEG with the same filename stem. Eight more lines. On this batch,<strong>3 of 35 fell back to JPEG</strong>. Without the fallback those three would have been a stack trace and a half-finished folder. With it, thirty-five of thirty-five made it through.</p><h2 id="what-id-add-next">What I&rsquo;d add next</h2><p>Multi-tag support, so a photo can be<code>#lorikeet #bottlebrush</code> when the bird showed up in the bottlebrush. EXIF preservation through the round-trip — right now PIL strips most of the metadata, which I don&rsquo;t love. A tiny review UI to replace the CSV step, either a Tkinter window or a one-page localhost app. Smarter candidate lists for the local provider, scoped by season or geography — Sydney summer has a different vocabulary than European spring.</p><p>None of these are urgent enough to displace &ldquo;the script already does what I wanted.&rdquo;</p><h2 id="closing-observations">Closing observations</h2><p>Three lessons that generalise beyond this script.</p><p><strong>Human-in-the-loop is cheap and underrated.</strong> The CSV seam between propose and apply takes thirty seconds per batch and saves me from confidently wrong outputs. For any task where a model is confident-but-wrong some non-trivial fraction of the time — RAG, codegen, moderation, enterprise copilots, agentic workflows — a structured review step pays for itself almost immediately. The CSV doesn&rsquo;t have to be elegant. It has to exist.</p><p><strong>Pluggable providers are worth the small abstraction tax even on personal tools.</strong> I went from local CLIP to<code>gpt-4o-mini</code> to Grok in the space of one afternoon without rewriting the rendering code. The interface is<code>(client, model, image) → (label, score)</code> and that&rsquo;s it. Once you&rsquo;ve paid that cost once, you can keep up with a fast-moving model market essentially for free.</p><p><strong>Small touches decide whether a script feels finished.</strong> Luminance-aware contrast and a save-format fallback don&rsquo;t change what the tool does; they change how the output reads.</p><p>The model wasn&rsquo;t the product. The seam was.</p><hr><p>A short reel of the tagged photos in the wild:<a href="https://www.instagram.com/reel/DVhMhwIE9YKZ-4xtOHeud3rY1IO2x_3OeGzr9M0/">Instagram story</a>.</p>
]]></content:encoded><media:content url="https://curiousbit.netlify.app/images/ctp/bottlebrush.jpg" medium="image"><media:title type="plain">Build-Log</media:title></media:content><category>automation</category><category>artificial-intelligence</category><category>build-log</category><category>engineering</category><category>Knowledge Base</category></item></channel></rss>