3 items
3 posts
Mistral OCR 4 and Baidu's Unlimited OCR both hit Hacker News today. The useful takeaway for developers is that OCR is no longer just text extraction. It is becoming a runtime decision for document agents.
Baidu releases Unlimited OCR, an open-source vision-language model that parses 100+ page documents in a single pass without memory blowup. Here's what developers need to know.
How to ship Claude's vision API in production. OCR, charts, UI audits, real cost numbers, TypeScript SDK code, and the gotchas that bite at 100k images a month.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.