Unicode Explorer using binary search over fetch() HTTP range requests

Overview

Simon Willison built a Unicode character lookup tool that demonstrates how HTTP range requests enable efficient binary search over large remote files without downloading the entire dataset. The tool searches through 76.6MB of Unicode metadata by fetching only small byte ranges needed for each binary search step.

View Original

The Breakdown

Binary search over HTTP range requests - fetches only specific byte ranges from a remote 76.6MB Unicode file instead of downloading the entire dataset, completing searches in ~17 steps with under 4KB transferred
HTTP compression compatibility issues - range requests don’t work with compressed files because compression changes byte offsets, but CDNs like Cloudflare automatically disable compression when range headers are present
AI-assisted development workflow - used Claude to brainstorm use cases for binary search, generate specifications, and implement the working code through an asynchronous research process
Unicode codepoint lookup mechanism - searches sorted Unicode metadata by character input (like ‘ø’) or hex codepoint (like ‘1F99C’) to return character information including category and Unicode block