Local VLMs Revolutionize Document Redaction with Bounding Box OCR for PII Protection
A new open-source application leverages Qwen 3 VL and hybrid OCR techniques to detect and redact personally identifiable information in documents with unprecedented local precision. The system, developed by independent researcher Sean Pedrick, combines visual-language models with traditional OCR to achieve high accuracy on handwritten and degraded texts.

Local VLMs Revolutionize Document Redaction with Bounding Box OCR for PII Protection
In a significant advancement for privacy-focused document processing, an open-source application has demonstrated that consumer-grade GPUs can power highly accurate, local document redaction workflows using visual-language models (VLMs). Developed by independent researcher Sean Pedrick and detailed in a public blog post, the Document Redaction App integrates Qwen 3 VL 8B Instruct—a compact yet powerful multimodal AI model—to detect and redact personally identifiable information (PII) with precise bounding box localization, a critical requirement for legal and compliance workflows.
Unlike conventional OCR systems that output only text, redaction requires spatial awareness: knowing exactly where a name, Social Security number, or signature appears on a page to obscure it without altering surrounding content. Pedrick’s system addresses this by employing two distinct approaches: a pure VLM method that directly extracts text and bounding boxes from scanned documents, and a hybrid model that pairs PaddleOCR’s high-precision line-level detection with Qwen 3 VL’s contextual intelligence to resolve ambiguous or degraded text, particularly in handwritten notes.
According to the developer’s testing, the hybrid approach outperforms pure VLM inference on challenging documents such as faded handwritten forms, smudged signatures, and low-contrast scanned receipts. In one example, a handwritten medical form with overlapping ink and poor contrast was processed with 92% accuracy in identifying PII fields using the hybrid method, compared to 78% with Qwen 3 VL alone. The system also successfully detects non-textual PII, including faces and handwritten signatures, using the VLM’s visual understanding capabilities—a feature rarely implemented in traditional redaction tools.
The application is fully open-source and available on GitHub, with a free, no-installation demo hosted on Hugging Face Spaces, allowing users to upload documents and see redaction results in real time. This accessibility is a major departure from enterprise-grade redaction platforms, which often require cloud processing, expensive licensing, and data transmission outside secure environments. By running entirely locally on GPUs with 24GB VRAM or less, the system enables law firms, healthcare providers, and government agencies to comply with GDPR, HIPAA, and other privacy regulations without exposing sensitive documents to third-party servers.
Pedrick’s work builds on emerging research in semantic document layout analysis, such as the SCAN framework recently published on arXiv, which demonstrates how multimodal models can interpret document structure through both textual and visual cues. While SCAN focuses on retrieval-augmented generation for legal and archival documents, Pedrick’s implementation applies similar principles to the practical, high-stakes domain of data redaction. His use of mid-sized LLMs like Gemma 27B to identify custom entities—such as internal project codes or employee IDs—further extends the system’s utility beyond standard PII categories.
Despite promising results, the system is not yet perfect. Handwritten cursive, overlapping text, and heavily corrupted scans still pose challenges, with error rates hovering around 8–12% in edge cases. Pedrick anticipates that the upcoming Qwen 3.5 VL models, expected to offer improved resolution and reasoning, will significantly close this gap. He has already begun retraining his pipeline with early access versions of the new model and plans to publish updated benchmarks in the coming weeks.
This innovation signals a broader shift in AI-powered compliance: away from cloud-dependent, black-box systems toward transparent, locally-executed tools that empower organizations to protect privacy without sacrificing control. As regulatory scrutiny intensifies globally, solutions like Pedrick’s may become the new standard—not because they are perfect, but because they are auditable, private, and accessible to institutions of all sizes.


