How we handle complex documents

What Does World-Class OCR Look Like at Wordbase?

What Does World-Class OCR Look Like at Wordbase?

At Wordbase, we have taken an innovative approach to processing scanned documents and images, ensuring the highest levels of accuracy and document comprehension.

Optical Character Recognition (OCR) is a technology that enables software to read text from images, scanned documents, and more.

However, OCR is not a one-size-fits-all solution. Various OCR technologies offer different levels of accuracy and speed.

  • Basic OCR: Inexpensive and fast, but often struggles with accurately predicting text, leading to errors, especially problematic in legal documents.

  • Advanced OCR: More costly and slower, but still may not understand document layouts.

Our Commitment to Accuracy

In the legal field, accuracy is paramount. To ensure we provide the best possible solution, we rigorously tested over half a dozen OCR technologies. Our goal was to find the one that delivers the highest level of accuracy and reliability for complex legal documents.

Through extensive testing, we found that many solutions either failed to accurately predict text or struggled with the unique layouts and nuances of legal documents.

Real-World Example

Consider this real world deposition transcript, a common document for attorneys.

Deposition Transcript Example

This document contains four distinct pages within one image.

OCR would incorrectly treat it as a single continuous page, merging sentences from different pages and producing incomprehensible text.

OCR Challenges

OCR struggles with:

  • Different document layouts
  • Multiple languages
  • Blurry images
  • Handwriting

What we do

At Wordbase, we use one of the best vision models in the world. It can comprehend documents just like a human would.

It can:

  • Recognize and separate multiple pages within a single image
  • Detect the correct reading order
  • Identify elements like redactions
  • It even processes handwriting, so whether you upload handwritten notes or scanned documents, Wordbase reads them with precision.

We are excited to share more about what we are building, we know it will be something you are proud to use.