HTR Viewer

We conducted a small study on optical character recognition (OCR) using multimodal models (VLMs), based on 77 handwritten German transcriptions of Wikipedia pages. This is an application for visualizing the OCR results. From top to bottom, it presents metadata about the handwriting, the handwriting text itself, the ground truth (true reference), the prediction, as well as an evaluation of errors using Character Error Rate (CER) and Word Error Rate (WER). At the bottom of the application, you will find additional links on the right-hand side to publicly available data and a blog post that explains the approach in more detail. Please scroll down, and the individual results will be displayed.

Metadata (Title and Url):


        
        
        

    

Handwritten Text:

Bild

Ground Truth:


    

Prediction:


    

Evaluation:

Character Error Rate : %

Word Error Rate : %