OCR Rate Computation in Mass Digitization Programs

Geneviève Cron

Displaying similar documents to “OCR Rate Computation in Mass Digitization Programs”

A software tool for searching in binary text images

Nikolay Kirov (2008)

Review of the National Center for Digitization

Similarity:

Scientific Word, version 1.0. A software review.

Köksal, Semen (1993)

Journal of Applied Mathematics and Stochastic Analysis

Similarity:

Optical Character Recognition of Historical Texts: End-User Focused Research for Slovenian Books and Newspapers from the 18th and 19th Century

Ines Jerele, Tomaž Erjavec, Daša Pokorn, Alenka Kavčič-Čolić (2012)

Review of the National Center for Digitization

Similarity:

MathType version 3.0. A software review.

Clay, David W. (1993)

Journal of Applied Mathematics and Stochastic Analysis

Similarity:

Word Image Matching In Bulgarian Historical Documents

Andrey Andreev, Nikolay Kirov (2006)

Review of the National Center for Digitization

Similarity:

Noun Sense Disambiguation using Co-Occurrence Relation in Machine Translation

Choe, Changil, Kim, Hyonil (2012)

Serdica Journal of Computing

Similarity:

Word Sense Disambiguation, the process of identifying the meaning of a word in a sentence when the word has multiple meanings, is a critical problem of machine translation. It is generally very difficult to select the correct meaning of a word in a sentence, especially when the syntactical difference between the source and target language is big, e.g., English-Korean machine translation. To achieve a high level of accuracy of noun sense selection in machine translation, we introduced...

Hausdorff Distances for Searching in Binary Text Images

Andreev, Andrey, Kirov, Nikolay (2009)

Serdica Journal of Computing

Similarity:

This work has been partially supported by Grant No. DO 02-275, 16.12.2008, Bulgarian NSF, Ministry of Education and Science. Hausdorff distance (HD) seems the most efficient instrument for measuring how far two compact non-empty subsets of a metric space are from each other. This paper considers the possibilities provided by HD and some of its modifications used recently by many authors for resemblance between binary text images. Summarizing part of the existing word image matching...

Correcting spelling errors by modelling their causes

Sebastian Deorowicz, Marcin Ciura (2005)

International Journal of Applied Mathematics and Computer Science

Similarity:

This paper accounts for a new technique of correcting isolated words in typed texts. A language-dependent set of string substitutions reflects the surface form of errors that result from vocabulary incompetence, misspellings, or mistypings. Candidate corrections are formed by applying the substitutions to text words absent from the computer lexicon. A minimal acyclic deterministic finite automaton storing the lexicon allows quick rejection of nonsense corrections, while costs associated...

An isolated word recognition system based on a low-complexity parametrization procedure.

Héctor Rulot Segovia, Enrique Vidal Ruiz, Francisco Casacuberta Nolla (1984)

Qüestiió

Similarity:

An Isolated Word Recognition System is presented in this paper which uses a parametrization scheme based on the two-level clipped signal Autocorrelation Function. The system prototype runs on a 64 kby. partition of a general-purpose minicomputer with quite small specific hardware requirements and, for moderate sized dictionaries (≤ 40 words), gives 95-98% recognition rates with response times better than two times real time. The system uses claasical Dynamic Programming word-matching,...

The text encoding initiative

Matthew J. Driscoll (2004)

Review of the National Center for Digitization

Similarity:

Some variants of Hausdorff distance for word matching

Andrey Andreev, Nikolay Kirov (2008)

Review of the National Center for Digitization

Similarity: