Hausdorff Distances for Searching in Binary Text Images

Andreev, Andrey; Kirov, Nikolay

Serdica Journal of Computing (2009)

  • Volume: 3, Issue: 1, page 23-46
  • ISSN: 1312-6555

Abstract

top
This work has been partially supported by Grant No. DO 02-275, 16.12.2008, Bulgarian NSF, Ministry of Education and Science.Hausdorff distance (HD) seems the most efficient instrument for measuring how far two compact non-empty subsets of a metric space are from each other. This paper considers the possibilities provided by HD and some of its modifications used recently by many authors for resemblance between binary text images. Summarizing part of the existing word image matching methods, relied on HD, we investigate a new similar parameterized method which contains almost all of them as particular cases. Numerical experiments for searching words in binary text images are carried out with 333 pages of old Bulgarian typewritten text, 200 printed pages of Bulgarian Chrestomathy from year 1884, and 200 handwritten pages of Slavonic manuscript from year 1574. They outline how the parameters must be set in order to use the advantages of the proposed method for the purposes of word matching in scanned document images.

How to cite

top

Andreev, Andrey, and Kirov, Nikolay. "Hausdorff Distances for Searching in Binary Text Images." Serdica Journal of Computing 3.1 (2009): 23-46. <http://eudml.org/doc/11442>.

@article{Andreev2009,
abstract = {This work has been partially supported by Grant No. DO 02-275, 16.12.2008, Bulgarian NSF, Ministry of Education and Science.Hausdorff distance (HD) seems the most efficient instrument for measuring how far two compact non-empty subsets of a metric space are from each other. This paper considers the possibilities provided by HD and some of its modifications used recently by many authors for resemblance between binary text images. Summarizing part of the existing word image matching methods, relied on HD, we investigate a new similar parameterized method which contains almost all of them as particular cases. Numerical experiments for searching words in binary text images are carried out with 333 pages of old Bulgarian typewritten text, 200 printed pages of Bulgarian Chrestomathy from year 1884, and 200 handwritten pages of Slavonic manuscript from year 1574. They outline how the parameters must be set in order to use the advantages of the proposed method for the purposes of word matching in scanned document images.},
author = {Andreev, Andrey, Kirov, Nikolay},
journal = {Serdica Journal of Computing},
keywords = {Hausdorff Distance; Binary Text Image; Word Matching; word image matching methods},
language = {eng},
number = {1},
pages = {23-46},
publisher = {Institute of Mathematics and Informatics Bulgarian Academy of Sciences},
title = {Hausdorff Distances for Searching in Binary Text Images},
url = {http://eudml.org/doc/11442},
volume = {3},
year = {2009},
}

TY - JOUR
AU - Andreev, Andrey
AU - Kirov, Nikolay
TI - Hausdorff Distances for Searching in Binary Text Images
JO - Serdica Journal of Computing
PY - 2009
PB - Institute of Mathematics and Informatics Bulgarian Academy of Sciences
VL - 3
IS - 1
SP - 23
EP - 46
AB - This work has been partially supported by Grant No. DO 02-275, 16.12.2008, Bulgarian NSF, Ministry of Education and Science.Hausdorff distance (HD) seems the most efficient instrument for measuring how far two compact non-empty subsets of a metric space are from each other. This paper considers the possibilities provided by HD and some of its modifications used recently by many authors for resemblance between binary text images. Summarizing part of the existing word image matching methods, relied on HD, we investigate a new similar parameterized method which contains almost all of them as particular cases. Numerical experiments for searching words in binary text images are carried out with 333 pages of old Bulgarian typewritten text, 200 printed pages of Bulgarian Chrestomathy from year 1884, and 200 handwritten pages of Slavonic manuscript from year 1574. They outline how the parameters must be set in order to use the advantages of the proposed method for the purposes of word matching in scanned document images.
LA - eng
KW - Hausdorff Distance; Binary Text Image; Word Matching; word image matching methods
UR - http://eudml.org/doc/11442
ER -

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.