An Online Repository of Mathematical Samples

Baker, Josef B.; Sexton, Alan P.; Sorge, Volker

  • Towards a Digital Mathematics Library. Grand Bend, Ontario, Canada, July 8-9th, 2009, Publisher: Masaryk University Press(Brno), page 49-57

Abstract

top
With a growing community of researchers working on the recognition, parsing and digital exploitation of mathematical formulae, a need has arisen for a set of samples or benchmarks which can be used to compare, evaluate and help to develop different implementations and algorithms. The benchmark set would have to cover a wide range of mathematics, contain enough information to be able to search for specific samples and be accessible to the whole community. In this paper, we propose an on-line system and repository where researchers may upload samples of mathematics in various formats such as scanned images, images directly rendered from born-digital documents, or born-digital document extracts. The system will support community tagging of these samples with attributes about their syntactic structure, semantic origin, image quality and source. Each sample in the database may then be searched for by any of its associated attributes, and users could download sets of sorted or random formulae to meet their own requirements. Associated with the system will be freely downloadable tools to assist in extracting and clipping mathematical samples from various kinds of documents to prepare them for uploading. Additionally, the system will allow users to annotate each sample with their own files, in LaTeX, MathML, OpenMath and other formats. The intention here is that these annotation files will correspond either to the recognition results of the users’ own systems on the samples, or manually constructed results. We believe that this facility will help to build a community verified ground truth set, available to anyone accessing the system.

How to cite

top

Baker, Josef B., Sexton, Alan P., and Sorge, Volker. "An Online Repository of Mathematical Samples." Towards a Digital Mathematics Library. Grand Bend, Ontario, Canada, July 8-9th, 2009. Brno: Masaryk University Press, 2009. 49-57. <http://eudml.org/doc/221836>.

@inProceedings{Baker2009,
abstract = {With a growing community of researchers working on the recognition, parsing and digital exploitation of mathematical formulae, a need has arisen for a set of samples or benchmarks which can be used to compare, evaluate and help to develop different implementations and algorithms. The benchmark set would have to cover a wide range of mathematics, contain enough information to be able to search for specific samples and be accessible to the whole community. In this paper, we propose an on-line system and repository where researchers may upload samples of mathematics in various formats such as scanned images, images directly rendered from born-digital documents, or born-digital document extracts. The system will support community tagging of these samples with attributes about their syntactic structure, semantic origin, image quality and source. Each sample in the database may then be searched for by any of its associated attributes, and users could download sets of sorted or random formulae to meet their own requirements. Associated with the system will be freely downloadable tools to assist in extracting and clipping mathematical samples from various kinds of documents to prepare them for uploading. Additionally, the system will allow users to annotate each sample with their own files, in LaTeX, MathML, OpenMath and other formats. The intention here is that these annotation files will correspond either to the recognition results of the users’ own systems on the samples, or manually constructed results. We believe that this facility will help to build a community verified ground truth set, available to anyone accessing the system.},
author = {Baker, Josef B., Sexton, Alan P., Sorge, Volker},
booktitle = {Towards a Digital Mathematics Library. Grand Bend, Ontario, Canada, July 8-9th, 2009},
keywords = {MathML; PDF},
location = {Brno},
pages = {49-57},
publisher = {Masaryk University Press},
title = {An Online Repository of Mathematical Samples},
url = {http://eudml.org/doc/221836},
year = {2009},
}

TY - CLSWK
AU - Baker, Josef B.
AU - Sexton, Alan P.
AU - Sorge, Volker
TI - An Online Repository of Mathematical Samples
T2 - Towards a Digital Mathematics Library. Grand Bend, Ontario, Canada, July 8-9th, 2009
PY - 2009
CY - Brno
PB - Masaryk University Press
SP - 49
EP - 57
AB - With a growing community of researchers working on the recognition, parsing and digital exploitation of mathematical formulae, a need has arisen for a set of samples or benchmarks which can be used to compare, evaluate and help to develop different implementations and algorithms. The benchmark set would have to cover a wide range of mathematics, contain enough information to be able to search for specific samples and be accessible to the whole community. In this paper, we propose an on-line system and repository where researchers may upload samples of mathematics in various formats such as scanned images, images directly rendered from born-digital documents, or born-digital document extracts. The system will support community tagging of these samples with attributes about their syntactic structure, semantic origin, image quality and source. Each sample in the database may then be searched for by any of its associated attributes, and users could download sets of sorted or random formulae to meet their own requirements. Associated with the system will be freely downloadable tools to assist in extracting and clipping mathematical samples from various kinds of documents to prepare them for uploading. Additionally, the system will allow users to annotate each sample with their own files, in LaTeX, MathML, OpenMath and other formats. The intention here is that these annotation files will correspond either to the recognition results of the users’ own systems on the samples, or manually constructed results. We believe that this facility will help to build a community verified ground truth set, available to anyone accessing the system.
KW - MathML; PDF
UR - http://eudml.org/doc/221836
ER -

References

top
  1. Hoos, H.H., Stutzle, T., SATLIB: An online resource for research on SAT, . In: Proceedings of the Third Workshop on Satisfiability (SAT 2000), IOS Press (2000) 283–292 http://www.satlib.org. (2000) 
  2. Sutcliffe, G., Suttner, C., The TPTP Problem Library: CNF Release v1.2.1, . Journal of Automated Reasoning 21(2) (1998) 177–203 (1998) Zbl0910.68197MR1646570
  3. Suzuki, M., Uchida, S., Nomura, A., A ground-truthed mathematical character and symbol image database, . In: Proceedings of the Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), IEEE Society Press (2005) 675–679 http://www.inftyproject.org/en/database.html. (2005) 
  4. W3C, Ink markup language (InkML), . (2006) http://www.w3.org/TR/InkML/. (2006) 
  5. Crockford, D., JavaScript Object Notation, . (2006) http://www.json.org/. (2006) 
  6. The American Mathematical Society, 2000 Mathematics Subject Classification, (2000) http://www.ams.org/msc/. (2000) 
  7. Sternberg, S., Semi-riemann geometry and general relativity, (2003) http://www.math.harvard.edu/~shlomo/docs/semi_riemannian_geometry.pdf. (2003) 
  8. Judson, T., Abstract algebra — theory and applications, (2009) http://abstract.ups.edu/download.html. (2009) 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.