Currently displaying 1 – 3 of 3

Showing per page

Order by Relevance | Title | Year of publication

Extracting Precise Data on the Mathematical Content of PDF Documents

Baker, Josef B.Sexton, Alan P.Sorge, Volker — 2008

Towards Digital Mathematics Library. Birmingham, United Kingdom, July 27th, 2008

As more and more scientific documents become available in PDF format, their automatic analysis becomes increasingly important. We present a procedure that extracts mathematical symbols from PDF documents by examining both the original PDF file and a rasterized version. This provides more precise information than is available either directly from the PDF file or by traditional character recognition techniques. The data can then be used to improve mathematical parsing methods that transform the mathematics...

An Online Repository of Mathematical Samples

Baker, Josef B.Sexton, Alan P.Sorge, Volker — 2009

Towards a Digital Mathematics Library. Grand Bend, Ontario, Canada, July 8-9th, 2009

With a growing community of researchers working on the recognition, parsing and digital exploitation of mathematical formulae, a need has arisen for a set of samples or benchmarks which can be used to compare, evaluate and help to develop different implementations and algorithms. The benchmark set would have to cover a wide range of mathematics, contain enough information to be able to search for specific samples and be accessible to the whole community. In this paper, we propose an on-line system...

Towards Reverse Engineering of PDF Documents

Baker, Josef B.Sexton, Alan P.Sorge, Volker — 2011

Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011

We present a progress report on our ongoing project of reverse engineering scientific PDF documents. The aim is to obtain mathematical markup that can be used as source for regenerating a document that resembles the original as closely as possible. This source can then be a basis for further document processing. Our current tool uses specialised PDF extraction together with image analysis to produce near perfect input for parsing mathematical formula. Applying a linear grammar and specific drivers...

Page 1

Download Results (CSV)