Displaying similar documents to “Mathematical Formulae Recognition and Logical Structure Analysis of Mathematical Papers”

Web Interface and Collection for Mathematical Retrieval : WebMIaS and MREC

Líška, Martin, Sojka, Petr, Růžička, Michal, Mravec, Petr

Similarity:

We demonstrate searching of mathematical expressions in technical digital libraries on a MREC collection of 439,423 real scientific documents with more than 158 million mathematical formulae. Our solution—the WebMIaS system—allows the retrieval of mathematical expressions written in TeX or MathML. TeX queries are converted on-the-fly into tree representations of Presentation MathML, which is used for indexing. WebMIaS allows complex queries composed of plain text and mathematical formulae,...

Extracting Precise Data on the Mathematical Content of PDF Documents

Baker, Josef B., Sexton, Alan P., Sorge, Volker

Similarity:

As more and more scientific documents become available in PDF format, their automatic analysis becomes increasingly important. We present a procedure that extracts mathematical symbols from PDF documents by examining both the original PDF file and a rasterized version. This provides more precise information than is available either directly from the PDF file or by traditional character recognition techniques. The data can then be used to improve mathematical parsing methods that transform...