The paper describes the background, the expected functionalities, and the architecture design goals of the European Digital Mathematics Library (Eu-DML), an infrastructure system aimed to integrate the mathematical contents available online throughout Europe, allowing for both extensive and specialized mathematics resource discovery. The three years long project to build the EuDML, partially funded by the European Commission, started in February 2010.
The WWW became the main resource of mathematical knowledge. Currently available full text search engines can be used on these documents but they are deficient in almost all cases. By applying axioms, equal transformations, and by using different notation each formula can be expressed in numerous ways. Most of these documents do not contain semantic information; therefore, precise mathematical interpretation is impossible. On the other hand, semantic information can help to give more precise information....
As more and more scientific documents become available in PDF format, their automatic analysis becomes increasingly important. We present a procedure that extracts mathematical symbols from PDF documents by examining both the original PDF file and a rasterized version. This provides more precise information than is available either directly from the PDF file or by traditional character recognition techniques. The data can then be used to improve mathematical parsing methods that transform the mathematics...