We demonstrate searching of mathematical expressions in technical digital libraries on a MREC collection of 439,423 real scientific documents with more than 158 million mathematical formulae. Our solution—the WebMIaS system—allows the retrieval of mathematical expressions written in TeX or MathML. TeX queries are converted on-the-fly into tree representations of Presentation MathML, which is used for indexing. WebMIaS allows complex queries composed of plain text and mathematical formulae, using...
In this work-in-progress report we propose a workflow for metadata extraction from articles in a digital form. We decompose the problem into clearly defined sub-tasks and outline possible implementations of the sub-tasks. We report the progress of implementation and tests, and state future work.