Web Interface and Collection for Mathematical Retrieval : WebMIaS and MREC

Líška, Martin; Sojka, Petr; Růžička, Michal; Mravec, Petr

  • Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011, Publisher: Masaryk University Press(Brno, Czech Republic), page 77-84

Abstract

top
We demonstrate searching of mathematical expressions in technical digital libraries on a MREC collection of 439,423 real scientific documents with more than 158 million mathematical formulae. Our solution—the WebMIaS system—allows the retrieval of mathematical expressions written in TeX or MathML. TeX queries are converted on-the-fly into tree representations of Presentation MathML, which is used for indexing. WebMIaS allows complex queries composed of plain text and mathematical formulae, using MIaS (Math Indexer and Searcher), a math aware search engine based on the state-of-the-art system Lucene. MIaS implements proximity math indexing with a subformulae similarity search.

How to cite

top

Líška, Martin, et al. "Web Interface and Collection for Mathematical Retrieval : WebMIaS and MREC." Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011. Brno, Czech Republic: Masaryk University Press, 2011. 77-84. <http://eudml.org/doc/220598>.

@inProceedings{Líška2011,
abstract = {We demonstrate searching of mathematical expressions in technical digital libraries on a MREC collection of 439,423 real scientific documents with more than 158 million mathematical formulae. Our solution—the WebMIaS system—allows the retrieval of mathematical expressions written in TeX or MathML. TeX queries are converted on-the-fly into tree representations of Presentation MathML, which is used for indexing. WebMIaS allows complex queries composed of plain text and mathematical formulae, using MIaS (Math Indexer and Searcher), a math aware search engine based on the state-of-the-art system Lucene. MIaS implements proximity math indexing with a subformulae similarity search.},
author = {Líška, Martin, Sojka, Petr, Růžička, Michal, Mravec, Petr},
booktitle = {Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011},
keywords = {math indexing and retrieval; mathematical digital libraries; information systems; information retrieval; mathematical content search; document ranking of mathematical papers; math text mining; WebMIaS; MIaS; Tralics; TEX; UMCL; Lucene},
location = {Brno, Czech Republic},
pages = {77-84},
publisher = {Masaryk University Press},
title = {Web Interface and Collection for Mathematical Retrieval : WebMIaS and MREC},
url = {http://eudml.org/doc/220598},
year = {2011},
}

TY - CLSWK
AU - Líška, Martin
AU - Sojka, Petr
AU - Růžička, Michal
AU - Mravec, Petr
TI - Web Interface and Collection for Mathematical Retrieval : WebMIaS and MREC
T2 - Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011
PY - 2011
CY - Brno, Czech Republic
PB - Masaryk University Press
SP - 77
EP - 84
AB - We demonstrate searching of mathematical expressions in technical digital libraries on a MREC collection of 439,423 real scientific documents with more than 158 million mathematical formulae. Our solution—the WebMIaS system—allows the retrieval of mathematical expressions written in TeX or MathML. TeX queries are converted on-the-fly into tree representations of Presentation MathML, which is used for indexing. WebMIaS allows complex queries composed of plain text and mathematical formulae, using MIaS (Math Indexer and Searcher), a math aware search engine based on the state-of-the-art system Lucene. MIaS implements proximity math indexing with a subformulae similarity search.
KW - math indexing and retrieval; mathematical digital libraries; information systems; information retrieval; mathematical content search; document ranking of mathematical papers; math text mining; WebMIaS; MIaS; Tralics; TEX; UMCL; Lucene
UR - http://eudml.org/doc/220598
ER -

References

top
  1. Archambault, D., Moço, V., Canonical MathML to Simplify Conversion of MathML to Braille Mathematical Notations, In: Miesenberger, K., Klaus, J., Zagler, W., Karshmer, A. (eds.) Computers Helping People with Special Needs, Lecture Notes in Computer Science, vol. 4061, pp. 1191–1198. Springer Berlin / Heidelberg (2006), http://dx.doi.org/10.1007/11788713_172 (2006) 
  2. Grimm, J., Producing MathML with Tralics, In: Sojka, P. (ed.): Towards a Digital Mathematics Library. Masaryk University, Paris, France (Jul 2010), http://www.fi.muni.cz/~sojka/dml-2010-program.html, pp. 105–117, http://dml.cz/dmlcz/702579 (2010) 
  3. Kováčik, O., Rákosník, J., On spaces L p ( x ) and W k , p ( x ) , Czechoslovak Mathematical Journal 41, 592–618 (1991), http://dml.cz/dmlcz/102493 (1991) MR1134951
  4. MREC—Mathematical REtrieval Collection, http://nlp.fi.muni.cz/projekty/eudml/MREC/index.html 
  5. Sojka, P., Towards a Digital Mathematics Library, Masaryk University, Paris, France (Jul 2010), http://www.fi.muni.cz/~sojka/dml-2010-program.html (2010) 
  6. Sojka, P., Líška, M., Indexing and Searching Mathematics in Digital Libraries – Architecture, Design and Scalability Issues, In: Davenport, J.H., Farmer, W., Rabe, F., Urban, J. (eds.) Proceedings of CICM Conference 2011 (Calculemus/MKM). Lecture Notes in Artificial Intelligence, LNAI, vol. 6824, pp. 228–243. Springer-Verlag, Berlin, Germany (Jul 2011) (2011) 
  7. Stamerjohanns, H., Ginev, D., David, C., Misev, D., Zamdzhiev, V., Kohlhase, M., MathML-aware Article Conversion from LaTeX, In: Sojka, P. (ed.) Proceedings of DML 2009. pp. 109–120. Masaryk University, Grand Bend, Ontario, CA (Jul 2009), http://dml.cz/dmlcz/702561 (2009) 
  8. Stamerjohanns, H., Kohlhase, M., Ginev, D., David, C., Miller, B., Transforming Large Collections of Scientific Publications to XML, Mathematics in Computer Science 3, 299–307 (2010), http://dx.doi.org/10.1007/s11786-010-0024-7 (2010) Zbl1205.68490
  9. Sylwestrzak, W., Borbinha, J., Bouche, T., Nowiński, A., Sojka, P., EuDML—Towards the European Digital Mathematics Library, In: Sojka, P. (ed.): Towards a Digital Mathematics Library. Masaryk University, Paris, France (Jul 2010), http://www.fi.muni.cz/~sojka/dml-2010-program.html, pp. 11–24, http://dml.cz/ dmlcz/702569 (2010) 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.