Using Discourse Context to Interpret Object-Denoting Mathematical Expressions

Wolska, Magdalena; Grigore, Mihai; Kohlhase, Michael

  • Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011, Publisher: Masaryk University Press(Brno, Czech Republic), page 85-101

Abstract

top
We present a method for determining the context-dependent denotation of simple object-denoting mathematical expressions in mathematical documents. Our approach relies on estimating the similarity between the linguistic context within which the given expression occurs and a set of terms from a flat domain taxonomy of mathematical concepts; one of 7 head concepts dominating a set of terms with highest similarity score to the symbol’s context is assigned as the symbol’s interpretation. The taxonomy we used was constructed semi-automatically by combining structural and lexical information from the Cambridge Mathematics Thesaurus and the Mathematics Subject Classification. The context information taken into account in the statistical similarity calculation includes lexical features of the discourse immediately adjacent to the given expression as well as global discourse. In particular, as part of the latter we include the lexical context of structurally similar expressions throughout the document and that of the symbol’s declaration statement if one can be found in the document. Our approach has been evaluated on a gold standard manually annotated by experts, achieving 66% precision.

How to cite

top

Wolska, Magdalena, Grigore, Mihai, and Kohlhase, Michael. "Using Discourse Context to Interpret Object-Denoting Mathematical Expressions." Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011. Brno, Czech Republic: Masaryk University Press, 2011. 85-101. <http://eudml.org/doc/221783>.

@inProceedings{Wolska2011,
abstract = {We present a method for determining the context-dependent denotation of simple object-denoting mathematical expressions in mathematical documents. Our approach relies on estimating the similarity between the linguistic context within which the given expression occurs and a set of terms from a flat domain taxonomy of mathematical concepts; one of 7 head concepts dominating a set of terms with highest similarity score to the symbol’s context is assigned as the symbol’s interpretation. The taxonomy we used was constructed semi-automatically by combining structural and lexical information from the Cambridge Mathematics Thesaurus and the Mathematics Subject Classification. The context information taken into account in the statistical similarity calculation includes lexical features of the discourse immediately adjacent to the given expression as well as global discourse. In particular, as part of the latter we include the lexical context of structurally similar expressions throughout the document and that of the symbol’s declaration statement if one can be found in the document. Our approach has been evaluated on a gold standard manually annotated by experts, achieving 66% precision.},
author = {Wolska, Magdalena, Grigore, Mihai, Kohlhase, Michael},
booktitle = {Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011},
location = {Brno, Czech Republic},
pages = {85-101},
publisher = {Masaryk University Press},
title = {Using Discourse Context to Interpret Object-Denoting Mathematical Expressions},
url = {http://eudml.org/doc/221783},
year = {2011},
}

TY - CLSWK
AU - Wolska, Magdalena
AU - Grigore, Mihai
AU - Kohlhase, Michael
TI - Using Discourse Context to Interpret Object-Denoting Mathematical Expressions
T2 - Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011
PY - 2011
CY - Brno, Czech Republic
PB - Masaryk University Press
SP - 85
EP - 101
AB - We present a method for determining the context-dependent denotation of simple object-denoting mathematical expressions in mathematical documents. Our approach relies on estimating the similarity between the linguistic context within which the given expression occurs and a set of terms from a flat domain taxonomy of mathematical concepts; one of 7 head concepts dominating a set of terms with highest similarity score to the symbol’s context is assigned as the symbol’s interpretation. The taxonomy we used was constructed semi-automatically by combining structural and lexical information from the Cambridge Mathematics Thesaurus and the Mathematics Subject Classification. The context information taken into account in the statistical similarity calculation includes lexical features of the discourse immediately adjacent to the given expression as well as global discourse. In particular, as part of the latter we include the lexical context of structurally similar expressions throughout the document and that of the symbol’s declaration statement if one can be found in the document. Our approach has been evaluated on a gold standard manually annotated by experts, achieving 66% precision.
UR - http://eudml.org/doc/221783
ER -

References

top
  1. Ausbrooks, R., Carlisle, S.B.D., Chavchanidze, G., Dalmas, S., Devitt, S., Diaz, A., Dooley, S., Hunter, R., Ion, P., Kohlhase, M., Lazrek, A., Libbrecht, P., Miller, B., Miner, R., Sargent, M., Smith, B., Soiffer, N., Sutor, R., Watt, S., Mathematical Markup Language (MathML) version 3.0, W3C Working Draft of 24. September 2009, World Wide Web Consortium (2009), http://www.w3.org/TR/MathML3. (2009) 
  2. Budiu, R., Royer, C., Pirolli, P., Modeling information scent: a comparison of LSA, PMI-IR and GLSA similarity measures on common tests and corpora, In: Proceedings of the 8th Conference on Large Scale Semantic Access to Content (RIAO-07). pp. 314– 332 (2007). (2007) 
  3. Bullinaria, J., Levy, J., Extracting semantic representations from word co-occurrence statistics: A computational study, Behavior Research Methods 39(3), 510–526 (2007). (2007) 
  4. Frantzi, K., Ananiadou, S., Mima, H., Automatic recognition of multi-word terms: the C-value/NC-value method, International Journal on Digital Libraries 3(2), 115–130 (2000). (2000) 
  5. Grigore, M., Wolska, M., Kohlhase, M., Towards context-based disambiguation of mathematical expressions, In: Selected Papers from the joint conference of ASCM 2009 and MACIS 2009: the 9th Asian Symposium on Computer Mathematics and the 3rd International Conference on Mathematical Aspects of Computer and Information Sciences. pp. 262–271 (2009). (2009) Zbl1186.68530
  6. Gruber, T., Olsen, G., An ontology for engineering mathematics, In: Proceedings 4th International Conference on Principles of Knowledge Representation and Reasoning. pp. 258–269 (1994). (1994) 
  7. Kozareva, Z., Riloff, E., Hovy, E., Semantic class learning from the Web with Hyponym Pattern Linkage Graphs, In: Proceedings of the ACL/HLT-08 Conference. pp. 1048–1056 (2008). (2008) 
  8. McCarthy, D., Word sense disambiguation: An overview, Language and Linguistics Compass 3(2), 537–558 (2009). (2009) 
  9. Mihalcea, R., Corley, C., Strapparava, C., Corpus-based and knowledge-based measures of text semantic similarity, In: Proceedings of the 21st National Conference on Artificial Intelligence. pp. 775–780 (2006). (2006) 
  10. Miller, B., LaTeXML: A LaTeX to XML Converter, Web Manual at http://dlmf.nist.gov/LaTeXML/ (September 2007). (2007) 
  11. Pedersen, T., Banerjee, S., Patwardhan, S., Maximizing semantic relatedness to perform word sense disambiguation, Research Report 25, University of Minnesota Supercomputing Institute (2005). (2005) 
  12. Stamerjohanns, H., Kohlhase, M., Ginev, D., David, C., Miller, B., Transforming Large Collections of Scientific Publications to XML, Mathematics in Computer Science 3, 299–307 (2010). (2010) Zbl1205.68490
  13. Turney, P.D., Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL, In: Proceedings of the 12th European Conference on Machine Learning. pp. 491–502 (2001), http://cogprints.org/1796/. (2001) Zbl1007.68551
  14. Wessler, M., An algebraic proof of Iitaka’s conjecture, Archiv der Mathematik 79, 268–273 (2002), http://dx.doi.org/10.1007/s00013-002-8313-2. (2002) Zbl1011.14002MR1944951
  15. Wolska, M., Grigore, M., Symbol declarations in mathematical writing, In: Sojka, P. (ed.) Proceedings of the 3rd Workshop on Digital Mathematics Libraries. pp. 119–127 (2010). (2010) 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.