Foreword ICTCS 2010 special issue
Experience in setting up a workflow from scanned images of mathematical papers into a fully fledged mathematical library is described on the example of the project Czech Digital Mathematics Library DML-CZ. An overview of the whole process is given, with description of all main production steps. DML-CZ has recently been launched to public with more than 100,000 digitized pages.
This paper presents an alternative interface for browsing in the Czech Digital Mathematics Library (DML-CZ) using our Visual Browser web browsing tool. Using dynamic visualization, we have created a tool for browsing the library graphically. Visualization can help users orient themselves in complex data and at the same time reveal sometimes unexpected relationships among units; it at least speeds up browsing. This work follows the metadata processing undertaken on DML-CZ and visualizes all reasonable...
High-Energy Physics (HEP) has a long tradition in pioneering infrastructures for scholarly communication, and four leading laboratories are now rolling-out the next-generation digital library for the field: INSPIRE. This is an evolution of the extraordinarily successful, 40-years old SPIRES database. Based on the Invenio software, INSPIRE already provides seamless access to almost 1 million records, which will be expanded to cover multimedia, data, software, wikis. Services offered include citation...
Entering mathematical queries, in general, can be a demanding task. Mathematical notation is two-dimensional and cannot be easily typed with a standard QWERTY keyboard. Handwriting appears to be the most intuitive and promising method to express mathematical queries. Recognition technology for handwritten mathematical notation has never been applied in math search. The objective of this research is to design and implement an automated symbol-recognition error compensation system for the handwritten-based...
Abstract. At an exclusively online university such as the UOC the necessity for communicating mathematics in the web is pressing. In an environment that does not allow for face to face communication, things implicitly communicated when using a blackboard, such as the canonical verbalization or handwriting of formulae, are lost and become a big obstacle. Also, the editorial process for the creation of learning/teaching resources is suited for a generalist approach and, consequently, needs such as...
Earlier work has examined the frequency of symbol and expression use in mathematical documents for various purposes including mathematical handwriting recognition and forming the most natural output from computer algebra systems. This work has found, unsurprisingly, that the particulars of symbol and expression vary from area to area and, in particular, between different top-level subjects of the 2000 Mathematical Subject Classification. If the area of mathematics is known in advance, then an area-specific...
We present a summary of our work in progress related to mathematical formulae recognition. Our approach is based on the structural construction paradigm and two-dimensional grammars. It is a general framework and can be successfully used in the analysis of images containing objects exhibiting rich structural relations. In contrast to most of all other known approaches, the method does not treat symbols segmentation and structural analysis as two separate processes. This allows the system to solve...
In most cases the current on-line journals in mathematics are supplied in the form of PDF with print images of papers in the front and OCR’ed hidden texts behind to provide with search facilily using key words. The embedded hidden texts usually does not include good information about mathematical formulae in the papers. We can say that, for the future development of DML, it is desirable to include, in the digitised journals, more structured information of the content of mathematical papers, e.g....
For preparing and validating metadata for the Digital Mathematics Library DML-CZ, a new tool, the Metadata Editor, has been developed. This paper outlines the procedures for Linguistic and geographical localizations its components. Also mentioned are such aspects as dynamic generation of form editing based on the XML Schema, the validation procedures as well as support for semiautomatic procedures regarding quality assurance.
YADDA framework facilitates information exchange between digital document repositories. YaddaWeb, its web-based interface, provides browse and search functionalities. Content providers use DeskLight application to add or modify metadata and content. Internally, YADDA contains flexible repository aggregation mechanisms, multiple hierarchy support and full-text indexing capabilities. YADDA framework is an excellent solution for Open Access paradigm of content exchange. Migration of the Mathematical...
This paper describes several innovative PDF document enhancements and tools that can be used when building a digital library. The main result presented in this paper is the PDF re-compression tool, developed using the jbig2enc encoder called pdfJbIm. This re-compression tool enables the size of the original bitonal PDFs to be, on average, downsized by one third. Some modifications to the jbig2enc encoder that increase the compression ratio even further are also described here. Together with another...