The Complexity of Cutting Complexes.
After an extensive study of the metadata policy of each of its content partners, the EuDML project evaluated many different strategies and existing schemas that could store every detail faithfully, and yet reserve room for the enhancements foreseen in the project’s work plan. The framework provided by the so-called NLM Journal Archiving and Interchange Tag Suite was selected as best readily available approximation of our needs. Some modifications of it have been endorsed by the project, defining...
The exchange of preprints and journals plays an important role to communicate new research ideas and results in many academic fields. Distribution of preprints and journal articles by electronic file via the Internet has become a primary method in addition to paper publication. Electronic preprints and articles in the paperless era should be certified in terms of existence proof and tamper resistance because they are easily modified by their site administrator. We developed a secure preprint and...
The workshop’s objectives were to formulate the strategy and goals of a global mathematical digital library and to summarize the current successes and failures of ongoing technologies and related projects. There is already some experience with building smaller DMLs and/or building big thematical scientific digital libraries. Why there are already big fulltext digital library in some domains like PubMed in biomedical one, but none in others? We try to pose such and other questions, and try to find...
The DML workshop’s objectives were to formulate the strategy and goals of a global mathematical digital library and to summarize the current successes and failures of ongoing technologies and related projects. There is already experience with building regional DMLs or building big thematic scientific digital libraries. EuDML project reached it halflife period. While there are already big fulltext digital libraries in some domains like PubMed Central in the biomedical domain, Inspire in high-energy...
In this paper we propose a flexible, modular framework for author name disambiguation. Our solution consists of the core which orchestrates the disambiguation process, and replaceable modules performing concrete tasks. The approach is suitable for distributed computing, in particular it maps well to the MapReduce framework. We describe each component in detail and discuss possible alternatives. Finally, we propose procedures for calibration and evaluation of the described system.
We present a progress report on our ongoing project of reverse engineering scientific PDF documents. The aim is to obtain mathematical markup that can be used as source for regenerating a document that resembles the original as closely as possible. This source can then be a basis for further document processing. Our current tool uses specialised PDF extraction together with image analysis to produce near perfect input for parsing mathematical formula. Applying a linear grammar and specific drivers...