Fast Information Retrieval in the Open Grid Service Architecture

Berka, Tobias; Vajteršic, Marian

Serdica Journal of Computing (2011)

  • Volume: 5, Issue: 3, page 207-236
  • ISSN: 1312-6555

Abstract

top
This is an extended version of an article presented at the Second International Conference on Software, Services and Semantic Technologies, Sofia, Bulgaria, 11–12 September 2010.In research, grid computing is an established way of providing computer resources for information retrieval. However, e-science grids also contain, process and produce documents - thereby acting as digital libraries and requiring means for information discovery. In this paper, we discuss how distributed information retrieval can be integrated into the Open Grid Service Architecture (OGSA) to efficiently provide image retrieval for e-science grids. We identify two fundamental ways of performing information retrieval on the grid - as a batch job or as a distributed activity - and argue the case for the latter for reasons of efficiency. We give an analysis of the theoretic communication and computation complexity and demonstrate that bandwidth limitations provide a decisive argument to support our case. We describe further design decisions for our system architecture and give a brief comparison with other designs reported in literature. Lastly, we describe how the statelessness and isolation of web services impede data-intensive, distributed, cross-site activities in OGSA grids, and how to escape them.

How to cite

top

Berka, Tobias, and Vajteršic, Marian. "Fast Information Retrieval in the Open Grid Service Architecture." Serdica Journal of Computing 5.3 (2011): 207-236. <http://eudml.org/doc/219613>.

@article{Berka2011,
abstract = {This is an extended version of an article presented at the Second International Conference on Software, Services and Semantic Technologies, Sofia, Bulgaria, 11–12 September 2010.In research, grid computing is an established way of providing computer resources for information retrieval. However, e-science grids also contain, process and produce documents - thereby acting as digital libraries and requiring means for information discovery. In this paper, we discuss how distributed information retrieval can be integrated into the Open Grid Service Architecture (OGSA) to efficiently provide image retrieval for e-science grids. We identify two fundamental ways of performing information retrieval on the grid - as a batch job or as a distributed activity - and argue the case for the latter for reasons of efficiency. We give an analysis of the theoretic communication and computation complexity and demonstrate that bandwidth limitations provide a decisive argument to support our case. We describe further design decisions for our system architecture and give a brief comparison with other designs reported in literature. Lastly, we describe how the statelessness and isolation of web services impede data-intensive, distributed, cross-site activities in OGSA grids, and how to escape them.},
author = {Berka, Tobias, Vajteršic, Marian},
journal = {Serdica Journal of Computing},
keywords = {Grid Computing; Information Retrieval; Web Services},
language = {eng},
number = {3},
pages = {207-236},
publisher = {Institute of Mathematics and Informatics Bulgarian Academy of Sciences},
title = {Fast Information Retrieval in the Open Grid Service Architecture},
url = {http://eudml.org/doc/219613},
volume = {5},
year = {2011},
}

TY - JOUR
AU - Berka, Tobias
AU - Vajteršic, Marian
TI - Fast Information Retrieval in the Open Grid Service Architecture
JO - Serdica Journal of Computing
PY - 2011
PB - Institute of Mathematics and Informatics Bulgarian Academy of Sciences
VL - 5
IS - 3
SP - 207
EP - 236
AB - This is an extended version of an article presented at the Second International Conference on Software, Services and Semantic Technologies, Sofia, Bulgaria, 11–12 September 2010.In research, grid computing is an established way of providing computer resources for information retrieval. However, e-science grids also contain, process and produce documents - thereby acting as digital libraries and requiring means for information discovery. In this paper, we discuss how distributed information retrieval can be integrated into the Open Grid Service Architecture (OGSA) to efficiently provide image retrieval for e-science grids. We identify two fundamental ways of performing information retrieval on the grid - as a batch job or as a distributed activity - and argue the case for the latter for reasons of efficiency. We give an analysis of the theoretic communication and computation complexity and demonstrate that bandwidth limitations provide a decisive argument to support our case. We describe further design decisions for our system architecture and give a brief comparison with other designs reported in literature. Lastly, we describe how the statelessness and isolation of web services impede data-intensive, distributed, cross-site activities in OGSA grids, and how to escape them.
LA - eng
KW - Grid Computing; Information Retrieval; Web Services
UR - http://eudml.org/doc/219613
ER -

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.