Flexible representation and querying of heterogeneous structured documents

Gloria Bordogna; Gabriella Pasi

Kybernetika (2000)

  • Volume: 36, Issue: 6, page [617]-633
  • ISSN: 0023-5954

Abstract

top
In this paper we present a fuzzy model for representing documents having a hierarchical structure and possibly containing multimedia information. We consider an archive containing documents with distinct (heterogeneous) logical structures. We also propose a flexible query language for expressing soft selection conditions on the structured documents. The documents’ content is organized into thematic (topical) sections where the index terms play a distinct role. The proposed document representation is adaptive to the user, who can indicate the preferred sections of documents, i. e. those which they estimate to bear the most interesting information, and can linguistically quantify the number of sections which determine the global potential interest of the documents. Linguistic quantifiers in the query specify the approximate number of the sections in which the query terms should appear.

How to cite

top

Bordogna, Gloria, and Pasi, Gabriella. "Flexible representation and querying of heterogeneous structured documents." Kybernetika 36.6 (2000): [617]-633. <http://eudml.org/doc/33507>.

@article{Bordogna2000,
abstract = {In this paper we present a fuzzy model for representing documents having a hierarchical structure and possibly containing multimedia information. We consider an archive containing documents with distinct (heterogeneous) logical structures. We also propose a flexible query language for expressing soft selection conditions on the structured documents. The documents’ content is organized into thematic (topical) sections where the index terms play a distinct role. The proposed document representation is adaptive to the user, who can indicate the preferred sections of documents, i. e. those which they estimate to bear the most interesting information, and can linguistically quantify the number of sections which determine the global potential interest of the documents. Linguistic quantifiers in the query specify the approximate number of the sections in which the query terms should appear.},
author = {Bordogna, Gloria, Pasi, Gabriella},
journal = {Kybernetika},
keywords = {query language; heterogeneously structured document; query language; heterogeneously structured document},
language = {eng},
number = {6},
pages = {[617]-633},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Flexible representation and querying of heterogeneous structured documents},
url = {http://eudml.org/doc/33507},
volume = {36},
year = {2000},
}

TY - JOUR
AU - Bordogna, Gloria
AU - Pasi, Gabriella
TI - Flexible representation and querying of heterogeneous structured documents
JO - Kybernetika
PY - 2000
PB - Institute of Information Theory and Automation AS CR
VL - 36
IS - 6
SP - [617]
EP - 633
AB - In this paper we present a fuzzy model for representing documents having a hierarchical structure and possibly containing multimedia information. We consider an archive containing documents with distinct (heterogeneous) logical structures. We also propose a flexible query language for expressing soft selection conditions on the structured documents. The documents’ content is organized into thematic (topical) sections where the index terms play a distinct role. The proposed document representation is adaptive to the user, who can indicate the preferred sections of documents, i. e. those which they estimate to bear the most interesting information, and can linguistically quantify the number of sections which determine the global potential interest of the documents. Linguistic quantifiers in the query specify the approximate number of the sections in which the query terms should appear.
LA - eng
KW - query language; heterogeneously structured document; query language; heterogeneously structured document
UR - http://eudml.org/doc/33507
ER -

References

top
  1. Bookstein A., 10.1002/asi.4630310403, J. Amer. Soc. Inform. Science 31 (1980), 240–247 (1980) DOI10.1002/asi.4630310403
  2. Bordogna G., Pasi G., 10.1002/(SICI)1097-4571(199303)44:2<70::AID-ASI2>3.0.CO;2-I, J. Amer. Soc. Inform. Science 44 (1993), 2, 70–82 (1993) DOI10.1002/(SICI)1097-4571(199303)44:2<70::AID-ASI2>3.0.CO;2-I
  3. Bordogna G., Pasi G., 10.1016/0888-613X(94)00036-3, Internat. J. Approx. Reason. 12 (1995), 317–339 (1995) Zbl0870.68058MR1327861DOI10.1016/0888-613X(94)00036-3
  4. Bordogna G., Pasi G., 10.1002/int.4550100205, Internat. J. Intelligent Systems 10 (1995), 233–248 (1995) DOI10.1002/int.4550100205
  5. Chiaramella Y., Kheirbek A., An integrated model for hypermedia and information retrieval, In: Information Retrieval and Hypertext (M. Agosti and A. Smeaton, eds.), 1996, pp. 136–176 (1996) 
  6. H. D. A. Buell D., Kraft, 10.1016/S0306-4573(81)80004-0, Inform. Process. Management 17 (1981), 127–136 (1981) Zbl0456.68134DOI10.1016/S0306-4573(81)80004-0
  7. al V. Christophides et, From structured documents to novel query facilities, In: Proc. ACM SIGMOD Internat. Conf. on Management of Data. ACM Press, Minneapolis 1994 
  8. Florescu D., Manolescu I., Kossmann D., Storing and querying XML data using an RDBMS, IEEE Data Engineering Bulletin 22 (1999), 3, 27–34 (1999) 
  9. Kim H., Cho S., 10.1016/S0306-4573(99)00075-8, Inform. Process. Management 36 (2000), 643–657 DOI10.1016/S0306-4573(99)00075-8
  10. Krovetz R., Croft W. B., 10.1145/146802.146810, ACM Trans. Information System 10 (1992), 2, 115–141 (1992) DOI10.1145/146802.146810
  11. Klir G. J., Folger T. A., Fuzzy Sets, Uncertainty and Information, Prentice Hall PTR Englewood Cliffs, 1998 Zbl0675.94025MR0930102
  12. Kraft D. H., Bordogna G., Pasi G., An extended fuzzy linguistic approach to generalize Boolean information retrieval, J. Inform. Sciences Appl. 2 (1995), 3, 119–134 (1995) MR1327861
  13. Lalmas M., Ruthven I., 10.1108/EUM0000000007180, J. Documentation 54 (1998), 5, 529–565 (1998) DOI10.1108/EUM0000000007180
  14. Macleod I., 10.1016/0306-4573(90)90025-W, Inform. Process. Management 26 (1990), 2, 197–208 (1990) DOI10.1016/0306-4573(90)90025-W
  15. Molinari A., Pasi G., A fuzzy representation of HTML documents for information retrieval systems: In: Proc, IEEE Internat. Conf. on Fuzzy Systems, New Orleans 1996 
  16. Negoita C. V., 10.1108/eb005334, Kybernetes 2 (1973), 3, 161–165 (1973) Zbl0278.68087DOI10.1108/eb005334
  17. Paice C. D., Soft evaluation of Boolean search queries in information retrieval systems, Information Technology: Research Development Applications 3 (1984), 1, 33–41 (1984) 
  18. Papakonstantinou Y., Widom J., Molina H. G., Object exchange and heterogeneous information sources, In: Proc. IEEE Internat. Conf. on Engineering, Birmingham 1996 
  19. Paradis F., Berrut C., Experiments with theme extraction in explanatory texts, In: Proc. II Internat. Conf. on Conceptions of Library and Information (CoLIB 2), Copenhagen 1996, pp. 13–16, 433–446 (1996) 
  20. Perez–Carballo J., Strzalkowski T., 10.1016/S0306-4573(99)00049-7, Inform. Process. Management 36 (2000), 155–178 DOI10.1016/S0306-4573(99)00049-7
  21. al A. Rao et, 10.1016/S0306-4573(99)00050-3, Inform. Process. Management 36 (2000), 179–186 DOI10.1016/S0306-4573(99)00050-3
  22. Sager N., Natural Language Information Processing, Addison Wesley, 1981 
  23. Salton G., Fox E., Wu H., 10.1145/182.358466, Comm. ACM 26 (1983), 12, 1022–1036 (1983) Zbl0519.68089MR0784124DOI10.1145/182.358466
  24. Salton G., McGill M. J., Introduction to modern information retrieval, McGraw–Hill Internat. Book Co., 1984 Zbl0523.68084
  25. Jones K. A. Sparck, Automatic Keyword Classification for Information Retrieval, Butterworths, London 1971 
  26. Jones K. A. Sparck, 10.1108/eb026526, J. Documentation 28 (1972), 1, 11–20 (1972) DOI10.1108/eb026526
  27. Rijsbergen C. J. van, Information Retrieval, Butterworths, London 1979 
  28. Yager R. R., 10.1109/21.87068, IEEE Trans. Systems Man Cybernet. 18 (1988), 1, 183–190 (1988) MR0931863DOI10.1109/21.87068
  29. Yager R. R, (eds.) J. Kacprzyk, The Ordered Weighted Averaging Operators: Theory and Applications, Kluwer, Dordrecht 1997 
  30. Zadeh L. A., 10.1016/S0019-9958(65)90241-X, Inform. and Control 8 (1965), 338–353 (1965) Zbl0139.24606MR0219427DOI10.1016/S0019-9958(65)90241-X
  31. Zadeh L. A., 10.1016/0898-1221(83)90013-5, Comput. Math. Appl. 9 (1983), 149–184 (1983) Zbl0517.94028MR0719073DOI10.1016/0898-1221(83)90013-5

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.