Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system

Alexandra Carpen-Amarie; Alexandru Costan; Jing Cai; Gabriel Antoniu; Luc Bougé

International Journal of Applied Mathematics and Computer Science (2011)

  • Volume: 21, Issue: 2, page 229-242
  • ISSN: 1641-876X

Abstract

top
Introspection is the prerequisite of autonomic behavior, the first step towards performance improvement and resource usage optimization for large-scale distributed systems. In grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider specific information for higher-level services. More precisely, in the context of data-intensive applications, a specific introspection layer is required to collect data about the usage of storage resources, data access patterns, etc. This paper discusses the requirements for an introspection layer in a data management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. We illustrate the autonomic behavior of BlobSeer with a self-configuration component aiming to provide storage elasticity by dynamically scaling the number of data providers. Then we propose a preliminary approach for enabling self-protection for the BlobSeer system, through a malicious client detection component. The introspective architecture has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and behavior of the system.

How to cite

top

Alexandra Carpen-Amarie, et al. "Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system." International Journal of Applied Mathematics and Computer Science 21.2 (2011): 229-242. <http://eudml.org/doc/208043>.

@article{AlexandraCarpen2011,
abstract = {Introspection is the prerequisite of autonomic behavior, the first step towards performance improvement and resource usage optimization for large-scale distributed systems. In grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider specific information for higher-level services. More precisely, in the context of data-intensive applications, a specific introspection layer is required to collect data about the usage of storage resources, data access patterns, etc. This paper discusses the requirements for an introspection layer in a data management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. We illustrate the autonomic behavior of BlobSeer with a self-configuration component aiming to provide storage elasticity by dynamically scaling the number of data providers. Then we propose a preliminary approach for enabling self-protection for the BlobSeer system, through a malicious client detection component. The introspective architecture has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and behavior of the system.},
author = {Alexandra Carpen-Amarie, Alexandru Costan, Jing Cai, Gabriel Antoniu, Luc Bougé},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {distributed system; storage management; large-scale system; monitoring; introspection},
language = {eng},
number = {2},
pages = {229-242},
title = {Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system},
url = {http://eudml.org/doc/208043},
volume = {21},
year = {2011},
}

TY - JOUR
AU - Alexandra Carpen-Amarie
AU - Alexandru Costan
AU - Jing Cai
AU - Gabriel Antoniu
AU - Luc Bougé
TI - Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system
JO - International Journal of Applied Mathematics and Computer Science
PY - 2011
VL - 21
IS - 2
SP - 229
EP - 242
AB - Introspection is the prerequisite of autonomic behavior, the first step towards performance improvement and resource usage optimization for large-scale distributed systems. In grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider specific information for higher-level services. More precisely, in the context of data-intensive applications, a specific introspection layer is required to collect data about the usage of storage resources, data access patterns, etc. This paper discusses the requirements for an introspection layer in a data management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. We illustrate the autonomic behavior of BlobSeer with a self-configuration component aiming to provide storage elasticity by dynamically scaling the number of data providers. Then we propose a preliminary approach for enabling self-protection for the BlobSeer system, through a malicious client detection component. The introspective architecture has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and behavior of the system.
LA - eng
KW - distributed system; storage management; large-scale system; monitoring; introspection
UR - http://eudml.org/doc/208043
ER -

References

top
  1. Albrecht, J., Oppenheimer, D., Vahdat, A. and Patterson, D.A. (2005). Design and implementation tradeoffs for wide-area resource discovery, Proceedings of 14th IEEE Symposium on High Performance, Research Triangle Park, NC, USA, pp. 113-124. 
  2. ALICE (2010). The MonALISA Repository for ALICE, http://pcalimonitor.cern.ch/map.jsp. 
  3. Andreozzi, S., De Bortoli, N., Fantinel, S., Ghiselli, A., Rubini, G.L., Tortone, G. and Vistoli, M. C. (2005). GridICE: A monitoring service for grid systems, Future Generation Computer Systems 21(4): 559-571. 
  4. Bolze, R., Cappello, F., Caron, E., Dayd, M.J., Desprez, F., Jeannot, E., Jgou, Y., Lanteri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Qutier, B., Richard, O., Talbi, E., and Touche, I. (2006). Grid'5000: A large scale and highly reconfigurable experimental grid testbed, International Journal of High Performance Computing Applications 20(4): 481-494. 
  5. Cardosa, M. and Chandra, A. (2008). Resource bundles: Using aggregation for statistical wide-area resource discovery and allocation, 28th IEEE International Conference on Distributed Computing Systems (ICDCS 2008), Beijing, China, pp. 760-768. 
  6. Carpen-Amarie, A., Cai, J., Costan, A., Antoniu, G. and Bougé, L. (2010). Bringing introspection into the BlobSeer data-management system using the MonALISA distributed monitoring framework, 1st International Workshop on Autonomic Distributed Systems (ADiS 2010), Cracow, Poland, pp. 508-513. Zbl1272.68122
  7. Cooke, A., Gray, A., Nutt, W., Magowan, J., Oevers, M., Taylor, P., Cordenonsi, R., Byrom, R., Cornwall, L., Djaoui, A., Field, L., Fisher, S., Hicks, S., Leake, J., Middleton, R., Wilson, A., Zhu, X., Podhorszki, N., Coghlan, B., Kenny, S., Callaghan, D.O. and Ryan, J. (2004). The relational grid monitoring architecture: Mediating information about the grid, Journal of Grid Computing 2(4): 323-339. Zbl1081.68514
  8. Cowell, R.G., Dawid, A.P., Lauritzen, S.L. and Spiegelhalter, D.J. (1999). Probabilistic Networks and Expert Systems, Springer-Verlag, New York, NY. Zbl0937.68121
  9. Ding, J., Krämer, B.J., Bai, Y. and Chen, H. (2004). Probabilistic inference for network management, in M.M. Freie, P. Chemovil, P. Lorenz and A. Gravey (Eds.), Universal Multiservice Networks, Lecture Notes in Computer Science, Vol. 3262, Springer, Berlin/Heidelberg, pp. 498-507. 
  10. GGF (2010). The Global Grid Forum, http://www.ggf.org/. 
  11. Gunter, D., Tierney, B., Crowley, B., Holding, M. and Lee, J. (2000). Netlogger: A toolkit for distributed system performance analysis, MASCOTS '00: Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, San Francisco, CA, USA, pp. 267-273. 
  12. Gurguis, S. and Zeid, A. (2005). Towards autonomic web services: Achieving self-healing using web services, DEAS05: Proceedings of Design and Evolution of Autonomic Application Software Conference, St. Louis, MO, USA, pp. 1-5. 
  13. Hood, C. and Ji, C. (1997). Automated proactive anomaly detection, Proceedings of the IEEE International Conference of Network Management (IM97), San Diego, CA, USA, pp. 688-699. 
  14. Jain, A., Chang, E.Y. and Wang, Y.-F. (2004). Adaptive stream resource management using Kalman filters, SIGMOD '04: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, pp. 11-22. 
  15. Jain, N., Kit, D., Mahajan, P., Yalagandula, P., Dahlin, M. and Zhang, Y. (2007). STAR: self-tuning aggregation for scalable monitoring, VLDB '07: Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria, pp. 962-973. 
  16. Kephart, J.O. and Chess, D.M. (2003). The vision of autonomic computing, Computer 36(1): 41-50. 
  17. Legrand, I., Newman, H., Voicu, R., Cirstoiu, C., Grigoras, C., Dobre, C., Muraru, A., Costan, A., Dediu, M. and Stratan, C. MonALISA: An agent based, dynamic service system to monitor, control and optimize distributed systems, Computer Physics Communications 180(12): 2472-2498. Zbl1197.68085
  18. Liang, J., Gu, X. and Nahrstedt, K. (2007). Self-configuring information management for large-scale service overlays, INFOCOM 2007: 26th IEEE International Conference on Computer Communications/Joint Conference of the IEEE Computer and Communications Societies, Anchorage, AK, USA, pp. 472-480. 
  19. Massie, M., Chun, B. and Culler, D. (2004). The Ganglia distributed monitoring system: Design, implementation, and experience, Parallel Computing 30(7): 817-840. 
  20. Nicolae, B., Antoniu, G. and Bougé, L. (2009). Enabling high data throughput in desktop grids through decentralized data and metadata management: The BlobSeer approach, Proceedings of the 15th International Euro-Par Conference, Delft, The Netherlands, pp. 404-416. 
  21. Nicolae, B., Antoniu, G., Bougé, L., Moise, D. and CarpenAmarie, A. (2010). BlobSeer: Next generation data management for large scale infrastructures, Journal of Parallel and Distributed Computing 71(2): 168-184. 
  22. Parashar, M. and Hariri, S. (2005). Autonomic computing: An overview, in J.-P. Banâtre, P. Fradet, I.-L. Giavitto and O. Michel (Eds.), Unconventional Programming Paradigms, Lecture Notes in Computer Science, Vol. 3566, Springer Berlin/Heidelberg, pp. 247-259. 
  23. Santos, Jr., E. and Young, J. D. (1999). Probabilistic temporal networks: A unified framework for reasoning with time and uncertainty, International Journal of Approximate Reasoning 20(3): 263-291. Zbl0931.68125
  24. Steinder, M. and Sethi, A. S. (2004). Probabilistic fault localization in communication systems using belief networks, IEEE/ACM Transactions on Networking 12(5): 809-822. Zbl1068.68028
  25. Tierney, B., Aydt, R. and Gunter, D. (2002). A grid monitoring architecture, Grid Working Draft GWD-PERF-16-3 http://www.gridforum.org/. 
  26. Van Renesse, R., Birman, K.P. and Vogels, W. (2003). Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining, ACM Transactions on Computer Systems 21(2): 164-206. 
  27. Vuran, M.C. and Akyildiz, I.F. (2006). Spatial correlationbased collaborative medium access control in wireless sensor networks, IEEE/ACM Transactions on Networking 14(2): 316-329. 
  28. Zanikolas, S. and Sakellariou, R. (2005). A taxonomy of grid monitoring systems, Future Generation Computing Systems 21(1): 163-188. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.