Towards a Flexible Author Name Disambiguation Framework

Bolikowski, Łukasz; Dendek, Piotr Jan

  • Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011, Publisher: Masaryk University Press(Brno, Czech Republic), page 27-37

Abstract

top
In this paper we propose a flexible, modular framework for author name disambiguation. Our solution consists of the core which orchestrates the disambiguation process, and replaceable modules performing concrete tasks. The approach is suitable for distributed computing, in particular it maps well to the MapReduce framework. We describe each component in detail and discuss possible alternatives. Finally, we propose procedures for calibration and evaluation of the described system.

How to cite

top

Bolikowski, Łukasz, and Dendek, Piotr Jan. "Towards a Flexible Author Name Disambiguation Framework." Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011. Brno, Czech Republic: Masaryk University Press, 2011. 27-37. <http://eudml.org/doc/220965>.

@inProceedings{Bolikowski2011,
abstract = {In this paper we propose a flexible, modular framework for author name disambiguation. Our solution consists of the core which orchestrates the disambiguation process, and replaceable modules performing concrete tasks. The approach is suitable for distributed computing, in particular it maps well to the MapReduce framework. We describe each component in detail and discuss possible alternatives. Finally, we propose procedures for calibration and evaluation of the described system.},
author = {Bolikowski, Łukasz, Dendek, Piotr Jan},
booktitle = {Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011},
keywords = {name disambiguation; problem decomposition; scoring functions; single-linkage clustering; MapReduce framework; machine learning},
location = {Brno, Czech Republic},
pages = {27-37},
publisher = {Masaryk University Press},
title = {Towards a Flexible Author Name Disambiguation Framework},
url = {http://eudml.org/doc/220965},
year = {2011},
}

TY - CLSWK
AU - Bolikowski, Łukasz
AU - Dendek, Piotr Jan
TI - Towards a Flexible Author Name Disambiguation Framework
T2 - Towards a Digital Mathematics Library. Bertinoro, Italy, July 20-21st, 2011
PY - 2011
CY - Brno, Czech Republic
PB - Masaryk University Press
SP - 27
EP - 37
AB - In this paper we propose a flexible, modular framework for author name disambiguation. Our solution consists of the core which orchestrates the disambiguation process, and replaceable modules performing concrete tasks. The approach is suitable for distributed computing, in particular it maps well to the MapReduce framework. We describe each component in detail and discuss possible alternatives. Finally, we propose procedures for calibration and evaluation of the described system.
KW - name disambiguation; problem decomposition; scoring functions; single-linkage clustering; MapReduce framework; machine learning
UR - http://eudml.org/doc/220965
ER -

References

top
  1. Agrawal, R., Srikant, R., Fast algorithms for mining association rules, In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB. vol. 1215, pp. 487–499. Citeseer (1994). (1994) 
  2. Dean, J., Ghemawat, S., MapReduce: Simplified Data Processing on Large Clusters, Communications of the ACM 51(1), 1–13 (2004). (2004) 
  3. Galvez, C., Moya-Anegón, F., Approximate personal name-matching through finite-state graphs, Journal of the American Society for Information Science and Technology 58(13), 1960–1976 (Nov 2007). (2007) 
  4. Han, H., Giles, L., Zha, H., Li, C., Tsioutsiouliklis, K., Two supervised learning approaches for name disambiguation in author citations, Proceedings of the 2004 joint ACM/IEEE conference on Digital libraries – JCDL ’04, p. 296 (2004). (2004) 
  5. Han, H., Zha, H., Giles, C.L., Name disambiguation in author citations using a Kway spectral clustering method, In: JCDL ’05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries. pp. 334–343. ACM, New York, NY, USA (2005). (2005) 
  6. Hastie, T., Tibshirani, R., Friedman, J., Elements of Statistical Learning, Springer (2009). (2009) MR2722294
  7. Kang, I., Na, S., Lee, S., Jung, H., Kim, P., Sung, W., Lee, J., On co-authorship for author disambiguation., Information Processing & Management 45(1), 84–97 (Jan 2009). (2009) 
  8. Mann, G. S., Yarowsky, D., Unsupervised personal name disambiguation, In: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003. pp. 33–40. Association for Computational Linguistics, Morristown, NJ, USA (2003). (2003) 
  9. Manning, C. D., Raghavan, P., Schütze, H., Introduction to Information Retrieval, (2008). (2008) Zbl1160.68008
  10. Pavelec, D., Oliveira, L. S., Justino, E., Nobre Neto, F. D., Batista, L.V., Compression and stylometry for author identification, 2009 International Joint Conference on Neural Networks, pp. 2445–2450 (Jun 2009). (2009) 
  11. Pedersen, T., Kulkarni, A., Angheluta, R., Kozareva, Z., Solorio, T., An unsupervised language independent method of name discrimination using second order cooccurrence features, pp. 208–222 (2006). (2006) 
  12. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z., Arnetminer: Extraction and mining of academic social networks, In: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 990–998. ACM (2008). (2008) 
  13. Torvik, V. I., Smalheiser, N. R., Author name disambiguation in MEDLINE, ACM Transactions on Knowledge Discovery from Data 3(3), 1–29 (Jul 2009). (2009) 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.