# Graphics processing units in acceleration of bandwidth selection for kernel density estimation

Witold Andrzejewski; Artur Gramacki; Jarosław Gramacki

International Journal of Applied Mathematics and Computer Science (2013)

- Volume: 23, Issue: 4, page 869-885
- ISSN: 1641-876X

## Access Full Article

top## Abstract

top## How to cite

topWitold Andrzejewski, Artur Gramacki, and Jarosław Gramacki. "Graphics processing units in acceleration of bandwidth selection for kernel density estimation." International Journal of Applied Mathematics and Computer Science 23.4 (2013): 869-885. <http://eudml.org/doc/262420>.

@article{WitoldAndrzejewski2013,

abstract = {The Probability Density Function (PDF) is a key concept in statistics. Constructing the most adequate PDF from the observed data is still an important and interesting scientific problem, especially for large datasets. PDFs are often estimated using nonparametric data-driven methods. One of the most popular nonparametric method is the Kernel Density Estimator (KDE). However, a very serious drawback of using KDEs is the large number of calculations required to compute them, especially to find the optimal bandwidth parameter. In this paper we investigate the possibility of utilizing Graphics Processing Units (GPUs) to accelerate the finding of the bandwidth. The contribution of this paper is threefold: (a) we propose algorithmic optimization to one of bandwidth finding algorithms, (b) we propose efficient GPU versions of three bandwidth finding algorithms and (c) we experimentally compare three of our GPU implementations with the ones which utilize only CPUs. Our experiments show orders of magnitude improvements over CPU implementations of classical algorithms.},

author = {Witold Andrzejewski, Artur Gramacki, Jarosław Gramacki},

journal = {International Journal of Applied Mathematics and Computer Science},

keywords = {bandwidth selection; graphics processing unit; probability density function; nonparametric estimation; kernel estimation},

language = {eng},

number = {4},

pages = {869-885},

title = {Graphics processing units in acceleration of bandwidth selection for kernel density estimation},

url = {http://eudml.org/doc/262420},

volume = {23},

year = {2013},

}

TY - JOUR

AU - Witold Andrzejewski

AU - Artur Gramacki

AU - Jarosław Gramacki

TI - Graphics processing units in acceleration of bandwidth selection for kernel density estimation

JO - International Journal of Applied Mathematics and Computer Science

PY - 2013

VL - 23

IS - 4

SP - 869

EP - 885

AB - The Probability Density Function (PDF) is a key concept in statistics. Constructing the most adequate PDF from the observed data is still an important and interesting scientific problem, especially for large datasets. PDFs are often estimated using nonparametric data-driven methods. One of the most popular nonparametric method is the Kernel Density Estimator (KDE). However, a very serious drawback of using KDEs is the large number of calculations required to compute them, especially to find the optimal bandwidth parameter. In this paper we investigate the possibility of utilizing Graphics Processing Units (GPUs) to accelerate the finding of the bandwidth. The contribution of this paper is threefold: (a) we propose algorithmic optimization to one of bandwidth finding algorithms, (b) we propose efficient GPU versions of three bandwidth finding algorithms and (c) we experimentally compare three of our GPU implementations with the ones which utilize only CPUs. Our experiments show orders of magnitude improvements over CPU implementations of classical algorithms.

LA - eng

KW - bandwidth selection; graphics processing unit; probability density function; nonparametric estimation; kernel estimation

UR - http://eudml.org/doc/262420

ER -

## References

top- Andrzejewski, W., Gramacki, A. and Gramacki, J. (2013). Density estimations for approximate query processing on SIMD architectures, Technical Report RA 03/13, Poznań University of Technology, Poznań. Zbl1284.93221
- Blohsfeld, B., Korus, D. and Seeger, B. (1999). A comparison of selectivity estimators for range queries on metric attributes, Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, Philadelphia, PA, USA, pp. 239-250.
- Bochkanov, S. and Bystritsky, V. (2013). ALGLIB, http://www.alglib.net.
- Chapman, B., Jost, G. and van der Pas, R. (2007). Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation), MIT Press, Cambridge, MA.
- Duong, T. (2004). Bandwidth Selectors for Multivariate Kernel Density Estimation, Ph.D. thesis, University of Western Australia, Perth. Zbl1060.62042
- Farooqui, N., Kerr, A., Diamos, G., Yalamanchili, S. and Schwan, K. (2011). A framework for dynamically instrumenting GPU compute applications within GPU Ocelot, Proceedings of the 4th Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-4, Newport Beach, CA, USA, pp. 9:1-9:9.
- Gramacki, A., Gramacki, J. and Andrzejewski, W. (2010). Probability density functions for calculating approximate aggregates, Foundations of Computing and Decision Sciences 35(4): 223-240. Zbl1284.93221
- Greengard, L. and Strain, J. (1991). The fast Gauss transform, SIAM Journal on Scientific and Statistical Computing 12(1): 79-94. Zbl0721.65089
- Harris, M. (2007). Optimizing parallel reduction in CUDA, http://developer.download.nvidia.com/assets/cuda/files/reduction.pdf.
- Hendriks, H. and Kim, P. (2003). Consistent and efficient density estimation, in V. Kumar, M.L. Gavrilova, C.J.K. Tan and P. L'Ecuyer (Eds.), Proceedings of the 2003 International Conference on Computational Science and Its Applications, ICCSA 2003: Part I, Lecture Notes in Computer Science, Vol. 2667, Springer-Verlag, New York, NY, Berlin/Heidelberg, pp. 388-397.
- Johnson, N., Kotz, S. and Balakrishnan, N. (1994). Continuous Univariate Distributions, Volume 1, Probability and Statistics, John Wiley & Sons, Inc, New York, NY. Zbl0811.62001
- Johnson, N., Kotz, S. and Balakrishnan, N. (1995). Continuous Univariate Distributions, Volume 2, Probability and Statistics, John Wiley & Sons, Inc, New York, NY. Zbl0821.62001
- Kulczycki, P. (2005). Kernel Estimators in Systems Analysis, Wydawnictwa Naukowo-Techniczne, Warsaw, (in Polish).
- Kulczycki, P. (2008). Kernel estimators in industrial applications, in B. Prasad (Ed.), Studies in Fuzziness and Soft Computing. Soft Computing Applications in Industry, Springer-Verlag, Berlin, pp. 69-91.
- Kulczycki, P. and Charytanowicz, M. (2010). A complete gradient clustering algorithm formed with kernel estimators, International Journal of Applied Mathematics and Computer Science 20(1): 123-134, DOI: 10.2478/v10006-010-0009-3. Zbl1300.62043
- Li, Q. and Racine, J. (2007). Nonparametric Econometrics: Theory and Practice, Princeton University Press, Princeton, NJ. Zbl1183.62200
- Łukasik, S. (2007). Parallel computing of kernel density estimates with MPI, in Y. Shi, G.D. van Albada, J. Dongarra and P.M.A. Sloot (Eds.), Computational Science-ICCS 2007, Lecture Notes in Computer Science, Vol. 4489, Springer, Berlin/Heidelberg, pp. 726-734.
- NVIDIA Corporation (2012). NVIDIA CUDA programming guide, http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf.
- NVIDIA Corporation (2013). NVIDIA's next generation CUDA compute architecture: Kepler GK110, http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110Architecture-Whitepaper.pdf.
- Michailidis, P.D. and Margaritis, K.G. (2013). Accelerating kernel density estimation on the GPU using the CUDA framework, Applied Mathematical Sciences 7(30): 1447-1476.
- Nelder, J.A. and Mead, R. (1965). A simplex method for function minimization, The Computer Journal 7(4): 308-313. Zbl0229.65053
- Raykar, V. and Duraiswami, R. (2006). Very fast optimal bandwidth selection for univariate kernel density estimation, Technical Report CS-TR-4774/UMIACS-TR2005-73, University of Maryland, College Park, MD.
- Raykar, V., Duraiswami, R. and Zhao, L. (2010). Fast computation of kernel estimators, Journal of Computational and Graphical Statistics 19(1): 205-220.
- Sawerwain, M. (2012). GPU-based parallel algorithms for transformations of quantum states expressed as vectors and density matrices, in R. Wyrzykowski, J. Dongarra, K. Karczewski and J. Waśniewski (Eds.), Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science, Vol. 7203, Springer-Verlag, New York, NY/Berlin/Heidelberg, pp. 215-224.
- Sheather, S. (2004). Density estimation, Statistical Science 19(4): 588-597. Zbl1100.62558
- Silverman, B. (1986). Density Estimation for Statistics and Data Analysis, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, London. Zbl0617.62042
- Silverman, B.W. (1982). Algorithm AS 176: Kernel density estimation using the fast Fourier transform, Journal of the Royal Statistical Society: Series C (Applied Statistics) 31(1): 93-99. Zbl0483.62032
- Simonoff, J. (1996). Smoothing Methods in Statistics, Springer Series in Statistics, Springer-Verlag, New York, NY/Berlin/Heidelberg. Zbl0859.62035
- Wand, M. and Jones, M. (1995). Kernel Smoothing, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, Chapman&Hall, London. Zbl0854.62043
- Xavier, C. and Iyengar, S. (1998). Introduction to Parallel Algorithms, Wiley Series on Parallel and Distributed Computing, Wiley. Zbl0948.68220
- Yang, C., Duraiswami, R. and Gumerov, N. (2003). Improved fast Gauss transform, Technical Report CS-TR-4495, University of Maryland, College Park, MD.

## NotesEmbed ?

topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.