E-statistics (energy statistics)

Research and software related to E-statistics

E-statistics (energy statistics) refers to a class of tests and statistics based on Euclidean distances. Applications include testing multivariate normality, multivariate distance components and k-sample test for equal distributions, hierarchical clustering by e-distances, multivariate independence tests, distance correlation, goodness-of-fit tests.

Gabor J. Szekely, National Science Foundation
Maria L. Rizzo, Bowling Green State University, email: email

R software: Energy statistics are implemented in the contributed package energy for R.

References

  1. Songzi Li and Maria L. Rizzo (2017). K-groups: A Generalization of K-means Clustering, ArXiv e-prints, 1711.04359, pdf.
  2. G. J. Szekely and M. L. Rizzo (2017). The Energy of Data, The Annual Review of Statistics and Its Applications. Extended Review. 4:447-479. , doi: 10.1146/annurev-statistics-060116-054026
  3. M. L. Rizzo and J. T. Haman (2016). Expected distances and goodness-of-fit for the asymmetric Laplace distribution, Statistics & Probability Letters, Volume 117, pp. 158-164, ISSN 0167-7152, http://dx.doi.org/10.1016/j.spl.2016.05.006.
  4. M. L. Rizzo and G. J. Szekely (2016). Energy Distance, WIRES Computational Statistics, Wiley, Volume 8 Issue 1, 27-38. Available online Dec., 2015, doi: 10.1002/wics.1375.
  5. G. J. Szekely and M. L. Rizzo (2014). Partial distance correlation with methods for dissimilarities, Annals of Statistics, 42/6, 2382-2412. article, preprint.
  6. G. J. Szekely and M. L. Rizzo (2013). Energy statistics: statistics based on distances. Journal of Statistical Planning and Inference Volume 143, Issue 8, August 2013, pp. 1249-1272. DOI
  7. G. J. Szekely and M. L. Rizzo (2013). The distance correlation t-test of independence in high dimension. Journal of Multivariate Analysis, Volume 117, pp. 193-213. DOI
  8. G. J. Szekely and M. L. Rizzo (2012). On the uniqueness of distance covariance. Statistics & Probability Letters, Volume 82, Issue 12, 2278-2282. DOI
  9. Maria L. Rizzo and Gabor J. Szekely (2010). DISCO Analysis: A Nonparametric Extension of Analysis of Variance, Annals of Applied Statistics Vol. 4, No. 2, 1034-1055. Reprint DOI
  10. Gabor J. Szekely and Maria L. Rizzo (2009). Brownian Distance Covariance,
    Annals of Applied Statistics, Vol. 3, No. 4, 1236-1265.    Reprint    doi:10.1214/09-AOAS312
  11. Gabor J. Szekely and Maria L. Rizzo (2009). Rejoinder: Brownian Distance. Covariance, Annals of Applied Statistics, Vol. 3, No. 4, 1303-1308.    Reprint    doi:10.1214/09-AOAS312REJ
  12. Maria. L. Rizzo (2009). New Goodness-of-Fit Tests for Pareto Distributions, ASTIN Bulletin: Journal of the International Association of Actuaries, 39/2, 691-715. PDF
  13. G. J. Szekely, M. L. Rizzo, and N. K. Bakirov (2007). Measuring and Testing Independence by Correlation of Distances, Annals of Statistics, Vol. 35 No. 6, pp. 2769-2794. http://dx.doi.org/10.1214/009053607000000505.    Reprint
  14. Bakirov, N. K., Rizzo, M. L., and Szekely, G. J. (2006). A Multivariate Nonparametric Test of Independence, Journal of Multivariate Analysis Volume 97, Issue 8 , September 2006, Pages 1742-1756 http://dx.doi.org/10.1016/j.jmva.2005.10.005.
  15. Szekely, G. J. and Rizzo, M. L. (2005) Hierarchical Clustering via Joint Between-Within Distances: Extending Ward's Minimum Variance Method,
    Journal of Classification, 22(2) 151-183. http://dx.doi.org/10.1007/s00357-005-0012-9.
  16. Szekely, G. J. and Rizzo, M. L. (2005) A New Test for Multivariate Normality,
    Journal of Multivariate Analysis, 93/1, 58-80. http://dx.doi.org/10.1016/j.jmva.2003.12.002. Reprint
  17. Szekely, G. J. and Rizzo, M. L. (2004b) Mean Distance Test of Poisson Distribution,
    Statistics and Probability Letters, 67/3, 241-247 http://dx.doi.org/10.1016/j.spl.2004.01.005.
  18. Rizzo, M. L. (2003) Hierarchical Clustering Based on a Generalized Measure of Homogeneity,
    2003 Proceedings of the Joint Statistical Meetings, American Statistical Association, Section for Physical and Engineering Sciences [CD-ROM], Alexandria, VA: American Statistical Association.
  19. Szekely, G. J. and Rizzo, M. L. (2004) Testing for Equal Distributions in High Dimension, InterStat, Nov. (5).    Reprint
  20. M. L. Rizzo (2005) Minimum Energy Clustering Proceedings of Interface/Classification Society of North America, Joint Annual Meeting, 2005.
  21. Rizzo, M. L. (2002a). A Test of Homogeneity for Two Multivariate Populations,
    2002 Proceedings of the American Statistical Association, Physical and Engineering Sciences Section [CD-ROM], Alexandria, VA: American Statistical Association.
  22. Rizzo, M. L. (2002b). A New Rotation Invariant Goodness-of-Fit Test, Ph.D. dissertation, Bowling Green State University.    Abstract
  23. Szekely, G. J. (2002) E-statistics: the Energy of Statistical Samples, Technical Report No. 02-16, Bowling Green State University, Department of Mathematics and Statistics, October 2002. PDF
  24. Szekely, G. J. (2000) E-statistics: Energy of Statistical Samples, Bowling Green State University, Department of Mathematics and Statistics Technical Report No. 03-05.
  25. Szekely, G. J. (1989) Potential and Kinetic Energy in Statistics, Lecture Notes, Budapest Institute of Technology (Technical University).

Software for R: energy


R is a free software environment for statistical computing and graphics, available at the Comprehensive R Archive Network (CRAN)..
This software is distributed under GNU General Public License Version 2, or later. See COPYING for the license.

Questions or comments on software: Maria Rizzo, email address above


[go to References]

Current version energy_1.7.2 released 2017-09-14.

Development version of energy on GitHub

<-back to home