E-statistics (energy statistics)
Research and software related to E-statistics
E-statistics (energy statistics) refers to a class of tests and statistics
based on Euclidean distances. Applications include testing
multivariate normality, multivariate distance components and
k-sample test for equal distributions, hierarchical clustering by e-distances,
multivariate independence tests, distance correlation, goodness-of-fit tests.
Gabor J. Szekely, National Science Foundation
Maria L. Rizzo,
Bowling Green State University, email:
R software: Energy statistics are implemented in the contributed
energy for R.
- G. J. Szekely and M. L. Rizzo (2013).
Energy statistics: statistics based on distances.
Journal of Statistical Planning and Inference
Volume 143, Issue 8, August 2013, pp. 1249-1272.
- G. J. Szekely and M. L. Rizzo (2013).
The distance correlation t-test of independence in high dimension.
Journal of Multivariate Analysis, Volume 117, pp. 193-213.
- G. J. Szekely and M. L. Rizzo (2012).
On the uniqueness of distance covariance.
Statistics & Probability Letters, Volume 82, Issue 12, 2278-2282.
- Maria L. Rizzo and Gabor J. Szekely (2010).
DISCO Analysis: A Nonparametric Extension of Analysis of Variance,
Annals of Applied Statistics Vol. 4, No. 2, 1034-1055.
- Gabor J. Szekely and Maria L. Rizzo (2009). Brownian Distance
Annals of Applied Statistics,
Vol. 3, No. 4, 1236-1265.
- Gabor J. Szekely and Maria L. Rizzo (2009). Rejoinder: Brownian Distance.
Covariance, Annals of Applied Statistics, Vol. 3, No. 4, 1303-1308.
- Maria. L. Rizzo (2009). New Goodness-of-Fit Tests for Pareto Distributions,
ASTIN Bulletin: Journal of the International Association of Actuaries,
39/2, 691-715. PDF
- G. J. Szekely, M. L. Rizzo, and N. K. Bakirov (2007).
Measuring and Testing Independence by Correlation of Distances, Annals of Statistics,
Vol. 35 No. 6, pp. 2769-2794.
Bakirov, N. K., Rizzo, M. L., and Szekely, G. J. (2006).
A Multivariate Nonparametric Test of Independence, Journal of Multivariate Analysis
Volume 97, Issue 8 , September 2006, Pages 1742-1756
- Szekely, G. J. and Rizzo, M. L. (2005) Hierarchical Clustering
via Joint Between-Within Distances: Extending Ward's Minimum Variance Method,
Journal of Classification, 22(2) 151-183.
- Szekely, G. J. and Rizzo, M. L. (2005) A New Test for
Journal of Multivariate Analysis,
- Szekely, G. J. and Rizzo, M. L. (2004b) Mean Distance Test of Poisson Distribution,
Statistics and Probability Letters, 67/3, 241-247
- Rizzo, M. L. (2003) Hierarchical Clustering Based on a Generalized
Measure of Homogeneity,
2003 Proceedings of the Joint Statistical Meetings, American Statistical
Association, Section for Physical and Engineering Sciences [CD-ROM],
Alexandria, VA: American Statistical Association.
- Szekely, G. J. and Rizzo, M. L. (2004) Testing for Equal
Distributions in High Dimension, InterStat, Nov. (5).
- M. L. Rizzo (2005) Minimum Energy Clustering
Proceedings of Interface/Classification Society of North America,
Joint Annual Meeting, 2005.
- Rizzo, M. L. (2002a). A Test of Homogeneity for Two Multivariate Populations,
2002 Proceedings of the American Statistical Association, Physical and Engineering
Sciences Section [CD-ROM], Alexandria, VA: American Statistical Association.
- Rizzo, M. L. (2002b). A New Rotation Invariant Goodness-of-Fit Test,
Ph.D. dissertation, Bowling Green State University.
- Szekely, G. J. (2000) E-statistics: Energy of
Statistical Samples, Bowling Green State University, Department of
Mathematics and Statistics Technical Report No. 03-05.
- Szekely, G. J. (1989) Potential and Kinetic Energy in Statistics,
Lecture Notes, Budapest Institute of Technology (Technical University).
R is a free software environment
for statistical computing and graphics, available at the
Archive Network (CRAN)..
This software is distributed under
Public License Version 2, or later. See
COPYING for the license.
Questions or comments on software: Maria Rizzo, email address above
[go to References]
energy_1.6.0 released 2013-05-12.
Summary of recent changes in energy package
- distance correlation t-test for high dimension implemented (introduced in SR 2013, JMVA)
- In eqdist.e and eqdist.etest, method="disco"
was replaced by two options: "discoB" (between sample
components) and "discoF" (disco F ratio).
- In distance components: Added disco.between and internal functions
that compute the disco between-sample component and
(DIStance COmponents) function and test added in
energy (version 1.2-0 27-Sept-2010)
disco provides a nonparametric approach to analysis
of structured data, using distance components rather than variance components.
The statistic is related to, but not equivalent to, the ksample statistic.
A disco method has been added to the eqdist.etest function and the corresponding
distance correlation and distance covariance:
The dcov package is now merged into energy version 1.1-0
package, available on CRAN 07-Apr-2008.
Some functions in energy have been translated to Matlab.
<-back to home