Curved exponential families of distributions play an important role in theory of statistics. In case of inference, for curved exponential family we can not find test statistic as good, powerful and efficient as the exponential family. But there is a ray of hope due to Bradly Efron. He has shown in his paper, “Defining the Curvature of a Statistical Problem (with Application to Second Order Efficiency)” [1] that families with small curvature enjoy the good statistical properties of exponential families.
In curved exponential family, variance of maximum likelihood estimator (MLE) exceeds the Cramer-Rao lower bound. For curved exponential family MLE is not the sufficient statistic in general. Nonetheless, with small curvature, curved exponential family enjoys good statistical properties. Firstly it is to be found that the values of the involved parameters of the distributions, for which, curvature have small values. Hence, after finding such values, it can be suggested that for the values of the parameter, the test statistic of certain hypothesis is equivalent to that of the corresponding exponential family. In this case it is found that some inference procedures are available fragmentally for continuous distributions. But for discrete cases we don’t find any procedure. Motivated by my (Sanchayita Sadhu) project work (submitted during the period of my post graduate study), this paper wants to find some techniques of inference in binomial distribution – a discrete distribution.
Curved Exponential Family:-
Let X = (X1, X2,…, Xd) have a distribution Pq, qÎ Θ Í Rq. Suppose Pq has a density (pmf) of the form
,
where k > q. Then the family {Pq, qÎ Θ} is called curved exponential family.
Some Examples of Curved Exponential Family:
A set of independently and identically distributed random variables which follow N (θ, θ2), θ is the unknown parameter involved in the distribution [4].
A random variable, say, X that follows gamma (a, 1/a), a is the unknown parameter involved in the distribution.
X and Y, say, are two random variables such that
where p is the unknown parameter involved in both the distributions [4].
Let a random variable, say, X = (X1,X2) has a Bivariate normal distribution with zero mean, standard deviation equal to 1 and correlation parameter r, r is the involved unknown parameter [4].
Suppose Zi = zi, i = 1,2,…,n. Xi are independent Poi(λzi) variables and Z1,Z2,…,Zn have some joint p.m.f p(z1,z2,…,zn). It is implicitly assumed that each Zi > 0 with probability 1. Then the joint p.m.f of (X1,X2,…,Xn,Z1,Z2,…,Zn) is
N0 = set of non-negative integer, N1 = set of positive integers [4].
Equicorrelation Multivariate Normal distribution: suppose (X1,X2,…,Xn) are jointly multivariate normal with general mean mi, variances all 1 and a common pair wise correlation r. This is an example of curved exponential family [4].
Applications of curved exponential family:-
First of all, some real life data, in which the random variables belong to curved exponential family, is discussed. The analysis, done in this paper, hope to be applied to the data to draw certain inference.
Some practical examples of curved exponential family are discussed here.
ØMixed Ancestral Graph (MAG):-
The family of distributions represented by a linear MAG M over a set of k variables is a locally parameterized curved exponential family of dimension equal to k(k+1)/2 minus the number of pairs of variables in M that are not adjacent to each other. (Theorem 4; Parameterizing and Scoring Mixed Ancestral Graphs by Thomas Richdson & Peter Spirtes [2]).
ØLazega’s Lawyer dataset:-
Lazega’s Lawyer dataset [5] is another example where the random variables of the dataset are from curved family.
ØUse in Social Networking:-
David. R. Hunter, in his paper, “Curved Exponential Family Models for Social Networks”, [3] has stated on the usefulness of curved exponential family models on generalization of exponential random graph model (ERGMs).
Concept of Statistical Curvature:-
The concept of mathematical curvature extends to the curved lines in Euclidian k-space, Ek, say, £= {hq, qÎΘ}, where Θ is the interval of real line. For each q, hqis a vector in Ek whose component wise derivatives with respect to q is denoted by
&
These derivatives are assumed to exist continuously in neighborhood of a value of q where it is wished to define the curvature. Let us also suppose that a k´k symmetric non-negative definite matrix Sq is defined continuously in q.
Let Mq be a 2´2 matrix, with entries denoted
ν20(q), ν11(q), ν02(q),
defined by
Mq(1)
and let
γq (2)
Then γq is “the curvature of £ at q with respect to inner productSq”.
Statistical Curvature:-
γq(given in (2)), the statistical curvature of ₣ at q, is the geometric curvature of £={hq : qÎ Θ } at q with respect to the covariance inner product Sq as defined in (1) and (2).
Here ₣ will stand for the family of densities {fq(x): qÎ Θ}, our curved exponential family.
Procedure for Finding Curvature of Various Distributions from Curved Exponential Family:-
First the pmf/pdf of the random variables is arranged as the known form of curved exponential family.
i.e. of the form
.
Then finding the values of ηi (θ)’s and Ti (x)’s the matrix
is constructed, , ,and are computed and the matrix Mq= is constructed. ()3 is then computed. If there are some negative values in the above result, then the absolute values are taken. Finally i.e. the statistical curvature for various values of θ is calculated.
Using this method the values of the statistical curvature corresponding to the possible values of the involved parameters are found. From these values it can be easily concluded that for which values of the involved parameters the curvature has small values and it can be said that for these values of the parameters, the given curved exponential family enjoys the good statistical properties of exponential families.
The Area of Work:-
The details work of analysis to find these values of involved parameters for which the curvature has small values is done on the following distributions which are some particular examples of curved exponential family.
The examples are given below:
A set of independently and identically distributed random variables which follow N (θ, θ2), θ is the unknown parameter involved in the distribution.
A random variable, say, X that follows gamma(a,1/a), a is the unknown parameter involved in the distribution.
X and Y, say, are two random variables such that
where p is the unknown parameter involved in both the distributions.
Let a random variable, say, X = (X1,X2) has a Bivariate normal distribution with zero mean, standard deviation equal to 1 and correlation parameter r, r is the involved unknown parameter.
Curvature of some curved distributions:-
1. On Normal Distribution:-
Let X1,X2,…,Xn are iid N(θ,θ2).
Here
f(x|θ) =
=
=k(θ)exp(η1 (θ) T1 + η2 (θ) T2 ) h(x)
Here,
η1(θ)=1/θ2 , h2=1/θ.
T1=, T2.
Now, the work is to find V, V and cov.
It is known that the moment generating function of X will be
MX(t)= exp(θt+θ2t2) (1.0)
Now,
V
Therefore from (1.0) it can be computed that
E(X4)= 10θ4
\V(Xi2)=E(Xi4)-[E(xi2)]2 = 6θ4
\
Now,
Cov(Xi2, Xi) = E(Xi3)- E(Xi2) E(Xi) = 2θ3
\= 2nθ3.
Hence the dispersion matrix (variance-covariance matrix) is given by
Sθ = .
Also,
hθ =
\ = & =
\=
= (1.1)
Similarly,
= . (1.2)
= (1.3)
= (1.4)
So, Mθ is given by
(1.5)
\ = (1.6)
and (1.7)
\Curvature= γθ (1.8)
So, for all possible values of θ, the value of the curvature is very small.
2. On Gamma Distribution:-
Let X be a random variable which follows Gamma(α,1/α).
i.e X ~ G(α,1/α).
Then the pdf of X is given by
f(x|θ)= (2.1)
\ f(x|θ) =
= k(θ) h(x)
= k(θ) (2.2)
Here η1(θ) = α-1 and η2(θ) = 1/α
T1= log x and T2= x
Now to find the required variances and covariance a simulation procedure is to be followed. After computing the required expressions, using an R program the values of curvature are found.
The curvature of the mentioned Gamma distribution is given in the following figure:
Figure1
So, from Figure1, it may be concluded that for a = 0 to 46 the value of the curvature is small.
3.On Binomial Distribution:-
Let X follows Binomial(n,p) and Y follows Binomial(m,p2)
i.e
Then the joint p.m.f of (X, Y) is
\ and
T1(x,y) = x , T2(x,y) = y.
With this information an R program is used to find the values of the curvature.
The curvature of the mentioned distribution is given in the following figure:
Figure 2
It is known that there is test statistic for single binomial distribution. Here is the joint distribution of two independent random variables which follow Bin(n,p) and Bin(m,p2) respectively. So, for the values of p for which the value of the curvature is small, any test statistic which is valid in corresponding exponential family can not be referred.
Inference of the Above Binomial Distributions:
ØTest Procedure:
To overcome this problem this paper would like to find out a test procedure to draw an inference about the null hypothesis. Here is the likelihood ratio test.
From Figure 2 it is seen that for p=0.6 to 0.85 the value of the curvature is small.
This paper wants to test
H0 : p = 0.8 vs. H1 : p ≠ 0.8
To find LRT firstly the maximum likelihood estimator of p is found.
Solving, taking n=5, m=3; [Using R],
(3.0)
Therefore, for LRT
Using R the values of λ will be found.
Let us consider the following table, from which a decision can be made:
Table 3
Ob.No.
(x,y)
λ
Sort(λ) º q
Corresponding(x,y)
Probability sum º s
1
(0,0)
460.0262373
0.2450758
(5,3)
0.08589935
2
(0,1)
171.7394474
0.6127021
(4,3)
0.10737418
3
(0,2)
64.1146861
0.6564672
(5,2)
0.15569256
4
(0,3)
23.9356364
1.5317867
(3,3)
0.16106127
5
(1,0)
184.0067301
1.6412016
(4,2)
0.17314087
6
(1,1)
68.6943735
1.7584320
(5,1)
0.20031996
7
(1,2)
25.6453497
3.8295451
(2,3)
0.20166214
8
(1,3)
9.5740587
4.1030881
(3,2)
0.20468204
9
(2,0)
73.6011862
4.3961701
(4,1)
0.21147681
10
(2,1)
27.4771872
4.7101868
(5,0)
0.22676505
11
(2,2)
10.2579300
9.5740587
(1,3)
0.22710059
12
(2,3)
3.8295451
10.2579300
(2,2)
0.22785556
13
(3,0)
29.4398721
10.9906500
(3,1)
0.22955426
14
(3,1)
10.9906500
11.7757079
(4,0)
0.23337632
15
(3,2)
4.1030881
23.9356364
(0,3)
0.23346020
16
(3,3)
1.5317867
25.6453497
(1,2)
0.23364895
17
(4,0)
11.7757079
27.4771872
(2,1)
0.23407362
18
(4,1)
4.3961701
29.4398721
(3,0)
0.23502914
19
(4,2)
1.6412016
64.1146861
(0,2)
0.23507632
20
(4,3)
0.6127021
68.6943735
(1,1)
0.23518249
21
(5,0)
4.7101868
73.6011862
(2,0)
0.23542137
22
(5,1)
1.7584320
171.7394474
(0,1)
0.23544791
23
(5,2)
0.6564672
184.0067301
(1,0)
0.23550763
24
(5,3)
0.2450758
460.0262373
(0,0)
0.23552256
With the help of the Table 3 the decision reached is as follows:
Now the Bayesian approach is applied on this distribution. Using Bayesian method the Bayes estimate of the involved parameter, p is found. Then the Bayes risk and risk in case of original distribution are computed.
For this purpose, let us consider
, 0< p <1
and let
\The posterior distribution of p is
(3.1)
Now substituting the binomial expansion of (1+p)m-y in the integration portion of the expression (3.1), it can be written as
From (3.1) and (3.2)
Using this posterior distribution the maximum likelihood estimate of p can be obtained.
The following steps will found the Bayesian estimation from the posterior distribution.
Under SEL, the Bayes estimation with respect to p(p) is
So, under square error loss the Bayes Estimate of p with respect to prior p(p) is
(3.3)
= C (say).
Maximum Likelihood Estimation:
This estimation is done and also a comparison is given in the following table.
The maximum likelihood of p in classical method and Bayesian method and also the Bayes estimates of p with respect to the prior p(p) is given in the following table (taking x=0-5, y=0-3,a=2, b=3):
Values of (x, y)
Accepted values of p in classical method
[ML estimates]
Accepted values of p in Bayesian method (i.e. in the posterior distribution)
[ML estimates]
Baye’s estimates of p
(0,0)
NA (0)
0.1159392
0.1796218
(0,1)
0.2559191
0.2761045
0.3071429
(0,2)
0.4171725
0.3977991
0.4107143
(0,3)
0.5454764
0.5
0.5000000
(1,0)
0.1702842
0.2201958
0.2629310
(1,1)
0.3711419
0.3615899
0.3798077
(1,2)
0.5164583
0.4745179
0.4772727
(1,3)
0.6363635
0.5714321
0.5625000
(2,0)
0.3112872
0.3175872
0.3433014
(2,1)
0.4818915
0.4451788
0.4513889
(2,2)
0.6146623
0.5507349
0.5434783
(2,3)
0.7272721
0.6428568
0.6250000
(3,0)
0.4391662
0.4104344
0.4213710
(3,1)
0.5893767
0.5272325
0.5220588
(3,2)
0.7119901
0.6264452
0.6093750
(3,3)
0.8181791
0.7142792
0.6875000
(4,0)
0.5592660
0.5000000
0.4975962
(4,1)
0.6944718
0.6080283
0.5919540
(4,2)
0.8085302
0.7017674
0.6750000
(4,3)
0.9090926
0.785716
0.7500000
(5,0)
0.6742002
0.5870874
0.5723140
(5,1)
0.7977114
0.6878677
0.6611842
(5,2)
0.9045345
0.7767435
0.7403846
(5,3)
0.0000000
0.8571339
0.8125000
(3.4)
Hence the Bayes risk is given by
=0.0007318181 with absolute error < 1e-04 (using R software)
Now, the risk in case of the joint distribution of X and Y are to be found. For this purpose
Ef(x,y|p)(p- Ef(x,y|p)(p))2 is to be computed.
Hence the risk in case of joint distribution of X and Y can be found in the following way.
The integration can be solved as the earlier computations [e.g. the procedure followed to find (3.4)] using various values of di’s.
Finally, using R software
= 0.002524987
4. On Bivariate Normal:-
Let us suppose X = (X1,X2) has a Bivariate normal distribution with zero mean, standard deviation equal to 1 and correlation parameter r.
Then the density of X is
(4.1)
Therefore
To compute the statistical curvature of this distribution firstly the moment generating function of bivariate normal distribution and its corresponding calculations are considered.
The moment generating function of X ~ N( 0, 0, 1, 1, r) is given by
(4.2)
Equating the coefficients from both sides of (4.2), the bivariate moments are
m11 = coefficient of t1t2 = r
m20 = coefficients of t12/2! = 1
m02 = coefficient of
m40 = coefficient of
m04 = coefficient of
m22 = coefficient of
Also, mij = 0 if i+j is odd.
Now V(T1) = V( x12 +x22)
=
Now,
V(x22) = 2
&
\V(T1) = 2 + 2 + 4 r2 = 4(1+ r2) (4.3)
(4.4)
And after calculation,
Cov( T1,T2 ) = 4r (4.5)
\ (4.6)
With this information an R program is used to find the values of the curvature.
The curvature of the mentioned distribution is given in the following figure:
Figure3
From the above figure it is observed that when the values of r lie between (-1,-0.5) and (0.5, 1) then this curved family performs just like the corresponding exponential family. But there is a good test statistic for bivariate normal distribution, belongs to exponential family, when the value of r0 (the value for null hypothesis) is equal to 0. So, in this case a good statistic can not be found.
Remark:
The paper hopes, this can be done in other discrete curved family also and plenty of scope are there for doing further research.
References
Defining the Curvature of a Statistical Problem (With Applications to Second Order Efficiency). Bradley Efron. The Annals of Statistics, Vol. 3, No. 6. (Nov., 1975), pp, 1189-1242
Paremeterizing And Scoring Mixed Ancestral Graphs. Thomas Richardson & Peter Spirtes. Technical Report No. CMU-PHIL-102, August 1999.
Curved Exponential Family Models for Social Networks. David. R. Hunter, Pann State University, March 2, 2006.