Abstract:
In this paper, a time series { X( t,
w
) , t Є T } where X is a random variable ( r .v.) on (W
, C, P) is explained. The properties of stability with supporting real
life examples have been taken and conclusions have been drawn by
testing methodology of hypothesis. Temperature data for 33 years from
five districts of Marathwada of Maharastra State were analyzed .
A preliminary discussion of
properties of time series precedes the actual application to regional
district-wise temperature data.
Keywords: Time series , regression
equation , auto-covariance , auto-correlation.
1.INTRODUCTION:
Our aim here is to illustrate a few
properties of stationary time series with supporting real life
examples. Concepts of auto covariance and auto correlation are shown
to be useful which can be easily introduced. In this article we have
used temperature data of 1970 to 2002 at five locations in Marathwada
region to illustrate most of properties theoretically established.
2.BASIC CONCEPTS:
Basic
definitions and few properties of stationary time series are given in
this section.
Definition 2.1: A
time series: Let (Ω, С, Р)
be a probability space let T be an index set. A real valued time
series is a real valued function X( t ,ω) defined on T x Ω such that
for each fixed t Є T, X( t, ω) is a random variable on (Ω, С, Р).
The function X( t, ω) is
written as X(ω) or X t and a time series considered as a
collection {X t : t Є T }, of random variables [ 8].
Definition 2.3:
Stationary time series:
A process whose probability structure
does not change with time is called stationary. Broadly speaking a
time series is said to be stationary, if there is no systematic change
in mean i.e. no trend and there is no systematic change in variance.
Definition 2.4:
Strictly stationary time series :
A time series is called strictly
stationary, if their joint distribution function satisfy
F
( Xt1 X t2
� X t n ) = F
( Xt1 X
2t � X tn ) ... (1)
Xt1
X t2
� X t n
Xt1+ h
Xt2 + h � X t n + h
Where, the equality must hold for all
possible sets of indices ti and (ti + h) in the index set. Further
the joint distribution depends only on the distance h between the
elements in the index set and not on their actual values.
Theorm 2.1: If
{ X t : t Є T }, is strictly stationary with E{�X
t � } < α and E{�X
t � μ� } < β
then ,
E( X t ) = E( X t + h
) , for all t, h
and
} �.(2)
E [(X t1
� μ )( X t2 � μ )] = E [( X t1 + h
� μ )( X t2 + h � μ )] , for all t1
, t2, h
Proof: Proof follows
from definition (2.4).
In usual
cases above equation (2) is used to determine that a time series is
stationary i.e. there is no trend.
Definition 2.5:
Weakly stationary time series: A time series is called weakly
stationary if
1.The expected value
of X t is a constant for all t.
2.The covariance
matrix of ( X t1 , X t2 ,�. X t
n ) is same as covariance matrix of
( X t1 + h ,
X t2 + h ,�. X t n + h ).
A look in
the covariance matrix (X t1 X t2 �. X t n)
would show that diagonal terms would contain terms covariance ( X
t i , X t i ) which are essentially
variances and off diagonal terms would contains terms like covariance
( X t i, X t j ). Hence, the definitions to
follow assume importance . Since these involve elements from the same
set {X t i }, the variances and co- variances are called
auto- variances and auto-co variances .
Definition 2.6:
Auto-covariance function:
The covariance between { X t }and { X t + h }
separated by h time unit is called auto-covariance at lag h and is
denoted by � (h)
.
� (h) = cov ( X t
, X t + h ) = E{ X t �μ }{ X t+
h � μ } �.(3)
the function
� (h) is called
the auto covariance function.
Definition 2.7: The
auto correlation function: The correlation between observation
which are separated by h time unit is called auto-correlation at lag
h. It is given by
E{X t �μ }{X
t+ h �μ }
Ρ (h) =
�
���������
��.(4)
[ E{X t � μ }2E{X
t+ h � μ }2] �
�
(h)
=
������
[ E{X t � μ }2E{X
t+ h � μ }2] �
where μ is mean .
Remark 2.1: For
a vector stationary time series the variance at time (it + h) is same
as that at time it. Thus, the auto correlation at lag h is
� ( h)
r
(h) =
����
���(5)
� ( 0)
Remark 2.2: For
h = 0, we get , ρ (0) = 1.
For
application attempts have been made to establish that temperature at
certain districts of Marathwada satisfy equation (1) and (5).
Theorem 2.2:
The covariance of a real valued stationary time series is an even
function of h.
i.e.,
�
(h) = �(-h).
Proof: We
assume that without loss of generality, E{ X t} = 0, then
since the series is stationary we get , E{ X t X t + h}
= �(h) , for all t and t + h
contained in the index set. Therefore if we set t 0 = t
1 - h ,
�(h) = E{ X
t0 X t0 + h} = E{ X t 1 X t 1+ h}
= �
(-h) �..(6)
proved.
Theorem 2.3: Let X
t�s be independently and identically distributed with
E( X t) = μ
and var( X t)
= σ2
then
�
( t, k) = E( X t , X k ) = σ2,
t = k
= 0, t ≠ k
This process is
stationary in the strict sense.
3. TESTING PROCEDURE
3.1: Inference
concerning slope (b1):
For testing H0 :
b1
= 0 Vs H1:
b1
> 0 for a = 0.05 percent
level using t distribution with degrees of freedom is equal to n � 2
were considered.
t n - 2 =
b1 / s
b1
where
b1 is the slope of
the regression line and s
b1
= s e / s t and s e = [ SSE / n -
2]1/2,
sum of squares due to
errors (SSE) = (s t2 � s t x
2/ s t 2 ) , s t x
= Σ( t i -
`t ) ( X t -
`X)
st2
= Σ( ti -
`t
) 2 ; s x 2 = Σ( X
t -
`X)2
.Where SSE is the sum of squares due to error or residual sum of
squares .
3.2: Example of
time series: temperature data of Marathwada region were collected
from five districts namely Aurangabad, Parbhani, Beed,
Osmanabad, and Nanded. The data were collected from Socio Economic
Review and District Statistical Abstract, Directorate Economics and
Statistics Government of Maharastra Bombay, Maharastra Quarterly
Bulletin of Economics and Statistics, Directorate of Economics and
Statistics Government of Maharastra , Bombay and Hand Book of Basic
Statistics of Maharastra State [2, 3 , 4]. Hence. we have five
dimensional time series t i , i = 1, 2 , 3 , 4 , 5
corresponding to the districts Aurangabad , Parbhani , Beed ,
Osmanabad and Nanded respectively. Table 4.1A, shows the results of
descriptive statistics, table 4.1B and table 4.2C shows linear trend
analysis . All the linear trends were found to be not significant
except Osmanabad district.
Over the years many scientists have analyzed rainfall ,
temperature , humidity ,agricultural area , production and
productivity of region of Maharastra state, [1 , 5 , 7 , 10 , 11]
. Most of them have treated the time series for each of the revenue
districts as independent time series and tried to examine the
stability or non-stability depending upon series. Most of the times
non- stability has been concluded , and hence possibly any sort of
different treatment was possibility never thought of. In this
investigation we treat the series first and individual series . The
method of testing intercept (
b0
= 0 ) and regression coefficient (
b1
= 0 ), Hooda R.P. [ 9 ] and for testing correlation
coefficient Bhattacharya G.K. and Johanson R.A. [ 6 ]. We set
up null hypothesis for test statistic used to test ,H 0 :
b1
= 0 and H1 :
b1
> 0 , for
a = 0.05,
-------- --------
t n-2 =
b
� S x x
/
s
�
, where
s
�
=
� SSE / n-2
The hypothesis H 0 is not significant for both
the values of t for 31and 18 d. f. for each districts.
The regression analysis tool provided in MS-Excel
was used to compute
b0
,
b1,
corresponding SE , t-values for the coefficients in regression models.
Results are reported in table 4.1B and table4.2C. Elementary
statistical analysis is reported in table�4.1A . It is evident from
the values of CV that there is hardly any scatter of values around the
mean indicating that all the series are not having trend.
Table4.1B shows that the model,
X t
=
b0 +
b1 t
+ �,
When applied to the
data indicates
H
0 :
b1
= 0 is true . Hence X t
is a not having trend
for four districts except Osmanabad districts .
X =
b0
+ b1 t
+ �,
where,
X t are the
annual temperature series .
t is the time (years)
variable.
e is a random error term
normally distributed as mean 0 and variance
s 2 .
Temperature X t
in (C 0 ) is the dependent variable and time t in (years) is the independent variable .
Values of auto covariance computed for
various values of h are given in table-4.2A ..Temperature values for
different districts were input as a matrix to the software. Defining
A = y 1 , y 2 ��y n-h
B = y h + 0 , y h+1 ��y n
� (h) = cov (A ,
B) were computed for various values of h . Since the time series
constituted of 33 values , at least 10 values were included in the
computation . The relation between
�
(h) were examined using model , table-4.2C.
� (h) =
b0 +
b1 h
+ � ,
the testing shows that
, both the hypothesis b0
= 0 and
b1
= 0 test is positive . Table-4.2C was obtained by regressing
values of � (h)
and h , using �Data Analysis Tools� provided in MS Excel. Table
4.2A formed the input for table 4.2C. In other wards,
� (h) are all zero
except Osmanbad districts, in the temperature series of Osmanabad
district trend was found showing that X t , X t + h
are dependent in temperature series of Osmanabad district and there
is a trend in that series . Hence in Osmanabad district temperature X
t is not stationary it presents a trend .
4.Conclusion
It was observed that t
values are therefore not significant for the 4 districts , except
Osmanabad district i.e. concluded that X t does not
depend on t for 4 districts [5] . Similarly,
�i j(h) does not
depend on h to mean that , �no linear relation� rather than �no
relation�. The testing shows that, for the hypothesis
b1 = 0, test is
positive for t and h for 4 districts except Osmanabad .
Generally it
is expected, temperature (annual) over a long period at any region to
be stationary time series. These results does not conform with the
series in Osmanbad district i,e. in Osmanabad district trend is found
in temperature series.
4.1: Analysis:
Temperature
The same strategy of analyzing first individual time
series as scalar series and then treating the vector series as the
regional time series has been adapted here for maximum temperature.
4.2. Temperature time
series treated as scalar time series
Table 4.1 contains the results for scalar series
approach.
The model considered was :
X
i (t) = (b0)i
+ (b1)it
+
�i (t),
i =1, 2,�5 ---- (7)
Where X i is the annual rainfall series , t is the time
seies variable , β0 = the intercept , β1 = the
slope , �i is the random error. Rainfall Xi is
the dependent variable and time t in years is the independent
variable.