Forecasting the Volatility of Coffee Arabica and Crude Oil Prices with a High Frequency Data

Modeling, analyzing, and forecasting volatility has been the subject of extensive research among academics, practitioners and portfolio managers. This paper estimates a variety of GARCH models using weekly closing price (in USD/barrel) of Brent crude oil and weekly closing prices (in USD/pound) of coffee Arabica, and compares the forecasting performance of these models based on a high frequency intra-day data which allows for a more precise realized volatility measurement. The study used weekly price data to explicitly model volatility, and employed high-frequency intra-day data to assess model forecasting performance. The analysis points to the conclusion that GARCH (1,1) for Arabica coffee and GARCH(1,2) crude oil returns were best models, respectively with Student’s t distributed innovation terms is the most accurate volatility forecasting models in the context of our empirical setting. We recommend and encourage future researchers studying the forecasting performance of GARCH models to pay particular attention to the measurement of realized volatility, and employ high-frequency data whenever feasible.


INTRODUCTION 1.Background of the study
The recent global financial crisis has increased the susceptibility of Commodity Dependent Developing Countries (CDDCs) to excessive price volatility in commodity markets. Moreover, structural weaknesses in these countries render their economies more vulnerable to increased commodity market turbulence than developed countries, given their comparatively lower income and high dependence on commodity exports. The World Bank estimates that 119 million additional people have been pushed into hunger as a result of the 2008 food price crisis (World Bank, 2009).
Modeling, analyzing, and forecasting volatility has been the subject of extensive research among academics, practitioners and portfolio managers (Baume et al, .2013). This has been used in risk management, derivative pricing and hedging, portfolio selection and policy making. Similarly the analysis of volatility spillovers between commodity and asset prices has a profound implication for risk management and portfolios maximization by the government and investors alike (Salisu and Oloko, 2015). In view of the current bearish behavior of oil price in the international markets, it is arguably of special interest to study the volatility of oil and coffee prices (Gil-Alana and Yaya, 2014).
In the last two decades there has been an explosion in volatility research. The "Dynamic Volatility Era" began with the introduction of the Autoregressive Conditional Heteroskedastic (ARCH) model (Engle, 2012). The ARCH (p) model expresses the conditional variance as p − th order weighted average of past (squared) disturbances and thus is able to describe volatility clustering in financial series. Following this, an enormous body of research has focused on extending and generalizing the ARCH model, mainly by providing alternative functional forms for the conditional variance. Some of the most important contributors to the dynamic volatility literature have been Engle, Bollerslev, Nelson and Ding. Bollerslev (1986) proposed the Generalized ARCH (GARCH) model, as a more parsimonious way of modeling volatility dynamics.
The most significant contribution of this extensive volatility research is that now we have a much clearer and better understanding of the probabilistic features of speculative returns data. This is of crucial importance in the search for an overarching framework, which will be able to first capture the empirical regularities adequately and ultimately give us reliable volatility forecasts. There are a number of extensive surveys of the literature including for instance Bollerslev (1990), Chou and Kroner (1992), Engle and Nelson (1994), and Ding and Engle (2001). While realized volatility models often demonstrate excellent forecasting performance, there is still much debate concerning optimal approaches. However, recent results reported by Hansen and Lunde (2005) have suggested that, in the context of exchange rate returns, nothing can beat a GARCH (1, 1) model.
Naturally, different papers in the literature tend to focus on different commodity price volatility. The focus of this work is on the volatility of Brent crude oil and Coffee Arabica prices in the commodity futures markets which are markets where one can buy specific quantities of a commodity at a specified price with delivery set at a specified time in the future. The choice of these commodities is motivated by the fact that these are the most actively traded commodities in the world which also happen to be major importing and exporting commodities of Ethiopia. Existing work appears to focus mostly on in-sample modeling of volatility of commodity prices giving less attention to model forecasting performance. This work will also seek to bridge this gap in the literature.

Statement of the problem
A number of recent papers have investigated the volatility of energy and commodity price mainly due to increased interest for understanding the drivers and dynamics of such volatilities in these highly volatile times. Alternative estimation approaches are used in the literature, but thus far there is a dearth of work undertaking a thorough "GARCH" analysis -a statistical framework which is particularly suited for modeling asset price volatility. Moreover existing work appears to focus mostly on in-sample modeling of volatility of commodity price giving less attention to model forecasting performance. Accordingly, this work will seek to contribute to the literature by undertaking a systematic analysis of volatility forecasting performance of different GARCH models.
Most of the existing research on volatility spillovers employ statistical models in order to estimate realized volatilities which turned out to be oftentimes poor approximations of true volatilities. An attractive alternative to model-based statistical volatility is to compute realized volatility based on high frequency intra-day or `tick' data. Realized volatility has been found to be more accurate than model based volatilities in predicting latent volatility (Christensen and Prabhala, 1998). Thus, the results provided by previous studies can be potentially misleading in that they may have underestimated or overestimated the extent of the true volatility.
This paper avoids this potential pitfall by using realized volatility based on a 30-minute intra-day data. We argue that doing so represents an important contribution to the literature of volatility measurement within commodity markets.

Objective of the study 1.3.1 General Objective
The general objective of this study is to model the volatility of the Arabica coffee and crude oil prices using GARCH type models and perform volatility forecasting using realized variance.

Specific Objectives
The specific objectives of this study are: 1. To fit a univariate GARCH models for each commodity prices separately. 2. To forecast the volatility of the two commodities 3. To compare the realized volatility forecasts obtained from weekly return square and a high frequency (a thirty minute interval) data.

Stationarity
The concept of stationarity is fundamental in time series analysis. A time series is said to be strictly stationary if the joint distribution of , , … , is identical to that of , , … , for all t, where is an arbitrary positive integer and , … , is a collection of positive integers. In other words, strict stationarity requires that the joint distribution of , … , is invariant under time shift. Strict stationarity imposes a very strong condition and is hard to test empirically. Often the concept of weak stationarity is used. A time series is weakly stationary if both the mean of and the covariance between and are time invariant, where is an arbitrary integer. More specifically, is weakly stationary if (a) ( =µ, which is a constant and (b) !" # , # =# which only depends on . In practice, suppose that we have observed $ data points | = 1, … , $ . Weak stationarity implies that the time plot of the data would show that the values fluctuate with constant variation around a fixed level. In application, weak stationarity enables one to make inferences concerning future observations (e.g. Forecasting).The covariance # = !" # , # is called the () − auto-covariance of . It has two important properties: (a) # * = +(, and (b) # = # . In the finance literature, it is necessary to test weakly stationarity of an asset return series. There are three methods of testing stationarity: graphical analysis, unit root test due to Dickey and Fuller (1979)

test.
A. Graphical Analysis of the Series Before pursuing formal tests, it is always advisable to plot the time series under study. Such a plot gives an initial clue about the likely nature of the time series. For instance, if a line-graph of a time series shows an upward trend, then this suggests perhaps that the mean of the data has been changing. This may be a clue that the series is not stationary. Such an intuitive feel is the starting point of more formal tests of stationarity.
B. Unit Root Test A test for stationarity that has become widely popular over the past several years is the -$. ,!! /0 .The Augmented Dickey-Fuller (ADF) test is discussed below. To illustrate the use of Dickey-Fuller test, consider first an autoregressive process of order one (AR (1)) process: = 1 + 3 + 4 1 where 1 and 3 are parameters and 4 is assumed to be white noise. is a stationary series if −1 < 3 < 1. If 3 1 = 1 , is a non-stationary series (a random walk with drift); if the process is started at some point, the variance of increases steadily with time and goes to infinity. If the absolute value of 3 is greater than one, the series is explosive. Therefore, the hypothesis of a stationary series can be evaluated by testing whether the absolute value of 3 is strictly less than one. The augmented Dickey-Fuller (ADF) test takes the unit root as the null hypothesis, 6 * : 3 = 0. Since explosive series do not make much economic sense, this null hypot0hesis is tested against the one-sided alternative 6 9 ∶ 3 < 1.The test is carried out by estimating an equation with subtracted from both sides of the equation: ∆ = 1 + < + 4 2 where < = 3 − 1 , ∆ is the first difference operator, and the null and alternative hypotheses are 6 * : < = 0 ,6 9 : < < 0.
While it may appear that the test can be carried out by performing a − /0 on the estimated <, the − 0 ( .0 . under the null hypothesis of a unit root does not have the conventional t-distribution. Dickey and Fuller (1979) showed that the distribution under the null hypothesis is nonstandard, and simulated the critical values for selected sample sizes. Recently, MacKinnon (1991) has implemented a much larger set of simulations than those tabulated by Dickey and Fuller. In addition, MacKinnon estimates the response surface using the simulation results, permitting the calculation of Dickey-Fuller critical values for any sample size and for any number of right-hand variables. The simple unit root test described above is valid only if the series is an AR (1) process. If the series is correlated at higher order lags, the assumption of white noise disturbances is violated. The ADF test uses different methods to control for higher-order serial correlation in the series. It makes a parametric correction for higherorder correlation by assuming that the series follows an AR ( >) process and adjusting the test methodology: = 1 + 3 + 3 + ⋯ + 3 @ @ + 4 (3) The ADF approach controls for higher-order correlation by adding lagged difference terms of the dependent variable to the right-hand side of the regression: , ∆ is the first difference operator, and the null and alternative hypotheses are 6 * : < = 0 ,6 9 : < < 0.
An important result obtained by Fuller is that the asymptotic distribution of the t-statistic on < is independent of the number of lagged first differences included in the ADF regression. Moreover, while the parametric assumption that follows an autoregressive (AR) process may seem restrictive, Said and Dickey (1984) demonstrate that the ADF test remains valid even when the series has a moving average (MA) component, provided that enough lagged difference terms are augmented to the regression.

Tests for autocorrelation
A weakly stationary series is not serially correlated (or is a white noise process) if the autocorrelation function is 0 for all k >0. The ACF at lag , denoted by G is defined as: Since both covariance and variance are measured in the same units of measurement, G is unit less, or pure, number. It lies between -1 and 1. If we plot G against , the graph we obtain is known as the population correlogram.
In practice it is only possible to compute the sample autocorrelation function (SACF), G J .To compute this, first computing the sample covariance at lag , # J , and the sample variance, # * is a must. Therefore, the SACF at lag is A plot of G J against is known as the sample correlogram.
Financial applications often require to test jointly that several autocorrelations of are zero. the Portmanteau statistics (Q-statistics) is defined as Where n=sample size and m=lag length. The null hypothesis for this test is, all the G up to certain lags are simultaneously zero. In large samples, the T statistic is approximately distributed as the ℎ. − 0T-(,/ distribution with V degree of freedom .In application, if the computed Q exceeds the critical value from the ℎ. − 0T-(,/ distribution at the chosen level of significance, the null hypothesis is rejected. Ljung and Box (1978) modify the above Q-statistic as LB statistic (Ljung-Box Q-statistic) which is defined as: Although in large samples both P and LB statistics follow the chi-square distribution with V degrees of freedom, WX statistic has been found to have better performance in small samples than the P statistic.

The univariate GARCH model
First of all let the returns , be expressed as the change in logarithmic price over a certain period , = !) > > 9 Consider the following financial return model; , -return series at time , _ -the average return at time , 4 -the innovation term The autoregressive AR (p), moving average MA (q) and autoregressive moving average ARMA (p, q) models are applicable when the innovation term (4 ) maintains a constant variance (homoscedasticity). If the error term 4 is conditionally heteroskedastic, then ARCH/GARCH models are applicable to model the conditional variance of the innovation term. Since the model involves just one lag of 4 .. / 4 , it is known as ARCH(1) model. The generalized ARCH (here after, GARCH) model is a generalization of the ARCH model in that it includes lagged variances in the conditional variance equation. GARCH models have the advantage of capturing long lags in the shocks by using fewer parameters than ARCH models. The GARCH (1, 1) model is defined as: The GARCH model with p number of lagged variance term and q number of lagged squared error terms denoted as GARCH (q, p),and can be written as: Where the volatility term ` F denote the variance and l represents the number of lags, and the term 4 D is the squared error for the period − .. We can write the above equation more compactly: To ensure that the conditional variance is well defined in a GARCH (p, q) model all the coefficients in the corresponding linear ARCH (∞) should be positive rewriting the GARCH (p, q) modeled as an ARCH (∞): ` ≥ 0, if 1 * ≥ 0 and all p ≥ 0. The non-negativity of 1 * and p is also a necessary condition for the nonnegativity of ` (Rossi, 2004). The ( coefficients of lagged squared returns are interpreted as how fast the model react to, for example, market events, the g −coefficients of lagged conditional variance determines the degree of persistence in the volatility. A large value of ( indicates that the conditional variance decays slowly, and that the volatility is persistent. On the other hand, if the ( value is relatively higher than the g value, then the volatility is more extreme. The sum of ( D and g F (∑ ( D + ∑ g F h iE j DE ) should be less than one for the above rstH6 T, > to be stationary. Nevertheless, one regions' volatility has impact on volatility of the other regions. Making estimations that include this contagion effect requires an extended model.

The model order selection: AIC and BIC
An important step before making the estimations is to determine the order of the models. Theoretically one can do this using the autocorrelation function, but in practice this may be difficult. A more formal way is to use an information criterion and choose the order that minimizes the criterion value. Two common criteria are the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). The formulas for these are suH = −2 × WWw + 2x XuH = −2 × WWw + x × Wy y where y is the sample size, and x is the number of parameters. WWw is an abbreviation for log likelihood function. The reason for using both criteria is that the BIC is consistent but inefficient, and the suH is the opposite, not consistent but efficient. No criteria are superior to others, but an overall assessment is needed based on the results showed by the criteria. For a model to be best it should have the smallest information criteria (Brooks, 2008).

Determining the conditional distribution
When fitting a GARCH-model based on financial data, the conditional distribution of the returns has to be defined. Studies, for example Bollerslev (1987) , illustrate that returns are not normally distributed. Instead, the Student-t distribution captures the observed kurtosis in empirical returns in a more sufficient way than the normal distribution. Returns have excess kurtosis and fatter tails than the normal distribution. Therefore, the Student-t distribution is more suitable (Reider, 2009). There are three assumptions about the conditional distribution of the error term commonly employed when working with GARCH models: normal (Gaussian) distribution, student's tdistribution, and the generalized error distribution (GED). If 6 b is true, then homoscedasticity characterizes the variance indicating no volatility clustering. The test statistic is Wx = $t ~\ (17) Where $ is the number of observations and t is computed from the regression (16) using estimated residuals. If $ is large, Wx is chi-squared distributed with T degrees of freedom.

Defining a proxy
In evaluating volatility forecasts the usual proxy for 'true' volatility is 'ex post' squared returns, or the squared errors. However, as noted by Andersen and Bollerslev (1998) and Andersen et al. (1999), although the use of squared returns or the squared errors is justifiable as an unbiased estimate of the volatility process, it provides a very noisy measure due to a large idiosyncratic term. Specifically, the returns innovation can be written as 4 =` a with a an independent zero mean and unit variance stochastic process and ` is the volatility process.
If the model for ` is correctly specified, then the conditional expectation, (4 |u ) = ` a |u =` , and the squared error is an unbiased estimate of the volatility process; however, it still contains the noisy idiosyncratic term, a . This typically resulted in a poor performance, which instigated a discussion of the practical relevance of volatility models. However, Andersen and Bollerslev (1998) showed that the 'poor' performance could be explained by the fact that the squared return is a noisy proxy for the conditional variance. By substituting the realized variance (instead of the squared return), Andersen and Bollerslev (1998) showed that volatility models perform quite well. Hansen and Lunde (2003) provide another important argument for using the realized variance rather than the squared return. They show that substituting the squared returns for the conditional variance can severely distort the comparison, in the sense that the empirical ranking of models may be inconsistent for the true (population) ranking. So an evaluation that is based on squared returns may select an inferior model as the 'best' with a probability that converges to one as the sample size increases. Thus realized volatility is the sum of squared intra-day returns. In principle, letting N tend to large, i.e. continuous time sampling, the measure approaches the true integrated volatility of the underlying continuous time process and is theoretically free from measurement error. Further, this measure allows a market participant to essentially treat volatility as an observed variable and to allow direct estimation.

Forecast evaluation 2.8.1 The MZ regression
A popular way to evaluate volatility models out-of-sample is in terms of the t from a Mincer -Zarnowitz (MZ) regression, that is, squared returns are regressed on the model forecasts of ` and a constant. Here: ` -is ex-post volatility (e.g. realized volatility) at time t, ℎ is estimated (in-sample) or forecasted (out-of sample) volatility at time t, " -independent and identically distributed; vi~ N (0, 1). α and g are parameters to be estimated.
If the model for conditional variance is well specified, we should have: ( = 0, g = 1. According to specific features of financial data series, the value of t is usually low (even less than 5%) (Anderson et al., 2009).

Mean Absolute Error (MAE)
Another way of determining the goodness of the estimations and forecasts is calculating the xs . The approach is to measure how the received conditional covariance are close to their corresponding realized value. The formula is: Where the proxy is used as ` and the estimated conditional covariance is used as ℎ .By comparing the xs between the estimated models, it can give an indication of which model that makes the best estimations.

Root Mean Square Error (RMSE)
The third measure is the Root Mean Square Error (RMSE), which is defined as Using these methods, the estimated models can be compared, and using the same measurements for estimations and forecasts, one can determine if the relatively best estimations model also makes the best forecast.

Data
The data considered in this paper were the weekly time series of Brent crude oil and coffee Arabica futures market closing price, given in US dollars per barrel and US dollars per pound, respectively. Both series contain data spanning between first week of January 2005 and last week of October, 2016 (a total of 616 observations) extracted from Bloomberg database. The full sample is split into two parts: in sample data, in order to estimate the parameters of models and out of sample data, in order to make forecasts. The in-sample period spans from first week of January 2005 up to last week of December, 2015 and the out-of-sample period spans from first week of January, 2016 through last week of October, 2016. For the out of sample period, we have also extracted a 30 minute intra-day data for realized volatility measuring purpose.

Test of stationarity and features of log return series
Our data consists of two commodity prices; crude oil, and Arabica coffee. Fig. 3.1 is the trend charts of the two commodity prices in the study period. The two commodity prices present the same trend and direction during the entire study period. Price for both of the commodities have risen from 2005 up to 2008 and fallen from 2009 up to 2010 with a similar pattern. Another impression is that both commodity level series are non-stationary. To visualize the returns series for these two markets, we depict the return time series plots in Fig.3.2. The weekly return series display volatility persistence properties, indicating that large changes tend to be followed by large changes of both sign and small changes tend to be followed by small changes, and the differenced series suggests

Time series plot for differenced series, Arabica coffee and crude oil
The level series plot in Fig 3.1 show that the level prices are non-stationary. Thus, the first logical step is to check the stationarity of these prices using unit root test. In this test the null hypothesis of unit root is rejected if the t-statistic is less than the critical value. Table 3.1 summarize the DF unit root test results of the two commodity level prices. As can be seen in the table, the t-statistics are greater than the critical values, at 1%, 5% and 10% the significance. Hence, the null hypothesis of unit root would not be rejected, that is, there is a unit root problem in each of the data. 3.120 If a time series data is non-stationary, it is necessary to look for possible transformations that might bring stationarity. In practice, econometricians usually transform financial prices in to return forms. This is because often return series are found to be stationary such that analysis is possible. The log return series is obtained by: , ƒ = 52 * !) p ƒ p ƒ where , is the log return series of the real price multiplied by 52 (which is simply a scaling factor), to annualize as we are using weekly data and each year contains 52 weeks, and p t is the real price series and log is the natural logarithm. Table 3.2 summarize the DF unit root test of the log return series for each of the commodities. The table show that all the t-statistics are less than the critical values at all levels of significance. These indicates that the null hypothesis of unit root would be rejected in both cases. Hence, the log return series are stationary for each of the items as shown by the time series plots.

Volatility Modeling
To build a volatility model for the log return series, the first step is specifying the mean equation. Once we specify the mean equation we have to test for ARCH effects using the residuals of the mean equation. If ARCH effects are statistically significant, specifying a volatility model and carrying out a joint estimation of the mean and volatility equations is necessary. The conditional mean specification is, in general, arbitrary for GARCH models of the conditional volatility. Various modifications to the conditional means in GARCH models are possible (see, for example, Asai and McAleer (2003).

Specification of Mean Equation
Based on two statistics, the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) we have chosen the temporary mean equation to estimate the joint mean and variance parameters. In most applications, lower order ARMA models, say, ARMA (1, 1), ARMA (1, 2), ARMA (2, 1) and ARMA (2, 2) are used. Table  3.3 gives the alternative ARMA models together with their corresponding AIC and BIC.  Table 4.3, the best in-sample results are achieved by ARMA (2, 2) for both commodity log return series. Therefore, ARMA (2, 2) is the best (having minimum AIC & BIC) conditional mean equation for both of the two cases. Hence, the ARMA (2, 2) model given by the following expression is used to model the conditional variance as mean equation of the log return series of prices: , = _ + G , + G , + 4 + " 4 + " 4 (22) Where; , -return series at time , either of Arabica coffee or crude oil return, ARMA (2, 2) process 4 is an innovation term.

Test for ARCH Effects
Considering the above chosen financial returns model i.e. Equation (22) and on fitting this model, if there is no volatility clustering in each of the returns, the random disturbance term 4 should be a white noise process. The standardized residual plot from Equation (22) can be an initial insight to judge the heteroskedastic characteristics of the error term. The standardized residual plot from each commodity return series, and is given below (Figure  3.3). The figure depicts the residual plots of the two commodity return series generated from the mean equation .We see from the figure that for both of the series there is a prolonged period of low volatility and prolonged period of high volatility. For example for Arabica coffee return there is a long period of low volatility from the first week of 2005 to the end of 2008 and also there exist a long periods of high volatility from first week of 2014 to the end week of 2016. Crude oil return also exhibits a prolonged period of low volatility than the coffee series, which is from the first week of 2005 to the first week of 2009. In other words periods of high volatility are followed by periods of high volatility and periods of low volatility tend to be followed by periods of low volatility, which is known as volatility clustering. This suggests that the residuals or error terms are conditionally heteroskedastic and can be represented by GARCH models.

Figure 3.3: Residual plots of the Arabica coffee and crude oil return series from the mean
Based on the residuals from the mean equation chosen above, it is possible to test for the existence of ARCH effect which will allow continuing the analysis using GARCH model. Table 3.4 shows the results of ARCH LM test for the two commodity returns. The last column of the table includes the p-values that indicate rejection of the null hypothesis that "there is no ARCH effect" up to the fourth lag at 5% level of significance. The results indicate that the two commodities price log return series are volatile and need to be modeled using GARCH models.

GARCH Model Identification
To estimate and evaluate the forecasts of the competing GARCH models, different p and q values for the standard symmetric GARCH models are tested using different statistics such as Akaike Information Criterion (AIC) and Bayesian information Criterion (BIC) in order to choose the best model based on the in-sample data. The appropriate specification is chosen, and then for this specific p and q, the alternative GARCH models are estimated, tested and finally one GARCH model is chosen based on forecasting performance. For this GARCH (p, q)-model different distributions are evaluated. Namely, normal distribution, student's t-distribution and generalized error distribution (GED). Finally the parameters for the chosen GARCH model is presented. Table 3.5 summarizes these two statistics computed from different GARCH models. Note that the AIC and BIC of the GARCH models are obtained by estimating the mean return and variance equations simultaneously.   The row ranks indicate the ranking of the various GARCH models based on the two statistics. As a result, the lower the rank sum is the better the model. The best models for the two commodity series are:  GARCH(1,1) for Arabica coffee  GARCH(1,2) for crude oil That is, the conditional variance processes are modeled by ` = 1 + ( 4 + f ` Table 3.6 and Table 3.7 we present different forecast accuracy measures. We compare the realized volatility measures by taking the weekly return square and a high frequency (30-minute intra-day) data as realized volatility measure. We use the mincer-zarnowitz regression to our estimates. If the estimated parameters in this regression are as expected, it means ( close to zero and g close to one and R 2 is not lower than 30% from Equation(3.35), indicating that the differences between the compared estimates are not huge and if so both of them are good estimates (Mincer, 1969).  Table  4.7 imply that the realized volatility measure from high frequency data is a better measure of unobserved volatility than the commonly used squares of returns (R2 values are the lowest). This is also confirmed by the other forecast error measures having 44 one step ahead forecasts (2016w1-2016w44). The values of the RMS and MAE should get lower if the computing model is best performing. With regards to the distribution of the error term, the student's t-distribution combined with high frequency data is best performing for both series.

Estimations of the univariate GARCH
The values of the coefficients of the two commodities are shown below in Table 3.8.The estimation was done assuming that the error term comes from student's t-distribution with eight degree of freedom. From the table, it can be seen that some of the coefficients in both of the returns are not significant at 5% level of significance; for instance, the constant term is not significant in both cases. Non-negativity rules: As we are modeling variance and the estimated variance must be non-negative, there is no easy rule that ensures non-negativity for the general GARCH(p, q) process . But Nelson &Cao(1992) give easy checkable condition for GARCH(1,q) model and (somewhat) manageable conditions for GARCH(2,q) model. The non-negativity condition for GARCH (1, 1) and GARCH(2,1) model are: GARCH (1, 1) ` = 1 + ( 4 + f ` + f ` GARCH (1,2) For both of the models to be non-negative the following condition should be satisfied: 1 > 0, ( ≥ 0 , f ≥ 0 , f + f < 1, and f ≥ 0 Looking the parameter estimates of ARCH and GARCH terms from the table below they satisfy the non-negativity rule described above for both commodity return series.

Discussion
The aim has been modeling and forecasting the volatility dynamics of weekly time series of Brent crude oil and coffee Arabica futures market closing price using GARCH models. From the preliminary analysis over the time period considered, both of the price series show an increasing trend. To determine whether the series are stationary or not, the Augmented Dickey-Fuller (ADF) test was carried out. Often, raw data of commodity prices are nonstationary, which was also the case in this study. For both level time series, the tests indicates to the existence of unit root I(1). The first log difference of each time series was considered as stationary (Table 3.3).
As a reference point for the analysis, a standard univariate GARCH models were fitted with ARMA(2,2) mean equation, and it is found that the variance equation for Arabica coffee and crude oil returns were GARCH(1,1) and GARCH(1,2) ,respectively.
In the two commodity returns it appears that the ARCH term coefficients were significant at 5% level of significance. The result is an implication of the presence of volatility clustering, i.e. large changes followed by large changes and small changes followed by small changes. In the case of Arabica coffee our result go in line with the Hansen and Lunde (2003) findings. They conclude that GARCH (1,1) model with high frequency data is enough to capture the volatility of exchange rates return volatility, but in the case of crude oil our finding is something different.
This study suggests that using high frequency data results in a more accurate forecast values in both the standard univariate GARCH and multivariate GARCH models. This result agrees with David G. McMillan and Alan E. H. Speight (2010) inference in the context of exchange rate returns. In fact, they consider a daily GARCH (1,1) model and 5-minute intra-day return. In addition to the usual 'ex post' squared returns measure, they also construct realized volatility and two versions of bias-corrected realized volatility. In order to evaluate these forecasts they utilize the Mincer-Zarnowitz regression test of predictive power, comparing forecasting performance on the basis of the t values obtained. In our analysis the procedure is similar, but we use a weekly return series and a 30-minute interval intra-day return series and we stretch their conclusion.

CONCLUSION AND RECOMMENDATIONS
This study estimates a variety of GARCH models using weekly closing price (in USD/barrel) of Brent crude oil and weekly closing prices (in USD/per pound) of coffee Arabica, and compares the forecasting performance of these models based on a high frequency intra-day data which allows for a more precise realized volatility measurement. The analysis points to the conclusion that for Arabica coffee and crude oil returns GARCH (1,1) and GARCH(1,2) were best models, respectively with Student's t distributed innovation terms is the most accurate volatility forecasting models in the context of our empirical setting.
We recommend and encourage future researchers studying the forecasting performance of MGARCH models to pay particular attention to the measurement of realized volatility, and employ high-frequency data whenever feasible.
We also recommend that public policy makers interested in foreseeing the price volatility of these two major commodities in the context of the Ethiopian economy consider using the information documented in this study as input in their deliberations given that it is based on some robust econometric work and highly appropriate data.
The scope of the analysis in this study has been limited to the volatility of the two commodities. In order to overcome this limitation and provide a more nuanced analysis, it might be profitable for future researchers to consider incorporating stock, currency and bond markets volatilities into the analysis. The potential complexity of such a research agenda notwithstanding, the results are likely to be rewarding in light of the deeper integration between global financial and commodity markets in recent years, a phenomenon which came to be known as the financialization of commodities. Author conflict of interest: The authors declare that there is no conflict of interest regarding the publication of this paper. Funding: this work has not received any specific grant, the authors are Debre markos university employees,