Frequency Analysis of Extreme Events and Developing Intensity Duration Frequency Curves: The Case of Jimma Town, Ethiopia

Frequency analysis of rainfall data enables to determine the future extreme events and developing IDF curves. The object of this study was to analysis the frequency of extreme events in precipitation, and develop Intensity, duration and frequency (IDF) curves. In this study 30 years daily rainfall data (1986-2015) from the National Meteorological Agency (NMA) was used. To develop IDF curves graphical and statistical methods were used. The result obtained indicates, the General Extreme Value (GEV) was the best fitted distribution function to the maximum annual daily observed data, at 95% of confidence interval. As the result of the computed sub daily rainfall data, the highest rainfall value for the duration of 24 hrs. and 100-year return period was 83.86 mm, whereas the lowest rainfall value was obtained for the duration of 10min and 2 year of return period was 13.34 mm. The correlation coefficient between the IDF curves developed from observed data and the Computed IDF curves was (R 2 =0.99).

study area. Accordingly, the object of this study was to analysis the frequency of precipitation, extreme events and develop IDF curves in case of Jimma town.

Methodology 2.1 Description of the Study Area
Jimma town is found in Oromia regional state. It is found at 1780m a.s.l. and has mean temperature 18.9 °C. The average rainfall is about 1557 mm.

Data
The daily rainfall data of Jimma and neighboring stations were collected from the National Meteorological Agency in Addis Ababa, Ethiopia. The daily data was covering 30 years  of data collected from nearby stations found around the town. In order to develop IDF curve the daily rainfall data changed to sub-daily data. Among surrounding stations only Jimma station was selected to make at site frequency analysis. The other three did not fill the WMO criteria because their missing data is above 10%.

Method
In this study frequency analysis of the study area was investigated by using extreme value method and IDF curves. Different literatures which have been related to this research idea were applied.

The missing rainfall data filling
The missing data filling technique was mainly based on taking the average value of a month over the period of 30 years and fill to make sure that all months have got their own maximum monthly rainfall data from the annual maximum daily data. The station whose data is missing is called interpolation station and gauging stations whose data are used to calculate the missing station data are called index stations. If the normal annual precipitation of the index stations lies within ±10% of normal annual precipitation of interpolation station then arithmetic mean method is applied to determine the missing precipitation record otherwise the normal ratio method is used for this purpose.
= ∑ 1 Where 'n' is the number of nearby stations, is precipitation at i th station and 'Px' is missing precipitation.

= ∑ 2
Where: Px is the missing precipitation for any storm at the interpolation station 'x', Pi is the precipitation for the same period for the same storm at the "i th " station of a group of index stations, Nx the normal annual precipitation value for the 'x' station and Ni the normal annual precipitation value for i th station.

Analyzing rainfall data
Evaluating the extreme rainfall data is important in hydrological studies. The statistical behavior of any hydrological series can be described on the basis of certain parameters such as mean, variance, standard deviation, coefficient of variation and coefficient of skewness, etc. (Sharma & Kumar, 2016). The following equations were applied to assess the trend analysis of the rainfall of the station.

= ∑
3 Where: is mean, is variable and N is the total number of observations.
Where: is coefficient of variation measures of variability of any hydrologic series. is used to classify the degree of variability of rainfall events as less, moderate and high. When < 20% it is less variable, from 20% to 30% is moderately variable, and > 30% is highly variable. Areas with > 30% are said to be extreme events.
Where: ! is coefficient of skewness. The coefficient of skewness is used to verify the degree of asymmetric of a distribution around the mean (Mohamed et al., 2016) .

Fitting the Probability Distribution Function
The analysis of rainfall data for computation of expected rainfall of a given frequency is commonly done by using appropriate probability distribution functions (Zalina et al., 2002) .There are different types of distribution functions which used to assesses the distribution and extreme events of rainfall data such as Normal distribution, Log-normal distribution, Gumbel's Extreme Value distribution and Log Preason Type III distribution (Alam et al., 2018) .Among these function Gumbel and log Pearson III are the two known common distribution functions which applied in this study.

I. Gumbel
The generalized extreme value distribution and its particular case, the Gumbel extreme value distribution, are widely applied for extreme value analysis (Pinheiro & Ferrari, 2015).
According to eq. 7, when k=0, it gives EVI distribution, k<0 the distribution known as EVII and K >0 it is reduced to EVIII (Shrestha et al., 2017).The Gumbel method used to estimate the (2, 5, 10, 25, 50 and 100-year return periods) for any duration. The Gumbel distribution is used to model the distribution of the maximum (or the minimum) of a number of samples of various distributions. This distribution might be used to represent the distribution of the maximum level of rainfall or a river flow in a particular year. The Gumbel distribution is the most applicable distribution function for IDF analysis due to its suitability for modeling extreme events.
Where: 3 = √5 6 7 =Standared deviation & 8 = 9̅ − 0.57723 II. Log Pearson Type III LPT III probability method is utilized to create different rainfall durations and return periods of rainfall intensity which helps to produce the IDF curves. LPT III distribution includes logarithms of the estimated values. In this distribution function the statistical parameters should be transformed to the logarithmic data. In this type of distribution, the frequency factor @ A ,depends on the return period T (Rasel & Islam, 2015).

14
III. Testing the goodness of fit of data GOF tests show how well the selected distribution fits to the given data. There are three most commonly used GOF tests. These tests are the Anderson-Darling, the Kolmogorov-Smirnov, and the Chi-Squared tests were used at (α=0.05) level of significance for the selection of the best fit distribution (Mohamed et al., 2016). In all three tests a parameter or statistic unique to each method is calculated for the required distribution types. These distributions are ranked based on their parameter values. Among the three tests Chi-squared(C-S) test is the known one. This test simply compares how well the theoretical distribution fits the empirical distribution PDF. The C-S test statistic is given by: The above stated goodness of fit test types used to determine the null hypothesis by applying α level of significance for the selection of the best fit probability distribution. H0: The maximum daily rainfall data fit the specified distribution. HA: The maximum daily rainfall does not fit the specified distribution.

Estimation of Short Duration Rainfall
Daily rainfall data for the study area were taken for a period of 30 years . The rainfall data obtained from gauging station were of 24hr duration depth. Design and analysis of drainage structures require rainfall intensity duration relationship of shorter period. Ethiopian Road Authority (ERA, 2013) Drainage Design Manual suggests the following equation for calculation of shorter duration. Sub daily precipitation can be calculated for any duration(10min,15min,30min, 1hr, 2hr, 3hr, 6hr, 12hr and 24hr) (Kotei et al., 2013). Thus, sub daily rainfall data calculated according to the following equation,

17
Using b = 0.3 and n = 0.92 as suggested by ERA 2013 manual results are tabulated Where:a b is required precipitation depth for the duration t-hour in mm,a cd is daily precipitation in mm and e is the time duration in hours for which precipitation depth is required in hours. The rainfall data is converted to intensity by dividing the rainfall with duration.

Estimating the design precipitation using Frequency Factor
Gumbel is used to calculate extreme value distributions using frequency factor (Rasel & Islam, 2015).Extreme rainfall events (XT) value calculated as follows; Where: is the individual extreme value of rainfall and g is the number of events or years of record. The standard deviation is calculated by eq. (17) computed using the following relation: Where: A is frequency precipitation for each duration with a specified return period T, k l average of the maximum precipitation corresponding to a specific duration,7 is standard deviation and @ A Gumbel's frequency factor for return period n o (AlHassoun, 2011).

Developing IDF curve
Then the average rainfall intensity, IT (in mm/hrs) for return period T is obtained from: Where: e r is duration in hours. The frequency of rainfall is usually defined by reference to the annual maximum series, which consists of the largest values observed in each year (Rasel & Islam, 2015) .

Empirical Equation for Estimation IDF curve
IDF is a mathematical relationship between the intensity of rainfall(I) the duration(td) and the return period (Tr). In this study empirical equation as given in (5) is used (Cardoso, Bertol, Soccol, & Sampaio, 2014) . The rainfall intensity-duration-frequency equation can be expressed as an empirical formula Eq. (21) which was found to provide the best correlated and consistent relationships for design storm predictions.
where, t A is the rainfall intensity for duration of the storm e r (in hr). Constants a, b, and c are empirical parameters depend on location, shape, and scale of the area which are obtained from area characteristics and precipitation data using logarithmic relationships (Aysar, 2016).

Evaluation of model
Model simulations can be evaluated by using regression coefficient (R 2 ). The regression coefficient (R 2 ) is the square of the Pearson product-moment correlation coefficient and describes the proportion of the total variance in the observed data that can be explained by the model. The closer the value of R 2 to 1, the higher is the agreement between the simulated and the measured values. The coefficient of determination (R 2 ) is used to check the accuracy of the model output which is given by ( Nguyena et al., 2016) .
Where; n is number of compared values, O is observed data, O is observed mean, P is simulated data and P is simulated mean. The range for R 2 is from zero to one. The closer R 2 is to one, the better the regression equation "fits" the data. Since 0 ≤ R 2 ≤ 1, then -1 ≤ R 2 ≤ 1. In fact, R is commonly called the correlation coefficient which is a measure of the degree of linear association between observed and predicted observations.

Rainfall data and analysis
Rainfall frequency analyses is estimating the magnitude precipitation falling at a given point or over a given area for a specified duration and return period. The precipitation data used for frequency analysis has to be arranged in the form of annual maximum series or converted to this form using continuous records of daily rainfall data. Analysis of consecutive days maximum rainfall of different return periods is a basic tool for safe ,economic planning , design of small dams, bridges, culverts, irrigation as well as drainage work etc (Bhakar et al., 2008) . The monthly rainfall data variation of Jimma town has a prominent rainfall from May to September. There is also significant amount of rainfall during April, May and October. The annual maximum rainfall in Jimma is above 150mm.This shows the place gets large amount rainfall annually and consistence. The nearby stations used to Journal of Natural Sciences Research www.iiste.org ISSN 2224-3186 (Paper) ISSN 2225-0921 (Online) Vol.12, No.21, 2021 18 interpolate 3% the missed data of Jimma station. Accordingly, arithmetic mean method (station-average method) were applied to fill the missed data of the station because the normal annual precipitation of the index stations lies within ±10% of normal maximum annual precipitation of Jimma station. Figure 6:Annual average maximum rainfall of station round Jimma town Table 2

:Annual Maximum 24hr rain fall (mm) at Jimma, 1986 -2015
The data obtained from NMA was the daily rainfall data. For the purpose of this study only the maximum daily rainfall was selected. The above table shows the maximum daily data in the 30 years. From the above maximum annual daily rainfall data, the mean, variance, standard deviation, coefficient of variation, skew ness and kurtosis were found in table  Table 3 :Descriptive statistic of Maximum daily rainfall data of Jimma Town According to the above table, the coefficient variation of the rainfall is 19%, this indicates the area has less varied rainfall. The station has kurtosis coefficient negative indicating that a distribution is flatter than the normal distribution. The distribution of the rainfall is right skewed or positive indicating that the low rainfall happens frequently whereas the high value rainfall happens rarely.
3.2 Best Fitting Probability Distribution Function a) Probability distribution function is used to evaluate probabilities or return periods of a certain rainfall amount at a given location. Such data can then be used in assessing flood discharges of given return period through modeling. It also applied in schemes of flood alleviation or forecasting (Soro et al., 2016). As it could be observed from the graph below, the reduced variate was found to have direct relationship with the recurrence interval of all the maximum rainfall measurements.

Figure 7: Frequency analysis result
For Jimma town, the data series of the station was also tested of different probability distribution to check and select the best distribution function. As a result, obtained from figure 3, Gumbel probability distribution was found to be an appropriate probability distribution function that describe well the given data of the station. The aim of frequency analysis is to relate the magnitude of extreme events to their frequency of occurrence by using probability distributions (Alam et al., 2018). Hence the genialized extreme value function was selected as best probability density function that works for Jimma town maximum annual daily rainfall data. Figure 8: Probability distribution function to the annual maximum daily rainfall data Identifying best fit distribution functions using easy fit software a) The result obtained shows, the generalized extreme value distribution function was the selected the best probability density function that fits for given station data. Table 4: Annual maximum daily rainfall data fitting, the location, shape and scale parameters of the selected distribution Accurate estimation of extreme rainfall could help to alleviate the damage caused by the extreme storms and it can help to achieve more efficient design and management of water infrastructure systems ( Nguyena, et al., 2016). The probability density function of the above graph shows the probablity distribution function of the annual maximum daily rainfall data. Table 5 :Goodness of fit Table 4 shows six frequency distributions applied in this study; it suggests that the best frequency distribution obtained for the peak daily rainfall was GEV. Based on the hypothesis made above and the GOF test, the computed p-values for stations of Jimma found to be greater than the significance level (3= 0.05), thus one cannot reject the null hypothesis (H0). Therefore, the maximum daily rainfall data fit is the General Extreme Value distribution function.

Computed Extreme Rainfall Quantiles (x y )
Rainfall frequency analysis plays an important role in hydrology and several economic evaluation of water resources projects. It helps to estimate the return periods and their corresponding event magnitudes as well as creating reasonable design criteria (Hassan & Ping, 2012). For Ethiopia, Gumbel's extreme value distribution is suggested to fit the annual extremes rainfall data (Gebremedhin, 2017) .According to the result obtained from evaluated distribution functions , GEV has better described the given data sets. It is also the known best fit distribution function and uses only for extreme events (peak rainfalls) (Waghaye et al., 2015). Most hydrological studies require short duration (sub-daily) of rainfall data but in developing countries like Ethiopia, such short duration data was not available. Therefore, Frequency analysis used to compute sub daily rainfall data in terms of different duration (10min,15min,30min, 1hr, 2hr, 3hr, 6hr, 12hr and 24hr) and return periods (2,5,10,25,50 and 100 years). Table 6: Rainfall estimates for different return Periods for different durations using GEV distribution for Jimma Town As the result obtained in Table 5, for each return period the values of the extreme rainfall events increase with increasing the rainfall durations. The highest rainfall value was determined for rainfall duration of 24 hrs. and 100 years return period (83.86 mm), whereas the lowest rainfall values was obtained for duration of 10min and 2 year of return period (13.34 mm). For constant rainfall duration the values of calculated rainfall depths increase with increasing the return periods.

The rainfall Intensity Duration Frequency (IDF) curve of observed data
Frequency analysis relates the magnitude of extreme events to their frequency of concurrence through the use of probability distributions. From different parameters intensity is an important characteristic of rainfall because, other things being equal, more flood and soil erosion can be caused by one rainstorm of high intensity than by several storms of low intensity (Kotei et al.,2013) . The graph below shows the average rainfall intensity is inversely proportional to its duration of occurrence and directly proportional to the return period. Figure 9: IDF curve with logarithmic scale and normal scale Generally, intense storms last for very short durations (Shinde et al., 2017). According to the above figure an increase in duration, there is a decrease in the maximum average intensity of the storm. In a specified return period (frequency) and for given duration of occurrence, a storm of higher intensity is less likely to occur than a storm of lower intensity.

Future IDF curve of the area by using local parameters 3.5.1 Estimation of IDF Parameters
The estimation of IDF parameters (a, b and c) were done using the IDF curve fit software known as MIDUSS Version 2.25 for the duration of 10min,15min ,30min,1hr, 2hr, 3hr, 6hr, 12hr and 24 hours and return periods of 2, 5 ,10, 25, 50 and 100 years. The results are summarized in Table 7  Table 7: Values of IDF Parameters for different return periods In the above table general trend of the constants of different stations was observed. The "a" coefficient increases with an increase in return period and also duration of rainfall for all of the stations. The "b" almost constant in all station. Exceptionally, the "c" exponent generally increases or decreases with increasing recurrence interval for all stations.

Test for Model Performance
The comparison between the observed and computed rainfall intensities at various frequencies and duration of rainfall was carried out on the data of the station using the regression coefficient. Therefore, there is strong correlations (R 2 =0.9884) between observed and computed (predicted) values. Hence, it can be said that the computed intensity value obtained using IDF parameters would adequately describe the observed data and the parameter estimation model performs well. 3.5.3 Future IDF curve for Jimma town Figure 10:IDF curves with logarithmic scale and normal scale for Jimma According to the above figure, for a given duration of rainfall, the rate of (I) decreases with increase in return period. The figure provides the information about the intensity duration frequency relationship in the study area for return period of 2 years to 100 years. Based on the IDF Curves above, one can conclude the relationship has an exponential function form which is the long duration of rainfall is followed by decreasing intensity of rainfall for various return periods.

4.Conclusion
Analysis of rainfall and determination of annual maximum daily rainfall would enhance the management of water resources applications as well as effective utilization. This study aimed at applying frequency analysis to assess extreme events of rainfall in the area. The result of the study shows Generalized Extreme Value distribution (GEV) was the most suitable to describe the daily rainfall patterns in Jimma town. Computed intensity result was highly correlated with the observation one. The obtained result has good agreement with previous studies done in some parts of the country. The constructed IDF curves can be used as the baseline for the estimation of the extreme events for the design and development of soil and water conservation activities, urban agriculture and water supply projects take place in the area. Finally, to minimize the uncertainty, before applying the result of the IDF curve of the study, it is recommended to increase the number of stations and verify it with non-stationary data.