Modelling Default Risk of Borrowers: Evidence from Online Peer to Peer Lending Platforms in Australia

Peer to Peer lending has the capacity to transforming the mass banking industry worldwide but credit risk modelling remains the core challenge of the platform. The general objective of this study is to analyse the credit default risk of borrowers of Peer to Peer online lending platform based in Australia. Specific objectives include the following; To identify the loan information applicants provide to request for a loan facility, Using RateSetter.com published data on loans to predict the likelihood of credit risk of the platform. In this article, we employed binary logistic regression model to assess the likelihood of loan default. Based on the mathematical approach and the nature of dependent variable, we grouped variables into categorical, numerical-continuous as well as binary. The dependent variable is dichotomous whilst real-life dataset was retrieved from a popular and competitive online lending platform based in Australia from 2014-2017. We identified that early repayment, no mortgage tenant; car, debt consolidation, investment, major events, professional services, 3-year loan duration, 4-year loan duration, interest rate and income have significant influence on borrowers’ likelihood to default. Our empirical coefficients suggest that, there is 83.4% likelihood of borrowers default rate and hence recommended a critical examination of borrowers’ information presented to the platform. This paper fulfills the need to examine the credit information provided by loan applicants. Similarly, it endeavors to predict the possibility of borrowers default risk and the reasons contributing to online lending credit default risk. Keywords: Credit Risk, Peer To Peer Online Lending, Binary Logistic Regression DOI : 10.7176/RJFA/10-2-01


Introduction
Over the past decade, social lending likewise acknowledged as peer to peer lending has become a striking field of interest among researchers and financial experts. Online P2P lending is a financial innovation that combines the internet and private lending marketplace to provide unsecured loans to borrowers. According to studies conducted by Bachmann et al. (2011) and Tao, Dong et al. (2017) P2P lending is the provision of unsecured peer-to-peer loans from lenders to borrowers through online marketplace and absolutely free from the involvement of financial intermediaries. Although newest and evolving, it is designed to supplement main stream financial institutions lending by closing borrowing deficit among Small Scale Medium Enterprises as well as individual borrowers.
Social lending has the potential to transform the mass banking model worldwide. Since the marketplace is basically organized on the internet, online P2P lending has the ability to connect every potential lender and borrower. Prodigious majority of borrowers with low net-worth and borrow comparatively modest amounts, currently have the chance to get finance via the P2P marketplace platform. It also encourages, intra-community, regional and global lending and borrowing. Invariably, the sustainability of online lending depends on the flexibility of service providers to devise a strategy of addressing risk management practices due to information asymmetry that arises among lenders and borrowers. The trustworthiness of both parties is unknown to each other.
The existence of market imperfect information between borrowers and lenders cause an adverse selection problem and subsequently misguide lenders to fund higher-risk sub-prime borrowers, Akerlof (1970), (Shen, Krumme et al. 2010). In an attempt of limiting such phenomenon, online platforms however have adopted "unconventional means" of credit rating system to judge creditworthiness of loan applicants. Borrowers' transaction and credit information available on Lending Club, Yue-Bao, Renrendai.com, Alibaba to mention few are mostly used as the yardstick to determine either the borrower is a "good" or "bad". In U.S and other Western countries there exist a standard credit rating system, where historical financial information of the borrower are directly provided by specialized and independent credit rating agencies like FICO (Tao, Dong et al. 2017), (Malekipirbazari andAksakalli 2015, Lin, Li et al. 2017). P2P platforms prevent market losses due to NPL by maintaining certain funds. Zopa Safeguard Trust (ZST) for instance is founded by Zapo and provides a total sum of £9.6m ($10.56m) to mitigate credit risk and NPL. Again, the opportunity granted by peer-to-peer marketplaces actually allow lenders and loan applicants to participate in pools of loans offered on the platform and has further limited market risk that occurred in [2007][2008]. As most internet finance platforms are still sprouting and improving their pricing and risk management models, addressing credit risk is quite often complicated. Credit facilities provided by these marketplace platforms are not securitized, scrutinized and unsecured making credit risk assessment the central concern for borrowers and lenders. Recent studies however focus on the auditing mechanism of peer-to-peer lending platform as well as the assessing borrowers' characteristics that affect credit default rate using information provided by borrowers.
Based on this context, we modeled default risk of the borrowers using dataset from a P2P lending platform in Australia. The study employed binary logistic regression model to evaluate the borrower's credit information (both soft and hard financial information) and other contributing factors in order to shed light on the standard credit risk assessment method for lending platforms and lenders. We therefore look critically into various variables and how they generate a repayment problem which is a major departure of works conducted by scholars in literatures reviewed. From the theoretical perspective, the study adds up to the existing literature concerning credit risk screening problems in P2P lending. The expedition has therefore broadened the knowledge of researchers about the ongoing credit risk analysis. From the practical perspective, the empirical findings serve as yardstick to lenders by making informed and optimal investment decision strategy. Hence, understanding the unique characteristics of P2P lending platform that boost the credit score of borrowers and lenders who take the advantage of lending booms for online loans. The rest of this paper is organized as follows.
In Section 2, we present the related empirical literature and institutional framework of P2P lending. Section 3 describes the methodology and presents the empirical results for measuring the default risk of borrowers. Finally, Section 4 discusses the implication of the empirical result and concludes the whole article.

Literature Review
Fundamentally, credit may be referred to as any loan facility lend to a potential borrower or any monetary instrument (fixed coupon bond) that involves pre-determined fixed payments over a given time period. According to Anita (2008), credit risk is the possible loss of valuable assets caused by probable deterioration of the creditworthiness of counterparty or its inability to meet contractual obligations (Manab, Theng et al. 2015). Cantor (2001), identified credit risk as a dominant risk for financial institutions as the core mandate for banks are lending and deposit activities. Additionally, security firms may face credit risk as well due to their involvement in derivatives market, borrowing or lending securities and making fringe loans to customers. Thus, credit risk depends on the inability of borrowers to generate sufficient cash flows through operation, earnings, or asset sales to meeting future interest and principal payment of the outstanding debt.
The proliferation of countless number of growing P2P lending supported by the internet in recent years has gingered the interest of researchers. A number of studies therefore focus on the advantages and economic impact of -the evolving online P2P lending. Culkin, Murzacheva et al. (2016), identified the emergence of online P2P lending platforms as the provision of direct alternative source of funds to facilitate the growth and development of SSMEs. Fraser, Bhaumik et al. (2015), suggested that P2P lending bridges the financial gap created by the traditional banking system and information flow among players on the platform. They were on the other hand indistinct about the role of service providers in the context of the regulations. There are existing body of literature that focus on the determinants of funding success, the interest, and or credit risk particularly default rate of borrowers. Mild, Waitz et al. (2015) and Guo, Zhou et al. (2016) have shown profound interest in the on-going debate of online lending and therefore proposed credit risk assessment models of the online borrowers. A study conducted by Klafft (2008) on Prosper online platform illustrates that borrowers' credit ratings and verified bank account information significantly influences the probability of successful funding. Empirically, they revealed credit rating and debt-to-income ratios as the fundamental factors determining interest rates charged on loans. They however, discovered that verified bank account information and home ownership insignificantly impact on interest rates. Notwithstanding the financial credit rating categories, Klafft (2008) suggested three guiding principles that mitigate risk of a default. This investment decision rule follows: (1) investors must only lend to borrowers without any delinquent accounts. (2) investors must satisfy Rule 1 and that they must have a debt-toincome (DTI) ratio less than 20% and (3) by satisfying Rule 2 such that the borrower has no credit inquiry reports during the last 6 months. Empirical studies conducted by, Lin, Prabhala et al. (2013), Duarte, Siegel et al. (2012), Emekter, Tu et al. (2015), Navarro-Galera, Rayo-Cantón et al. (2015), and Chen, Li et al. (2017) however argued strongly against the above findings of (Klafft 2008). They utilize new dataset from renowned lending platforms such as Renrendai.com in China, LendingClub in U.S.A, Zopa to mention few. They found that unpredictable socio-economic and financial factors have much repercussion on the likelihood of loan funding.
Besides, there is substantial number of articles that shed views on the ongoing credit risk analysis by scholars in the field of finance in recent time. Tao, Dong et al. (2017), analysed data from Renrendai.com and found that borrowers who receive higher income and own a car (vehicle) are more likely to be funded. Their findings also suggest that such borrowers pay less interest rate and are likely to default. Lin, Li et al. (2017), empirically investigated borrowers default risk of P2P in China using Logistic regression and found that gender, age, marital status, educational level, working years, company size, loan amount, monthly payment, debt-to-income ratio and delinquency history have significant contribution in modelling loan default risk among borrowers. Ma and Wang (2016), however spend quality time to examine factors that determine credit risk by looking at the entire P2P platform, the borrower and the regulatory policies using Interpretative Structural Modelling (ISM). Empirical results of their study expose the existence of direct relationship between auditing mechanism of P2P platforms and credit risk. They further explored that borrowers' moral level, job stability and the policy environment also affect credit risk.
Although many studies have been conducted to analyse credit default risk of borrowers of P2P lending platforms, many of them fail to critically examine borrowers' information and how they generate repayment difficulties. This study will add to the existing literature by providing evidence from RateSetter.com, a typical P2P online platform based in Australia. 1.1.1 RateSetter P2P RateSetter was established in United Kingdom in 2010 and launched Australia 2014, with the goal of providing the opportunity by dressing investment gap between cash and shares by making better returns accessible to everyone in a simple and modern way supported by information technology. Over £2,782,454,205 have been provided by investors and earned £111 million in interest. The platform can boost of a team of 200 people based in London, Leicester and across the whole of U.K. Its equity investors comprised major investment fund managers, Woodford Investment Management and Arteris. Lenders on the other hand can choose to invest a minimum of £10 up to £1,000,000s and requires the rolling market flexibility while the 1 year and 5-year market offers long term growth and income opportunities. RateSetter is the first P2P online platform to launch provisional fund that safeguard against NPL. The platform has therefore maintained a 100% track record since 2010. Investors on the other hand, receive the expected returns accordingly and no lender has lost a coinage.

RateSetter Borrower Default and Recovery Team
A number of debt recovery procedures are put in place to recover defaulted loans. With respect to late repayment, RateSetter may embark on any of the following processes: Serving the borrower repayment notice, Re-issuing direct debit instrument to the borrower's bank account RateSetter debt collection team contact the borrower to discuss circumstances and to ascertain whether they are experiencing hardship; Developing alternative repayment arrangements and matching repayment amounts to expected cash flows or by submitting default reminder to directors and third-party guarantors to repay amount due; If appropriate reporting non-payment to credit bureaus and thereafter appointing an external collection agency to vigorously pursue payment. Circumstances under which the borrower defaults; RateSetter engages in legal processes where necessary court actions, Conditions whereby the loan facility is secured, RateSetter appoints a specialized collection team to exercise the security interest, repossess and sell the relevant property.

Methodology 2.1 The mathematical model
Generally, in modelling loan default prediction, this paper adopted binary response denoted by Y such that, represents the expected values of Y for a particular realization X of the independent variable X. Function { ( )} g X specifies the probability of the loan default, which can be expressed as: The right-hand side of Eqn 4 represents the reducible error term that is inherent in Y. Hence, the learning problem is to find a good estimate of * { ( )} g x that minimizes the reducible error term in the model. Based on the mathematical model adopted, binary logistic regression model is considered for analyzing likelihood of loan default because the dependent variable of general regression model is not dichotomous. Our explanatory variable is categorized into non-default and default borrower which serve the solution to classification problem usually typical of logistic regression. In the binary logistic regression, the explanatory variables show the probability of default or otherwise. We therefore assumed z as unobservable continuous number and hence depict the likelihood of a default. All other things being equal, z value is positively related to default probability. In order to transform the continuous number into likelihood function, we adopted the binary logit model. The model takes the format of the function in Eqn 5; 1 1 z p e − = + 5 In Eqn 6 below, k and z are said to be linearly correlated as described in the model; With p representing the probability of default, x = explanatory variable, k β = the parameter estimates and k is the number of explanatory variable according to (Hosmer and Lemeshow 2000) 2.1.1 Data Description This section of the article describes and summarizes the descriptive statistics of the data used in the study. Our dataset was retrieved from RateSetter.com website via www.ratesetter.com.au which is the most popular Peer-to-Peer platform in United Kingdom (U.K) specifically in Australia. The dataset starts from 2014:M10 to 2017:M12. Approximately, 15,776 borrowers had been assigned loans amounting $208,078,046 ($208.1m) within the 4 year-period. The dataset comprised loan transactions either issued or reached maturity and thus contains the financial information about the empirical creditworthiness of the applicants. Respective loan attributes were then allotted to each borrower depending on the credit information provided (Soft information, hard information and loan characteristics). Through data preprocessing and data screening, we excluded the loan status of attribute "borrower I.D" and "contract date" since they do not have effect on the results of this study.
To identify empirically credit worthiness of the borrower, we considered 'loan status' of attributes '<30 days', '>30 days', 'in-default', 'Hardship', as "Bad or Default Borrower" whereas 'fully paid' and 'On-schedule', attributes was classified as "Non-Default or Good Borrower". According to the available data, features used in our predictive model of each case comprised of borrowers age, employment history, annual income, purpose of the acquired loan facility, loan amount, interest rate, gender, early repayment by the borrower and borrower's status (whether the applicant/borrower defaulted or non-defaulted otherwise known as good or bad borrower.) Among these variables, employment history, loan purpose, gender, early repayment, housing status, borrower's residence was categorized. Specifically, gender, repayment status, early repayment was classified as binary variables; whereas borrowers' residence was divided into Australian Capital Territory (ACT), New South Wales (NSW), Northern Territory (NT), Queensland (QLD), South Australia (SA), Tasmania (TAS), Victoria (VIC) and Western Australia (WA); housing status contains own a home(mortgage), Own a home (no mortgage), Tenant (mortgage) and Tenant (no mortgage). The categorization above increased the reliability and stability of the model thereby reducing the computational complexity.

Data Presentation and Analysis 3.1. Preliminary (Econometric Approach)
Multicollinearity is possible when the correlations between IVs are high. This makes it impossible to isolate individual effect on the DVs. High multicollinearity increases VIF of OLS estimator and reduces the significance of the estimate. To avoid data redundancy, skewing and the occurrence of TYPE II error, we assessed multicollinearity by examining VIF and T. The T and VIF as shown in Table 1  1.009 to 1.682 respectively. In OLS results, we found that the collinearity statistics-T and VIF were greater than 0.2 (T >0.200) and less than 2.5 (VIF < 2.500) respectively, indicating low threshold for the multicollinearity problem and will not bias the results. In our dataset, the pairwise correlation between two variables ranges from -0.283 to 0.259. Our pairwise correlation is much lower than the 0.8 critical threshold for serious multicollinearity problem as suggested by (Gujarati 2009 (11) independent variables in the model for precision. The standard deviation, skewness, standard errors, kurtosis, mean are presented in Table 2. Table 3 on the other hand, shows the correlation matrix. We noticed that the correlation between the variables was relatively low (less than 0.7).  Below $5,000 and above $150,000 per annum respectively. Intuitively, majority of RateSettler P2P applicants are situated in QLD, NSW and VIC. Statistically, 4172 accounting for 26.4% of all borrowers live in QLD, 4147 representing 26.1% also live in NSW and 3999 borrowers accounting for 25.3% reside in VIC. The share of the remaining five major states in Australia comprised 3,458 loan applicants accounting for 22.2% borrowers which is less than the applicants in one of the states afore mentioned. About 82.7% of borrowers are full time employees, 6.1% are part-time workers and 7.4% are self-employed applicants. It was found that, 0.1% borrower's house-persons and 2.0% borrowers are either retired or engage in self-supporting economic venture other than as mentioned early on. In terms of borrower default history, 15,458 applicants have no default history and only 314 borrowers had default behaviors in the past. The minimum loan duration is 6 months with 7 years' maximum. Borrowers prefer 1 year, 2 years, 3 years, and 5 years' loan duration to 6 months, 9 months, 1.5 years, 4 years and 7 years. 3.1.1 Results According to Table 4 where we reported the results of the regression, 7 out of 11 regressors contribute to loan default due to their statistically significant t-statistics. Observing Hosmer and Lemeshow's 2 X test statistics (11.913) the model is significant in explaining the phenomena. Additionally, the 0.027 and 0.153 are the Cox & Snell and Nagelkerke R 2 respectively. S.E (S. E<0.2) of the estimated coefficients as found in Table 3 further suggest that our regressors are free from multicollinearity problem. In addition, the estimated coefficients in our model are significant at the 1% level, 5% and 10% levels excluding variables which are insignificant. Of course, our results depict several implications. The empirical findings show that, gender, borrower's state (where the borrower lives), employment history, age of the borrower and loan amount insignificantly affect the likelihood of loan default. Although Housing status generally contributes significantly to borrower default, No-mortgage tenant specifically is significant in modelling default rate of the borrower. This implies that for every 1 unit increase of the predictor (No-mortgage Tenant) the odds are changing by 0.463units. That is, the intention of the 'No-mortgage Tenant' to default decreases the logit of loan default by -0.769 units than mortgage tenant.
The estimated coefficients illustrate that loan purpose is positively related to default risk. The default risk for acquiring loan facility for investment and to purchase a car/vehicle is relatively higher than the original. However, a percentage (1%) increase in any of the following IVs (1) debt consolidation, (2) major events, (3) Investment and (4) professional development/services, their respective log odds increases by 0.637 units, 1.588 units, 0.812 units and 0.794 units in that order.
The interest rate of a borrower is likewise related to the default rate significantly. As interest rate increase by 1%, the default rate will reduce to 32.7%. Nevertheless, repayment plan by the borrower is associated with default risk. Any attempt of the borrower to delay repayment by additional month, the borrower is 6.096 more likely to default than the borrowers who pay on time. It illustrates that, a borrower who had defaulted in the past is 0.998 more likely to default all things being equal. Finally, income is generally significant in predicting default risk of the borrower. A borrower whose monthly salary is from $50 to $100k is 2.262 more likely to default than the former whereas borrower whose salary is $100-$150k is 2.896 more likely to default payment. The results on the other hand show that, an increase in borrowers' income by 1% it is expected that the default rate will increase by 3.319 units' times the original.  Ceteris peribus, there is 0.834 likelihood of borrower with the aforementioned characteristics to default. These characteristics have the high propensity on borrowers' default rate and hence must be dully checked by lenders and financial experts of the platform. 3.1.3 Discussion Credit risk is an important concern for P2P lending market platforms. To broaden knowledge and understanding of default risk otherwise known as credit risk evaluation model, we employed binary logistic regression in modelling P2P funding success. The study computes the likelihood of choosing an alternative as a function of the attributes of all the alternatives available as discussed in section 2. First of all, contrary to our findings, related studies reveal that women in general exhibit common character traits when it comes to financial decision making them less likely to default (Powell andAnsic 1997, Chen, Li et al. 2017). Borrowers who are tenants and do not mortgage are less likely to default than the former as aforementioned in section 3. As the costs of rent generally hike, borrowers spend greater proportion of the borrowed loan on it and hence increase the likelihood of overdue repayment in future. The estimated coefficients however show that, mortgage influences the likelihood of successful funding and repayment. It can be mentioned that an increase in mortgage will decrease the probability of loan default. Intuitively, the higher the cost of rents, so does the borrower's default rate. Applicants of P2P platforms in Australia may have informed knowledge of rent control measures. Besides, borrowers who live in their own apartment as well as borrowers who live in rented apartment and mortgage do not contribute significantly to default and hence classified as good borrowers. The intention of the borrower to repay borrowed amount on schedule increases the logit of the estimate of repayment by 6.096 units.
Our research provides empirical evidence that some of the loan purpose variables contribute significantly to default risk of borrowers. It is obvious from the estimated coefficients in Table 4 such that, professional services, major events, vehicle acquisition, debt consolidation and major purchases are prime causes of credit risk among borrowers. Thus, we confirm that our results are consistent with (Herrero-Lopez 2009) who identified loan decision variables as the mediation between borrower characteristics and the likelihood of successful funding. Given the variations in application process, interest rate, loan amount, and term length across loan products, it is apparent that each option presents a unique set of pros and cons. Peer-to-peer loans offer the benefits of expedited application processing, smaller loan amounts, and shorter periods, but borrowers pay for these inconveniences in form of higher interest rates. In spite of the explained phenomenon, borrowers might want to fake their financial identity provided to the platform which may result in misclassification of good borrowers and vice versa. Although borrowers' residential state is statistically insignificant in explaining the model, the platform may find it difficult to monitor and the track the economic activities of these scattered clients. The exclusion of third-party's like the experienced traditional banks to further verify the borrowers' financial information is likely to have repercussions on the borrowers default risk. Mitigating against the above risk, RateSetter is compared to extensively use UKs premier credit reference agencies and business credit rating agencies like Graydon and Dunn & Bradstreet to assess borrowers' financial information. Additionally, the industry uses some kind of sophisticated identification mechanisms like CIFAS and other tools of combating financial fraud.

Summary and Conclusion
Economic and business opportunities as you might expect depend on affordable and reliable information technology (I.T). Sustainable development and growth of internet speed and access has enormously improved globally in recent years. It has therefore facilitated the creation of larger, faster and among all geographically diverse online lending marketplaces. The advent of online P2P marketplaces has successfully managed financial intermediaries' problem created by financial institutions. These platforms demand much time, effort and are riskier than mortal-and-bricks lending scenarios. Our study focuses on P2P lending of which individual lenders make unsecured loans to meet the financial request of potential borrowers. Getting true trustworthy and the degree of creditworthiness of a potential loan borrower is paramount for healthy functioning of social lending markets. In an attempt to determining the risk score of an individual borrower, financial features such as past 9 financial history, borrowers' age, employment history, annual income, purpose for the acquired loan facility, loan amount, interest rate, gender, and various other financial features are used. Borrowers who are recently unemployed, experiencing bankruptcy, and have had loan repayment difficulties are more likely to be offered unattractive interest rate as compared to borrowers with more positive credit report. The assumed imperfect information that exist in the fin-tech market compound information gathering problem. Again exclusively relying on online verification might pave way for debtors to undertake unproductive venture after acquiring the credit facility. We therefore suggest the use of double sword strategy. That is the adoption of online and offline information gathering to abjure information deficiencies and bottlenecks to modelling credit default risk.
Despite the encouraging empirical findings, the study suffers from a number of limitations. First, the study analyzed dataset from RateSetter.com website for a limited time range which denies the chance to exploit the structural dynamics among borrowers, lenders and lending platforms. Secondly, other micro and macroeconomic variables such as exchange rate, disaster (i.e. fire outbreak, death etc), government regulations, banking regulations and taxes, fraud etc may have aftershocks on our empirical estimations. We therefore propose the use of binary logistic regression model by importing data about borrower characteristics of other online lending platforms for further studies for a long-time range.