De Econometrist neemt een statistische kijk op de wereld.
Historically the use of models has been relatively rare, however nowadays they are far more common. The sentiment towards more data-driven decision making has led to an ever-increasing impact of models. The steps to implementing a model go as follows: Firstly, a model needs to be developed and calibrated to the training dataset, after this the model is implemented and used. When the model is in use it should be tested on a regular basis. If testing shows significantly deteriorating model performance the sequence of actions will be repeated. This process is called the model lifecycle, the figure below shows all the different steps of the cycle.
Figure 1 Model Life Cycle
The mode lifecycle is a process in which all the elements that are required to keep a model up and running, and performing as desired, are contained. It starts with the development of a model and it ends when a model is no longer performing adequately and is decommissioned. To assess if the models need to be redeveloped or recalibrated they need to be tested.
Model testing takes place in two stages in the model lifecycle. The first round of testing is performed after the (re)development or (re)calibration of the model and is used to determine if the action has led to the desired level of performance. The second round of testing is performed during the usage of the model. This is done to assess if the model is still performing adequately. Model testing is therefore an important aspect of the model life cycle as it is the main factor in the decision if a model remains in use.
In theory model testing is about the following two topics:
To do so the following model testing requirements are identified:
To do this in practice is however not so straightforward as it may seem from these requirements, as models can be:
Testing therefore tries to take all the different facets of models, model performance and the model environment into account and assesses if this is still in line with the requirements. This means that model testing is an extensive process which can be time consuming and costly. This is further hampered by the fact that model performance can never be verified as positive; only the absence of indicators of non-performance can be measured. Still it remains an important facet, this has to do with the cost of non-performing models. Take for example a model that tries to assign clients into a risk category. If the model is not able to discriminate between good and bad clients, it would mean that the assigned risk category could be random. This would have a harmful effect on revenue in two ways:
This is just one of many possible examples of what could be wrong with a certain model. The same model could for instance still adequately predict the amount of money that will be lost given the fact that a client will default, yet the result is inadequate.
Banks and financial institutions were one of the first companies to start modelling many aspects of their business. The area in which this was the most prevalent was that of credit risk modelling.
Credit risk models aim to predict, as the name suggest, the credit risk associated to a credit event of a specific client. In more practical terms it means that it tries to predict the expected loss of a client at a certain point in time. In credit risk modelling the definition of when a credit event is depicted as a loss is of paramount importance. Where a loss is: the failure of payments, bankruptcy or default. What remains to be answered is how to determine when a loan is in one of these categories. Take for example a client that is past due 5 days, this may already indicate that there are some troubles with repaying the loan, but it may only be a technical issue as well. This indicates the importance of properly determining the thresholds for the past due period. The standard assumption is that a loan is considered non-performing when a client is past due more than 90 days (3 months). When there is an indicator of non-performance the bank should recognize the loss and create provisions (sometimes called reserves or allowances). To this end, the bank estimates the expected loss (EL). The expected loss is based on three parameters; The value of the loan (i.e. the exposure at default, EAD), the probability that the loan will default (i.e. probability of default, PD) and the estimation of the part of the loan which will be lost in case that a default occurs (i.e. loss given default, LGD). To sum up, the expected loss is calculated as follows:
These parameters are models themselves, each with unique drivers or other models behind them. A model within a model is called a sub-model and for this article the decision was made to not go beyond the main drivers of EL.
The default probability is the likelihood that a borrower will not be able to make repayments over a specific timeframe. The probability of default (PD) depends on multiple factors such as borrow characteristics as well as the economic environment. PD’s are estimated using statistical techniques and historical data. Generally, the higher the default probability, the higher the interest rate the lender will charge the borrower. Creditors typically want a higher interest rate to compensate for bearing higher default risk.
The exposure at default (EAD) is the total value of exposure at the time of a borrower’s default on a loan. Banks often calculate an EAD value for each loan and then use these figures to determine their overall default risk. EAD is a dynamic number that changes as a borrower repays the lender. Banks must disclose their risk exposure. A bank will base this figure on data and internal analysis, such as borrower characteristics and product type.
The loss given default (LGD) is the share of the asset that is lost when a borrower defaults on a loan. The method used most frequently to predict the LGD is to look at the uncovered part of the loan, since this is the part that is lost upon default. One of the main input factors for this estimation is the Loan-To-Value, which is the ratio of a loan to the value of the asset.
Expected loss is covered by revenues (interest rate, fees) and by loan loss provisions (based on the level of expected impairment). Expected loss is an important estimate as it directly impacts the profit & loss statement. The expected loss corresponds to the mean value of the credit loss distribution. Hence, it is only an average value which can be easily exceeded. Therefore, we define the unexpected loss as difference between a high quantile and the expected loss (see Figure 2).
Figure 2 Expected vs Unexpected Loss
Regulations such as the Basel agreements demand that banks should hold enough capital to fully cover the unexpected loss, which is called the regulatory capital or reserve.
There are several issues that need to be accounted for when modelling credit risk:
The expected loss (and thus the unexpected loss) on a loan and credit fluctuates over time. Loans are paid back over time and have a declining outstanding amount to be repaid. It is therefore important that these models accurately predict the expected loss, the following model lifecycle components are there to assure this.
To assess if a credit risk model still performs adequately the model needs to be tested. To do so various kinds of tests can be performed. These tests can be attributed to the following categories:
This can be graphically represented as can be seen in figure 3 together some test examples per category. Where model input, output and portfolio testing are of a quantitative nature, the tests performed on the model environment are usually more qualitatively.
Figure 3 Facets of Model Testing
This figure makes it painfully clear that data quality plays an important role for model testing. Hence, it is usually treated it as additional category of model testing.
When model testing has been performed and the results are in, a decision needs to be made on what happens next. There are four possible scenarios possible:
If the model is non-performing and a small remediation is no longer sufficient a re-calibration or re-development is required.
Within the model lifecycle framework, the re-calibration of a model is defined as: The updating of the fixed modelling parameters to get the optimal results given the current historical dataset.
In other words, it uses a dataset which incorporates more recent data than compared to the previous calibration/development to re-assess all model parameters and set these to the values that would have yielded the best possible model output.
A re-development takes this a step further by assessing if the current set of parameters are still optimal. It considers parameters that are currently not in the model and might drop parameters that currently are in the set. This is a far more extensive procedure and as such when possible recalibration is the preferred option.