Perspectives on Credit Risk Model Development from Recent CCAR Validations
At FI Consulting, we develop, validate and implement a wide variety of credit risk models for GSEs, banks, credit unions, and Federal agencies across residential and commercial real estate, small business and consumer portfolios. In terms of model validation, clients often look to us for surge support to temporarily increase their capacity during crunch periods leading up to supervisory reviews and other audits. Stress testing, particularly CCAR, has represented a significant amount of our validation work.
We perform model reviews that look at all aspects of the modeling environment including data, input assumptions, model theory, mechanics, model use, policies, regulations, controls and documentation. In most instances, we work within our clients’ own model risk frameworks and use their specific templates, tools and reporting formats while bringing our own knowledge of leading practices to bear.
After a recent wave of model validation work, the FI team convened to conduct our usual review of lessons learned and from the discussion came a list of findings that came up across many projects. The good news is that most of these are easy to address and will help you avoid unwanted validation issues. Take a look and consider whether you have them covered as you develop your next model!
A model is only as good as its data…
- Documentation of a repeatable process to assemble and clean datasets is essential. Make it easy for the validation team to access or reconstruct the data set.
- There is often inconsistent recognition of the timing of when events such as default or prepayment take place, for example does a default occur from the date of the most recent missed payment, or when the obligation is 90 days past due? It’s best to refer to policy to confirm treatment and align with business practices.
- Records with duplicate IDs should not be blindly deleted, sometimes duplicates exist for a reason, such as an obligation being renewed with the same ID, or the same obligor has two unique entities in a database. Don’t throw out important information.
- Not all data is considered equal, the data integrity of the target variable far outweighs the integrity of the predictor variables when modeling, and should be focused on first.
Regarding missing values…
- Datasets often have inconsistent treatment of missing values, such as the use of NaN, blanks, and zeros to represent missing values. Conversely, validators should be skeptical of datasets that contain no missing values, usually suggesting that an imputation process, for example missings are filled in with average values, has already been conducted and was not documented. Apply consistent treatment to missing values and describe it in the documentation.
- Average value imputation and last value imputation should be avoided, as they significantly bias regression results. Missing value dummy variables is a better alternative, though not without problems. Generating estimates for missing value dummies is beneficial for checking the significance of missingness on the target variable.
- Maximum likelihood or multiple estimation is the preferred method of handling missings for generating stable and interpretable model coefficients. These methods work under the assumption that the data is missing at random (MAR), see Paul Allison’s “Handling Missing Data by Maximum Likelihood” for a detailed discussion on the topic.
Even a statistically grounded model contains qualitative judgments…
- Commonly we find extensive documentation on how a list of candidate variables was statistically tested, but little to no documentation on how the candidate variables themselves were selected.
- Rarely does the selection of final model variables strictly adhere to the statistical tests, and qualitative judgments are almost always used when selecting final models from candidates. These qualitative judgements must be documented, and preferably the qualitative input comes from experts in the line of business rather than the model development team.
- Statistical tests also contain some qualitative judgments, for example, why was forward selection or backward selection used? If a classification algorithm was used, what are the assumptions and limitations of that algorithm? Why were certain p-value and correlation thresholds chosen?
- In sum, when a quantitative process is not adhered to, a qualitative decision must be documented.
- Many banks perform exhaustive single factor analysis, but omit testing of variable interactions. Variable interactions are often quite significant, and more intuitive than lagged terms in many cases.
- A common finding in validations is omitted variable bias; validations often find that variables that have strong predictive power were not included in the final model or even tested.
- Discussion of standard modeling assumptions such as linearity, normality, independence of observations, etc. must be documented.
- Is the model development data still relevant to the portfolio? Did the geography of the portfolio change? Were certain portfolios wound down in the aftermath of the financial crisis? It is important to monitor changes in the composition of the portfolio and to check the population stability.
- Low event portfolios are a challenge, as a model that predicts that events never take place will be very accurate! Closer inspection of these models is necessary; employing Kolmogorov-Smirnov or target shuffling are remedies for this.
- For benchmarking, ideally you want to keep the data and change the model, however this is often difficult in practice. Benchmark models often rely on model factors that weren’t collected for your dataset, so you may be forced to benchmark to a different dataset. It’s possible to develop a meaningful benchmark in this case through propensity score matching. PSM can allow you to select and test a subset of loans that match your bank’s unique portfolio characteristics.
- Sensitivity analysis is important for statistically-based models, but it is critical for cash-flow type models where there can be multiple model frameworks sitting on top of each other that produce non-linear results. Ensuring that the model performs well in severe stress will be an important part of the model documentation.