Category: Data

Accuracy of Explanations of Machine Learning Models for Credit Decisions

Read Paper

This white paper creates a framework for using synthetic data sets to assess the accuracy of interpretability techniques as applied to machine learning models in finance. The authors controlled actual feature importance using a synthetic data set and then compared the outputs of two popular interpretability techniques to determine which was better at identifying relevant features, finding variation in results.

Andrés Alonso and José Manuel Carbó, Banco de España

Reducing the Black-White Homeownership Gap through Underwriting Innovations

Read Report

This study updates mortgage market developments in the use of cash-flow information from bank accounts and utility, telecommunications, and rental payments history. The report highlights issues concerning data collection, standardization, and consumer protection regulation when using non-traditional financial data sources, as well as the impact of pricing, servicing, and regulation in determining whether the use of such data sources enhances racial equity.

Jung Hyun Choi et al., Urban Institute

Government and Private Household Debt Relief During COVID-19

Read Study

This study finds that nearly 30% of total debt relief in response to the COVID-19 pandemic was provided by the private sector, with the balance provided pursuant to government mandates focusing on mortgage and student loans. Households with lower incomes and lower creditworthiness were more likely to obtain forbearance relief, as were households who live in areas with higher Black or Hispanic populations, high infection rates, and more severe economic deterioration. The authors caution that the winding down of forbearance measures and subsequent structuring of debt repayments may have a significant impact on household debt distress and the aggregate economy given the amount of accumulated postponed repayments.

Susan F. Cherry et al., National Bureau of Economic Research Working Paper No. 28357

A Proposal for Identifying and Managing Bias in Artificial Intelligence

Read Publication

This publication considers common types of biases in AI systems that can lead to public distrust in applications across all sectors of the economy and proposes a three-stage framework for reducing such biases. The National Institute of Standards and Technology intentionally focuses on how AI systems are designed, developed, and used and the societal context in which these systems operate rather than specific solutions for bias. As a result, its framework proposes to enable users of AI systems to identify and mitigate bias more effectively through engagement across diverse disciplines and stakeholders, including those most directly affected by biased models. This proposal represents a step by NIST towards the development of standards for trustworthy and responsible AI. NIST is accepting comments on this framework until August 5, 2021.

Reva Schwartz et al., Draft NIST Special Publication 1270

Moving beyond “algorithmic bias is a data problem”

Read Paper

This paper explores how model design choices can cause or exacerbate algorithmic biases, notwithstanding the common view that data predominantly cause bias problems in machine learning systems. The author cites two important factors that constrain our ability to curb bias solely through working on the quality or scope of training data: inherent messiness in real world data and limits on accurately anticipating features in a model that can cause bias. Model designers should therefore consider how their choices about the length of model training or the use of differential privacy techniques can affect model accuracy for groups underrepresented in the data.

Sara Hooker, Patterns

AI in Financial Services

Read Report

This report examines broad implications of using AI in financial services. While recognizing the potentially significant benefits of AI for the financial system, the report argues that four types of challenges increase the importance of model transparency: data quality issues; model opacity; increased complexity in technology supply chains; and the scale of AI systems’ effects. The report suggests that model transparency has two distinct components: system transparency, where stakeholders have access to information about an AI system’s logic; and process transparency, where stakeholders have information about an AI system’s design, development, and deployment.

Florian Ostmann and Cosmina Dorobantu, The Alan Turing Institute

Six Facts You Should Know about Current Mortgage Forbearances

Read Blog

This source collects recent trends in short-term forbearances in the mortgage market but also notes areas in which additional data and consumer outreach are urgently needed. In particular, it highlights that about 530,000 homeowners who became delinquent after the pandemic did not take advantage of forbearance, despite being eligible to ask for relief under federal legislation. An additional 205,000 homeowners obtained an initial forbearance that expired in June or July, but did not seek to extended it and have since become delinquent.

Jung Hyun Choi & Daniel Pang, Urban Institute

Financial Inclusion and Alternative Credit Scoring: Role of Big Data and Machine Learning in Fintech

Read Paper

This research paper analyzed whether unstructured digital data can substitute for traditional credit bureau scores with an analysis of loan-level data from a large Indian fintech firm. The researchers found that evaluating creditworthiness based on social and mobile footprints can potentially expand credit access. Variables found to significantly improve default prediction and outperform credit bureau scores include the number and types of apps installed, metrics of the applicant’s social connectivity, and measures of borrowers’ “deep social footprints” derived from call logs.

Sumit Agarwal, Shashwat Alok, Pulak Ghosh, and Sudip Gupta

Quants Sound the Alarm as Everyone Chases Same Alternative Data

Read Article

This article explores how Wall Street is mining geolocational data, payments
information, social media posts, and various other sources of data in an effort to better understand emerging trends from the pandemic. While useful in various ways, it also cautions about the risk of unrepresentative data.

Justina Lee, Bloomberg

How Did COVID-19 and Stabilization Policies Affect Spending and Employment? A New Real-Time Economic Tracker Based on Private Sector Data

Read Paper

Using real time anonymized data from private companies, this paper focuses on the ripple effects of a sharp decrease in spending by high-income households on both small businesses and low-income workers.

Raj Chetty et al., National Bureau of Economic Research Working Paper No. 27431

Translate »