46: Explainable Machine Learning to Assess the Value of Sustainable Housing

Elizaveta Logosha Co-Author
Groupe Vivialys
 
Frederic Bertrand Co-Author
University of Technology of Troyes
 
Myriam Maumy-Bertrand First Author
Universite De Technologie De Troyes
 
Myriam Maumy-Bertrand Presenting Author
Universite De Technologie De Troyes
 
Tuesday, Aug 5: 10:30 AM - 12:20 PM
2748 
Contributed Posters 
Music City Center 
Our objective is to estimate the green value of housing by focusing on energy performance labels in order to understand how housing prices evolve when energy performance improves.

Instead of fitting a hedonic modeling that is some special kind of linear model, and as it was done in previous works, we fit random forests or XGBoost models.

Unlike linear models, which directly reveal the relative importance of the variables via coefficients, these complex models require alternative methods to quantify the impact of the input variables. Shapley values are often used to tackle this issue for random forests and XGBoost models, that do not provide explicit coefficients. Their calculation guarantees that each feature is fairly represented, taking into account all possible combinations of variables.

However, with non-linear and complex models such as random forests and XGBoost, the exact calculation of Shapley values becomes computationally prohibitive.

As a consequence we used more efficient approximation methods such as SHAP, KernelSHAP and FastSHAP to interpret the predictions given by models and we managed to propose an estimate of the "green value" of a housing.

Keywords

Machine Learning

Shapley Values

Green Effect

Hedonic Regression

Random Forests

XGBoost 

Main Sponsor

Section on Statistical Learning and Data Science