Using K-Means Cluster Analysis and Decision Trees to Highlight Significant Factors Leading to Homelessness - CIE San Diego

Using K-Means Cluster Analysis and Decision Trees to Highlight Significant Factors Leading to Homelessness

Homelessness has been a persistent social concern in the United States. A combination of political and economic events since the 1960s has driven increases in poverty that, by 1991, had surpassed 1928 depression era levels in some accounts. This paper explores how the emerging field of behavioral economics can use machine learning and data science methods to explore preventative responses to homelessness. In this study, machine learning data mining strategies, specifically K-means cluster analysis and later, decision trees, were used to understand how environmental factors and resultant behaviors can contribute to the experience of homelessness. Prevention of the first homeless event is especially important as studies show that if a person has experienced homelessness once, they are 2.6 times more likely to have another homeless episode. Study findings demonstrate that when someone is at risk for not being able to pay utility bills at the same time as they experience challenges with two or more of the other social determinants of health, the individual is statistically significantly more likely to have their first homeless event. Additionally, for men over 50 who are not in the workforce, have a health hardship, and experience two or more other social determinants of health hardships at the same time, the individual has a high statistically significant probability of experiencing homelessness for the first time.

This study seeks to expand understandings of homelessness beyond the ethnographic accounts of the 1990s by examining call center data collected and stored by 2-1-1 San Diego as individuals call in to obtain access to social services. In essence, 2-1-1 San Diego is a non-profit information and referral hub, accessed through an easy-to-remember three-digit dialing code. Further, 2-1-1 San Diego acts as the community’s backbone organization for a larger collective impact movement [18,19]. Realizing the value of shared data and measurement approaches consistent with the collective impact epistemology [18,19], 2-1-1 San Diego launched The Community Information Exchange (CIE) in 2018. CIE is a collective impact data-sharing hub that tracks key socioeconomic, demographic, and social data gathered from those calling the 2-1-1 San Diego call center. Further, 2-1-1 San Diego’s mission is to serve as a nexus to bring community organizations together to help people efficiently access appropriate social services and provide vital data and trend information for proactive community planning. Organizations across San Diego County have leveraged CIE’s cloud-based data warehouse to share information for individual care coordination, and have used real-time data for community-wide coordination in times of crisis. This collection of regional data around social service clients, their needs, and the available resources will allow an examination of patterns in environmental factors and behavioral choices that may occur before a client becomes homeless.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email