If data is indeed the “new oil,” then real estate is lingering in the era before the invention of the automobile. Prior to the boom of Ford Motors, due to the lack of a universal application, gasoline was a cheap and relatively useless byproduct of oil refinement—used in lamps and stain removal—that was often dumped into rivers. That is, until the Model T was born. While data is consistently used to drive decisions, real estate is still struggling to figure out how to unilaterally harness this invaluable commodity for the benefit of the entire industry. Just as the oil business was awash in gasoline in the early 1900s, it is clear that real estate needs to find its own application for the massive trove of data it sits upon. The industry needs its own Model T, in the form of high-quality, standardized data to drive smart investments, rather than metaphorically dumping this precious asset down the river.
The real estate industry’s sluggishness in capitalizing on the power of data is due in no small part to several sizable hurdles in its path: a lack of quality public data, the disparity of a standardized data format, and a death grip on proprietary data by many large real estate firms.
To fully examine the trends and challenges of widely incorporating data usage in commercial real estate, we brought together a panel of experts—Dhinaker Dhandi, Vice President of Product at Altus Analytics, Josh Fraser, CEO of Estated, and L.D. Salmanson, CEO of Cherre—at our recent Metatrends event in Los Angeles.
Why hasn’t data been more effectively utilized in CRE?
The blame for not applying data effectively in real estate doesn’t lie with building owners, property managers, or landlords, it’s an inherent problem underlying the entire industry. While there is abundant interest in compiling data from the $50 trillion real estate asset class, as noted by Fraser, the first challenge stems from obtaining good public data. There are over 3,100 counties in the United States, with varying levels of size and sophistication when adopting technology. This means when requesting building data from counties, the response can range from an Excel spreadsheet—sometimes with valuable data, but often loaded with numbers of dubious quality—to an obsolete CD-ROM or even in the most extreme cases a dreaded pile of yellowing paper documents.
The lack of standardization
Salmanson agrees that the lack of quality from public agencies greatly exacerbates the data problem real estate is facing. He adds that high quality, standardized data is the holy grail that the industry needs to pursue in order to reach “data-driven decisions from data-driven results that lead to a higher alpha for everyone in the industry.” Currently, Salmanson noted, despite an abundance of information on 140 million properties in the United States, it is hard for investors to be able trust the data coming in, let alone understand it, due to a lack of homogeneity. He uses the example of half- and quarter-bathrooms in residential real estate, which he notes, has come a lot further in standardizing than the commercial sector. In most cases, a half-bathroom is a toilet and a sink, whereas a quarter-bathroom is just a toilet.
The problem arises, however, when housing information comes in from countless sources and has to be standardized into one field; if a house is listed as 2.5 bathrooms, how does an investor know if it’s one full bathroom with three half bathrooms, or two full bathrooms and one half-bathroom? This is the inherent problem facing standardization: the source data of several different MLS systems doesn’t have the granularity that is needed to ensure that information is reliable and easily understood. In commercial real estate, Salmanson notes that OSCRE is moving to fix this problem, but there is a level of cooperation that needs to be established before it can be solved.
Another concern is that with the adoption of more robust data analysis will come a desire to incorporate real-time data into real estate investment. Both Salmanson and Fraser agree that commercial real estate may not immediately reap the rewards of real-time data and with such a disparity in data sources and lack of standardization, implementing real-time data could actually end up resulting in much poorer data quality. They both agree that standardization should come first, then quality real-time data access will follow.
Proprietary data as an advantage
The lack of standardization also faces an inherent problem with the business model of the larger industry actors that actually have access to good data. These companies rightly understand their advantage, and hold onto their data tightly to “weaponize” it for their own gain. While software companies like VTS are pushing the industry towards a standard data model like the one driven by OSCRE, the competitive advantage that the large investment companies hold is driving them away from joining the push for standardization. Dhandi—whose company Altus Analytics moved away from the standardized OSCRE model—explained that the problem of shared data is that while it is beneficial to the entire industry, certain clients have proprietary platforms that they want to continue using without sharing data to maintain their competitive advantage.
The holders of quality data have an advantage over their competitors with less data, meaning they can make wise investments to drive profits and gain market share. Although data standardization is clearly in the industry’s best long-term interests, the holders of quality data are stuck in a quandary: to share or not to share. As Dhandi notes, sharing their proprietary data is “either the best idea in the world or the worst. It’s the best idea in the world because you’re the source of the data, people trust you, and come to you. But it’s the worst idea in the world because you’ve given away your competitive advantage.” With this in mind, he notes that the most important next step in building a robust data standard is building effective partnerships with the key players in the industry to push for standardization.
While OSCRE and the Real Estate Standards Organization (RESO) are making progress in standardization, Fraser argues that there is a long way to go, primarily with the problem of proprietary information being the primary asset that many of these long-established companies are protecting. “Trying to force a lot of these businesses that have been around for 10, 20, 50 years, to adopt technology standards is not their top priority. So, where it goes from there and how much adoption we can truly have is questionable in the next few years.”
What does the future hold?
While the silver lining on standardizing data in real estate needs a major polishing, there is hope. Fraser explained that although quality public data is difficult to obtain, a lot of the counties in the U.S. are open to standardizing and digitizing, but it’s a problem of scarce resources and motivation. Enter technology: both Dhandi and Salmanson agreed that county tax assessors will begin to adopt tech to share their public real estate data in the next ten years, which will greatly help the data standardization process move forward. No longer will investors have to trudge through years of data in obsolete formats to drive their decisions. “We’re learning every single day how hard this is,” Fraser states. “But there is a lot of data out there that is incredibly valuable and a lot of very smart people working on it. Give it five to ten years and it will be a significantly better situation.”
As another means of confronting poor quality data, both Fraser’s and Salmanson’s companies are turning to machine learning. AI has allowed Estated to map virtually any CRE data set in twenty minutes, freeing up valuable resources to build QA checks into the process of reviewing data. For Cherre, AI has allowed non-technical people to import data which frees up the engineering team’s time to build out new data schema as they arrive from new clients.
Despite progress being made, all three panelists agreed that there is a long way to go until data can be standardized, aggregated, and consumed in one uniform way. In the short term, not much change will happen, but over ten years, they can begin to see the ground-shifting changes that will benefit the real estate industry as a whole.
Fraser sees a fundamental shift in the residential real estate market by 2030 with the growth of iBuying. Zillow has moved heavy investment into this arena and is aggregating a lot of data to ensure that they are making smart buying decisions on these houses, which with iBuying is a nearly instantaneous purchase. If the next ten years indeed shows a growth in iBuying to 60-70% of all homes sales as predicted, the quality, standardization, and accuracy of data will have to improve industry-wide. This will be bolstered by heavyweights like Zillow throwing their backing into the process to ensure this happens. While this change may initially begin with residential real estate—which is typically where changes in the industry occur—Fraser believes it will translate over to the commercial sector in due time. Salmanson, however, was skeptical that iBuying would be as impactful of a model as Fraser believes. “The jury’s out if the model’s been tested. It’s been tested in an upcycle, but can it survive a down cycle?”
In lieu of iBuying being proven in a down cycle, Salmanson believes that the productization of real estate is going to drive the market forward. Homebuyers, especially younger buyers, will want to see a standard level of housing that they understand and can trust, such as a gold, silver, and bronze rating system to clarify how each home is classified. His belief is this productization over the next ten years will drive better data to be developed, helping buyers become more comfortable with buying real estate as they understand clearly what level of investment they are undertaking and what they get in return.
Unleashing data’s full potency in real estate will propel massive, data-driven, intelligent investments, a seismic shift such as the oil industry experienced when the automobile was invented and mass produced. But until the day that standardization arrives, data will still be utilized to push smart investments in the industry, but it may have to be manually inputted into a spreadsheet from a binder of printed public documents by the light of an oil-burning lamp.