This article was initially available on our podcast. Click here to listen.

What do real estate, iBuyers, and RCM analytics have in common? I recently heard a podcast on NPR about the problems with iBuyers. They were trying to identify the root cause of why Zillow had pulled out of the iBuyer market.

What’s the algorithm like?

For those unfamiliar with iBuyers, these are data and software companies like Zillow and Opendoor, who have built algorithms to figure out what a property is worth. I’m sure if you’ve been into Zillow or anything else, you’ve seen that Zestimate or whatever it is that they call it, where they’re trying to identify what they think the property is worth. 

They have this algorithm that says, “Hey, here’s what we think the property’s worth.” If they believe that they can get the property for less money, say they can buy a house for $250,000 but then go and sell it for $300,000, they have called into the market where they believe the home, potentially do some light rehab, and then resell it at a profit, at least in theory. They’ve mostly been losing money, but at least that’s the concept.

Explore the fundamental issues

They were talking about why there is a fundamental problem in that business. That is that essentially Zillow is trying to buy and sell (at least this is what they explained, I don’t think this is entirely accurate) houses around sort of an average price. Buyers with significant problems with their homes would love to sell their house for the average price.

Let’s say that example I gave earlier. The average price is $300,000. They’ve got a ton of problems, so their house is only worth $200,000. Of course, they’d love to sell it for $250,000 to Zillow. But buyers with grand houses, ones that are worth $350,000, aren’t going to want to sell to Zillow for $250,000. They’re just not going to do that. They’re not going to sell for that average price.

Understand the data problem

This concept is actually what’s known as “adverse selection” in economics. It’s that you should be worried about somebody who wants to sell because there’s likely an asymmetry in information. If somebody wants to sell, it’s probably because there’s something wrong with it.

They didn’t get to the root of the problem, although they did mention the asymmetry of information. They didn’t say adverse selection, but that’s a fun concept. If you ever want to dive into those things, that’s great to study. They talked about the algorithms failing and tech companies failing, but it isn’t an algorithm problem.

The root cause is a data problem. Their algorithms look at things like the neighborhood that the house is in and buying patterns that people are engaging into (Do they want to buy here? Do they want to buy there? Also, do they want bigger houses or smaller houses? the number of bedrooms, the number of square feet, the number of bathrooms, the garage). 

Disclosures are everything

They may even have some information that’s in the listing when somebody posts it for rent or for sale or something like that talks about, “Was the bathroom recently rehabbed?” They may have things that can process natural language and try to pull out some of that information and figure out, “Ah, okay!” Everything in their algorithm is from that information that you see in a listing in Zillow, in the “Details” section. It also may be even in the “About,” or the “Explanation,” or whatever.

The problem is that the things that tank the value in real estate are not there. A roof that failed isn’t in the data. Foundation problems in a basement are not in the listing. Further, mold problems requiring remediation are not in there. In addition, sellers don’t disclose those kinds of things. They certainly don’t lead with that information. 

Maybe, at some point, if they’re selling their own house and they know about it, they have to legally list that in a separate form on a piece of paper, but that’s not in the listing. That’s not publicly available information, and it’s not easily able to be found.

Determine what isn’t being shared

If all that data were available, like maybe there was a 3D walkthrough or something like that, and you could see every little corner, you might build a training algorithm that recognizes something like mold in a picture. But sellers don’t share that information. They don’t share those photos, and there’s no smell sensor inside the house to capture that information and give that to their algorithms.

The real estate industry is super old-school. Healthcare is old-school, real estate is worse. It’s astonishing. I used to think that healthcare was decades behind the rest of the world in terms of technology, but real estate is worse. I’ve got contractors who didn’t even have email. It’s spectacular. It’s awe-inspiring. You got to pay them with a paper check. I don’t have checks. Sorry! I’ve got to send you money electronically. I don’t do checks. Even if a contractor had gone in and found a problem like a mold, or foundation, or those roof things we talked about, that data is not stored somewhere, much less will that data be connected.

How does this relate to medical and revenue cycle management and analytics? Getting access to data is the key. The sophistication of algorithms is more minor because we don’t have data, and we don’t have access to the data.


In theory, we shouldn’t have the same problems in healthcare because so much of the data is already digitized. It’s in an electronic medical record. Further, it’s in the billing system. Also, it might be in some offline systems as well. It could be things like banks and other things depending upon what you are trying to do. This might require data warehousing to bring together information from many different systems. It might require interoperability to get data from providers, even those that aren’t the primary rendering provider for clinical data if you want to make clinical decisions.

That’s less so the case for financial data. However, that comes into play in a different subject which is, “How do we get metadata, sort of aggregations of large amounts of data across many providers to figure things out?” But that’s a whole other conversation. For the financial data, the data pretty much all exist. You have to grab it all.

In conclusion

The net of this is, get all of your data, make sure you have all the data in one place to start to use it or to improve what you’re doing with that data, or make sure you’re getting it from places that you have not been getting it from before. Digitize everything, things coming from paper EOBs. Make sure all that information is getting entered, including stuff like denial codes. Offline records that are stored in something like Excel spreadsheets get them into a centralized system. Then, you can avoid Zillow’s problem, which lost them hundreds of millions of dollars. Get all of your data, get it all in one place, and you can do amazing things.