This article was initially available on our podcast. Click here to listen.

I saw an interesting article recently concerning artificial intelligence and machine learning in revenue cycle management. It was pointing to pre-authorizations and pre-certifications as an excellent area for using this technology.

Learning to compare results

What they suggested was that if a provider or a revenue cycle management company is looking for a company to provide this service, or to test a vendor. Thus, you should give them a dated data set of charges (pick, let’s say, 2017 or something like that) and then have them analyze that and determine what would have happened and compare that to your actual results. This could also be applied to any form of machine learning. Also, whether to prevent denials or to address the requirement for pre-authorization.

I love the idea of using data to test vendors. I’m a huge proponent of this. We’ve been advocating this for many years in many different areas. I think that increasingly, this will be the way things go, which is a provider, a revenue cycle management company, or something like that who is looking to evaluate something would provide a dataset. Then, that dataset would be used to give some prediction, some estimate, something that utilizes that data and gives numbers back to that buying organization from the vendor.

Things evolve

There are a couple of challenges with what was suggested by this author. One of them is that things change over time. Something like 2017 versus 2022: Payer policies change, certain things change. If the algorithm is designed to optimize, in other words, if the training set is current data for that machine learning algorithm, meaning 2021-2022 data, it might not perform very well on a 2017 dataset because anything that’s changed in that time might produce an erroneous output or an erroneous prediction. That’s the first problem.

The likelihood that somebody is going to have a machine learning algorithm that was already in existence at that time or trained off of the 2017 dataset or that they have some historical product that they used at that time that they could roll out of the archives and run that to predict and that somehow that would be representative of what you would get today – there’s many problems with that. 

What about algorithms?

First of all, they probably don’t have the algorithm from that timeframe. The current one isn’t going to work on it. Even if the company was in existence at that time, whether or not they had developed a perfect algorithm for the 2020-2017 time period, again, all of these things are extremely unlikely. So that’s a big problem.

There’s an even bigger problem: the purchaser, the billing entity, the provider, the billing company, whatever it is who is providing this dataset that is supposed to be used to test the vendor, to see what they would predict and how; all that performs. The problem is that you probably don’t have accurate data from that timeframe because what you would want to provide is effectively unadulterated data. And data gets adulterated over time. It gets changed. It gets overwritten.

What about claims?

You would need the original charge information submitted for all of those claims for, let’s say, 2017. Not the resubmitted claims, not the appeal, not the updated information, all that stuff. That would be an extraordinary level of data warehousing where you have changed records kept that are dated like that, so you could go back and say, “Ah, for this claim, we want it circa January 17, 2017.” Not what it looked like on January 23 of 2017 or September of that year, or anytime after that.

Suppose a claim was denied for whatever reason (a diagnosis code might have been added or changed. Perhaps, a CPT code might have been changed, some other demographic information might have been provided). In that case, there’s almost no system out there that can roll back and tell you what it looked like. You can’t tell even before you made all those modifications to try to get that claim paid. Nobody’s going to have that data to be able to provide.

Is the task impossible?

I’m not even sure I understand what this person was suggesting because it seems like an impossible task to provide. Maybe, I’m missing something. Maybe, I don’t understand because our experience has been that these systems overwrite all that data almost all the time. There’s not going to be a cheater.

Even in the systems that keep multiple claims where there’s a new claim created rather than one resubmitted. As a result, that will be hard to track and figure out which ones to provide as a data set. In theory, that’s possible. Again, I guarantee that some of the fields of some data are still going to change on that. Even if you create a separate claim when something like a CPT code changes, there’s no way to provide original data.

Final thought

A great idea! One day, I think everybody will be able to provide this. You need to have sophisticated data warehousing to do this. Why? To track all those changes so that you don’t overwrite them. Everybody should aspire to that. I think it is unrealistic to do it retroactively back to 2017.