This article was initially available on our podcast. Click here to listen.

What is malignant transformation? In fact, at first, we start with, “What is transformation?”

In the past, we’ve used the term ETL, which is an acronym standing for Extraction, Transformation, and Loading. This is a critical part of cleaning up data, normalizing data, making it practical so that you can do analysis or even basic reporting. Whether you’re using data warehousing, or whether you’re using an analytics application or all kinds of other things, you need ETL, you need transformation, you need cleaning up data and normalizing data. 

What about ETL?

Even if you’re doing stuff in Excel, you’re probably doing ETL, not knowing it. When you’re manipulating things in Excel to try to clean it up and make basic reports before you do the presentation, before you chart out the graphs, all the work you do before that is that ETL stuff: Getting it out of the system and then the T is the transformation.

What’s a malignant transformation? I know everybody who’s done anything in Excel has done malignant transformation. I certainly know that I have. I’ve done much destructive transformation. We’ll get to that in a second and then like why everybody does dangerous shift.

Let’s say you’ve got a report that kicks out from the billing system, and it’s got 104 columns. Nobody needs 104 columns. Whatever you’re doing is probably sufficiently narrow that you need like 13 of those columns. 

So many columns

It doesn’t even matter whether you’re just trying to chart charges trending over time, or payments trending over time, or anything fundamental. Even some of the more complex things don’t need that many columns. So you delete a bunch of those columns. You delete 86 out of the 104 to make your life easier.

Part of that is just because. God, how many times were you scrolling across around 306 columns or 104 columns? For instance, you were trying to figure out, “Where’s that column that I wanted?” or “What about that field?” You’re constantly navigating around, and that’s a pain in the butt. Even more than that, your system bogs down really fast.

Varying reports

Let’s say, for example, you want to do monthly reports for the executive team, for a physician, whatever it might be. You’re not doing some manipulation of the data. Maybe, you do some V-lookups to attach something to this from a different tab. You do some essential “if-then” kind of stuff to transform or add some information: if it’s this type of patient record, then we’re going to sign this value, or we’re going to say we should have collected it at the front desk or whatever you want to do. Sometimes calculations. You’re going to concatenate things, not complex, not even to the pivot table kind of level but those kinds of things.

What happens very quickly is, when you do those types of calculations, you’re doing several columns of calculations, 2-3-4-5 columns of calculations across 50,000 records, 100,000 records, half a million records. Your Excel is going to crash fast with all these calculations. What we invariably end up doing is we start putting in even bogged down until it crashes, which is the worst of all. In minor crashes, you know you have to get out of it and like to shut down. But it’s like when it just hangs there forever, and you’re trying to figure out like, “Ah, is it crashed? Is it not? Should I wait?” You wait, get a coffee, come back, “Crap! It’s still hung up. What should I do? Do I wait longer?”

Shortcut to destruction and transformation

Once you do that a few times, you get frustrated enough. You start finding the shortcuts in what we call a destructive transformation. You delete a bunch of those input columns, or you delete columns that aren’t being used at all. On the other hand, you delete columns after you’ve used them for calculation, or you delete a whole bunch of records that you don’t need. Further, you have half a million records, but you only want to look at the year 2021. You don’t need 2017, 2018, 2015, 2020, whatever those kinds of things, so you delete most of those records. Or you delete a bunch of columns, or you do both.

Let’s say, for example. You’re trying to do denials analysis. Well, you don’t need payment records. You don’t need contractual allowance records. So you delete all of those kinds of things. 80-90% of the documents you have may not be necessary to do the type of calculation that you’re doing.

You may do something a little bit more sophisticated. Let’s say, for example, after you do five columns of calculations, or even you do a couple of columns, and you know your system is about to crash, save. Save often. Then, you stop and copy and paste the values from those calculations and replace the calculation columns so that you don’t have the system constantly doing all these calculations.

Modifying the original data

These things are known as destructive transformation because you’re making changes (transformation) to the data. Some of those things are all very positive, but you’re destroying some of the original data. For instance, some of it you’re just deleting. Some of it you’re overwriting. In addition, that’s what’s known as destructive transformation. Also, there’s a good reason to do this. We have to because otherwise, your Excel will just kind of go bonkers all the time. 

Again, we’ll talk about solutions to this a little later, but this is known as destructive transformation. There are many problems embedded in a destructive transformation that will lead you down many difficult paths. Some of them are very frustrating. Some of them you may not even anticipate or see. We just kind of deal with those and accept them, but I think there are ways around this. What we should all be striving for is what’s known as non-destructive transformation.

Final thought

In a subsequent podcast coming up shortly, we’ll explain what non-destructive transformation is. To illustrate, how can you do it, but more than that, even why you should do it, like what kinds of problems you are currently running into currently? Also, some of which are seen, some of which are not. Further, some of which will save you an enormous amount of time where you’re trying to do repeated types of analysis. In addition, where you have to do these things every week, every month. If you can figure out how to do non-destructive transformation, it will save you an enormous amount of work and make your life so much easier. We’ll do that in an upcoming podcast. It makes life easier.