Data is not a substitute for strategy

When I was in grad school, my data sciences class assigned us this incredibly complex optimization problem where we were supposed to recommend the best place to locate a series of factories given expected product demand, availability of suppliers, distance to customers, wage and materials costs, etc. It was too complicated to solve on a normal PC with off-the-shelf software, so the other students simply gave up treating this as a data optimization problem and instead made recommendations based on strategic considerations.

Me? No, I used brute force: I rewrote the software to run on the more powerful and expensive campus mainframe, applying every trick I knew until I found the “correct” answer. It was tough, and at the time I was quite proud of my computer skills, thinking somehow I had bettered my fellow students.

But when I saw the other answers, I realized how silly I was to think that data could beat strategy. Sure, with sufficient computation power I was able to identify a mathematically-provable solution given today’s data. But who cares? Data keeps changing. It’ll be months, maybe years before some of those factories are operating, by which time all my data assumptions would have been irrelevant. Good strategic thinking, on the other hand, doesn’t depend on fluctuations in the data.

I’m reminded of that lesson in this post from Aaron Carroll (at the Incidental Economist blog), responding to Mark Cuban’s advice that everyone get their blood tested quarterly. Data, says Carroll, is not the problem. If you think that more data is always better, you will likely miss the forest for the trees. Or as the post says:

“Ordering a lab test is like picking your nose in public. If you find something, you better know what you’re going to do with it.”