During the 2000s, the United States experienced an unprecedented decline in manufacturing employment. Roughly six million jobs were lost.
Most economists looked at the economic data and concluded that all these jobs disappeared due to improvements in productivity. But, when
, an economist with the Upjohn Institute for Employment Research, looked
that same data, she came to a very different conclusion.
Houseman found that
productivity does not explain the big decline in manufacturing jobs at all — there has been little productivity improvement in this sector. Rather, nearly all the improvement in productivity came from the economy’s
electronics sector, a relatively small part of the US economy. Further, all the productivity in the electronics sector was driven by product design or R&D, not by automation (making things faster and cheaper). Besides, most electronics manufacturing has moved to Asia anyway.
Houseman’s research shows the importance of looking
the data before drawing conclusions from it.
Common Errors When Looking at Data
Are you measuring what matters most?
As businesspeople, we routinely review data without giving it much thought. Audited financial statements are a good example. Accountants, independent of the company, take financial statements prepared by the company and opine on their accuracy. In forming these opinions, they review the internal control system that produced the financial statements, test the control system to confirm it is working, and test the underlying data for accuracy.
All fine and good, provided we are measuring what matters most to begin with.
For example, one of my clients buys raw materials by the ton, runs that material through a simple process, bags it and sells those bags by the cubic foot. So, should manufacturing performance be measured by the ton of raw material consumed or cubic feet of final product sold? It turns out the most accurate measurement is weight and amount of cubic feet sold — product loss during manufacturing is difficult and expensive to measure.
When I was in the rental car industry, we looked at measures per car (per unit of time) and something called Daily Billed Revenue Days (DBR Days).
The problem was that those two measures could lead to different conclusions, even when evaluating the same issue. The differentiator was utilization (i.e., how often the vehicle was on rent during the period). A high rent rate per day meant high dollar average per DBR Day. But, if the vehicle was not rented that much, there was low revenue per car. Each approach had its own inherent limitations, something that needed to be understood when looking at the data.
Are you considering how the measurement is taken?
For my client that buys tons and sells bags, for example, it measures the weight of what goes into the process for one minute, every hour. That’s a 1.7% sample (1/60). But it’s known that the weight of the material going into the process varies, due to the physics of how the raw material is stored. For this client, a better way of measuring is needed.
In the car rental example, some in that industry assume in their calculations that if a car is on rent for
part of the day, it is rented for the entire day. Others calculate to tenths of a day.
Are you looking at the forest, but missing the trees?
As in the productivity example earlier,
it is important to disaggregate the data. The financial statements may show a profit, but individual customers or products might still be big losers. Or, maybe just a handful of customers or products drive all the profitability.
A different slant on disaggregation is to look at how the parts combine. So, the business may seem to be profitable, but if inventory is going up, some fixed costs are going into inventory. Then, when the product ships, those costs come home to roost. If the price doesn’t cover it, there is a problem. If you are a retailer (or sell to one), your price after discounts, returns, mark downs and so forth, might put you in a hole.
A problem may also occur when one assumes that the combining of individual numbers leads to a meaningful aggregate number. A classic example of this involves adding up the profits of all public companies and examining trends in that aggregate number to divine the relative health of business and the economy in general. Given the size of the entire economy, the aggregate of all public company profitability is a lousy measure of business profitability. Instead, economists look at net income of corporations in the National Income and Product Accounts.
Are the numbers rigged or even completely fake?
LIBOR, the London Interbank Offered Rate that banks charge when they lend to each other, is self-reported by the Treasury departments of large banks. As such, it has the potential for being fudged. (Verifying the veracity of self-reported financials is one of the reasons big companies have internal audit departments.)
When I ran the profitability measurement system for State Street Bank, somehow the Treasury department always beat the self-reported LIBOR rate published in the
Wall Street Journal.
But all bank Treasury departments have an incentive to understate the rate if they are a net lender, because that rate is used to judge their performance! By contrast, during the crash, banks deliberately
underreported the rate they were paying because they didn’t want to spook customers and markets (a high rate may be a sign of distress).
Today, LIBOR is essentially fake, as is the London Interbank Market. That’s because since the crash, European banks
usually borrow from the ECB, the European version of the Fed, and rarely from each other. There is no market anymore behind LIBOR — it’s a fake rate in a fake market. And yet, despite all that, LIBOR is still used to price loans, analyze interest rates and more.
Thoughts and Suggestions
Look for data that is different than expected.
While it could mean your model is flawed, it may also indicate a problem with the data itself, or that the data is not what you think it is.
Put controls on the data.
Businesses do this for financial data; it should be done for all important data.
For example, when I was in the corrugated box business, machine and labor productivity were very important. One of the controls on the productivity data was to make sure hours worked per machine, job, etc., totaled the hours reported by the workers for payroll purposes. That was done for every shift, every day.
Disaggregate important data from time to time.
As in the productivity example above, the story the data seems to be telling may be very different when disaggregated.
When external data is important, invest in really understanding it.
When I worked for Kraft Foods, every product group was assigned someone from the market research department. The market research people were experts in the purchased market research data (special studies that were done, etc.). Also, because they were independent of the product group brand managers, product management couldn’t spin the data to senior management.
It is often the case when looking at numbers and the conclusions drawn from them that the sample size is too small to make reliable predictions over a much larger population. In my manufacturing client example above, only measuring one minute in 60, or 1.7%, can lead to problems when extrapolating to the remaining 98.3%!
Path forking is a related problem. That occurs when so much screening is done
prior to the data analysis, that the conclusions drawn are dependent on conditions that cannot realistically be generalized.
Every well-run company relies on data to make informed decisions. Keep in mind, however, that
information is only as good as the assumptions, techniques and circumstances through which it is generated.
Don’t just look at data, look into it!