The importance of statistics in running a business
Article posted on 11/18/2003
There are lies, damn lies, and statistics. It's true. Statistics can be used to pass off the most erroneous things as seemingly factual. Those of us who base our business plans on statistics have to be very careful not to let statistics mislead us. Over the years, I have made statistical analysis a key part of my core competency. I track all manner of statistics and have learned how to mush them together to form reasonably accurate conclusions.
Here are 4 tables.
| May-01 | DLs |
| WindOS Soft | 911 |
| MSWindowsXP | 794 |
| Nascent | 703 |
| LunaUIXP V2 | 570 |
| Gemin | 567 |
| Jul-02 | DLs |
| Midnite | 1480 |
| Serene II | 1407 |
| Industrial Disease | 815 |
| Kismet | 466 |
| Win2k Extended | 403 |
| Feb-03 | DLs |
| XP Prime | 2795 |
| C-N Red | 2027 |
| Woodworks | 1055 |
| Axion_CCOlor | 687 |
| Vgreen | 568 |
| Nov-03 | DLs |
| Longorn Slate | 2121 |
| Atlantek 2 | 1036 |
| Flooter WB | 867 |
| NHA Thy | 796 |
| Shadow 2.0 | 665 |
So what are these tables you're thinking? These are the top 5 WindowBlinds skin downloads for a particular day in May 2001, July 2002, February 2003, and November 2003. I want to know if the overall user base of WindowBlinds is increasing or decreasing.
As anyone who does statistical analysis can tell you, if you are trying to determine the # of users, you cannot go by the top downloads. Ideally, you would use the mean number of downloads. Unfortunately, that would take too long. And I have found an imperfect mechanism for getting a "rough" user base guestimation.
Throwing out #1 and #2 from each allows us to eliminate skins that might have been highlighted somewhere or been linked by someone or had other extenuating circumstances. Instead, places 3-5 added together will make our rough estimate but only as the start of our statistical analysis journey:
So, places 3 through 5 for each one:
| May-01 | 1840 |
| Jul-02 | 1684 |
| Feb-03 | 2310 |
| Nov-03 | 2328 |
Better but there is still a problem. Look at the months. Web site activity varies greatly by time of year. The peek months are usually October through March. The dead time is June and July. So we have to bias thee results a bit:
| Date | DLs | Fudget Factor | Modifed Result |
| May-01 | 1840 | 1 | 1840 |
| Jul-02 | 1684 | 1.1 | 1852 |
| Feb-03 | 2310 | 0.9 | 2079 |
| Nov-03 | 2328 | 0.9 | 2095 |
Now we have a better idea. If you saw the original results, the tempting thing to do would be to say "Hey look, November has the highest stats ever, We're kicking ass!" But that would be a delusion, a dangerous one. November is a peek traffic month just like February is. Whereas the earlier stats were from the Spring and Summer months. After biasing them, the results don't look quite as impressive but they do show consistent growth. From July 2002 the WindowBlinds population has increased by around 12%.
Of course, even here there are other issues, such as the day to day availability of skins. Turns out that on average, 1.5 new visual styles are added each day to the world of WindowBlinds. But the quality can vary from week to week. But you make due with what you got.
I've tracked most of the other programs, competitors and unrelated programs alike to look for trends. This technique definitely works well as a basic guide for popularity. Which gets back to my point -- if you want to be an entrepreneur, you have to wear many hats. I attribute much of our success to knowing the market and making projections based on accurate statistical models. Those who ignore the statistics or don't know how to accurately use them are likely to end up bankrupt in the long term.