The independent news organization of Duke University

​Mind the squeaky wheels

barely functional

A recent Chronicle article reported on the results of a Duke study of a New Zealand community which found that about 80 percent of public services were consumed by the the neediest 20 percent of the population. This relationship was tied back to a neat mathematical result called the “Pareto Principle.” The Pareto distribution, and the related Zipf distribution, often arise from systems where habits or patterns are self-reinforcing, such as criminal convictions or health issues. I felt the article showed far too little excitement about this, and I seek to rectify that now.

American linguist George Kingsley Zipf is credited with describing perhaps the eeriest mathematical fact of the universe, today known as Zipf’s law. Put simply, in any collection of things, the nth most popular thing is about 1/n times as popular as the most popular thing. It is best understood through examples. Open the last piece of literature you read or the last document you wrote that’s at least a few hundred words long, and count the number of occurrences of each word. What you’ll likely find is that the most commonly used word is used roughly twice as much as the second most commonly used word, roughly thrice as much as the third most used word, and so on. What’s impressive about this is that it seems to hold true in basically every text—the Gutenberg corpus, the entirety of Wikipedia, the King James Bible or this week’s Chronicle articles—and not just in English, but any language, including ancient ones nobody can translate.

It may seem that Zipf’s law is little more than a boring party trick, but without looking too far it becomes downright mysterious in its ability to predict the frequencies of many events, not just word usage. The population of the largest city in a country is usually about twice as large as that of the second-largest and three times as large as the third-largest. Measure the amount of time a family spends in each room of their home, and Zipf’s distribution appears again. The numbers of casualties from accidents and wars are also known to follow Zipf’s law. It’s even been used to find evidence of fraud in scientific publications: Data which don’t follow Zipf’s pattern have sometimes been found to have been faked. I’m willing to bet that you use your most commonly used pen or pencil twice as often as the next and three times as often as the one after that, and that you’ll find Zipf’s law appearing in how often you use the various ePrint stations around campus.

The reason that so many different phenomena fall under this pattern isn’t very clear, and it’s likely that each case has a different set of root causes. What has been noted, however, is that Zipf’s law is an inevitable mathematical result for situations that behave according to a principle of least effort. When an event is more likely to occur after it has already occurred, the long-run frequency of events tends to be proportional to the inverse of their rank, producing the 1/n rule. Language, it seems, follows a principle of least effort: common usage of a particular word increases the chances it is used in the future.

A convenient and easy to remember result of Zipf’s law is the Pareto “80-20” Rule: 20 percent of the causes account for 80 percent of the effects. The implications of this in all sorts of fields are staggering: about 20 percent of Medicare participants will use 80 percent of available resources; about 80 percent of the meals ordered in West Union will be the most popular 20 percent of menu options; about 80 percent of land is owned by 20 percent of landowners. For a typical institution, business or cause, solving just a few of the biggest problems will result in a huge improvement in efficiency, profit, and success.

There’s benefit to be gained on a more personal level as well. What if 80 percent of your life’s joy and best memories came from 20 percent of your life’s experiences? It’s perhaps uncomfortable to think of a human life in discrete terms like this. But recently I’ve begun applying this principle in my own daily life and have found it quite useful. I’m aware of the Zipf-ian patterns of my own existence; where I eat or study, who I spend my time with, and how I browse the internet follow the same basic pattern.

But do the same 20 percent of people receive 80 percent of my kind words and deeds? Do 20 percent of my academic assignments receive 80 percent of my total effort? I’m not yet sure, and I’m not sure if such a lopsided distribution is necessarily good or bad.

What I think is certain, however, is that with enough effort it’s possible to live per any distribution one desires. Zipf’s law fits a lot of data, but not all of it—the choices we make, not a mathematical function, determine the events of our lives.

Eidan Jacob is a Trinity junior. His column, "barely functional," runs on alternate Tuesdays.


Share and discuss “​Mind the squeaky wheels” on social media.