Malcolm Gladwell and the statistical basis of Outliers
Malcolm Gladwell promotes his book on Colbert. Go watch it.
Isn't it just wonderful how comedy can get to the nub of things and
throw thoughtful, serious people off-balance?
So, Malcolm Gladwell has published a new book: Outliers. It is
Gladwell's summary of, and meditation on, the large volumes of
research on the topic of extraordinary achievement. As usual, the
reviews marvel at his clear prose and credit him with making a tough
topic easy to understand.
Being quite a success story himself, people are getting tougher on
Gladwell. Part of his talent as a great writer is that he makes it
seem easy; lots of stuff is folded neatly in his sentences and
narrative. It takes some getting used to before you can attempt to
unravel his logic. Possibly the strongest criticism of Gladwell's
body of work (NYT, New Yorker, and the three books) is that he
simplifies too much.
I have not read the book! But he sure picks'em - the topics. From
the optimistic message of "Tipping Point", via the equally
optimistic "Blink", he now tackles a topic that has more or less
created the Self-Help movement: extraordinary success.
His message, as reported by the reviews, is again optimistic:
extraordinary success (outlier success) is a product of
highly-specific circumstances, and therefore there is an
awfully-large amount of talent out there that we should not
overlook because it does not act, look, or sound like "successful"
people do.
I once gave a research tutorial on my PhD topic: Data Clustering,
and in one of the feedback forms, I had this guy saying nice things
about my talk, except that because the subject of his PhD is
"Outliers" - he found my talk lacking in that respect. He wrote that
he wanted a "more solid treatment of the topic of Outliers". So,
take what I write below as totally un-solid.
In the field of Pattern Recognition, where my scientific training
has been, Outliers is a special topic. Pattern Recognition - for
those who need a small introduction - is a subject usually
classified under Computer Science that combines statistics and
computer algorithms for the purposes of learning models from data.
For example, you want a computer to learn someone's voice from a
number of his speech recordings for the purpose of recognising that
voice in new, unheard-before recordings. That's a typical pattern
recognition problem.
The fundamental assumption employed in these situations is that a
person's voice can be typified. Often, however, it is not. It varies
according to his stress level, background noise, time of day, etc.
And sometimes the voice that the computer thinks is his, is not, and
vice versa. Typically, the decision rests on whether a given voice
is within the margins of expectation of the voice we are trying to
recognise, or outside of it. Is it an outlier or not?
Thus, the topic of "Outliers" - an established sub-field of
statistics by itself - got linked-in to pattern recognition. I am
not an expert in Outliers - otherwise I would not have got that
comment from the student - but I know enough to know that it is
almost a philosophical question.
Something could be an outlier in one 'representation space' (just
think of 'space' for now) but very typical in another representation
space. For example, in one space - a courtroom (say) - a man could
be exceptionally important - the judge, but in another room - a
hospital waiting room - he is just another patient. Representation
spaces transform outliers into typicals, and vice versa.
Also, let's fix the representation space for now, just how much, how
far, do you go before you say something is an outlier? Maybe it is
noise (a random, unwanted artefact) and not an outlier.
Distinguishing between noise and outliers is a huge pain in the neck
for people in the field.
Another angle: maybe with more data, or in more time, the outliers
cluster around each other, which would mean they are not outliers
but actually a distinct but tiny cluster. Once something is part of
a cluster it is not an outlier.
To recap, the statistical definition of outliers is that they are
not noise, and they do not congregate in tiny clusters. When you
factor in that they are tricky to hunt down because they change
status from outlier to typical when the representation space is
changed, you realise how tough this problem of identifying outliers
is.
Malcolm Gladwell says that Bill Gates and Mozart are outliers. Why
aren't we saying they are noise? (Bill Gates is a random, unwanted
artefact!) And in what representation space are we working? Are we
measuring money, acclaim, extraordinary musical talent, what?
If we had more data, would Mozart still be an outlier, or would he
coalesce with Timbaland, Prince, Beethoven, and others into a
distinct, tiny cluster of "exquisite musicmanship"?
Still, I agree with Gladwell's fundamental message: there is an
awful lot of luck involved in success, and an awful lot of wasted
talent on earth. If Warren Buffet was born in Egypt, would he have
been as rich and as famous? I hope Malcolm Gladwell's book manages
to create a dent in how people assess the "potentially successful".
Amen. Yes sir, please.
Comments
Have you checked out William Gibson's novel 'Pattern Recognition'?
I had just gone to read about Gladwell's new book, the Outliers, and came straight from there to see what you were writing about today.
Isn't the problem of outliers (as well as the variety of variations of derivative terms describing similarly exceptional things) really a problem - shared by all mathematics - of the MEANINGS of the things oversimplified into numbers?
* Lynn, thanks for your comment. I remember our conversations on Blink in your house and by the Bluffs. He is such an interesting writer, isn't he? Look forward to sharing notes with you on Outliers.
* math, yes, you are right, meanings are lost. But without Maths there would be no pyramids, no bridges, and no computers. In any case, true mathematicians are the real outliers of humanity!
I thought about Bill Gates being an outlier. I have an older friend who was very close friends with Bill's late mother. She sometimes tells stories about the incredible community of people who were connected to Bill's family when he was growing up. It was apparently an amazing family long before Bill co-founded Microsoft.
The bit I read about Gladwell's take on outliers said they never do it by themselves. It has everything to do with the family or community the person is a part of or comes from.
Since Bill Gates, the mainstream of software development has certainly been exclusive of intelligent or even responsible software. Microsoft has made outliers of all creators of good software, with good engineering, good design and good user-interfaces.
Looking generally at an industry or a community, we might ask where we should expect to find good ideas, innovative work and visionary leadership? In the mainstream? Or on the outlier fringe?
I remember reading an interview with Gates back in 1992 and he said he was an avid reader of his dad's Fortune (or similar business magazine) since age 8. He was a geek with a keen business sense. Not many of those around.
I agree with math that Gates has not contributed very much to software innovation. The reason we talk about him is mostly to do with his business acumen - his wealth. I am sure he is an absolutely engaging, highly intelligent man; but it is his business instincts that were remarkable. Microsoft imitated better products shamelessly, out-spent smaller competitors until they sank, and monopolised PC operating systems. Highly ambitious for a programmer!
The unrewarded developers of Unix (*the* original operating system back in the 1970s, untouchable even today), the open-source developers of Linux, the developers of Lotus (the original spreadsheet software, which Excel copied), the developers of WordPerfect (which Word copied), the developers of Borland compilers (which Microsoft almost put out of business), ... my God, how these guys must resent Microsoft.
This is probably the problem with Gladwell's theory: agreeing on outliers is probably impossible. Why aren't Dennis Ritchie et al (of Unix) outliers? If they are, well thanks, how come then Linus Torvalds (of Linux) is not an outlier? If you let him in as an outlier, what about Richard Stallman of the gnu project ... and so on and on.
I also think that lumping people like Mozart together with Gates is wrong. They are two truly different souls. To be exceptional in the arts is something dependent on you as a person (and you can lose it) and your audience (and they can change); but to be exceptional in computing or business is more logical.