kip/bot/blog

apophenic pretentia

  • Home
  • 110 Best Books: the perfect library
  • About
RSS
Category Archives: analytics

Good line from NYT Book Review

Posted on October 4, 2008 by kipbot
No Comments

“The plural of anecdote is not data”
- from a review of Friedman’s new book

Categories: analytics

Numerati Generation Gap: Nate Silver & Dan Rather

Posted on October 2, 2008 by kipbot
2 Comments

Fun interview by Dan Rather of fivethirtyeight’s Nate Silver:

Some interesting things to note:

  • it’s fun to look at Dan Rather’s bemused near-smirk. You can just hear him thinking “you dork, why don’t you stick to baseball stats”
  • the number of times Rather refers to complex statistical methods for either the baseball work or the fivethirtyeight work
  • the psychohistory line about “any one game doesn’t matter” but when you hit a critical mass of data, in polls or stats, you can “find nuggets of wisdom”
  • There’s a weird thing going on in this discussion about stats and polling where some very simple math is being turned into high science. If you spend a little time looking at Baseball Prospectus, it’s all algebra. There may be some underlying techniques in the crunching of the numbers, like regression, but the formulas are pretty simple. fivethirtyeight is largely a question of weighting polls, based on some historical data. It’s just not that complicated. Silver’s dissection of the GWU/Battleground poll is barely even a dissection — he just looked at the methodology and saw that they over-indexed older voters! I’m starting to find it frightening how innumerate people are . . . or is it how illogical they are given that it’s middle school math level?

    Categories: analytics, politics, Uncategorized

    Barnacles, Butterflies, and . . . Buffoons?

    Posted on October 1, 2008 by kipbot
    No Comments

    I’m reading Numerati, a fun read about the rising importance of data and modelling (and a healthy antidote to some of the extremes of Super Crunchers. In general, the book has a better, less fetishistic tone, one that acknowledges the power of what’s going on, but keeps it real:

    The only folks who can make sense of the data are crack mathematicians, computer scientists, and engineers. They know how to turn the bits of our lives into symbols . . . [he has a nice jag about using index cards to keep track of dietary patterns, and how inefficient that would be. It's a bit of humanizing text, but I don't feel like typing it.] The key to this process is to find similarities and patterns. We humans do this instinctively, it’s how we figured out, long ago, which plants to eat and how to talk. But while some of us were focusing on more specfic challenges, others were thinking more symbolically. I picture early humans sitting around a fire. Some, naturally, are jousting for the biggest piece of mate or busy with mating rituals. But off to the side, a select few are toying with stones thinking “if each of these pebbles represent one mammoth, then this rock . . . “

    Somehow, those paleo-ners playing with the stones instead of mating or eating meat managed to survive long enough to pass on their genes until, millions of years later, they could become Hari Seldons of the 21st century.

    The key thread of the first fourth of the book (which is where I am, according to the impossible to count progress dots on my Kindle), is how people are trying to turn data points into meaningful models of people. The first test cases are supermarkets, where discount programs and smart carts are being deployed to gather data points about people. One of the first things that emerges is that there are customers who do too good a job of taking advantage of sales and promotions. These people, called “barnacles” by the numerati and marketers who really never intended for people to take advantage of sales, are the people who watch the movies they rent on Netflix, rather than let them sit on the coffee table collecting dust, or the people who actually go to the gym and try to live up to their New Years Resolution or lower their blood pressure. These barnacles should be “fired” by retailers, as they drag down profits.

    On the other side, you have “butterflies”: “customers who drop in at the store on occasion, spend good money, and then flit away, sometimes for months or years on end.” Since they’re unreliable, it a waste of time to lavish courteous, much less fawning, treatment of them.

    I suppose that means that the most desirable customers are buffoons . . . those who don’t scrutinize, price-seek or use the products and services they buy and those who are easily ensnared in a seller’s field of gravity.

    It’s kind of fun to watch marketing lurch between respecting the customer’s individuality and trying to model them into flippable switches.

    Categories: advertising, analytics, marketing

    Everything is SABERMetrics, even politics

    Posted on September 30, 2008 by kipbot
    No Comments

    As part of my poll-obsessing, I finally checked out fivethirtyeight, recommended to me by Alex. Short version is that Nate Silver, the author of the site, is also a leader of Baseball Prospectus. He is credited with creating the very powerful PECOTA system, which rethinks baseball statistics — mostly through pure intelligence, but there is some math that exceeds the AD&D level — and in the process creates a much better explanatory and predictive tool. (It also played no small part in helping to create fantasy baseball’s popularity and even help baseball make a comeback when people thought the fast-paced, pre-felonious NBA was going to surpass America’s pastime.)

    fivethirtyeight is, and I don’t think this is oversimplifying, doing for political polling what it did for baseball stats: finding truths by refining, critiquing, and improving simplistic polling data. Today’s post on the site was one of those aha moments:

    I have gotten an increasing number of questions about the GWU/Battleground Poll, which presently gives John McCain a 2-point national lead, even as essentially every other current national poll shows Barack Obama with a lead of at least 5 points.

    Just because a poll is an outlier doesn’t necessarily mean that it’s doing something wrong. Pollsters may have legitimate reasons for having a different perspective on the election, and they may also occasionally produce odd results due to chance alone.

    In this case, however, the poll seems to be making a relatively fundamental mistake: it is not weighting by age.

    For months, I’ve been wondering why the hell some polls have been reporting a neck and neck race, while others show Obama steadily gaining ground. (Even stranger, why on earth is the always admirable John McCain pulling such silly stunts, throwing hail Marys, if it’s a dead heat?) Finally, someone explains it, and oh how bizarrely simple it turns out to be.

    For those who are curious, here’s the weighting of the battleground poll in question:

    18-34 17%
    35-44 12%
    45-64 40%
    65+ 31%

    Compared to the US Census/2004 election data:

    18-34 26%
    35-44 17%
    45-64 38%
    65+ 19%

    Pretty clear. This poll massively overrepresents older voters who, at a local polling level, have been averse to Obama for a variety of reasons, and massively underrepresent the younger voters who Obama has targeted in campaign activities and who are likely to respond to the post Baby-boomer voice he’s cultivated.

    So simple, no math. Can’t tell if I’m impressed at the baseball-stats freaks or disgusted at the innumeracy of the media, or even literate newspaper reading people.

    Categories: analytics, politics, science

    Data and Original Thinking

    Posted on September 10, 2008 by kipbot
    No Comments

    Nice quotation of Darwin in Glut recommended by @mokindo:

    I am a firm believer that without speculation there is no good and original observation

    Categories: analytics, expertise

    Number-crunching: Bill James going soft?

    Posted on July 30, 2008 by kipbot
    No Comments

    Just kindle-bought Bob Neyer’s Big Book of Baseball Legends, and am amused by Bill James’s prologue:

    The academics have won. The standards of accuracy that began in academia have been embraced by paid reporters and have now spread to the limitless legions of dignified researchers, pouding out accurate if boring biographies about absent and long-dead heroes.

    And I’m not saying that’s a bad thing, you know? Dinosaurs are more interesting than unicorns. I don’t even read fiction; history is always more interesting. I am just saying… something humanizing and indefinable has been lost in the search for the truth — lost or, worse yet, thrown away. For thousands of years, men made slightly heroic fiction out of their own petty lives. You can’t get away with that anymore.”

    It’s a strange introduction to a book that is largely about debunking baseball myths. Stranger still, coming from someone who almost single-handedly turned number-crunching into part of America’s pastime and may have done more for increasing overall numeracy in the country than any government initiative. Still, it’s nice to hear a high priest of truth-by-numbers acknowledge that there’s more to baseball, and other things, than the numbers might be able to tell.

    Categories: analytics

    Politicos are More Social than Designers

    Posted on May 8, 2008 by kipbot
    2 Comments

    Technology Review ran an article about blogosphere and social network traffic visualizations which featured pretty and interesting pictures as well as insights into what’s worth measuring in social networks. (The full article isn’t yet available to non-subscribers in its full format.)  The picture below visualizes a number of things including, apparently, the relative ego size/socialness of political junkies and designers.
    blogosphere.jpg

    The two regions are held together by popular blogs with ties to both subject areas. The size of the ­circle representing a given blog is proportional to the number of other blogs linked to it. Hurst notes an apparent difference in culture between the two regions: pink lines, which represent reciprocal links, are much denser among the political blogs than they are among blogs focused on technology.

    Categories: analytics, design

    “Biology gives way to chemistry”, or Number-Crunching Reductionism

    Posted on March 12, 2008 by kipbot
    No Comments

    Came across a line in Omnivore’s Dilemma that captures some of my frustration with super-crunching and marketing models:

    To reduce [a complex agricultural system under discussion in the book] represented the scientific method at its worst. Complex qualities are reduced to simple quantities; biology gives way to chemistry . . . that method can only deal with one or two variables at a time. The problem is that once science has reduced a complex phenomenon to a couple of variables, however important they may be, the natural tendency is to overlook everything else, to assume that what you can measure is all there is, or at least all that really matters. When we mistake what we can know for all there is to know, a healthy appreciation of one’s ignorance in the face of a mystery . . . gives way to the hubris that we can treat nature as a machine.

    I love the idea of biology giving way to chemistry: systemic thinking giving way to engineering problems. How often do designers struggle against models of people that focus on two factors to the exclusion of everything else, that reduce people to the actions we want them to take?

    Stick a pin in it and it dies.

    Categories: analytics, reading

    Grounding Abstract Methods in Design Needs

    Posted on February 28, 2008 by kipbot
    6 Comments

    Two articles, once again from Todd Walker, highlight how research (or research-driven techniques) needs to be (re)-grounded in the needs of design.

    The first, Design Meets Research from AIGA,  has a useful survey of leading testing techniques and provides some pros and cons about each of them.  In the middle of the piece is a paragraph that summarizes the key problem most designers have with research:

    There is a group of brand consultants and cultural anthropologists alike that believe now that it is not the actual research itself that is the problem. It is rather about how research is often misused, what type of design concepts and stimulus are tested, and how data is analyzed that is most often at fault. When used correctly, research shouldn’t stifle creativity but rather offer designers stronger inspiration and focus.

    They remind designers that there’s a critical interpretation phase that comes between research and design.  No one would disagree with that statement, but where it gets tricky is how people define interpretation and who participates in it.  In more than one work environment, interpretation meant a summary of major findings, was conducted by the strategy group or account lead, and somehow straight-lined to design recommendations.  (“Only 49% of respondents viewed element x favorably -> Replace element x or remove it.”)

    The article hits some other high points:  know what you’re testing for; remember that testing is ultimately about better understanding a customer (heightening designer empathy with the audience) and not about having customers do design; ethnographic activities are still the best things for designers to do no matter what; research is an art not a science; interpretation is a joint activity between design and research.

    The other article, Personas and the Role of Design Documentation has similar themes, but is more focused on personas.   Specifically, it focuses on the way in which most people go through personas as a deliverable that needs to be done, not as a tool with a purpose and communication goal.  Key point for the writer:

    Personas are not documents, and they are not the result of a step-by-step method that automagically pops out convenient facsimiles of your users. Personas are actually the designer’s focused act of empathetic imagination, grounded in first-hand user knowledge.

    The best part of the article is a distillation of lessons from Alan Cooper’s ‘origin of personas’ story (mythic in its grandeur, but true):

    1. Cooper based his persona on a real person he’d actually met, talked with, and observed.
    This was essential. He didn’t read about “Kathy” from a market survey, or from a persona document that a previous designer (or a separate “researcher” on a team) had written. He worked from primary experience, rather than re-using a some kind of user description from a different project.

    2. Cooper didn’t start with a “method”—or especially not a “methodology”!
    His approach was an intuitive act of design. It wasn’t a scientific gathering of requirements and coolly transposing them into a grid of capabilities. It came from the passionate need of a designer to really understand the user—putting on the skin of another person.

    3. The persona wasn’t a document. Rather, it was the activity of empathetic role-play.
    Cooper was telling himself a story, and embodying that story as he told it. The persona was in the designer, not on paper. If Cooper created a document, it would’ve been a description of the persona, not the persona itself. Most of us, however, tend to think of the document—the paper or slide with the smiling picture and smattering of personal detail—as the persona, as if creating the document is the whole point.

    4. Cooper was doing this in his “spare time,” away from the system, away from the cubicle.
    His slow computer was serendipitous—it unwittingly gave him the excuse to wander, breathe and ruminate. Hardly the model of corporate efficiency. Getting away from the office and the computer screen were essential to arriving at his design insights. Yet, how often do you see design methods that tell you to get away from the office, walk around outside and talk to yourself?

    5. His persona gained clarity by focusing on a particular person—”Kathy”.
    I wonder how much more effective our personas would be if we started with a single, actual person as the model, and were rigorous about adding other characteristics—sticking only to things we’d really observed from our users. Starting with a composite, it’s too easy to cherry-pick bits and pieces from them to make a Frankenstein Persona that better fits our preconceptions.

    There are, of course, challenges embodied in these lessons.  Grounding a persona in one person could lead to endless ratholes about which one person, and number wonks will immediately jump all over the “method”/”methodology” point.  But the key point is that personas are ways of creating empathy with the user, of getting us (our team and clients and other stakeholders) out of our own heads and into someone else’s, of creating conversations with potential customers and users.

    Categories: analytics, design

    Wisdom applied to Number Crunching

    Posted on February 27, 2008 by kipbot
    No Comments

    Terrific TNR article referred to me by Todd Walker describes how the Obama team uses data and wonky policy techniques in a way that seems relevant for many of us in an increasingly number-rich, -doused, -drenched, -dictated world.

    The article starts with a description of the influence of neo-classical refiner Richard Thaler:

    Behaviorists like Thaler believed that the perfectly rational, utterly self-interested maximizers of economists’ imaginations had little in common with actual human beings, who frequently err when making simple calculations, who have trouble with self-control, who often act out of altruism or spite.

    But what’s really interesting is how Thaler and his fellow behaviorists responded to this fairly critical insight. Though rational self-interest was the central tenet of neoclassical (i.e., modern) economics, they didn’t take a wrecking ball to the field and replace it with some equally sweeping theory of human behavior. Instead, they labored to bring economics closer in line with how the world actually works, one small adjustment at a time. “‘Discovery commences with the awareness of anomaly,’” Thaler wrote in the introduction to The Winner’s Curse, quoting the philosopher Thomas Kuhn. “I hope to accomplish that first step–awareness of anomaly. Perhaps at that point we can start to see the development of the new, improved version of economic theory.”

    One of my biggest gripes with data and marketing models (funnels) is that people tend to approach them as rules to live by. When faced with an anomaly, there are two responses: 1) wave it off as an anomaly; or 2) try to force the anomaly into the ‘model’. It’s a bit like the retrograde motion of planets: when the observational data pointed to non-circular motion of the planets, retrenching astronomers created these weird circle-within-circle movements that had no plausible explanation, but preserved the pretty circles. A third approach would be to evolve the model, soften its hard edges, add some dynamics to it.

    The divide in economics between numbers and working models is becoming a chasm. What’s great about Thaler’s approach is that it functions somewhere between the wrecking ball of a new model, but avoids retrograde techniques. The thinking embraces the anomaly and allows for a punctuated equilibrous burst in the development of the model. “Like their intellectual godfather Thaler, the Obama wonks aren’t particularly interested in tearing down existing paradigms, just adjusting and extending them when they become outdated. (Thaler urges his students to master the same traditional, mathematical models their colleagues do if they want to be taken seriously.)”

    Another nice passage highlights that there is still something along the lines of expertise and judgement that can live well with numbers:

    The second difference is that the Obama hands tend to feel less hemmed in by establishment opinion. As one Obama adviser puts it, “Democrats want to be just a little bit different from Republicans, but not so different that they get attacked for being weak.” Like Hamilton, the Obamanauts generally reject this calculus–not because they favor some radical alternative, but because clinging to received foreign policy wisdom can preclude highly practical courses of action.

    Of course, here they’re talking about foreign policy, which is not numbers-based. But the idea of “practical courses of action” — things which just make sense or feel right, pass the sniff test, resonate with a highly trained neerve ending have a place in their discussions, agenda, and plans.

    It also allows for leadership without ignoring the polls, or innovation without ignoring the data.

    Categories: analytics, marketing, politics
    Previous Entries
    Next Entries
    • Recent Posts

      • Gamifying Learning
      • What FB got with Instagram #latetotheparty
      • The Evolving Faux-Flash Genre
      • Evolving and Nuancing Web Metrics
      • Digital Age Requires Fluid Mental Models
    • Categories

      • advertising
      • analytics
      • brand
      • business
      • collaboration
      • computing
      • craft
      • creativity
      • culture
      • design
      • DIY
      • education
      • emergent
      • expertise
      • gadgets
      • games
      • imadork
      • innovation
      • inspiration
      • management
      • marketing
      • politics
      • programming
      • reading
      • science
      • social
      • technology
      • Uncategorized
      • UX
    • Tag Cloud

    • Archives

      • April 2012
      • March 2012
      • February 2012
      • January 2012
      • December 2011
      • November 2011
      • October 2011
      • August 2011
      • July 2011
      • June 2011
      • May 2011
      • April 2011
      • March 2011
      • February 2011
      • January 2011
      • December 2010
      • November 2010
      • October 2010
      • September 2010
      • August 2010
      • July 2010
      • June 2010
      • May 2010
      • April 2010
      • March 2010
      • February 2010
      • January 2010
      • December 2009
      • November 2009
      • October 2009
      • September 2009
      • August 2009
      • July 2009
      • June 2009
      • May 2009
      • March 2009
      • February 2009
      • January 2009
      • December 2008
      • November 2008
      • October 2008
      • September 2008
      • August 2008
      • July 2008
      • June 2008
      • May 2008
      • April 2008
      • March 2008
      • February 2008
    © kip/bot/blog. Proudly Powered by WordPress | Nest Theme by YChong