Archive for the ‘dataviz’ Category

Surprising experience of the persistence of advertising amidst noise

Thursday, July 21st, 2011

Getting circulated in the twittersphere is a 7 minute video displaying front “pages” of the NY Times website across 11 months. There are 12,000 screenshots, generally displayed for a fixed period of time (with some punctuation in the form of quick holds on a screenshot). For the first minute, I thought this was just another data stunt. I stuck with it, though, curious to see if anything popped. There were a handful of images that popped even though they appeared once (Obama’s frowning face the day after the election, Jared Loughner’s disturbing head shot, World Series and big sports game shots). News stories that crossed several days (the Arab Spring, Chilean Miners) created some persistence and lasting impression. The most persistent, and memorable, parts, however, were the advertisements. Brands that bought the masthead banners for extended periods of time, and take-overs just below the masthead (also for extended periods), became visual foundations for the crazy flow of seemingly disconnected stories, nuggets, and factoids racing underneath. As someone who has grown up believing that advertising doesn’t effect me and that it’s the stuff between the stuff we really want, that odd sense of solidity in contrast to the important stuff of the real world was jarring.

The nerdification of sports/everything

Monday, February 28th, 2011

I love this commercial:

A few days ago, I posted some critical comments about the data visualization techniques used in an iPad app. The designer responded (the links can be found in the post) and it highlighted some larger design issues.

There is no longer a dichotomy of stats-people and civilians. Everyone is surrounded by data and everyone is increasingly using data and, with it, data visualization. The commercial above highlights this amusingly, Steven Johnson writes about it in Everything Bad is Good for You, and newspapers and TV news outlets are building teams to define that new literacy.

All info-graphics and data visualizations have the same standards: to bring meaning to the data or turn that data into a story. While the goal of the content may change and the technical proficiency of the audiences may vary, those two standards apply universally. New info-graphic techniques should improve the meaning or the story-telling ability of earlier techniques. When we replace an efficient, clear, easy-to-scan table with a map containing blobs, there needs to be an improvement in either or both of those criteria.

Tufte is not a statistician’s statistician, he’s more the Orwell of the data-literate age. Having referenced Tufte in my earlier post, I got hit with a lot of comments about ivory tower, academic approaches, and statistical wonkiness. The truth is, Tufte studies a whole range of data visualizations from restaurant menus to ballots to subway maps to train time tables to sun spot charts. Throughout his writing, he rails, like Orwell, against data treatment that obscures meaning, muddles thought, or deliberately distorts. For Tufte, and you see this most clearly in his analysis of the Space Shuttle disaster, there’s an ethical responsibility to be clear and accessible to as broad an audience as possible.

Anyway, I just love that commercial. It shows as clearly as anything that people love richness, complexity, and depth in their content.

Pennant App: Disappointing Data Design

Friday, February 25th, 2011

Note: The app’s designer, coder, all-around maker, responded in his blog. Some additional comments, responses to this post, in a later post.

I just downloaded the Pennant iPad app. While connected to the internet, this app lets you look at every play of every game of professional baseball going back to 1951. It’s gotten glowing reviews from Wired and other places with praise for its “rich interface” and all the fun they bring to stats. I was stoked to buy it — I like reading about baseball more than watching it, I loves me my data, and I was genuinely happy with the Jazz app, which has a similar visual language and navigation tropes.

Sadly, Pennant is just prettied up data, prettied up so much that it underperforms the highly evolved system of box scores and the recent and insufficiently explored sparklines of Edward Tufte. It’s also loaded with some bad usability posing as visualization.

The first problem comes with the app designers’ attachment to cover flow. I’ve never been a big fan of cover flow, finding it imprecise for task completion, and way too low in information density for exploration of anything larger than 20 items. In Pennant, it’s a real waste of space and a strange distortion of a timeline.

img_0115.PNG

The colors are meaningless and confusing, the covers themselves add nothing to the exploration (might be nice to have the jerseys to differentiate between the different iterations of the Giants, or some basic information about team founding, league, or notable players — anything beyond text to justify a visual treatment). Worse, they don’t even show enough items, requiring extra work to get around. While I dislike it, at least the iTunes version previews information and provides more:

coverflow.png

The cover flow fixation obscures the drill-downs in the experience as well. This is a screenshot of the 1981 Pirates season:

img_0135.PNG

The real information is the line across the bottom of the screen, but the cover, which simply confirms the user’s choice and serves as an over-sized title, dominates. The line across the bottom of the screen is also problematic — the individual data points are hard to pinpoint, as an adult fingertip can actually touch three at a time and the finger obscures your sight line. (It’s also a repetition of the cover flow above, but with the added, and admittedly useful though inefficiently executed, depiction of the average.)

This disregard for information and usefulness is pretty much the problem with the whole app. At a brisker pace, some screenshots and critiques:

img_0116.PNG

Maps are possibly the most abused data visualization techniques out there. To make room for the map, you have to shrink the actual data (team names) to pin points. And for what? Spatial relationships? For this particular data set, maps actually create confusion – if you don’t remember that a team moved, or simply want to find the team name in that cluster of points in the northeast, or you don’t really consider the renaming of the Angels as different teams in the same way you think of the Brooklyn Dodger and LA Dodgers as being different.

img_0136.PNG

This is the one that most people praise. It sure looks nifty, and you need code to draw it and do the transition, but why a circle? Usually the stats are listed in proximity and tell a quick story, and are clumped in ways that tell interesting stories. Here, all you have are wedges next to each other, forcing you to spatially assess the relative stats. Worse, they’ve got cumulative stats (number of walks, hits, runs) intermixed with percentage stats (OBP and AVG). Worse than useless, this actually reduces the clarity and usefulness of the data. (The pitching one is a drag too. A standard measure is strike out to walk ratios and you don’t even have the numbers and the shapes that might make for comparison are on opposite sides of the wheel.)

The most difficult thing about the screen is that, at its core, it’s a pie graph. There is a set of wedges indicating some proportional relationship with the other items, further implying that how far out it radiates is an extra dimension. None of these implied relationships, basic knowledge for an adult reader, is delivered on, thus frustrating user expectations.

img_0137.PNG

Quick hit here: the win loss line on the bottom is hard to access (see adult finger stuff above) and the representation makes the Loss look like half of a Win, not the opposite. Compare to a Tuftean style presentation:

sparklinemlbwinloss.png

Here, the color pops for winning and losing streaks, and you get a sense of home and away performance. And, oh yeah, you have summary data at the end of the line.

img_0138.PNG

I have no idea what they were thinking with this one. And, yes, you can move the bubbles around – and get absolutely nothing out of it.

img_0139.PNG
The most annoying of all. This is the meat of the experience, the play-by-play, the true fan’s recreation of the great moments in baseball. Why a circle? Is there something full-circle about the game? What happens to the already maddening smallness of the lines when you go into extra innings? The metaphor dominates at the expense of the information and the narrative.

It gets weirder when you set it to play:

img_0145.PNG

The weirdness comes in a couple places. First, all of the lines from the earlier screen are made the same length when it turns into a wheel. All you have now is the number of at bats and a preview that the last out is happening. The second is that, by ticking off the plays individually, you actually lose the narrative that would come with the list. In this case, Gary Carter tripled, but there’s no context – how many men on base, how many outs, what’s the score, how long has the pitcher been in? These are the moments that make individual plays dramatic. But, by putting the display at the service of the interface metaphor, the events are reported with no context and no outcome (which a list of plays would have allowed the user to put together, and which newspaper scorecards fill in).
Enough of the hating. A quick note about how good infographics are pleasing to the eye, engage you in conversation and add to the story the data tells.

sparklinesaleast.png

Tufte shows cleaner, higher-resolution versions in his book, but I chose the larger, jaggier one to highlight the point. This graphic shows the pattern of a teams season, it visualizes, surges and slumps, shows tight races, conveys the hopelessness of being a fan for some teams and it has data.

Pennant is mostly chartjunk. Sad, but maybe there will be other efforts, as the data isn’t exclusive.

The simplest data tells/inspires a story

Tuesday, August 24th, 2010

A colleague (Ed) walked into my office today saying something about “becoming a doctor” when he came through the door. Slow on the uptake, I needed the explanation that this was a reference to Field of Dreams, specifically the scene where Burt Lancaster, playing Moonlight Graham had to leave the eternal youth of the field to save Kevin Costner’s kid who was choking on a hot dog. All of which brought to mind the tidbit I had to tell Ed: Moonlight Graham was a real player and the story was true!

picture-21.png
I always assumed that W. P. Kinsella, the author of Shoeless Joe the novel on which the movie was based, was a baseball nerd who browsed the sadly no-longer needed Baseball Encyclopedia and found that one line of data that inspired a story.

I owned a copy of Baseball Encyclopedia and got goosebumps when it occurred to me to look up Moonlight Graham and see if he really existed. There it was. This guy got to put on a uniform, get on the official scorecard, maybe even took the field, but didn’t get to bat. Out of that line of zeroes, a string of non-data, really, Kinsella imagined a whole potential person and life story. Dig it.

Video: “Pie charts suck so beware of them”

Friday, January 8th, 2010

Nice Ignite talk by Alex Lundry, who, according to a quick Google hit, does a lot of market and political research and is a consultant to the GOP, has a really great Ignite talk about data viz, visual thinking, and some politics.

Data viz in 5 minutes

Tuesday, September 22nd, 2009

Nice Ignite talk (“enlighten us. just be quick about it” by OReilly) about the basics of data visualization. The presenter, Matthias Shapiro, gives some nice conceptual frameworks to work with: pick your metric, ask a specific question, choose the dimensions (time, location, network, color, time).

Kindred thought: Fact Marketing != Data visualization

Tuesday, September 22nd, 2009

Quick post from Michael Surtees has a nice line:

Unfortunately a lot that passes for data visualization isn’t much more than data fire works. It makes an impressive pop but fades into darkness. Entertaining but not really informative.

and some good links to other, true, data viz stuff.

Nice, but is it data visualization?

Tuesday, September 15th, 2009

I like this video (found via Flowing Data) quite a bit, especially the reminder about the fragility of the atmosphere and curve of the horizon line. But is this data visualization or is it fact marketing? Data visualization should take data points and reveal patterns unseeable (or hard to see), or coax a story out of a perceived bunch of noise.

Vizworld just posted an interview with Edward Tufte which is a nice reminder of first principles of data visualization:

Not much has changed since Tufte began offering the Presenting Data And Information lecture years ago, other than a fourth book and a couple of new examples, but not much has to change when the point is returning to the first principles of information design: make wise comparisons, show causality, employ multiple variables and, above all, focus on the content. This point was driven home for me early on in the lecture as I internally formulated a question on one of my favorite topics: “How will the techniques presented in this lecture help me better represent 3d digital cities?” As if my mind had been read, the answer came: “Don’t ask how visualization techniques can help display data. Ask how data can be best represented.”

I like that it’s a statement of positive principles — show causality and comparisons, seek out complexity and richness, etc. — rather than the anti-prescriptions that are often associated with Tufte (avoid chartjunk, eschew Powerpoint).

Evolving the Origin of Species

Thursday, September 10th, 2009

Ben Fry, creator of Processing (or Proce55ing for those that remember) and data viz guru at MIT, has an absolutely fascinating visualization of how Darwin changed the text of “The Evolution of Species” in the thirteen years following its publication.

darwinfry.png

The labels across the top are chapter numbers, the dashes underneath represent text from the book which you can see on mouse-over. The color bars indicate the different editions.

I called it fascinating on first look, but should probably be more measured or specific. I hate when we fail to distinguish between fact illustration (making a single thing visual) and data visualization (revealing previously unseen stories through a rich visual worth looking at several times). This falls somewhere in between. The final state of the chart, after the 6th, and lengthiest, revision does tell a story:

darwinfry2.png

The most obvious part of the narrative is the addition of an entire section and extensive revisions to the final section in the 6th edition, indicating a structural bolstering of the argument and possibly responses to ten years of critique. The speckle patterns, small bits of color, show a lot of tinkering/revising in the first three editions. These all support Fry’s introductory point:

We often think of scientific ideas, such as Darwin’s theory of evolution, as fixed notions that are accepted as finished. In fact, Darwin’s On the Origin of Species evolved over the course of several editions he wrote, edited, and updated during his lifetime

I’m wondering, though, whether this illustration tells the story better than the text?

What does make it fascinating overall is the ability to mouse over the sections (the small gray and colored stripes) and read the text underneath. Might be a better tool (if the stripes were a little bit bigger and easier to mouse over) than it is a data viz.