Skip navigation
Help

Statistical Visualization

warning: Creating default object from empty value in /var/www/vhosts/sayforward.com/subdomains/recorder/httpdocs/modules/taxonomy/taxonomy.pages.inc on line 33.
Original author: 
Nathan Yau

Show ratings for 24

The quality of television shows follow all kinds of patterns. Some shows stink in the beginning and slowly gain steam, whereas others are great at first and then lost momentum towards eventual cancellation. Using data from the Global Episode Opinion Survey, Andrew Clark visualized ratings over time for many popular shows in an interactive.

The graph represents the average ranking for the show over time. The red lines indicate changepoints, estimations of when the properties of the time-series, typically the mean changes. The intensity of the plot varies according to the number of respondents. An episode of a show that is favourably rated tends to get more people ranking as do earlier episodes in long-running show.

For example, the chart above shows ratings for 24. The ratings started in the 8s and finished in the 7s, which isn't a huge difference really when you compare it to ratings for The Simpsons.

Simpsons

There's a self-selection challenge here. To participate in the GEOS survey, you have to create an account, so there's probably going to be some polarity in the ratings as well as limited sampling for many episodes. So take it all with some salt. Nevertheless, it's fun to poke around and see how your favorite shows changed over time. Most of the ratings matched my expectations.

The R code is available on github if you want to have a go at the data.

0
Your rating: None

Internet map

Upon discovering hundreds of thousands open embedded devices on the Internet, an anonymous researcher conducted a Census of the Internet, mapping 460 million IP addresses around the world.

While playing around with the Nmap Scripting Engine (NSE) we discovered an amazing number of open embedded devices on the Internet. Many of them are based on Linux and allow login to standard BusyBox with empty or default credentials. We used these devices to build a distributed port scanner to scan all IPv4 addresses. These scans include service probes for the most common ports, ICMP ping, reverse DNS and SYN scans. We analyzed some of the data to get an estimation of the IP address usage.

It's a pretty thorough analysis, but the conclusion interested me most:

The why is also simple: I did not want to ask myself for the rest of my life how much fun it could have been or if the infrastructure I imagined in my head would have worked as expected. I saw the chance to really work on an Internet scale, command hundred thousands of devices with a click of my mouse, portscan and map the whole Internet in a way nobody had done before, basically have fun with computers and the Internet in a way very few people ever will. I decided it would be worth my time.

It makes me feel...uneasy. [Thanks, Roger]

0
Your rating: None

Traffic fatalities - alcohol a factor

I made a graphic a while back that showed traffic fatalities over a year. John Nelson extended on that, pulling five years of data and subsetting by some factors: alcohol, weather, and if a pedestrian was involved. And he aggregated by time of day and day of week instead of calendar dates.

For example, the above is the breakdown of accidents that involved alcohol. As you might expect, there's a higher count of traffic fatalities during the weekend and late night hours since people don't have to work the next day. Or you can see when weather is a factor:

Weather a factor

See more breakdowns here.

0
Your rating: None

History of Film

In something of an homage to the Genealogy of Pop & Rock Music by Reebee Garofalo, designer Larry Gormley visualized 100 years of film.

This graphic chronicles the history of feature films from the origins in the 1910s until the present day. More than 2000 of the most important feature-length films are mapped into 20 genres spanning 100 years. Films selected to be included have: won important awards such as the best picture Academy Award; achieved critical acclaim according to recognized film critics; are considered to be key genre films by experts; and/or attained box office success.

Available in print for 34 bones.

0
Your rating: None

The Forest of Advocacy is a series of animations that explores the political contribution patterns among eight organizations, such as Bain Capital, Goldman Sachs, and Harvard Business School.

These visualizations provide a dynamic look at the partisan tilt of giving within organizations. For each organization, individuals are characterized as points sketching out a line over time. The X axis is time, and the Y axis represents the net partisan tilt of contributions over the preceding 6 months. Over the decades, one sees lines sketched out, reflecting the partisanship of individuals over time. For each organization, we also provide the net contributions of the entire organization, and the names of biggest Democratic, Republican, and "bipartisan" contributors (the individual with the highest product of Democratic and Republican contributions).

At the core, each animation is a time series chart, but the aesthetic and animation, which is narrated, provides for a more organic feel. In particular, the movements of people, represented by squares shifting straight across or up and down, makes it easy to see consistent and not so consistent contributions. [Thanks, Mauro]

0
Your rating: None

Movie poster colors, the evolution

We've seen a number of looks at movie poster cliches, but this is the first time I've seen how the color of movie posters have changed over time. Vijay Pandurangan downloaded 35,000 poster thumbnails from a movie site, counted the color pixels in each image, and then grouped them by year and sorted by hue.

Some thoughts from Pandurangan's designer friend Cheryle Cranbourne:

The movies whose posters I analysed "cover a good range of genres. Perhaps the colors say less about how movie posters' colors as a whole and color trends, than they do about how genres of movies have evolved. For example, there are more action/thriller/sci-fi [films] than there were 50-70 years ago, which might have something to do with the increase in darker, more 'masculine' shades.”

There's no mention of the blanked out 1924. That must've been a sad year. Oh wait, there were movies during that year, so there was either a massive ink shortage or it's just missing data.

[via @DataPointed]

To follow me on Twitter, click here.

0
Your rating: None

Recruiters looking at resumes

In a study by TheLadders (of n equals 30), recruiters looked at resumes and make some judgments. During evaluations, eye tracking software was employed, and they found that the recruiters spent about six seconds on a resume looking for six main things: name, current company and title, previous company and title, previous position start and end dates, current position start and end dates, and education. After that, it was a crapshoot.

Beyond these six data points, recruiters did little more than scan for keywords to match the open position, which amounted to a very cursory "pattern matching" activity. Because decisions were based mostly on the six pieces of data listed above, an individual resume’s detail and explanatory copy became filler and had little to no impact on the initial decision making. In fact, the study’s eye tracking technology shows that recruiters spent about 6 seconds on their initial "fit/no fit" decision.

If I ever have to submit a resume, I'm just going to put those six things as bullets and then the rest will all be keywords in small, light print. It'll be like job search SEO.

Update: I've been told that TheLadder's reputation might be less than savory, and a quick search shows some in agreement, so it might be wise to sidestep the service. Instead, go with my awesome six-bullet advice and you're gold.

[via @alexlundry]

0
Your rating: None

Roulette single bet odds

Jay Jacobs has some fun with roulette simulations and explores the odds of winning for different bets. Above shows a simulation of 250 spins 20,000 times. Or to put it differently, it's like simulating the play of 20,000 people, who each took 250 spins and always bet on a single number.

I'm not sure why it doesn't start to get red until you're $500 in the hole, but bottom line: the longer you play, the higher probability you will lose all your money. That was my main takeaway from Probability 101 in undergrad. The rest is a blur.

0
Your rating: None