The raw data behind the story "Al Gore's New Movie Exposes The Big Flaw In Online Movie Ratings" https://fivethirtyeight.com/features/al-gores-new-movie-exposes-the-big-flaw-in-online-movie-ratings/.

ratings

Format

A data frame with 80053 rows representing movie ratings and 27 variables:

timestamp

The date at which the rating was recorded.

respondents

The number of respondents in a category associated with a given timestamp.

category

The subgroups of respondents differentiated by demographics like gender, age, and nationality.

link

The website associated with a given category's responses.

average

The average rating reported by a given category.

mean

The mean rating reported by a given category.

median

The median rating reported by a given category.

votes_1

The count of votes denoting a rating of one that respondents gave.

votes_2

The count of votes denoting a rating of two that respondents gave.

votes_3

The count of votes denoting a rating of three that respondents gave.

votes_4

The count of votes denoting a rating of four that respondents gave.

votes_5

The count of votes denoting a rating of five that respondents gave.

votes_6

The count of votes denoting a rating of six that respondents gave.

votes_7

The count of votes denoting a rating of seven that respondents gave.

votes_8

The count of votes denoting a rating of eight that respondents gave.

votes_9

The count of votes denoting a rating of nine that respondents gave.

votes_10

The count of votes denoting a rating of ten that respondents gave.

pct_1

The percentage of votes denoting a rating of one that respondents gave.

pct_2

The percentage of votes denoting a rating of two that respondents gave.

pct_3

The percentage of votes denoting a rating of three that respondents gave.

pct_4

The percentage of votes denoting a rating of four that respondents gave.

pct_5

The percentage of votes denoting a rating of five that respondents gave.

pct_6

The percentage of votes denoting a rating of six that respondents gave.

pct_7

The percentage of votes denoting a rating of seven that respondents gave.

pct_8

The percentage of votes denoting a rating of eight that respondents gave.

pct_9

The percentage of votes denoting a rating of nine that respondents gave.

pct_10

The percentage of votes denoting a rating of ten that respondents gave.

Source

IMBD http://www.imdb.com/title/tt6322922/ratings and see https://github.com/fivethirtyeight/data/tree/master/inconvenient-sequel

Examples

# To convert data frame to tidy data (long) format, run: library(dplyr) library(tidyr) library(stringr) ratings_tidy <- ratings %>% gather(votes, count, -c(timestamp, respondents, category, link, average, mean, median)) %>% arrange(timestamp)