I often use the average ratings of books/movies from Goodreads/IMDB in order to decide what to read/watch next. Unfortunately, if there are not enough ratings, the average rating is often inflated, and it seems to decrease with the number of ratings. Plots of two datasets of books and movies show that the story is slightly more complicated.

Recently I used the Goodreads API to obtain my list of books marked as read on the site. For each of those books I also scrapped other books which are listed by Goodreads as being similar, resulting in a list of some 30K books. My plan was to devise a recommendation algorithm. Unfortunately including the average rating turns out to give poor results, mainly because there are many books with very high average rating which are not very interesting. These books also have low number of ratings. Therefore we need a method to quantify how much the average rating can be trusted.

A plot of the number of ratings vs average ratings, shows a surprising pattern.

There seems to be a tendency of books rated by many to have lower ratings, but only in the interval 4.2-5. Below 4.2 books with more ratings have higher average rating.

As one can find many poor movies with very high rating, I decided to try the same plot for a list of some 80K movies.

The shape is different, but in the lower right corner of the plot one could imagine there is a class of movies for which the average vote decreases with the number of votes.

In order to explain these plots I propose the following mechanism. Each book/movie accumulates ratings according to their current average rating. This will explain the left side positive slope. In addition, most books/movies start with over-inflated ratings and are downgraded as more people try to read/watch them. These books/movies, still overvalued, are responsible for the negative slope on the right side. The difference between the two plots could be explained by a larger network effect for movies: people are more influenced by the crowd when deciding what movie to watch or how to rate it.