Some time ago I modified a Chrome extension to track the time spent on individual web pages. During the past year data for 80K visits to different pages on 3982 domains has been registered, together with the time actively spent on them.

Most pages are visited for a very short time, while there is a significant number of pages on which a much larger time is spent.

The histogram corresponding to the first few minutes shows a transition to a power law distribution around a few seconds.

The distribution is more complicated but still compatible with a power law. The plot of the number of pages on which a time longer than \(t\) is spent has power law tail consistent with the histogram.

The pure power law distribution is often called Pareto distribution . Moreover the fact that the number \(n\) of most visited sites is proportional with the inverse of the typical time, \(\frac{1}{t}\), is known as Zipf’s law. This law also states that the time spent on the \(i'th\) most visited state decreases as \(\frac{1}{i}.\)

Something very similar happens if the times are accumulated for individual domains.

The time versus rank plot shows that there are a few sites on which a long time is spent. In fact I seem to spend 95% of the time on less than 5% of the sites. This is similar with a less extreme 80%-20% distribution called Pareto principle. There are many contexts in which this rule has been identified to hold. Indeed, most of the internet traffic comes from a few pages, most of the Covid infections are related to few patients and 80% of women appreciate only 20% of men (and probably vice-versa). The original 80%-20% rule observed by Pareto for wealth distribution must be significantly more extreme these days.

The list of the 20 domains on which most of my time was spent shows that I read too many news search compulsively for books I will never read and and movies I will never have time to watch. The first three sites nicely summarize my activity in front of the computer: reading news, working on programming projects and watching movies.

domain time
0 feedly.com 925365
1 jupyter notebook 316436
2 www.netflix.com 234897
3 www.goodreads.com 147447
4 TV series 139312
5 en.wikipedia.org 112632
6 www.youtube.com 101562
7 HomeAssistant 99961
8 www.google.com 91973
9 mail.google.com 86839
10 leetcode.com 82419
11 www.hotnews.ro 80254
12 www.bbc.com 77420
13 comics 72836
14 boingboing.net 57888
15 freethoughtblogs.com 54147
16 Audiobooks 50952
17 www.nature.com 48181
18 Ebooks 46664
19 www.mega-image.ro 36204