Of course you would produce statistics ! This is not entirely scientific because the A/S/L data is not verified in any way, but as the dataset is beefy it will be perfect for cool charts. First of all, the biggest surprise came from something I thought was a natural law : women have worse passwords than men. This is not the case in this dataset, by about 3 markov level, which is a lot.
Let’s start with the year of birth distribution : click here. As you can see, the data set is large, and the distribution is nice looking. There are spikes at the multiples of ten. I’m not sure there is a lot to comment here, but that will get much more interesting with the next paragraph.
What about computing the medium password strength per age class ? It is here, and shows huge differences between (claimed) age classes. What is not surprising is that the people that lie about their ages (born before 1920) have better passwords than the large majority of users. What is surprising is that possibly younger people (born after the 90s) have stronger passwords.
Oh and geographical distribution of password strength : rocks !. If somebody knows of a software package that would let me produce that kind of map, I am interested.
EDIT : actually there is a reason why I cracked so many passwords : I wrote code to attack the “short” ones, and it produced a lot of false positives. This code applied to all md5s, and explains the high success rate.