Vanilla is a product of Lussumo. More Information: Documentation, Community Support.

    •  
      CommentAuthorefji
    • CommentTimeOct 26th 2011 edited
     permalink
    If you like stats :

    I downloaded the 1625 shots in FF between Aug 24 (#176473) and Sep 24 (#182451). For them the ratings and number of solves count are closed.

    Mean value of the ratings = 7.785

    Here is the graph of the ratings versus number of solves. You can notice a strong correlation !
    A shot having only 1 solve has an expected rating of 7.2 while it is above 8 for a shot with 1000+ solves.



    I have also checked some popular tags and computed the mean rating of the shots tagged with the following words:

    animated : 7.85 (104 shots = 6.4%)
    b/w: 7.80 (257 shots = 15.8%)
    nudity: 7.76 (71 shots = 4.4%)
    boobs: 7.69 (26 shots = 1.6%)
    gore: 7.53 (29 shots = 1.8%)

    It means that animated and b/w movies are popular, while nudity, sex and gore do not help reaching FF :)
    • CommentAuthorFireball
    • CommentTimeOct 26th 2011
     permalink
    Nice stuff efji, really great work ! Much respect to you for putting so much dedication into this.

    However I personally think you should have checked the tag "animation" because it is more commonly used compared to "animated".

    http://whatthemovie.com/search?t=tag&q=animated

    http://whatthemovie.com/search?t=tag&q=animation

    About the nudity shots I do not agree to a certain point as well, because they usually get instantly around 2-5 favs in most of the cases (if they aren't completely cheap shots). This also helps you to get your shots into the FF pretty often.

    Maybe the number of favs should be taken into consideration as well to make this work more significant, what do you think?
    •  
      CommentAuthorefji
    • CommentTimeOct 26th 2011
     permalink
    @Fireball

    thanks for your comments. It is not such a tough job. Just a few lines of script :)

    Actually I checked the string "anim", so it includes "animated" and "animation" (and maybe "animal" too...).

    Yes I could check the number of favs too. No doubt that "boobs" gives you a big bunch of additionnal favs.
    • CommentAuthorRDPL55
    • CommentTimeOct 26th 2011
     permalink
    Excellent job... Between 10° and 101 I recognized all my shots... :)
    •  
      CommentAuthorfungus
    • CommentTimeOct 28th 2011 edited
     permalink
    Very interesting efji. Being indeed very interested in statistics, I wonder if you could also show the t-value of your regression. It seems there is heteroskedasticity in the data, meaning the variance is much higher on the left side of your graph than on the right side. So it's possible the correlation isn't statistically significant.
    •  
      CommentAuthorefji
    • CommentTimeOct 31st 2011 edited
     permalink
    You are right @fungus. It should be considered.
    For the above regression t=16.
    If I do not take into consideration very low and very high ratings (<6 and >9) that correspond to awesome uploads that I cannot detect automatically, then I get this plot, with t=19 and a correllation even stronger between log(number of solves) and ratings