User blog:Uknownada/Zipf's Law and One Piece

Hey, Wiki! Nada here. So I've been watching a lot of Vsauce on Youtube lately, and one recent video made me a little curious. It's called The Zipf Mystery, and it goes over Zipf's Law and how it applies to not just our language, but pretty much everything we do. It's an interesting topic, and I recommend giving this video a watch before reading this blog.

In a nutshell, Zipf's Law is a phenomenon where the words of a given document (book, poem, essay, etc.) are approximately distributed through a consistent pattern: The second-most used word would appear about half as often as the most used word. The third-most used word would appear about one third as often, etc. This pattern is not limited to what's written.

Curious about how this all works, I decided to take the first chapter of One Piece and see how this is applied. To keep things consistent, I used the Viz translation so no confusion with fansubs or anything would come up. Viz has some grammatical errors here, but I'm not worrying about that. What I did was take every word spoken by a character and list them just to see how frequently they appear, and if it matches Zipf's Law at all. Surprisingly...it did.

http://i1091.photobucket.com/albums/i389/uknownada/20151111111956_zpspv8emepo.png

I was a little too lazy and pressed on time to make my own graph, so I threw together one on a generator on some crappy free website. Here, you'll find eleven (some of them are tied, so there are 8 ranks) of the most frequently used used words in Chapter 1 Piece.

you i ha me i'm just Luffy what pirates your pirate

Interesting looking graph, huh? It just curves right down. But it also seems to line up with Zipf's Law pretty nicely.

"You" is the most used word, at 112 occurrences. Divide this by 2, and you get 56. That's a bit off from 45, which is where "I" is, but not by much. 112 divided by 3 is about 37.333...which is PRETTY close to 38. HA! Divided by 4? 28. By 5? 22.4. By 6? 18.6666... By 7? 16. And by 8? 14.

Now let's compare how close these numbers all are.

Obviously it's not a perfect match. Zipf's Law never is. But when looking over everything, it seems pretty close. Closer than I expected. So what do you do with this information? I dunno. Try to prove it wrong. Might not be that hard. Maybe there were some words I missed. I put them all into my subpage in case anybody wants to confirm. The whole transcript of Chapter 1 by Viz Media is there.

Anyways, I'm off. Continue what you were doing before. And as always, thanks for reading.