So who exactly is out there blogging? Well, you've probably got a pretty fair idea of the demographics just from bopping around the blogosphere. But we're gonna be talking real numbers here.
My student, JS, downloaded over 100,000 blogs, all of which have blogger profiles. That's a pretty decent sample. We did all manner of fun experiment some of which I'll tell you about.
Here are the facts:
% male bloggers -- 54.3
% female bloggers -- 45.7
Age distribution:
15 and below 7.2%
16-20 20.6%
21-25 18.3%
26-30 9.4%
31-35 4.7%
36-40 2.2%
41-45 1.3%
46-50 0.8%
51-up 1.4%
unreported 34.1%
Yes, it's a kids game.
I've got the numbers on occupation, too, but they're tedious. It's mostly students, a bunch of techies and a little of everything else.
Now for the fun stuff. The game we were playing is to see if we could rig the computer to check frequencies of various words in a blog and then correctly guess the gender and age of a blogger. Suffice it to say it can be done. There is a measure known as "information gain" which tells you the extent to which the presence or absence of a feature tells you what you want to know. In this case, we want to measure the information gain of words for gender and age determination of bloggers.
The words with highest information gain for gender are the following:
cute
boyfriend
mom
cry
hair
system
software
based
government
web
Do I need to tell you which words indicate which gender? Without getting into gory details, each of the first five is used about three times as often by bloggerettes and the second five in about the same proportion reversed. Is this not embarrassing? Can one even post this list without being accused of something bad?
The words with highest information gain for age are:
lol
im
haha
wanna
bored
girl
office
job
company
women
policy
political
administration
Would you believe that for bloggers in their teens or early twenties, a random sample of 10,000 words will include an average of 30 appearances of the word "girl". For those above the age of thirty, that number drops to zero. Zilch.
Another interesting example is "bored", which in a random sample of 10,000 words is used 26 times by teens, 9 times by those in their twenties and zero times by those in their thirties. Do people have less and less time to be bored or are they just less inclined to tell the world about it? lol
Fascinating. Especially the boredom thing.
ReplyDeleteThe adults just call it the "glassy-eyed feeling"
ReplyDeleteMaybe you should also look for "boring" rather than "bored," since ben chorin himself wrote:
"Most of these papers are as boring to me as they'd be to you."
implying that his audience is as bored as he is, albeit placing the fault with "boring" external factors.
See, you really should have deleted THAT post. Like I said, it was obnoxious.
Anonymous,
ReplyDeleteI'm sorry you're paying such careful attention. You must be bored.
In any case, you are right that the sentence you cite was obnoxious. (The rest of the post was fine.)I've edited that sentence.
"(The rest of the post was fine.)"
ReplyDeleteNow there's an obnoxious comment!
Are you going to take to editing your comments too?
Please publish the age distribution for each gender. Aren't there some interesting differences?
ReplyDeleteJohn Dvorak argues that most people lose interest in bloggin in less than a year. Please give us some stats on blogs that are currently active and over a year old; might be interesting.
It's been claimed in the past that stay-at-home mothers are a large percentage of all bloggers. I take it you are refuting this notion?
I looked at predicting gender using the word-tagger at http://www.bookblog.net/gender/genie.html . I think that attempts to identify gender tend to fail frequently when the writer is an intelligent woman.
The age-identifiers you've collected simply suggest that multiple generations are blogging, and we really speak different flavors of English; no news here.
But I must admit, I've used the word Girl only once in nearly two years.
You said "Do people have less and less time to be bored or are they just less inclined to tell the world about it?"
I suggest you look at your data. I've used the word 'bored' three times but never about myself:
Bored of the Rings (parody).
But when the kids get bored, we bring out ...
Candyland is a board game that remarkably young children can play. When you get bored, make it a more skilful game.
- The Precision Blogger
http://precision-blogging.blogspot.com
Daniel,
ReplyDeleteWill provide gender/age breakdown but it'll take some time until I get to it.
haha. It’s funny how many office women will cry if their boyfriend calls them a girl. At the company they’re all about the job and they’re all serious government policy types and have the right political contacts in the administration, but really they’re just about cute hair. lol. They never wanna talk to a guy who’s into web-based system software, cause they think they’ll be bored. Gotta go. I promised mom I’d im her.
ReplyDeleteDr. Bean,
ReplyDeleteIt's official. You are an ageless hermaphrodite.
Finally! Some recognition!
ReplyDeleteHave you spoken with Shlomo Argamon about this? It's his current research area, telling gender of anonymous authors, even identifying which of a pool of possible authors best matches an anonymous work. You can find his email via Google or Google-groups.
ReplyDeleteThe "bored" thing may also relate to TV programs, which can create and spread slang. "Bored now" was popularized via Buffy the Vampire Slayer, an American TV show, known for its creative teen slang. There's a whole study on this, "Slayer Slang", that my wife read.
ReplyDeleteThanbo,
ReplyDeleteShlomo is a good friend of mine.
Oy, your poor wife. Buy her some decent books.
>Buy her some good books.
ReplyDeleteAlready have. Harold McGee, "The Science of Cooking". Art Spiegelman, "In the Shadow of No Towers". Stephen Gladstone, "Blink". She already has the whole Austen oeuvre on her PDA, to read in spare moments. It's our anniversary this month, doncha know. 24 Adar, and it's a leap year, so we drag it out all month long from one to the other.
>Shlomo is a good friend of mine
ReplyDeleteWonder if you also know my cousin Dov K, originally from Bethesda?