This post is a blatant ripoff of Rick Turoczy’s Silicon Florist word cloud, but that’s how memes work, right? I made two different 2009 word clouds for Netninja: one for just titles and one for full post content. I’ll get into the “how” after the images.
This first image is of just titles. I guess you can infer that I did a lot of talking about Portland and Ignite, talking about gadgets (iPhone, Kindle, Apple), and blogs (LiveJournal, WordPress, Netninja).
The second image is for post content and contains a lot more mundane words. A few words carry over from titles, like Kindle and Portland. Apparently, I, like, use the word like a little, like, too much.
To generate the word map for post content took two steps. First, I had to extract the text from the MySQL server that my WordPress installation is using. I have a shell account to my web server, so all it took was a simple(-ish) Unix command. If you only have access your database through PHPMyAdmin or other GUI tool, you may have to figure out how to do this a different way. The command I used was:
mysql -uUSERNAME -p -hDBHOST --batch --skip-column-names -e 'select post_content from wp_posts where post_date between "2009-01-01" and "2010-01-01" and post_status="publish" and post_type="post"' DBNAME | perl -pne 's/<.*?>/ /g' | perl -pne 's/\x5Cn/ /g' > post_content_2009.txt
Obviously, replace your username, server name, and database name where appropriate. Then type your database password when prompted. The content of all of your 2009 posts ends up in post_content_2009.txt.
Specifically, what this does is:
- Extracts the post content from the database between the dates specified (1 January 2009 at 00:00 and 1 January 2010 at 00:00)
- Only public posts, not pages, drafts, or revisions
- Strips out HTML tags
- Strips out the literal string “\n” (representative of newlines)
- Puts the result in post_content_2009.txt
You can then open post_content_2009.txt in a text editor, select all, copy, and paste the text into Wordle and play with fonts, colors, and cloud shapes.
You can do a similar thing, but with only post titles by using a similar command (replacing “content” with “title” in the above command):
mysql -uUSERNAME -p -hDBHOST --batch --skip-column-names -e 'select post_title from wp_posts where post_date between "2009-01-01" and "2010-01-01" and post_status="publish" and post_type="post"' DBNAME | perl -pne 's/<.*?>/ /g' | perl -pne 's/\x5Cn/ /g' > post_title_2009.txt