1000px-Wikileaks_logo

WikiPub: WikiLeaks on your eBook reader

I read. I read a lot. I do not nec­es­sar­ily read as fast as the folks that can power through a book in a week­end, but I get a lot of enjoy­ment out of it nonethe­less. Part of this read­ing occurs dur­ing my daily com­mute.

Recently, I man­aged to grab a copy of the WikiLeaks cable­gate archive. It’s pretty easy if you know where to look and know how to oper­ate BitTorrent. I wanted to read the cables in the morn­ings, but ran into a few snags. When down­load­ing a mir­ror of the cables, you get a whole bunch of HTML files that are for­mat­ted for a desk­top browser. I do most of my read­ing on the iPhone and only occa­sion­ally switch to the iPad (if I’m not jug­gling a travel mug, mes­sen­ger bag, and tran­sit pass). It was easy enough to load the zip into GoodReader, extract the raw HTML, and browse it within that app, but it really wasn’t a good read­ing expe­ri­ence on the small screen due to for­mat­ting.

What I really wanted was some­thing I could load into iBooks. I wanted some­thing I could book­mark, under­line, and search that was nicely for­mat­ted and word-wrapped for the screen. I wanted WikiLeaks as an ePub file. That’s where this project comes in.

Overview

WikiPub is a Ruby script that will take a WikiLeaks archive, scan it, and con­vert it into an ePub that you can load into your book-reader of choice.

Screenshots

The ePub can be read on your iPhone or iPad, or any other device that can han­dle that for­mat. Do keep in mind that the text is quite lengthy and can take a lit­tle while for iBooks to ini­tially index and pag­i­nate.

Requirements

The first big require­ment is Ruby on a Unix-like sys­tem (Linux, OS X, etc.) and enough command-line expe­ri­ence to not be afraid of open­ing a ter­mi­nal win­dow, nav­i­gat­ing to a folder, and run­ning a few com­mands.

The other big require­ment is that you have down­loaded and extracted the WikiLeaks cable­gate archive. WikiPub does not con­tain, down­load, or oth­er­wise pro­vide the actual cables. You’re on your own for that. WikiPub only con­verts the cables you’ve already obtained into a bookreader-friendly for­mat.

Specific tech­ni­cal require­ments (you’ll need the “tidy” and “zip” apps, but your OS prob­a­bly already has them installed) in the readme.txt file con­tained in the archive.

Operating

  1. Open a ter­mi­nal win­dow
  2. Change direc­to­ries to the source folder
  3. Run: “./wikipub.rb /path/to/your/cablegate/files”, sub­sti­tut­ing the appro­pri­ate path to your extracted cable­gate archive
  4. Wait.  Depending on the speed of your com­puter, and whether you’ve pre­vi­ously run this script, you might be wait­ing an hour.  You’ll see a progress indi­ca­tor such as “Parsing 242 of 1095 ...”
  5. Your out­put files will be in the cur­rent folder, named sim­i­larly to “wikileaks-2010.epub” (with dif­fer­ing years). You can copy these to your book reader, drag them into iTunes, or how­ever you con­sume your epub files.
  6. MOST IMPORTANTLY: Share the epub with your friends. Not every­body has Ruby command-line mojo. Share it with the peo­ple in your life that don’t know how to turn a cable­gate archive into an epub with this script.

Options

  • You can add “–nomono” as a com­mand line flag to con­vert the doc­u­ment from a mono­spaced font to a pro­por­tional one. This con­ver­sion is still con­sid­ered beta. Due to quirks in for­mat­ting in the orig­i­nal doc­u­ment, some­times new­lines get mis­in­ter­preted as para­graph breaks and some­times too-close-together para­graphs get mis­in­ter­preted as one con­tin­u­ous para­graph.
  • You can add “–nos­plit” if you want one humungous ePub file with all cables, oth­er­wise you will get one file per year.

Other Notes

Kindles pre­fer mobi files instead of ePub files. If you want to read the WikiLeaks cables on your Kindle, you will need to con­vert them. This can be done for free with an app like Caliber or for a small charge by email­ing to your spe­cial Amazon account.

Current Version

  • WikiPub-1.1.zip — defaults to split files (one per year) rather than one giant file, adds option for pro­por­tional font

Previous Versions

5 thoughts on “WikiPub: WikiLeaks on your eBook reader

  1. Great pro­gram, but I don’t like to read the cables in fixed width font, and iBooks won’t allow me to change the font. Is there a way to work around this?

    1. I may throw that in (as an option) the next ver­sion. It’ll make the pro­cess­ing time go up because I can’t just use the “pre” sec­tions ver­ba­tim. I’ll have to con­vert line breaks to para­graphs and such. It shouldn’t be too dif­fi­cult to do, but is def­i­nitely more of a next-weekend project due to this week­end being pretty busy.

  2. Thanks, look­ing for­ward to it!

    By the way, another wel­come addi­tion would be a way to cre­ate an ebook just with a restricted num­ber of wik­ileaks rather than with the whole bunch of them. This way, every week or so, one could down­load the newest ones, com­pile them into an ebook, and add them to one’s library.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>