In celebration of moving my downloads over to WP DownloadManager, I decided to release Epub versions of Mike and Psmith and Psmith in the City. ETA: And also Psmith, Journalist. For more about Psmith, see my Kindle-licious series.
[download id=”24,25,26″]
I’ve written this warning a few times before, but I might as well do it again:
Warning Warning Warning
The above texts are public domain only in the United States, anywhere with a Berne-convention-style copyright that expires 25 years after author death, and anywhere else without copyright laws.
If you live anywhere else, especially in Canada, Mexico, the United Kingdom, Ireland, every country in continental Europe, almost every country in Asia, South America, and Africa—these are not legal for you to download, read, read aloud, print, or store on a computer or server unless it happens to be housed in the United States, etc etc etc. ((If you think this is ridiculous, join the club. I’m not against copyright in general—far from it—but the man’s been dead for over 25 years.))
For more information, see Copyright and Wodehouse.
End Warning
I wrote a few scripts to make the Epub process a snap for those of us working by hand. I’m not ready to release them, but here’s an example session (warning: extreme geek):
[Thu Dec 18, 1:56PM]-(s000)-{~/Documents/Publications/demo}
tufor% create-epub Wodehouse_P.G.-Mike_and_Psmith
Title?
Mike and Psmith
Entering new creator: role [aut|ill|...]?
aut
Last name?
Wodehouse
First name, middle initial, etc?
P. G.
Add another creator [y/n]?
n
Book rights?
Public Domain in the United States
Book license?
[Thu Dec 18, 1:57PM]-(s000)-{~/Documents/Publications/demo}
tufor% ls
Wodehouse_P.G.-Mike_and_Psmith/
[Thu Dec 18, 1:57PM]-(s000)-{~/Documents/Publications/demo}
tufor% ls Wodehouse_P.G.-Mike_and_Psmith
META-INF/ content/ metadata.opf mimetype toc.ncx
[Thu Dec 18, 1:57PM]-(s000)-{~/Documents/Publications/demo}
tufor% cd Wodehouse_P.G.-Mike_and_Psmith
[Thu Dec 18, 1:59PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% ls content
stylesheet.css
[Thu Dec 18, 1:59PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% ls META-INF
container.xml
stylesheet.css is a small one for Adobe Digital Editions. metadata.opf and toc.ncx are the base versions seen in the epub tutorial.
At this point, I use ruby to write scripts to split the HTML file into chapters, and add in the Project Gutenberg header/footer and the preface. (To the person who downloaded this file before it was completely ready: you’re missing these bits.) I make prodigious use of sed and vim to mass-correct any mistakes that epubcheck complains about. ((Mac OS X is cool because it comes with ruby, sed, and vim pre-installed. It even has the Ruby Gem framework. And Rails. Rock on, Apple!))
So after all that, the content/ directory looks like this:
tufor% ls content/
chapter01.html chapter13.html chapter25.html
chapter02.html chapter14.html chapter26.html
chapter03.html chapter15.html chapter27.html
chapter04.html chapter16.html chapter28.html
chapter05.html chapter17.html chapter29.html
chapter06.html chapter18.html chapter30.html
chapter07.html chapter19.html gutenberg-footer.html
chapter08.html chapter20.html gutenberg-header.html
chapter09.html chapter21.html preface.html
chapter10.html chapter22.html stylesheet.css
chapter11.html chapter23.html title.html
chapter12.html chapter24.html
And now we get to the fun part: updating the manifest. Not at all tedious:
[Thu Dec 18, 2:05PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% ls
META-INF/ metadata.opf toc.ncx
content/ mimetype
[Thu Dec 18, 2:06PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% update-manifest
Skipping over ncx item
Generating items for manifest
content/chapter01.html: chapter01 application/xhtml+xml
Will add chapter01 to spine
content/chapter02.html: chapter02 application/xhtml+xml
Will add chapter02 to spine
content/chapter03.html: chapter03 application/xhtml+xml
Will add chapter03 to spine
content/chapter04.html: chapter04 application/xhtml+xml
Will add chapter04 to spine
content/chapter05.html: chapter05 application/xhtml+xml
Will add chapter05 to spine
content/chapter06.html: chapter06 application/xhtml+xml
Will add chapter06 to spine
content/chapter07.html: chapter07 application/xhtml+xml
Will add chapter07 to spine
content/chapter08.html: chapter08 application/xhtml+xml
Will add chapter08 to spine
content/chapter09.html: chapter09 application/xhtml+xml
Will add chapter09 to spine
content/chapter10.html: chapter10 application/xhtml+xml
Will add chapter10 to spine
content/chapter11.html: chapter11 application/xhtml+xml
Will add chapter11 to spine
content/chapter12.html: chapter12 application/xhtml+xml
Will add chapter12 to spine
content/chapter13.html: chapter13 application/xhtml+xml
Will add chapter13 to spine
content/chapter14.html: chapter14 application/xhtml+xml
Will add chapter14 to spine
content/chapter15.html: chapter15 application/xhtml+xml
Will add chapter15 to spine
content/chapter16.html: chapter16 application/xhtml+xml
Will add chapter16 to spine
content/chapter17.html: chapter17 application/xhtml+xml
Will add chapter17 to spine
content/chapter18.html: chapter18 application/xhtml+xml
Will add chapter18 to spine
content/chapter19.html: chapter19 application/xhtml+xml
Will add chapter19 to spine
content/chapter20.html: chapter20 application/xhtml+xml
Will add chapter20 to spine
content/chapter21.html: chapter21 application/xhtml+xml
Will add chapter21 to spine
content/chapter22.html: chapter22 application/xhtml+xml
Will add chapter22 to spine
content/chapter23.html: chapter23 application/xhtml+xml
Will add chapter23 to spine
content/chapter24.html: chapter24 application/xhtml+xml
Will add chapter24 to spine
content/chapter25.html: chapter25 application/xhtml+xml
Will add chapter25 to spine
content/chapter26.html: chapter26 application/xhtml+xml
Will add chapter26 to spine
content/chapter27.html: chapter27 application/xhtml+xml
Will add chapter27 to spine
content/chapter28.html: chapter28 application/xhtml+xml
Will add chapter28 to spine
content/chapter29.html: chapter29 application/xhtml+xml
Will add chapter29 to spine
content/chapter30.html: chapter30 application/xhtml+xml
Will add chapter30 to spine
content/gutenberg-footer.html: gutenberg-footer application/xhtml+xml
Will add gutenberg-footer to spine
content/gutenberg-header.html: gutenberg-header application/xhtml+xml
Will add gutenberg-header to spine
content/preface.html: preface application/xhtml+xml
Will add preface to spine
content/stylesheet.css: stylesheet text/css
content/title.html: title application/xhtml+xml
Will add title to spine
Adding chapter01 to spine
Adding chapter02 to spine
Adding chapter03 to spine
Adding chapter04 to spine
Adding chapter05 to spine
Adding chapter06 to spine
Adding chapter07 to spine
Adding chapter08 to spine
Adding chapter09 to spine
Adding chapter10 to spine
Adding chapter11 to spine
Adding chapter12 to spine
Adding chapter13 to spine
Adding chapter14 to spine
Adding chapter15 to spine
Adding chapter16 to spine
Adding chapter17 to spine
Adding chapter18 to spine
Adding chapter19 to spine
Adding chapter20 to spine
Adding chapter21 to spine
Adding chapter22 to spine
Adding chapter23 to spine
Adding chapter24 to spine
Adding chapter25 to spine
Adding chapter26 to spine
Adding chapter27 to spine
Adding chapter28 to spine
Adding chapter29 to spine
Adding chapter30 to spine
Adding gutenberg-footer to spine
Adding gutenberg-header to spine
Adding preface to spine
Adding title to spine
[Thu Dec 18, 2:06PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor%
I have to edit metadata.opf a little bit, to reorder stuff in the spine (in particular, I move title.html to the top, followed by gutenberg-header.html and preface.html), but other than that I can safely regenerate the metadata every time. Even if I add a new file, it won’t disturb the order of existing items on the spine, and will append new items at the end.
Let’s watch.
[Thu Dec 18, 2:09PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% cp content/title.html content/new-thing.html
[Thu Dec 18, 2:09PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% ls content
chapter01.html chapter13.html chapter25.html
chapter02.html chapter14.html chapter26.html
chapter03.html chapter15.html chapter27.html
chapter04.html chapter16.html chapter28.html
chapter05.html chapter17.html chapter29.html
chapter06.html chapter18.html chapter30.html
chapter07.html chapter19.html gutenberg-footer.html
chapter08.html chapter20.html gutenberg-header.html
chapter09.html chapter21.html new-thing.html
chapter10.html chapter22.html preface.html
chapter11.html chapter23.html stylesheet.css
chapter12.html chapter24.html title.html
[Thu Dec 18, 2:09PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% update-manifest
Skipping over ncx item
Generating items for manifest
... lots of manifest generation output ...
Will add title to spine
title already in spine
gutenberg-header already in spine
preface already in spine
chapter01 already in spine
chapter02 already in spine
chapter03 already in spine
chapter04 already in spine
chapter05 already in spine
chapter06 already in spine
chapter07 already in spine
chapter08 already in spine
chapter09 already in spine
chapter10 already in spine
chapter11 already in spine
chapter12 already in spine
chapter13 already in spine
chapter14 already in spine
chapter15 already in spine
chapter16 already in spine
chapter17 already in spine
chapter18 already in spine
chapter19 already in spine
chapter20 already in spine
chapter21 already in spine
chapter22 already in spine
chapter23 already in spine
chapter24 already in spine
chapter25 already in spine
chapter26 already in spine
chapter27 already in spine
chapter28 already in spine
chapter29 already in spine
chapter30 already in spine
gutenberg-footer already in spine
Adding new-thing to spine
Fun!
The next sticking point is the table of contents, toc.ncx. I set up my script such that it generates the play order according to the spine order of metadata.opf, using for navigation point labels the in the header of the HTML.
Let’s watch.
Thu Dec 18, 2:12PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% update-ncx
[Thu Dec 18, 2:14PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% head -30 toc.ncx
Mike and Psmith
Mike and Psmith
Project Gutenberg Front Matter
Preface
Unlike metadata.opf, the toc.ncx is regenerated destructively each time at the moment.
Now, update-manifest and update-ncx leave backup files behind them:
tufor% ls
META-INF/ metadata.opf.20081218-14951-560400
content/ mimetype
metadata.opf toc.ncx
metadata.opf.20081218-141150-490432 toc.ncx.20081218-141435-146263
metadata.opf.20081218-14659-942229
It would be nice to not include everything in the root directory (after all, that’s why content/ is there: to leave less junk floating around the metadata files). So I created a script that easily compiles the epub and leaves out excess files; it also puts the mimetype correctly in the zip files with a storage method of STORAGE rather than one of the deflations, and names the epub file after your current directory name.
[Thu Dec 18, 2:15PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% compile-epub
Creating Wodehouse_P.G.-Mike_and_Psmith.epub
adding: mimetype (stored 0%)
adding: META-INF/container.xml (deflated 35%)
adding: content/chapter01.html (deflated 57%)
adding: content/chapter02.html (deflated 54%)
adding: content/chapter03.html (deflated 55%)
adding: content/chapter04.html (deflated 60%)
adding: content/chapter05.html (deflated 60%)
adding: content/chapter06.html (deflated 56%)
adding: content/chapter07.html (deflated 56%)
adding: content/chapter08.html (deflated 56%)
adding: content/chapter09.html (deflated 59%)
adding: content/chapter10.html (deflated 57%)
adding: content/chapter11.html (deflated 59%)
adding: content/chapter12.html (deflated 57%)
adding: content/chapter13.html (deflated 53%)
adding: content/chapter14.html (deflated 59%)
adding: content/chapter15.html (deflated 55%)
adding: content/chapter16.html (deflated 57%)
adding: content/chapter17.html (deflated 56%)
adding: content/chapter18.html (deflated 58%)
adding: content/chapter19.html (deflated 56%)
adding: content/chapter20.html (deflated 60%)
adding: content/chapter21.html (deflated 60%)
adding: content/chapter22.html (deflated 59%)
adding: content/chapter23.html (deflated 53%)
adding: content/chapter24.html (deflated 59%)
adding: content/chapter25.html (deflated 56%)
adding: content/chapter26.html (deflated 56%)
adding: content/chapter27.html (deflated 59%)
adding: content/chapter28.html (deflated 59%)
adding: content/chapter29.html (deflated 62%)
adding: content/chapter30.html (deflated 56%)
adding: content/gutenberg-footer.html (deflated 64%)
adding: content/gutenberg-header.html (deflated 41%)
adding: content/preface.html (deflated 48%)
adding: content/stylesheet.css (deflated 46%)
adding: content/title.html (deflated 43%)
adding: metadata.opf (deflated 84%)
adding: toc.ncx (deflated 80%)
[Thu Dec 18, 2:18PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% unzip -lv Wodehouse_P.G.-Mike_and_Psmith.epub
Archive: Wodehouse_P.G.-Mike_and_Psmith.epub
Length Method Size Ratio Date Time CRC-32 Name
-------- ------ ------- ----- ---- ---- ------ ----
21 Stored 21 0% 12-18-08 13:57 eff6d13b mimetype
234 Defl:X 153 35% 12-18-08 13:57 eadbee10 META-INF/container.xml
11781 Defl:X 5103 57% 12-18-08 14:05 82252b1b content/chapter01.html
8323 Defl:X 3808 54% 12-18-08 14:05 3c097760 content/chapter02.html
8532 Defl:X 3858 55% 12-18-08 14:05 1eb7a219 content/chapter03.html
12635 Defl:X 5105 60% 12-18-08 14:05 51858951 content/chapter04.html
14013 Defl:X 5645 60% 12-18-08 14:05 e82ddc17 content/chapter05.html
10943 Defl:X 4864 56% 12-18-08 14:05 1d06481f content/chapter06.html
11325 Defl:X 4983 56% 12-18-08 14:05 5b852aef content/chapter07.html
11615 Defl:X 5111 56% 12-18-08 14:05 59254e00 content/chapter08.html
13873 Defl:X 5621 60% 12-18-08 14:05 a3a9cb1e content/chapter09.html
9105 Defl:X 3878 57% 12-18-08 14:05 98bd768c content/chapter10.html
18743 Defl:X 7636 59% 12-18-08 14:05 3b4cd0bb content/chapter11.html
10291 Defl:X 4396 57% 12-18-08 14:05 1986e66a content/chapter12.html
7656 Defl:X 3585 53% 12-18-08 14:05 0c6835e2 content/chapter13.html
10073 Defl:X 4174 59% 12-18-08 14:05 fe43b8a3 content/chapter14.html
10050 Defl:X 4572 55% 12-18-08 14:05 48951a2b content/chapter15.html
13029 Defl:X 5568 57% 12-18-08 14:05 c3442a78 content/chapter16.html
7180 Defl:X 3155 56% 12-18-08 14:05 ee262e95 content/chapter17.html
11163 Defl:X 4657 58% 12-18-08 14:05 bdb9bf24 content/chapter18.html
11613 Defl:X 5052 57% 12-18-08 14:05 4e3ed93b content/chapter19.html
12792 Defl:X 5133 60% 12-18-08 14:05 8bb75279 content/chapter20.html
11435 Defl:X 4596 60% 12-18-08 14:05 f3c5c1f1 content/chapter21.html
13212 Defl:X 5381 59% 12-18-08 14:05 523813d8 content/chapter22.html
6565 Defl:X 3060 53% 12-18-08 14:05 ce2bfe78 content/chapter23.html
11961 Defl:X 4868 59% 12-18-08 14:05 51731ede content/chapter24.html
9273 Defl:X 4073 56% 12-18-08 14:05 38cba99d content/chapter25.html
12833 Defl:X 5621 56% 12-18-08 14:05 18f1dda1 content/chapter26.html
10913 Defl:X 4420 60% 12-18-08 14:05 aabfda85 content/chapter27.html
12606 Defl:X 5166 59% 12-18-08 14:05 5dc98847 content/chapter28.html
16818 Defl:X 6375 62% 12-18-08 14:05 08bd089f content/chapter29.html
10767 Defl:X 4739 56% 12-18-08 14:05 991a2d3d content/chapter30.html
21245 Defl:X 7684 64% 12-18-08 14:05 0b1e8638 content/gutenberg-footer.html
1186 Defl:X 705 41% 12-18-08 14:05 b2803314 content/gutenberg-header.html
3023 Defl:X 1567 48% 12-18-08 14:05 450a9aa7 content/preface.html
270 Defl:X 147 46% 12-18-08 13:57 395f8d82 content/stylesheet.css
566 Defl:X 322 43% 12-18-08 14:05 944003e1 content/title.html
4827 Defl:X 792 84% 12-18-08 14:11 b4ddcbb0 metadata.opf
6022 Defl:X 1185 80% 12-18-08 14:14 9a2cc700 toc.ncx
-------- ------- --- -------
378512 156779 59% 39 files
And then there’s running epubcheck, which I wrapped in a script to set the classpaths properly for java and such:
[Thu Dec 18, 2:18PM]-(s000)-{..lications/demo/Wodehouse_P.G.-Mike_and_Psmith}
tufor% epubcheck Wodehouse_P.G.-Mike_and_Psmith.epub
No errors or warnings detected
All told, not counting the time I’m spending copying and pasting into this blog post, and assuming that content is prepared (it rarely is), the process takes at most a couple minutes if you type slowly.
Much fun. The scripts are pretty rough, so unless you are slightly geeky, you won’t like them.