LRF to HTML: The Rough Guide

As of this writing, calibre, which can convert many things from one format to another featuring command-line tools, does not convert LRF to HTML, or indeed, to most anything else other than LRS, an XML format. Currently this is not a high-priority item to fix in calibre itself, because calibre is aimed at converting things to LRF. (The ePub conversion is still relatively new and shiny.)

ETA: Here’s the LRS specification.

So. Heck. Why not. I’m using Ruby, by the way, because Ruby has the kick-ass REXML library, which also forms the cornerstone for my ruby-epub stuff (still in the making).

The scope of this code: extremely basic. Should be run on the LRS file produced from calibre’s lrf2lrs utility. The finer details of calibre’s LRS are skipped over, and there are some hacks. It is somewhat smart enough to deal with strange formatting (though not illegal formatting).

But basically it does this:

prompt% ./lrs2html AnExampleBook.lrs
Parsing XML
Done parsing
Attributes: #
Processing Styles
Styles: {"208"=>"text-align: center; ", "220"=>"text-align: center; ", "209"=>"text-align: center; ", "221"=>"text-align: foot; ", "210"=>"text-align: center; ", "213"=>"text-align: center; ", "214"=>"text-align: center; "}
Processing Pages (Sections)
Procesing text for Section 0
Title: An Example Book
Processed section An Example Book
Procesing text for Section 1
Title: Chapter 1: In the Beginning
Processed section Chapter 1: In the Beginning
Procesing text for Section 2
Title: Chapter 2: Flowering
Processed section Chapter 2: Flowering
Procesing text for Section 3
Title: Chapter 3: Autumn
Processed section Chaper 3: Autumn
Creating directory An_Example_Book
Writing sections
Writing 'An Example Book' to title.html
Writing 'Chapter 1: In the Beginning' to section-01.html
Writing 'Chapter 2: Flowering' to section-02.html
Writing 'Chapter 3: Autumn' to section-03.html
Writing TOC
DONE
prompt% ls An_Example_Book
section-01.html
section-02.html
section-03.html
title.html
toc.html

So, here’s the code, which is cheerily commented as always…. you might want to download the file, since there’s a control character that WordPress wisely does not allow me to post.

Code Download

Advertisements

5 thoughts on “LRF to HTML: The Rough Guide

  1. I get error:

    /usr/bin/lrs2html.rb:22:in `require’: no such file to load — FileUtils (LoadError)
    from /usr/bin/lrs2html.rb:22

    I’m not familiar with ruby. Running on ubuntu karmic. Any idea what to do?

  2. Hello eris23,

    It’s probably a capitalization issue; I forget when ruby switched from ‘FileUtils.rb’ to ‘fileutils.rb’ in the base library. Try changing the ‘FileUtils’ to ‘fileutils’ in the ruby script.

  3. Thanks for this script … it’ll give me a starting point to work with …

    One thing I did notice is that when the argument is missing it doesn’t abort execution like it should, might give Ruby newbies a bit of confusion.

  4. Landon,

    You’re welcome. And yes, good point. I didn’t pay enough attention to that sort of thing.

    Veeery nice gravatar, by the way.

Comments are closed.