Thinking about Print

Every once in a while I get mail from someone asking for a downloadable version of one or another “books” from the MDC Wiki. Since last March, I’ve got this request less than a dozen times, but it’s something I occasionally sit and try to bend my brain around.

There are many questions involved.

The first (and key) question is: Is there enough demand for a downloadable version of, say, the JavaScript 1.5 Reference, to invest a lot of time/effort into producing it? I don’t know the answer to this at the moment, so I’ll ask you: how often do you find yourself in a situation where you want to look something up but can’t because you’re offline and can’t access the MDC wiki?

That aside, the real difficulty begins. By the very nature of wikis, they are malleable. The MDC wiki content changes daily with new edits, corrections, additions, page moves, and deletions. I think that the wiki has, in fact, changed several times every day since it was first launched last March. This is a very powerful feature of the wiki, in that it’s easy to make changes and improvements, and so people do. I might be mistaken, but I’m fairly certain that the MDC currently hosts the most up-to-date version of a JavaScript reference currently in existence (warts and all).

The complicated bit here is that in order to maintain a completely up-to-date downloadable/printable version of any particular collection of content within the wiki, the process of generating that content would have to be wholly automated. One would think that computers would be good at that sort of thing, but evidence appears to show otherwise. I’m certainly not the first or smartest person to think about this problem, and as far as I can tell every other project started towards a solution has been abandoned well before completion.

(Aside: if you know of a current and/or complete project that does what I’m talking about, be it wiki->PDF or wiki->docbook or wiki->xml, please send me a note.)

The good folk over at the Hula project have sort of addressed this issue with their Single Page Administration Guide (warning: it’s a long page…164 pages when saved to PDF). Using wiki includes, they’ve simply collated all the disparate pages into a single long page, which you can then print or save to PDF or what-have-you.

This include-everything-in-a-single-page trick is an OKish solution, in that it does allow people to get a copy of the content that they can then use offline. There are also problems. The table of contents has no page numbers. The page section headings don’t have numbers, so the section numbers in the TOC aren’t very useful. Links aren’t clickable (or even rendered as links) so things like “see Message Store” in the HTML version show up simply as “see Message Store” in the PDF. And so forth.

I think the trick will be figuring out how to turn wiki pages into DocBook XML fragments (using only a simple subset of DocBook elements), then patching those fragments together into full DocBook books. Once the DocBook book is available, there are a host of different tools that can be used to generate it into a variety of formats, including much-more-useful PDFs.

While that seems simple enough on the surface, the number of dead projects that have attempted to do this in a fully automated fashion seems to indicate otherwise.

So, if it can’t be fully automated, could it be partially automated? Could wiki markup be turned into a rough approximation of DocBook fragments which could then be finessed and pieced together by hand?

This is where the first question becomes important. If this requires human intervention, is it worth doing at all? Would it be worth the effort to generate a DocBook version of the JavaScript Reference once or twice per year given that it will be rendered almost immediately obsolete by updates to the wiki?

I don’t know. Maybe. Maybe not. What do you think? Is it worth it? Is there an easy way to do this wiki->DocBook or wiki->PDF generation that would generate a proper book without requiring a lot of human involvement?

The wiki has been an awesome boon for the state of Mozilla developer documentation. In less than a year over 22000 edits and additions have been made, each of which has served to improve the content we deliver. The web version of the content is XHTML compliant (with occasional markup errors in editing), and it’s relatively usable and friendly with a nice layout. The kicker is trying to turn this incredible resource into usable offline formats. Obviously we don’t want to stop using the wikis, so if we want to generate offline content, we have to figure out how to do that given the tools at our disposal.

And this is apparently what I spend my Friday evenings thinking about.

Advertisements

16 thoughts on “Thinking about Print

  1. Jimmy Wales is thinking about this now as he’s trying to do the same thing for wikipedia, as text books for kids. We can probably ask him what he’s doing. And I don’t know what’s worse, you thinking about this on a Friday or evening, or me responding…and thinking about Firefox marketing in football terms.

  2. Jimmy Wales is thinking about this now as he’s trying to do the same thing for wikipedia, as text books for kids. We can probably ask him what he’s doing. And I don’t know what’s worse, you thinking about this on a Friday or evening, or me responding…and thinking about Firefox marketing in football terms.

  3. James: do you mean the new “Rough Cuts” book series, or something else? If something else, could you give me a URL where I can get more information? Thanks 🙂

  4. James: do you mean the new “Rough Cuts” book series, or something else? If something else, could you give me a URL where I can get more information? Thanks 🙂

  5. That’s definitely another possibility. I started thinking about what would be required to dump a static version of each language wiki this morning, actually. Thanks for the link.

  6. That’s definitely another possibility. I started thinking about what would be required to dump a static version of each language wiki this morning, actually. Thanks for the link.

  7. Thank’s for thinking this way!
    I’m really in need for an offline version of the MDCWiki, since i’m developing in the Mozilla context but rarely have the opportunity to log on to thw i-net. Please notify me (by email, or via your blog which i’m tracking ;-)) as soon as you got any solution. I would be pleased to help with testing and would also use it in any alpha state.
    Sadly though 😦 i do not know any solution for the problem.

    Keep up the thinking!

  8. Thank’s for thinking this way!I’m really in need for an offline version of the MDCWiki, since i’m developing in the Mozilla context but rarely have the opportunity to log on to thw i-net. Please notify me (by email, or via your blog which i’m tracking ;-)) as soon as you got any solution. I would be pleased to help with testing and would also use it in any alpha state.Sadly though 😦 i do not know any solution for the problem.Keep up the thinking!

  9. If one could get all the content into a single page, you may find that to be very useful when used in combination with the technique outlined in this List Apart article:

    http://www.alistapart.com/articles/boom

    It describes the process they used to publish a book from HTML/CSS source. I’m not sure if this is entirely relevant to what you’re attempting to do, but you may find it helpful.

  10. If one could get all the content into a single page, you may find that to be very useful when used in combination with the technique outlined in this List Apart article:http://www.alistapart.com/articles/boomIt describes the process they used to publish a book from HTML/CSS source. I’m not sure if this is entirely relevant to what you’re attempting to do, but you may find it helpful.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s