Convert Documents From doc/ppt/xls/etc to html/pdf/flash/etc Using OpenOffice.org

For a long time I’ve wanted a solution for converting documents into easily digestible formats - namely, I want word docs, excel spreadsheets, and powerpoints to automagically convert to html or flash so I can view them in the browser without the need to open a separate app. This is particularly urgent on my lovely mac that I love, since opening powerpoint melts the titanium and strains the California power grid. I also want to eventually move email out of email apps and view them as proper documents, possibly as a wiki.

So after much futzing around I finally got the conversion working today. The magic solution is simply to use JODConverter with OpenOffice.

In particular, I downloaded OpenOffice 2.3.1 for OS X, started it as “headless”:

/Applications/OpenOffice.org\ 2.3.app/Contents/MacOS/soffice -headless -accept="socket,host=127.0.0.1,port=8100;urp;" -nofirststartwizard &

and then used JODConverter commandline to convert files:

alias ooconv='java -jar /Users/parand/Packages/jodconverter-2.2.1/lib/jodconverter-cli-2.2.1.jar'
ooconv SomePowerpointFile.ppt -f swf

This works well, I’m happy with it. Ideally I’d have liked to get the Python conversion script working but the uno library supplied with OpenOffice seemed not to like my Python version. Oh well.

I also have a Python script to pull down my mail and convert it to threaded html, look for meeting notices, attachments, etc, and call the converter on them. A little bit of spit and polish and I’ll be close to my email-in-a-twiki idea, but I probably won’t get around to it for another year…

Leave a Reply