Welcome to NeoOffice developer notes and announcements
NeoOffice
Developer notes and announcements
 
 

This website is an archive and is no longer active
NeoOffice announcements have moved to the NeoOffice News website


Support
· Forums
· NeoOffice Support
· NeoWiki


Announcements
· Twitter @NeoOffice


Downloads
· Download NeoOffice


  
NeoOffice :: View topic - successul mixing NeoO and KompoZer HTML code
successul mixing NeoO and KompoZer HTML code
 
   NeoOffice Forum Index -> Random Whatnot
View previous topic :: View next topic  
Author Message
vicjoe
Captain


Joined: Oct 31, 2005
Posts: 56
Location: Victoria BC Canada

PostPosted: Sun Nov 25, 2007 4:22 am    Post subject: successul mixing NeoO and KompoZer HTML code

I originally posted this to the KompoZer (Open Source WYSIWYG app) forum and am cross-posting it here for anyone interested.

I've discovered that the HTML format of OpenOffice (or NeoOffice for Mac, which is what I use) imports nicely into KompoZer. I particularly like that when I have set up hyperlinks from a footnote or endnote superscript, which jumps to the proper note or reference, the same action is preserved in the HTML "Save As..." (Not "Export" which in OOo/NeoO only gives the option to generate an xhtml file). The saved-as file is HTML 4.0 and has inline CSS styles.

When the OOo generated HTML file is opened in KompoZer, one can then proceed to manipulate it (for example pull in the margins, which by default will run to the end of the browser window). This is much better than saving text from one's word processor and then having to do a bunch of formatting; the extended characters are already coded properly, as well as the code for hyperlinking footnotes. Therefore for someone like me who wishes to convert a word processing document to HTML and have all the formatting intact (and not have to struggle with the nasty code that Word either puts on the clipboard or inserts in a generated HTML file), this is a real time-saver.

And, here is possibly the best part: the OOo/NeoO file's inline cascading style sheet (even after making changes in KompoZer) exports nicely to an external style sheet using CaScadeS's stylesheet exporting option.

And, here is a small catch-22: all the code from OOo/NeoO and any modifications subsequently made in KompoZer validate except one single bit in the footnote linking, an attribute called "sdfixed" that the validator says is "proprietary" (no useful information found on Google). Nonetheless the hyperlink anchors work for me in both directions on FireFox, Opera and Safari. I don't have a Windows machine to test that function in IE, so I'd like to ask somebody to test the file for me in MS IE on a Win box and see if the footnote hyperlink works. There is only one such link in the sample file, a superscripted number 1 immediately after the word "Culture" in the document's head title. It is at:
http://members.shaw.ca/vicjoe/pub/culture-no_explan_power.html

Feel free to look at the source code, too. If the footnote hyperlink works in IE, I'll be a happy camper and this tip may be hugely useful to others as well, as this obviates the need to learn about CSS, at least for formatted/converted word processing documents. Note that all the "special characters" like curly quotes and em dashes came through, too. Nice!
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Sun Nov 25, 2007 8:31 am    Post subject:

Not completely related, but I've often used Neo's Writer/Web to clean up HTML I've gotten from other sources (e.g., exported from MS Office), which has saved tons of time compared to pulling out the chaff manually....

Smokey

_________________
"[...] whether the duck drinks hot chocolate or coffee is irrelevant." -- ovvldc and sardisson in the NeoWiki
Back to top
vicjoe
Captain


Joined: Oct 31, 2005
Posts: 56
Location: Victoria BC Canada

PostPosted: Sun Nov 25, 2007 12:39 pm    Post subject: cleaning MS Office HTML junk

Quote:
Not completely related, but I've often used Neo's Writer/Web to clean up HTML I've gotten from other sources (e.g., exported from MS Office), which has saved tons of time compared to pulling out the chaff manually
That's interesting; I gather what you are saying is that Neo's Writer/Web isn't simply acting as a browser when it opens a file from somewhere, rather it is re-coding it and in the process getting rid of 'mso-normal' and like junk? I'll have to try it next time I get a file someone has saved in HTML format from Office. One more reason to use Ooo/Neo and dump MS Orifice.

I've occasionally used other means to clean MSO HTML files, i.e. Online Tidy http://valet.htmlhelp.com/tidy/ or Word Cleaner http://textism.com/wordcleaner/
Back to top
OPENSTEP
The One
The One


Joined: May 25, 2003
Posts: 4752
Location: Santa Barbara, CA

PostPosted: Wed Nov 28, 2007 8:41 am    Post subject:

Terry, one of the guys who helped everyone out before he passed on, also wrote the standalone MacTidy applications to do some HTML cleanup:

http://www.geocities.com/terry_teague/tidy.html

It doesn't surprise me that OOo is rewriting the HTML as I believe the path is HTML gets translated into the internal document format in memory and then is saved back to HTML through a filter.

ed
Back to top
vicjoe
Captain


Joined: Oct 31, 2005
Posts: 56
Location: Victoria BC Canada

PostPosted: Thu Nov 29, 2007 3:29 am    Post subject: OOo HTML generator invokes IE7 failure

I ran into a problem with my OOo/NeoO saved HTML further refined in KompoZer, namely that some of the CSS attributes, most grievously the margin specification, was being ignored by MS IE 7 (and possibly IE 6).

What I discovered was, though my test file displayed generally okay in 10 major browsers, it always looked awful in IE, especially in that the margins would run to the far right edge of the browser window.

Since MS browsers still constitute 65% of overall use, I was dismayed to say the least. To make a long story of hunting around to find out what IE was doing, er, differently (and I found lots on that, but that is another story), what it came down to is that the DOCTYPE generated by the OOo HTML generator was:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

Not close enough for IE, which would kick into a particularly unforgiving "quirks mode" and ignore CSS attributes.

So, to use my little OOo-NeoO/KompoZer trick and expect it to display properly in IE, I determined that one must change the DOCTYPE to this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd"> (assuming one wants to use 4.01 transitional).

I don't know about other DOCTYPEs, but I'd assume the same, that it has to be character for character correctly rendered as set forth by the W3C consortium.

Of course IE will recognize a completely non-standard DOCTYPE generated by any MS Office app, but that is to be expected. For example, Word 9 generates this gem:
xmlns="http://www.w3.org/TR/REC-html40">

Conclusion: with the above correction, a combo of OOO/NeoO export and refinement in KompoSer works fairly well by applying the above DOCTYPE cure. Once this has all been incorporated into a work flow, it saves a lot of time (particularly since at this stage numerous of the coding features of KompoZer's CSS editor, CaScadeS, do not work in the Mac version).
Back to top
Display posts from previous:   
   NeoOffice Forum Index -> Random Whatnot All times are GMT - 7 Hours
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © Planamesa Inc.
NeoOffice is a registered trademark of Planamesa Inc. and may not be used without permission.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.