Welcome to NeoOffice developer notes and announcements
NeoOffice
Developer notes and announcements
 
 

This website is an archive and is no longer active
NeoOffice announcements have moved to the NeoOffice News website


Support
· Forums
· NeoOffice Support
· NeoWiki


Announcements
· Twitter @NeoOffice


Downloads
· Download NeoOffice


  
NeoOffice :: View topic - Why NeoOffice MS Word 97 doc File Size Smaller than OOo ?
Why NeoOffice MS Word 97 doc File Size Smaller than OOo ?
 
   NeoOffice Forum Index -> NeoOffice Releases
View previous topic :: View next topic  
Author Message
drgerafe
Red Pill


Joined: Dec 06, 2007
Posts: 5
Location: Toledo, Ohio USA

PostPosted: Thu Dec 06, 2007 10:09 pm    Post subject: Why NeoOffice MS Word 97 doc File Size Smaller than OOo ?

Can anyone here explain to me why text documents saved as MS Word 97 are so much smaller from NeoOffice (2.2.2p3) than when saved from OpenOffice (2.3.x) from either FreeBSD or Solaris systems ?

A relatively simple 3-page OOo text document when saved as an MS Word 97 .doc file from OpenOffice running on either my FreeBSD or Solaris systems approaches 140 Kbytes, while the *same* OOo/NO text document saved as an MS Word 97 .doc file from NeoOffice (MacOS X 10.4.11) is around 44 Kbytes !

How can I get my OpenOffice installations to generate such compact Word 97 documents as NeoOffice makes ?

Personally, I *only* use OpenOffice or NeoOffice to create simple MS Word 97 text documents that need to get attached to e-mail messages (as required by the receipients), and I'd much prefer to send smaller rather than *much* larger attachments whenever possible.

Thanks for any insights on this.
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Thu Dec 06, 2007 10:46 pm    Post subject:

Since NeoOffice uses the OpenOffice.org 2.2.1 file import and export code, the most likely causes is that Sun's engineers made a change in OOo 2.3.x. I wouldn't think they would purposely make the file size bigger but I would assume that they made changes to fix a bug. Other than that, I don't have any other theories to offer.

Patrick
Back to top
Samwise
Captain Naiobi


Joined: Apr 25, 2006
Posts: 2315
Location: Montpellier, France

PostPosted: Fri Dec 07, 2007 7:10 am    Post subject:

Patrick is correct. Word 97 documents created with NeoOffice 2.2.1 and OpenOffice.org 2.2.1 are the same size, whereas (Word 97) documents created with OOo 2.3.0 are much larger.
Back to top
drgerafe
Red Pill


Joined: Dec 06, 2007
Posts: 5
Location: Toledo, Ohio USA

PostPosted: Fri Dec 07, 2007 11:45 am    Post subject:

Thanks for the follow-up on this.

I just compared MS Word 97 DOC file sizes on an ODT document
saved as follows:

NeoOffice-2.2.1-p3: 46,080 bytes
OpenOffice-2.2.1 (FreeBSD 6.2): 117,248 bytes
OpenOffice-2.3.0 (FreeBSD 6.2): 115,200 bytes

I didn't have access to the Solaris system yet today,
but I know from past experience that files similar to this
particular ODT when saved from either StarOffice8
or OpenOffice-2.3.1 (Sparc Solaris 10) are 100,000+ bytes.

OK, so we're only talking about an extra 66 Kbytes or so of overhead here.
When, however, the size of the DOC file matters
(one site to which I need to submit DOC files on occasion
limits submitted DOC files to 150 Kbytes, which forces me
to edit more of the content of the document than I might
otherwise).

I like the idea that the NeoOffice binary can translate ODT documents
to DOC files from the shell using appropriate macros,
but this approach won't help me when I'm not logged into the console.

Something else must be at play here.
Back to top
drgerafe
Red Pill


Joined: Dec 06, 2007
Posts: 5
Location: Toledo, Ohio USA

PostPosted: Wed Dec 19, 2007 10:17 pm    Post subject:

Another follow-up on this.

I made three new ODT documents with OpenOffice 2.3.1 on Sparc-Solaris
10, all with Default style, 12 pt New Times Roman font.

Version 00 (empty document):
http://drgerlists.googlepages.com/ooowritertest00.odt

Version 01 ("hello, world!"):
http://drgerlists.googlepages.com/ooowritertest01.odt

Version 02 (Lincoln's Gettysburg Address):
http://drgerlists.googlepages.com/ooowritertest02.odt
Number of words: 272
Number of characters: 1471
Number of lines: 15

In each case, these documents were saved as MS Word 97/2000/XP
documents on three platforms to which I have direct access (Solaris
10/Sparc, FreeBSD 6.2-R/i386, and Darwin 10.4.11/ppc) with OpenOffice,
StarOffice, and NeoOffice hosted on their respective platforms, as follows:

Code:

=================================================================
                             Save As MS Word 97 File Size (bytes)
OS/Platform/Application       Version 00  Version 01  Version 02
-----------------------------------------------------------------
Solaris 10/Sparc
OpenOffice.org 2.3.1             104,448     104,448     107,520
StarOffice 8 Prod. Update 8      104,448     104,448     107,520
-----------------------------------------------------------------
FreeBSD 6.2-R/i386
OpenOffice.org 2.1.0              80,384      80,384      83,456
OpenOffice.org 2.2.1              80,384      80,384      83,456
OpenOffice.org 2.3.0              80,384      80,384      83,456
-----------------------------------------------------------------
Darwin 10.4.11/ppc
NeoOffice 2.2.2 Patch 3            8,704       8,704      11,776
-----------------------------------------------------------------


My question to this list is, essentially, how/why does NeoOffice
manage to write such efficient MS Word 97 document files compared
to OpenOffice running on either Solaris or FreeBSD ?

I did notice while conducting this informal test that NeoOffice did not
render the Version 02 document exactly the same as the other two
platforms (i.e., an extra line of text was rendered by NeoOffice).

Comments/discussion welcome.
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Wed Dec 19, 2007 10:54 pm    Post subject:

It looks like your .odt from NeoOffice has no Thumbnails entries in it. I would suspect that if you do an "unzip -l" on the Solaris and FreeBSD .odt files, the big file in the .odt is Thumbnails/thumbnail.png.

I am not sure why your NeoOffice .odt files are missing this entry, but I do know that on Mac OS X, both NeoOffice and OpenOffice generate Thumbnails/thumbnail.png files that are only are 10K unzipped.

Patrick
Back to top
drgerafe
Red Pill


Joined: Dec 06, 2007
Posts: 5
Location: Toledo, Ohio USA

PostPosted: Thu Dec 20, 2007 11:20 am    Post subject:

Thanks for looking at this.

The .odt files posted in the previous message were generated by OpenOffice running on a Solaris 10 system and had the thumbnail stripped from the archive prior to posting.
The file sizes reported previously are for the MS Word 97 .doc files that come
from these .odt files.
While I'm certainly curious as to why the .doc files sizes differ between Solaris
and FreeBSD, I'm really more interested in understanding why the
NeoOffice .doc files are so much smaller than the others.

Can another curious forum reader note the .doc file sizes that come from
OpenOffice running on Win32 and Linux systems ?

You mention that PNG thumbnails are generated by NeoOffice (and, of course,
OpenOffice).
My recent experience, however, is that NeoOffice generates (quite large) PDF
thumbnails (which I prefer to strip out of the archive before I transport the
NeoOffice-generated .odt file to another platform) -- the PNG thumbnails are
much smaller, in comparison, but that is the topic for another thread.
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Thu Dec 20, 2007 11:37 am    Post subject:

drgerafe wrote:
The .odt files posted in the previous message were generated by OpenOffice running on a Solaris 10 system and had the thumbnail stripped from the archive prior to posting.


Never mind. I was working too late when I wrote my post. The thumbnail only affects the .odt file size and no thumbnail is put in a .doc file AFAIK.

To finally answer your question, the difference is caused by changes to the Word import/export code made by Novell's engineers in the ooo-build project.

While I don't understand how their changes work, a quick grok of the ooo-build patches that NeoOffice checks out from ooo-build's svn repository shows that a few patches are applied to OOo's sw/source/filter/ww8 directory. This is the directory where OOo' Word import/export code resides.

Patrick
Back to top
drgerafe
Red Pill


Joined: Dec 06, 2007
Posts: 5
Location: Toledo, Ohio USA

PostPosted: Thu Dec 20, 2007 12:27 pm    Post subject:

Thank you for that explanation, Patrick.

I was unaware of the "go-oo" project,
and thus, Novell's involvement.

If their work proves to be stable,
we can investigate having it added to the native FreeBSD OOo port,
as we work mostly in that environment.

Gary
Back to top
Display posts from previous:   
   NeoOffice Forum Index -> NeoOffice Releases All times are GMT - 7 Hours
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © Planamesa Inc.
NeoOffice is a registered trademark of Planamesa Inc. and may not be used without permission.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.