View previous topic :: View next topic |
Author |
Message |
Gust Councilperson
Joined: Oct 09, 2007 Posts: 137
|
Posted: Tue Jan 27, 2009 8:04 am Post subject: File size |
|
In the days of OpenOffice.org 1.0 I was always very impressed by the file size of the documents when saved in the native format as compared to the MS Office formats.
During the NeoOffice 2 life cycle, file sizes got ever closer for a reason that I ignore. For most documents both file formats resulted in broadly the same file size.
Now there is NeoOffice 3, and it seems that the native format ever gets fatter, whereas the size of MS Office formatted files (binary, not xml) is often much smaller now (probably the export filter got improved). As an example: open a new text document, type 'blah' and save it in the respective file formats. The doc file will be 8KB, whereas the odt file is 20KB (note that odt is a zipped container format, it actually is a genuine achievement to produce 20KB of entropy using 'blah' as single input).
I'm not sure if this behaviour is specific to NeoOffice, it's cause seems more likely to be found in the underlying OpenOffice.org code. But I am sure that I do not appreciate the evolution. |
|
Back to top |
|
|
sirloxelroy Sentinel
Joined: Nov 16, 2006 Posts: 26 Location: BFE
|
Posted: Tue Jan 27, 2009 8:35 am Post subject: |
|
Strange, I am running NeoOffice 2.0 Patch 2 and "BlahBlahBlah" was 16kb in file size (16,725), and 6 Blahs was also 16KB, but 16,758 in size. _________________ Wisdom from IRC:
Man watching 6 MSCE's around a sun box, looks alot like the opening scene's of 2001:space odyssey and the monkey's with the monolith. |
|
Back to top |
|
|
James3359 The Merovingian
Joined: Jul 05, 2005 Posts: 685 Location: North West England
|
Posted: Tue Jan 27, 2009 8:56 am Post subject: |
|
FWIW the values I get are 8k (.doc) and 16k (.odt).
But I notice with larger files that similar issues arise - one, for example is 64k (.doc) 96k (.odt NeoOfice 2.2.5) and 112k (.odt NeoOffice 3 EA)
None of these include graphics.
I remember there has been some discussion of this before, and I tracked down this Trinity thread it is not entirely enlightening about why this should be. |
|
Back to top |
|
|
pluby The Architect
Joined: Jun 16, 2003 Posts: 11949
|
Posted: Tue Jan 27, 2009 9:07 am Post subject: Re: File size |
|
Gust wrote: | I'm not sure if this behaviour is specific to NeoOffice, it's cause seems more likely to be found in the underlying OpenOffice.org code. But I am sure that I do not appreciate the evolution. |
Since NeoOffice 2.2.1, NeoOffice adds the following PDF subfile to all .od* files that does add a little extra size:
Thumbnails/thumbnail.pdf
This PDF subfile is a PDF export of the first page of your document and is used by Mac OS X 10.5's Quick Lock service to render to preview of your document in the Finder. Apple so generously used its own developer resources to implement Quick Look support for the Microsoft Office 2007 file formats but Apple only implemented rather weak support for .od* files so we had to fill the gap and since do a full on-the-fly renderer for .od* files was way beyond our resources, we implemented the PDF subfile approach.
Other than that one subfile, NeoOffice's underlying OpenOffice.org code generates the .od* file so there should no other differences between NeoOffice and OpenOffice.org files.
Patrick |
|
Back to top |
|
|
Gust Councilperson
Joined: Oct 09, 2007 Posts: 137
|
Posted: Tue Jan 27, 2009 9:08 am Post subject: |
|
James3359 wrote: | FWIW the values I get are 8k (.doc) and 16k (.odt).
But I notice with larger files that similar issues arise - one, for example is 64k (.doc) 96k (.odt NeoOfice 2.2.5) and 112k (.odt NeoOffice 3 EA)
None of these include graphics. |
Thanks James, this fits in with my impression that odt is getting fat. I have a two page text document (approx. 4500 characters) with no graphics, no custom styles, only three styles in use and no other formatting. NeoOffice 3 EA 2 needs a whopping 96 KB for this, whereas saving it in the binary doc format reduces the file size to 20 KB (which is further reduced to 8 KB after zipping). |
|
Back to top |
|
|
pluby The Architect
Joined: Jun 16, 2003 Posts: 11949
|
Posted: Tue Jan 27, 2009 9:19 am Post subject: |
|
Gust wrote: | Thanks James, this fits in with my impression that odt is getting fat. I have a two page text document (approx. 4500 characters) with no graphics, no custom styles, only three styles in use and no other formatting. NeoOffice 3 EA 2 needs a whopping 96 KB for this, whereas saving it in the binary doc format reduces the file size to 20 KB (which is further reduced to 8 KB after zipping). |
What are the differences in the subfiles sizes for each of these? The following Terminal command will list the subfiles and their uncompressed. If the PDF subfile accounts for the difference, then most likely your are using a different font for your document in the two NeoOffice versions as PDF embeds a subset of the font for accurate rendering and font sizes can vary widely.
As an aside, I don't recall the designers of the ODF file format ever promising that it would create small files. Instead, my impression is that it is a very verbose format. The idea being that the more data about the document, its layout, and its style settings, the more likely that other applications can support the format.
If you are looking for the most compact file size, why not use RTF or plain text? Granted, plain text has no special formatting data in it, but it is compact and is the most compatible format available by far.
Patrick |
|
Back to top |
|
|
James3359 The Merovingian
Joined: Jul 05, 2005 Posts: 685 Location: North West England
|
Posted: Tue Jan 27, 2009 9:32 am Post subject: |
|
My 112`KB file with NO3 EA unzips to 196KB and contains the following elements of size:
- content.xml - 44KB
- layout-cache - 4KB
- manifest.xml - 4KB
- meta.xml - 4KB
- mimetype - 4KB
- settings.xml - 12KB
- styles.xml - 20KB
- thumbnail.pdf - 72KB
- thumbnail.png - 32KB
so it looks as though stripping out the thumbnails would knock 100KB off the file size.
I guess that that for documents containing a full page of text/graphics or more the size of thumbnails is pretty much the same whatever the complexities of the overall document - so this probably starts to make less of a difference with larger file sizes. [Cross posted with Patrick on this - I had overlooked the impact of embedding fonts - so a multi-font first page is likely to generate larger thumbnails and larger files overall.]
But you might get an anomalous result with a document with a blank first page - yes - just checked blah.odt - 16KB blah.odt with a page break inserted before "blah" - 12KB
So you might be able to do a quick and dirty file size reduction simply by adding a blank first page to your odt. document. |
|
Back to top |
|
|
ovvldc Captain Naiobi
Joined: Sep 13, 2004 Posts: 2352 Location: Zürich, CH
|
Posted: Tue Jan 27, 2009 1:57 pm Post subject: |
|
Also, the thumbnail is important for small files, but doesn't make as much of a difference for larger ones, especially when you insert images after the first page.
I understand that 100kB extra can make some impact, but I would also contend that while this used to be 1/3 of a floppy disk, it is now just a minute fraction of the storage capacity of a USB flash drive..
Best wishes,
Oscar _________________ "What do you think of Western Civilization?"
"I think it would be a good idea!"
- Mohandas Karamchand Gandhi |
|
Back to top |
|
|
ahneobugs Sentinel
Joined: Jan 16, 2008 Posts: 20
|
Posted: Fri Feb 13, 2009 12:40 pm Post subject: |
|
File size is an issue. An extra 100k may not be a big deal for a few dozen documents, but once you get into larger numbers, it does become significant.
For instance, it can mean that backing up takes significantly longer and incremental backups will require new media 5-10x as often (compared to wordperfect mac).
It would be great if this feature could be turned off. Personally, I would sacrifice the quicklook preview for space efficiency. |
|
Back to top |
|
|
OPENSTEP The One
Joined: May 25, 2003 Posts: 4752 Location: Santa Barbara, CA
|
Posted: Fri Feb 13, 2009 12:54 pm Post subject: |
|
You can get a 1.5 TB drive for less than $150. My computations may be wrong, but 100K is worth 0.000009 cents.
This functionality is beneficial for the majority of users and will not be turned off. You can always edit OpenDocument files from any application manually using the command line zip utilities.
ed |
|
Back to top |
|
|
pluby The Architect
Joined: Jun 16, 2003 Posts: 11949
|
|
Back to top |
|
|
|