Welcome to NeoOffice developer notes and announcements
NeoOffice
Developer notes and announcements
 
 

This website is an archive and is no longer active
NeoOffice announcements have moved to the NeoOffice News website


Support
· Forums
· NeoOffice Support
· NeoWiki


Announcements
· Twitter @NeoOffice


Downloads
· Download NeoOffice


  
NeoOffice :: View topic - File size
File size
 
   NeoOffice Forum Index -> NeoOffice Beta Releases
View previous topic :: View next topic  
Author Message
Gust
Councilperson


Joined: Oct 09, 2007
Posts: 137

PostPosted: Tue Jan 27, 2009 8:04 am    Post subject: File size

In the days of OpenOffice.org 1.0 I was always very impressed by the file size of the documents when saved in the native format as compared to the MS Office formats.

During the NeoOffice 2 life cycle, file sizes got ever closer for a reason that I ignore. For most documents both file formats resulted in broadly the same file size.

Now there is NeoOffice 3, and it seems that the native format ever gets fatter, whereas the size of MS Office formatted files (binary, not xml) is often much smaller now (probably the export filter got improved). As an example: open a new text document, type 'blah' and save it in the respective file formats. The doc file will be 8KB, whereas the odt file is 20KB (note that odt is a zipped container format, it actually is a genuine achievement to produce 20KB of entropy using 'blah' as single input).

I'm not sure if this behaviour is specific to NeoOffice, it's cause seems more likely to be found in the underlying OpenOffice.org code. But I am sure that I do not appreciate the evolution.
Back to top
sirloxelroy
Sentinel


Joined: Nov 16, 2006
Posts: 26
Location: BFE

PostPosted: Tue Jan 27, 2009 8:35 am    Post subject:

Strange, I am running NeoOffice 2.0 Patch 2 and "BlahBlahBlah" was 16kb in file size (16,725), and 6 Blahs was also 16KB, but 16,758 in size.
_________________
Wisdom from IRC:
Man watching 6 MSCE's around a sun box, looks alot like the opening scene's of 2001:space odyssey and the monkey's with the monolith.
Back to top
James3359
The Merovingian


Joined: Jul 05, 2005
Posts: 685
Location: North West England

PostPosted: Tue Jan 27, 2009 8:56 am    Post subject:

FWIW the values I get are 8k (.doc) and 16k (.odt).

But I notice with larger files that similar issues arise - one, for example is 64k (.doc) 96k (.odt NeoOfice 2.2.5) and 112k (.odt NeoOffice 3 EA)

None of these include graphics.

I remember there has been some discussion of this before, and I tracked down this Trinity thread it is not entirely enlightening about why this should be.
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Tue Jan 27, 2009 9:07 am    Post subject: Re: File size

Gust wrote:
I'm not sure if this behaviour is specific to NeoOffice, it's cause seems more likely to be found in the underlying OpenOffice.org code. But I am sure that I do not appreciate the evolution.


Since NeoOffice 2.2.1, NeoOffice adds the following PDF subfile to all .od* files that does add a little extra size:

Thumbnails/thumbnail.pdf

This PDF subfile is a PDF export of the first page of your document and is used by Mac OS X 10.5's Quick Lock service to render to preview of your document in the Finder. Apple so generously used its own developer resources to implement Quick Look support for the Microsoft Office 2007 file formats but Apple only implemented rather weak support for .od* files so we had to fill the gap and since do a full on-the-fly renderer for .od* files was way beyond our resources, we implemented the PDF subfile approach.

Other than that one subfile, NeoOffice's underlying OpenOffice.org code generates the .od* file so there should no other differences between NeoOffice and OpenOffice.org files.

Patrick
Back to top
Gust
Councilperson


Joined: Oct 09, 2007
Posts: 137

PostPosted: Tue Jan 27, 2009 9:08 am    Post subject:

James3359 wrote:
FWIW the values I get are 8k (.doc) and 16k (.odt).

But I notice with larger files that similar issues arise - one, for example is 64k (.doc) 96k (.odt NeoOfice 2.2.5) and 112k (.odt NeoOffice 3 EA)

None of these include graphics.

Thanks James, this fits in with my impression that odt is getting fat. I have a two page text document (approx. 4500 characters) with no graphics, no custom styles, only three styles in use and no other formatting. NeoOffice 3 EA 2 needs a whopping 96 KB for this, whereas saving it in the binary doc format reduces the file size to 20 KB (which is further reduced to 8 KB after zipping).
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Tue Jan 27, 2009 9:19 am    Post subject:

Gust wrote:
Thanks James, this fits in with my impression that odt is getting fat. I have a two page text document (approx. 4500 characters) with no graphics, no custom styles, only three styles in use and no other formatting. NeoOffice 3 EA 2 needs a whopping 96 KB for this, whereas saving it in the binary doc format reduces the file size to 20 KB (which is further reduced to 8 KB after zipping).


What are the differences in the subfiles sizes for each of these? The following Terminal command will list the subfiles and their uncompressed. If the PDF subfile accounts for the difference, then most likely your are using a different font for your document in the two NeoOffice versions as PDF embeds a subset of the font for accurate rendering and font sizes can vary widely.

As an aside, I don't recall the designers of the ODF file format ever promising that it would create small files. Instead, my impression is that it is a very verbose format. The idea being that the more data about the document, its layout, and its style settings, the more likely that other applications can support the format.

If you are looking for the most compact file size, why not use RTF or plain text? Granted, plain text has no special formatting data in it, but it is compact and is the most compatible format available by far.

Patrick
Back to top
James3359
The Merovingian


Joined: Jul 05, 2005
Posts: 685
Location: North West England

PostPosted: Tue Jan 27, 2009 9:32 am    Post subject:

My 112`KB file with NO3 EA unzips to 196KB and contains the following elements of size:
  • content.xml - 44KB
  • layout-cache - 4KB
  • manifest.xml - 4KB
  • meta.xml - 4KB
  • mimetype - 4KB
  • settings.xml - 12KB
  • styles.xml - 20KB
  • thumbnail.pdf - 72KB
  • thumbnail.png - 32KB

so it looks as though stripping out the thumbnails would knock 100KB off the file size.

I guess that that for documents containing a full page of text/graphics or more the size of thumbnails is pretty much the same whatever the complexities of the overall document - so this probably starts to make less of a difference with larger file sizes. [Cross posted with Patrick on this - I had overlooked the impact of embedding fonts - so a multi-font first page is likely to generate larger thumbnails and larger files overall.]

But you might get an anomalous result with a document with a blank first page - yes - just checked blah.odt - 16KB blah.odt with a page break inserted before "blah" - 12KB Smile

So you might be able to do a quick and dirty file size reduction simply by adding a blank first page to your odt. document.
Back to top
ovvldc
Captain Naiobi


Joined: Sep 13, 2004
Posts: 2352
Location: Zürich, CH

PostPosted: Tue Jan 27, 2009 1:57 pm    Post subject:

Also, the thumbnail is important for small files, but doesn't make as much of a difference for larger ones, especially when you insert images after the first page.

I understand that 100kB extra can make some impact, but I would also contend that while this used to be 1/3 of a floppy disk, it is now just a minute fraction of the storage capacity of a USB flash drive..

Best wishes,
Oscar

_________________
"What do you think of Western Civilization?"
"I think it would be a good idea!"
- Mohandas Karamchand Gandhi
Back to top
ahneobugs
Sentinel


Joined: Jan 16, 2008
Posts: 20

PostPosted: Fri Feb 13, 2009 12:40 pm    Post subject:

File size is an issue. An extra 100k may not be a big deal for a few dozen documents, but once you get into larger numbers, it does become significant.

For instance, it can mean that backing up takes significantly longer and incremental backups will require new media 5-10x as often (compared to wordperfect mac).

It would be great if this feature could be turned off. Personally, I would sacrifice the quicklook preview for space efficiency.
Back to top
OPENSTEP
The One
The One


Joined: May 25, 2003
Posts: 4752
Location: Santa Barbara, CA

PostPosted: Fri Feb 13, 2009 12:54 pm    Post subject:

You can get a 1.5 TB drive for less than $150. My computations may be wrong, but 100K is worth 0.000009 cents.

This functionality is beneficial for the majority of users and will not be turned off. You can always edit OpenDocument files from any application manually using the command line zip utilities.

ed
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Wed Feb 25, 2009 12:55 pm    Post subject:

FYI. I have moved the posts concerning how to disable NeoOffice's Quick Look plugin in the Finder to the following new topic:

https://trinity.neooffice.org/modules.php?name=Forums&file=viewtopic&t=7051

Patrick
Back to top
Display posts from previous:   
   NeoOffice Forum Index -> NeoOffice Beta Releases All times are GMT - 7 Hours
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © Planamesa Inc.
NeoOffice is a registered trademark of Planamesa Inc. and may not be used without permission.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.