Welcome to NeoOffice developer notes and announcements
NeoOffice
Developer notes and announcements
 
 

This website is an archive and is no longer active
NeoOffice announcements have moved to the NeoOffice News website


Support
· Forums
· NeoOffice Support
· NeoWiki


Announcements
· Twitter @NeoOffice


Downloads
· Download NeoOffice


  
NeoOffice :: View topic - Encoding for ".xls" files
Encoding for ".xls" files
 
   NeoOffice Forum Index -> NeoOffice Beta Releases
View previous topic :: View next topic  
Author Message
alanterra
Agent


Joined: Jul 03, 2006
Posts: 10

PostPosted: Sun Jan 24, 2010 11:10 pm    Post subject: Encoding for ".xls" files

I am asking this question here, because I would like to give a webmaster an answer that he can use to make his website more Mac/Neooffice friendly.

Bajaflora.org serves data from the San Diego Natural History Museum Herbarium. Not all of the site is currently publicly available, but I hope that will change soon.

If I ask it to download some data, I get an ".xls" file. I am not sure if this is really a ".xls" file or in some other format with an ".xls" extension. See attached.

It appears to be in UTF8, because if I open it in TextMate, the non-ASCII characters are interpreted correctly (eg, San Pedro Mártir). But if I open it in Neooffice, the non-ASCII characters are "all kerflooey".

What I need to know is, whose problem is this, and can it be easily solved (like with a Byte Order Marker in the file, or a different extension)? I would like to tell the webmaster of Bajaflora.org "please do the following, so that the downloaded files can be easily opened by platforms other than Office/Windows."

Thanks!

Alan
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Sun Jan 24, 2010 11:38 pm    Post subject: Re: Encoding for ".xls" files

alanterra wrote:
If I ask it to download some data, I get an ".xls" file. I am not sure if this is really a ".xls" file or in some other format with an ".xls" extension. See attached.

It appears to be in UTF8, because if I open it in TextMate, the non-ASCII characters are interpreted correctly (eg, San Pedro Mártir). But if I open it in Neooffice, the non-ASCII characters are "all kerflooey".


Your file is definitely not an Excel file. Instead, it is a UTF-8 encoded partial HTML file. Specifically, it contains an HTML table of data and cells are encoded in UTF-8.

All that needs to be changed in this file to open it in Safari or Firefox is to change the name of the file extension to .html and the missing the missing HTML tags at the beginning and ending of the file.

A renamed and edited version of your file is attached.

Patrick
Back to top
alanterra
Agent


Joined: Jul 03, 2006
Posts: 10

PostPosted: Sun Jan 24, 2010 11:41 pm    Post subject:

thank you
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Mon Jan 25, 2010 12:09 am    Post subject: Re: Encoding for ".xls" files

alanterra wrote:
If I ask it to download some data, I get an ".xls" file. I am not sure if this is really a ".xls" file or in some other format with an ".xls" extension. See attached.

Right, it's part of an HTML document (an HTML table) with an .xls extension.

alanterra wrote:
What I need to know is, whose problem is this, and can it be easily solved (like with a Byte Order Marker in the file, or a different extension)? I would like to tell the webmaster of Bajaflora.org "please do the following, so that the downloaded files can be easily opened by platforms other than Office/Windows."

My guess is that there's a bug in the underlying OOo HTML import code where it doesn't do any character set detection and always assumes the character set is something other than UTF-8.

Luckily, it's easy enough to fix: just add a meta tag with charset information to the beginning of the HTML fragment; I added one and the document then imported just fine (trinity really does not like me trying to post the content of the meta tag, even without the angled brackets...grr Mad I've spent the last 20 minutes trying to do so, with absolutely no success Mad ).

You might also be interested in this article section from the wiki: Opening auto-generated xls files in Calc

Smokey

_________________
"[...] whether the duck drinks hot chocolate or coffee is irrelevant." -- ovvldc and sardisson in the NeoWiki
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Mon Jan 25, 2010 7:12 am    Post subject: Re: Encoding for ".xls" files

sardisson wrote:
My guess is that there's a bug in the underlying OOo HTML import code where it doesn't do any character set detection and always assumes the character set is something other than UTF-8.


I don't think you can blame NeoOffice's underlying OpenOffice.org code for this. HTML files have a very standard way to specify what character set is. Our website's pages use that standard way but the attached file did not have any of the standard HMTL "html", "head", or "body" tags. In other words, the HTML was very incomplete.

Patrick
Back to top
Display posts from previous:   
   NeoOffice Forum Index -> NeoOffice Beta Releases All times are GMT - 7 Hours
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © Planamesa Inc.
NeoOffice is a registered trademark of Planamesa Inc. and may not be used without permission.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.