Posted: Sun Nov 09, 2008 2:26 am Post subject: XML support in v3
I've been using a spreadsheet app on my iPhone, curiously called Spreadsheet by Softalk. they've just released v1.0.2, which amongst other things fixes 'importing XML files from NeoOffice'. I've tested this and found that while Neo v2.2.5 makes XML docs that import correctly, OOo v3 (and therefore Neo v3) XML docs are different and do not parse correctly.
I've contacted Softalk about this, but I'm wondering whether you know if the changes in XML are to do with some newer XML spec, or whether it's a bug in the OOo v3 code that's made this difference (in which case NeoOffice might be able to reimplement compatibility?)
Here's a zip file which contains a spreadsheet called 'fuel reckoner' in various formats - NeoOffice v3 .ods, NeoOffice v3 .xls (Excel 2003), and XML files exported from Excel 08, OOo v3, and Neo v2.2.5.
I know this isn't high on your list of priorities, but it'd be very nice to add 'NeoOffice v3 compatibility' to their list of supported apps, especially since OOo isn't on there yet .
- padmavyuha
*edit* one difference that's obvious between the XML in the Neo2 and the OOo3 version is this:
This is obviously a different XML spec, now that I look at it, so I've asked the Softalk people whether they're going to be able to support it, given that lots of folk now are using OOo3, and will be using Neo3 too.
What do you mean different? The keys in the header are in a different order, but I can't spot any difference in the contents. Or am I just about ready for bed now?
-Oscar _________________ "What do you think of Western Civilization?"
"I think it would be a good idea!"
- Mohandas Karamchand Gandhi
What do you mean different? The keys in the header are in a different order, but I can't spot any difference in the contents. Or am I just about ready for bed now?
The keys are the same and just in a different order. All reasonably well-supported XML parsers should handle that type of change without any problem.
Note that this is not a NeoOffice problem. The XML that you are showing is Microsoft Office XML. That file format was defined by Microsoft Office and two different versions of that XML specification were approved: one by ECMA and one by the ISO.
Seems to me that this third-party tool is not properly handling the newer ISO version of Microsoft Office XML files. NeoOffice 2.2.5 uses odf-converter version 1.1 which generates the ECMA approved Office XML version and NeoOffice 3.0 Early Access builds (OpenOffice.org 3.0 does not support saving as Office XML) uses odf-converter version 2.0 which generates the ISO approved Office XML version.
In other words, NeoOffice 3.0 Early Access wil support the latest Office XML specification more completely than NeoOffice 2.2.5 does so if there are new XML tags and attributes that a third-party tool does not recognize, then their Office XML support needs to add support for those new tags and attributes.
Oh, sorry about the header keys, I wasn't looking closely enough.
Okay, I get it about the ECMA vs. ISO versions, I'll get on to the Softalk developers about supporting both.
However, I don't understand what you mean when you say 'OpenOffice.org 3.0 does not support saving as Office XML', since in its Save As file format dropdown list it clearly includes Microsoft Excel 2003 XML (.xml), just as Neo v3 EA does. I've created and saved the same spreadsheet in both and they produce identical XML docs... (pauses to check what version of OOo he is using) - oh, I'm using a 3.1 DEV version - still, we know it will be supported in the next update of OOo (as well as in Neo v3 from the get-go), so that's a good bargaining chip to negotiate with Softalk about this.
Thanks for explaining this to me.
- padmavyuha
*edit* I've pitched it to them as potentially broadening their user base massively - folk who don't have to fork out for a copy of MS Office can afford an iphone, and their $10 app too .
Last edited by yoxi on Sun Nov 09, 2008 4:25 pm; edited 1 time in total
Oh, sorry about the header keys, I wasn't looking closely enough.
Okay, I get it about the ECMA vs. ISO versions, I'll get on to the Softalk developers about supporting both.
However, I don't understand what you mean when you say 'OpenOffice.org 3.0 does not support saving as Office XML', since in its Save As file format dropdown list it clearly includes Microsoft Excel 2003 XML (.xml), just as Neo v3 EA does. I've created and saved the same spreadsheet in both and they produce identical XML docs... (pauses to check what version of OOo he is using) - oh, I'm using a 3.1 DEV version - still, we know it will be supported in the next update of OOo (as well as in Neo v3 from the get-go), so that's a good bargaining chip to negotiate with Softalk about this.
Actually, ignore my last post as I did not look closely enough at your XML snippets.
Your snippets indicate that these are Office 2003 XML files. The Office 2003 XML export code was written by Sun's OpenOffice.org engineers.
While the Cell tag is defined here, the contents of its Formula attribute are not a public standard and so Sun's OpenOffice.org engineers have been slowly reverse engineering what valid values can go into that attribute.
If you think that OpenOffice.org's engineers are putting the wrong values in this attribute, you should file a bug against OpenOffice.org 3.0. However, just because software application X does not support it does not necessarily mean that it is a bug in OpenOffice.org.
Now I'm confused again. In all of my examples (Neo v2, Neo v3 EA, OOo v3.1 DEV) the apps state that they're saving in Office Excel 2003 XML format - but Neo v2.2.5 produces different output from the other two. So you're saying this is nothing to do with odf-converter versions, or with ECMA/ISO issues? Halp!
- padmavyuha
Oh, and I've just checked, and OOo v3.0 does support Office Excel 2003 XML export. I guess then you were talking about the 2007/2008 .XMLX not being supported?
Last edited by yoxi on Sun Nov 09, 2008 4:43 pm; edited 1 time in total
Now I'm confused again. In all of my examples (Neo v2, Neo v3 EA, OOo v3.1 DEV) the apps state that they're saving in Office Excel 2003 XML format - but Neo v2.2.5 produces different output from the other two. So you're saying this is nothing to do with odf-converter versions, or with ECMA/ISO issues? Halp!
Yes. odf-converter is only used for saving Office 2007 XML and we only use odf-converter for that format because OpenOffice.org does not have any code to export to Office 2007 XML.
Microsoft creates a new file format with every major release. Sun's OpenOffice.org engineeers then try to reverse engineer them. The Office 2003 XML format was a proprietary format just like the older Office formats.
So, as of OpenOffice.org 2.2.1, Sun had figured out much of the possible values in for converting to Office 2003 but they have probably received many bugs since 2.2.1 was released regarding incorrect exported values and included any fixes for such bugs in OpenOffice.org 3.0.
Okay, getting there. The issue then is a dramatic change between OOo v2 and v3's cell definition XML code. Softalk need to find out from someone over there what the differences are (and why) if they want to make their XML importer handle OOo v2 Excel 2003 XML files.
This is obviously not a bug, so have you any suggestions about how/who I might contact where over at OOo to ask about this?
Okay, getting there. The issue then is a dramatic change between OOo v2 and v3's cell definition XML code. Softalk need to find out from someone over there what the differences are (and why) if they want to make their XML importer handle OOo v2 Excel 2003 XML files.
Close, but not quite. OOo v3 has formulas that are more compliant with what Microsoft Office expects to be in those cells. I highly doubt the OOo engineers created any new formulas as then Office would not be able to understand them. Instead, most likely OOo users found that OOo 2.2.1's code was writing some (or maybe even all) formulas incorrectly.
Hmm - the difference seems to be between relative and absolute cell referencing. The earlier version references cell positions relative to the current cell, the later one uses letter/number cell referencing. But it also has these weird 'of:' bits added to the formula strings. It's just a completely different lingo. I hope Softalk can find documentation on this.
An interesting update: this is, in fact, an OOo bug! The deal is that somewhere between OOo v2 and v3, they changed the Office Excel 2003 XML formula export from the R1C1 schema (which MS Excel supports) to the ODFF bracketed [.A1] schema, which therefore breaks compatibility with MS Excel.
I have posted this as a bug now, as I've tried importing these v3 XML files into Excel 2008 on my mac, and they throw up an error message and all the formulae get replaced with the calculated values.
It would make sense to have the ODF XML export format available as an option, along with all the other ODF file types - but to break the Excel compatibility is a bit stupid.
An interesting update: this is, in fact, an OOo bug! The deal is that somewhere between OOo v2 and v3, they changed the Office Excel 2003 XML formula export from the R1C1 schema (which MS Excel supports) to the ODFF bracketed [.A1] schema, which therefore breaks compatibility with MS Excel.
Wow. That is kind of scary that this kind of change was made. I did not even notice that they were using ODF formulas but now after you found it, it seems so obvious.
Thanks for tracking this down and filing a bug with OOo. Hopefully they will fix this bug in OOo 3.0.1 so that we can get the fix before Neo 3.0 Early Access goes out in January.
I'm glad I was able to pin this down; I'm interested to see what the qa geezers make of this. I can't believe they'd want to break Excel compatibility like that on purpose, but maybe they've got some conscious reason and won't change it back.
Joined: Apr 05, 2009 Posts: 2 Location: Fort Wayne, Indiana
Posted: Sun Apr 05, 2009 12:19 pm Post subject: Why was 'Use R1C1' dropped from Release 3?
When I upgraded to Release 3 and opened a spreadsheet, Format:Sheet:Use R1C1 is missing! Since I use this format, I reverted back to the previous release until this is fixed. Does anyone know why this format would be dropped in the latest release? There seems to be nothing in the release notes about this change. I'm sure many others have noticed this, as well. I did not try to open an existing spreadsheet to see if the R1C1 format would be accepted.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You cannot download files in this forum