Welcome to NeoOffice developer notes and announcements
NeoOffice
Developer notes and announcements
 
 

This website is an archive and is no longer active
NeoOffice announcements have moved to the NeoOffice News website


Support
· Forums
· NeoOffice Support
· NeoWiki


Announcements
· Twitter @NeoOffice


Downloads
· Download NeoOffice


  
NeoOffice :: View topic - xml file formats and search by content
xml file formats and search by content
 
   NeoOffice Forum Index -> NeoOffice Releases
View previous topic :: View next topic  
Author Message
bezvardis
Keymaker


Joined: Dec 10, 2004
Posts: 89
Location: Latvia

PostPosted: Wed Mar 09, 2005 11:43 am    Post subject: xml file formats and search by content

Since yesterday I was trying to get search by content to find particular files containing particular words. I noticed that it just would not find some particular files. Some other files would be found, though. I got desparate and searched all possible topics on the net. But then I copied the text that finder could not find into a MS word document and saved it. Finder found it immediately. The same text saved by NeoOffice coul not be found if searched by content. Now I stumbled across the help in NeoOffice which talks about xml file formats and says that it actually uses some kind of compression like that of zip files. So I thought - maybe the content in this format is so deeply concealed that finder just cannot get to it and terefore does not find the file? If that is so - is that a bug or it is a feature of OpenOffice?
Back to top
bezvardis
Keymaker


Joined: Dec 10, 2004
Posts: 89
Location: Latvia

PostPosted: Wed Mar 09, 2005 12:01 pm    Post subject:

Now I experimented a bit more with this and found out that the file can be found if I enter the search words in the properties of the file. when I remove them and reset the properties, finder again cannot find the file.

It seems also that file saved by NeoOffice in .doc format becomes searchable to Finder
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Wed Mar 09, 2005 12:29 pm    Post subject:

The reason that Finder won't search the contents of Neo/J files is because Finder does not support searching of the OOo file format. Apple did put in searching of MS Office formats, but Apple has not bothered to make any of their tools handle the OOo file format.

Patrick
Back to top
ovvldc
Captain Naiobi


Joined: Sep 13, 2004
Posts: 2352
Location: Zürich, CH

PostPosted: Wed Mar 09, 2005 2:47 pm    Post subject:

pluby wrote:
The reason that Finder won't search the contents of Neo/J files is because Finder does not support searching of the OOo file format. Apple did put in searching of MS Office formats, but Apple has not bothered to make any of their tools handle the OOo file format.


Is Spotlight going to worked on compressed files? If it is, chances are good they can be searched without problem. Good thing about OOo is that it is (somewhat) human readable text inside a zip, right? Ingredients should be about there..

Oscar

_________________
"What do you think of Western Civilization?"
"I think it would be a good idea!"
- Mohandas Karamchand Gandhi
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Wed Mar 09, 2005 2:53 pm    Post subject:

Yes, the OOo files are merely plain text XML files zipped up. In fact, you can manually unzip them using the following terminal command:

jar xvf <OOo file>

Patrick
Back to top
bezvardis
Keymaker


Joined: Dec 10, 2004
Posts: 89
Location: Latvia

PostPosted: Wed Mar 09, 2005 4:05 pm    Post subject:

pluby wrote:
Yes, the OOo files are merely plain text XML files zipped up. In fact, you can manually unzip them using the following terminal command:

jar xvf <OOo file>


Can doing that somehow help me solve the search by content problem? I tend to have thousands of documents with texts and sometimes I have to find where a particular word is mentioned. If text by content is unavailable, it is a big drawback for me.
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Wed Mar 09, 2005 4:53 pm    Post subject:

Some time ago someone mentioned trying to get support for search OOo files in certain search applications (other than the Finder). The end result was that the author of at least one of the programs agreed to look in to supporting OOo files. If you do a search here, you'll probably find the thread and the names of the apps.

Ed has at least given some thought to writing a plugin for the forthcoming 10.4 Spotlight engine and the OpenDocument (OOo 2.0) format...whether he'll have the time to do so only he can answer.

Smokey

_________________
"[...] whether the duck drinks hot chocolate or coffee is irrelevant." -- ovvldc and sardisson in the NeoWiki
Back to top
fabrizio venerandi
Guest





PostPosted: Thu Mar 10, 2005 12:25 am    Post subject:

if you have not problem with hd space or formatting, you can use the old staroffice format file. I uses staroffice format for big document for example, 'cause opening calc file in neooffice format is too slow.
And osx search content look into staroffice files.


f.
Back to top
bezvardis
Keymaker


Joined: Dec 10, 2004
Posts: 89
Location: Latvia

PostPosted: Thu Mar 10, 2005 9:13 am    Post subject:

Thank you all for useful advices. Eventually I managed to find some opensource program called docsearch which can do the search of NeoOffice and other kinds. The interface is not very friendly to the eye but at least it works.
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Fri Mar 11, 2005 11:31 pm    Post subject:

I can't find the old posts I was looking for; I think they probably got lost when Ed restored trinity after he came back from vacation one month. There's one thread where SOB noted he had requested the authors of EasyFind and SpeedSearch to add OOo/Neo doc support. But I remember a post where someone reported that some search author had agreed to add support to the next version of his product, and I can't find it now.

Here are some other relevant OOo links, just to have them in one place:
http://www.danielnaber.de/loook/
http://oootools.free.fr/fooox/
http://www.openoffice.org/issues/show_bug.cgi?id=14468

Smokey

_________________
"[...] whether the duck drinks hot chocolate or coffee is irrelevant." -- ovvldc and sardisson in the NeoWiki
Back to top
bezvardis
Keymaker


Joined: Dec 10, 2004
Posts: 89
Location: Latvia

PostPosted: Sat Mar 12, 2005 10:50 am    Post subject:

I had some interesting discussion on this topic at the Apple Panther support page and here are some work-arounds that I found:
1) docSearcher that I mentioned (requires indexing and gives back search results that to me were not very comprehensible but others might think different) docSearch finds all documents in all formats that contain particular thread. the download page is here http://www.brownsite.net/docsearch.htm
2) someone posted a terminal search code (if that's how it is called) ( http://discussions.info.apple.com/webx?13@116.yy3YaI37RO4.927450@.68a8d1b9/6 ) which works quite fast, uses no indexing and gives a simple list of files containing the thread. But it looks only for xml file formats so all the other files have to be searched separately. that someone also said that it might be quite easy to write similar code for Apple Script and make it run from NeoOffice or finder or whatever - maybe there are poeple who want to take this challenge and write a script? I have no knowledge in the Script and thus cannot do that myself
3) at this link http://www.darwinwars.com/lunatic/bugs/oo_macros.html someone posted a set of macros among which there is supposed to be one that does the search inside Openoffice. Other macros worked fine for me but the search made NeoOffice quit. I don't know if that was my mistake or the fault of the script. If someone else tries that and makes it work - let me know.
4) Easyfind did not work for me when I wanted to find the NeoOffice documents, which might mean that the promises from them to look at this problem remained just promises Sad
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Sun Mar 13, 2005 10:52 am    Post subject:

bezvardis wrote:
2) someone posted a terminal search code (if that's how it is called) ( http://discussions.info.apple.com/webx?13@116.yy3YaI37RO4.927450@.68a8d1b9/6 ) which works quite fast, uses no indexing and gives a simple list of files containing the thread. But it looks only for xml file formats so all the other files have to be searched separately. that someone also said that it might be quite easy to write similar code for Apple Script and make it run from NeoOffice or finder or whatever - maybe there are poeple who want to take this challenge and write a script? I have no knowledge in the Script and thus cannot do that myself


There are a couple of Neo/J folks that could probably convert that into a nice AppleScript (Max, yoxi, me, that I know of). I unfortunately don't have time to look into it right now Sad but if no one else has taken it on after next week, I should have some time then.

Smokey

_________________
"[...] whether the duck drinks hot chocolate or coffee is irrelevant." -- ovvldc and sardisson in the NeoWiki
Back to top
Max_Barel
Oracle


Joined: May 31, 2003
Posts: 219
Location: French Alps

PostPosted: Sun Mar 13, 2005 5:32 pm    Post subject:

Smokey wrote:
I unfortunately don't have time to look into it right now Sad but if no one else has taken it on after next week, I should have some time then.

Same here. To avoid wasting concurrent effort we should synchronize by posting in this thread before starting up.
Back to top
OPENSTEP
The One
The One


Joined: May 25, 2003
Posts: 4752
Location: Santa Barbara, CA

PostPosted: Mon Mar 14, 2005 9:53 pm    Post subject:

FWIW I've had the idea to do a Spotlight filter for OOo docs for some time but have not had the time to do it. Most of my relevant research is in the NSFilter proposal on dashboardbuddha, and the bulk of the concept would apply for an OSS Spotlight plugin framework. For better or for worse, I tend to put bug fixing (or narrowing down bugs) on a higher priority level then Spotlight plugins as, well, the OS isn't even fully released to the public yet Smile

ed
Back to top
bezvardis
Keymaker


Joined: Dec 10, 2004
Posts: 89
Location: Latvia

PostPosted: Tue Mar 15, 2005 9:49 am    Post subject:

Gib Henry has posted a script (written by biovizer) that resulted from discussion on this topic on one of Apple discussion boards http://trinity.neooffice.org/modules.php?name=Forums&file=viewtopic&t=1186
The script though has some limitations (e.g, one cannot quit it easily when it runs, it only searches for one word or a phrase but not keywords, it also searches only Openoffice files and no other - things like that). Otherwise it works pretty well and quite fast and looks nice. If anyone would like to improve it - that would be even more fantastic than the fact that biovizer wrote this piece Smile
Back to top
Display posts from previous:   
   NeoOffice Forum Index -> NeoOffice Releases All times are GMT - 7 Hours
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © Planamesa Inc.
NeoOffice is a registered trademark of Planamesa Inc. and may not be used without permission.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.