Posted: Fri Jan 19, 2007 9:33 am Post subject: regular expression [:space:]
Is there anybody that knows why in Find & Replace the regular expression [:space:] (that is also mentioned in the help) doesn't work, but need to be follewed by a "+"?
I mean [:space:] doesn't work, while [:space:]+ does the job.
The same problem happens with OOo/X11 2.0.3 and 2.1.
I cannot use " " because this find just " " while with [:space:]+ I'm able to find any lenght of spaces: it finds " " but also " ".
I started this because of another 3d where one of our comunity wants to transform a text in a table, and the text is arranges as coloum divided by different number of spaces, and I suggest to Find [:space:]+ & Replace them with /t and than do the "tableazation"...
Joined: Nov 21, 2005 Posts: 1285 Location: Witless Protection Program
Posted: Fri Jan 26, 2007 7:00 pm Post subject: Re: regular expression [:space:]
valterb wrote:
Is there anybody that knows why in Find & Replace the regular expression [:space:] (that is also mentioned in the help) doesn't work, but need to be follewed by a "+"?
I mean [:space:] doesn't work, while [:space:]+ does the job.
The same problem happens with OOo/X11 2.0.3 and 2.1.
Is it only my problem?
Valter
Ah HA!
I was re-rereading the OOo OOo2.0 Migration Guide (compiled 8-May-2006 - PDF 2.3MB) trying to understand why Valter was having so much trouble.
Well, after the 4th or 5th time I think I have the answer - refer to page 59.
Quote:
Regular expressions:
“Regular expressions†are significantly different in OOo from MSO’s “Use wildcardsâ€. See Help > OpenOffice.org help > Index tab > and type in “regular expressions†then move to “Searching†and press Display. Some common examples are in Table 5. To use regular expressions, click the More Options button of the Find & Replace dialog and make sure the Regular expressions checkbox is checked.
On reopening the Find & Replace dialog, the Regular Expressions checkbox is always unchecked.
Quote:
Table 5. Sample regular expressions
Problem -> Search -> Replace
Replace multiple spaces with just one space. “[:space:]†finds both non-breaking spaces and normal spaces but not tabs. Type a normal space in the Replace field.
-> [:space:]* -> normal space
Note The asterisk “*†means any number of the preceding character. Where in MSO you might have just “*†the equivalent in OOo is “.*†because “.†stands for any single character (like MSO’s “?â€).
So there are two problems:
1. Difference between MSO and OOo on "*" (wild card vs multiple occurrences)
2. You have to use the "*" after [:space:] to find multiple occurrences.
I strongly suggest the anyone trying to do character replacement in OOo to read, and reread the OOo2.0 Migration Guide "Regular expressions" Section.
I think there are less detailed sections in the other Guides (Getting Started, Writer (pg 51), etc) and HELP. I have not found a good reference page that lists ALL the "Regular expressions" ... yet! Suggestions
I have been trying to find an answer and I think that this is what is causing problems. It's different than MS Office and some other programming languages. Guess we have to follow the OOo rules?!?
Time to add this extra information to the Wiki
It seems to be a ... popular question that requires more than a quick answer. sigh
Philip ( Can't code as well as others, but I can ... read up a storm! ) With the help of glasses _________________ Have you checked the NeoWiki Documentation Page for more answers?
http://neowiki.neooffice.org/index.php/Documentation_and_Related_Resources
includes User Guides, eBooks, Blogs, additional resource links, and much more!
Problem -> Search -> Replace
Replace multiple spaces with just one space. “[:space:]†finds both non-breaking spaces and normal spaces but not tabs. Type a normal space in the Replace field.
-> [:space:]* -> normal space
Note The asterisk “*†means any number of the preceding character. Where in MSO you might have just “*†the equivalent in OOo is “.*†because “.†stands for any single character (like MSO’s “?â€).
Since last time I use MS Word I was a child I'm not conditioned by MS: during my Windows days (they ware w98 days) I used WordPerfect...
Anyway, [:space:]* and [:space:]+ seem to work the same way in Search field within NeoOffice.
And if I use [:space:].* in the Search field the whole sentence after the first space is replaced with what I enter in the Replace field.
The use of [:space:]? replace any single space with what I enetr in Replace field.
Does what you quote mean this?
Joined: Nov 21, 2005 Posts: 1285 Location: Witless Protection Program
Posted: Sun Jan 28, 2007 4:37 pm Post subject:
Valter,
I don't think it's a problem with English grammar. It's just that the definitions are not clear. I have had to read the "OpenOffice.org User Guide for 2.0, page 55" 3 or 4 more times to notice the tiny differences.
OpenOffice.org User Guide wrote:
* -- Finds zero or more of the character immediately in front of the "*".
For example,"Ab*c" finds "Ac", "Abc", "Abbc", "Abbbc", and so on.
+ -- The character before this symbol must appear at least once:
"AX+4" finds "AX4", “AXX4â€, but not "A4".
? -- Finds zero or one of the characters in front of the "?".
For example, "Texts?" finds the words "Texts" and "Text".
So I think your examples are covered by the User Guide Examples.
It also explains why "+" and "*" seem to work the same, with the difference being "Zero occurances".
Valterb wrote:
And if I use [:space:].* in the Search field the whole sentence after the first space is replaced with what I enter in the Replace field.
The [:space:] IS the character before the "*", so you don't have to use the '.' before the "*".
That would be Space-ANYTHING-zero or more times - hence the whole sentence gets replace - an error.
So [:space:]+ IS the proper Regular Expression for: SPACE must appear at least once or more times.
Thank you for bringing these details to our attention!
Philip ( didn't know any of these "tiny details" before researching these questions! ) _________________ Have you checked the NeoWiki Documentation Page for more answers?
http://neowiki.neooffice.org/index.php/Documentation_and_Related_Resources
includes User Guides, eBooks, Blogs, additional resource links, and much more!
Joined: Oct 24, 2005 Posts: 561 Location: Edinburgh, Scotland
Posted: Mon Jan 29, 2007 7:40 am Post subject:
Philip,
What you say is entirely correct, however its not the problem that Valter was having!
The problem is that the regular expression "[:space:]" on its own without a following *, + or ? finds exactly nothing!
For example, given the text
Code:
Is Fred fat? Not all Freds are thin.
Now search for the string "Fred" - it will be found twice, once in Fred and again in Freds
Now search for the string "Fred " (note trailing space) - it will be found once in the first sentence
Now search for the string "Fred ?" - it will find "Fred " in the 1st sentence and "Fred" in the second.
All OK so far. But now if we replace the character " " with the regular expression "[:space:]" things go wrong...
Search for the string "Fred[:space:]" - matches nothing!
Search for the string "Fred[:space:]?" - matches the same occurrences as in the third example using " ".
Search for the string "Fred[:space:]f" - matches the "Fred f" in the first sentence.
So its not possible to search for a single space using "[:space:]", unless you know what the character following it is.
Joined: Nov 21, 2005 Posts: 1285 Location: Witless Protection Program
Posted: Tue Jan 30, 2007 12:25 am Post subject:
Ahhhhhhh ... Shoot!
Thanks for the extra information amayze. (et al)
It seems that regular expression spaces (aka [:space:] ) are ... bad, evil, twisted.
(Who would want to look for a SINGLE space - anyWAY! )
The more I read, the more cornfused I become.
Check out the following article about OOo Regular Expressions.
Be sure to Check out the sections: Wildcards ("because REs, by nature, are greedy"), Special sets ([:alpha:], [:space:], [:digit:], etc), and especially Caveat.
Philip ( '(what|where|how)ever' )
\. "because REs, by nature, are greedy"? Who knew?!?
Joined: Oct 24, 2005 Posts: 561 Location: Edinburgh, Scotland
Posted: Tue Jan 30, 2007 6:18 am Post subject:
I think I've found the solution to this problem!
It would appear that any search term that ends with a special set ([:space:], [:digit:], [:alpha:] etc.) will fail to match anything. But if the special set is not at the end of the search term then it works as advertised.
The work around?
Make sure the special set is not at the end of the search term by enclosing the whole search term in parenthesis. ie:
Searching for "[:space:]" will fail, however searching for "([:space:])" will correctly match exactly one space, and "(Fred[:space:])" will match "Fred ". Alternatively you could just put the special set in parenthesis, so "Fred([:space:])" would also work.
Joined: Nov 21, 2005 Posts: 1285 Location: Witless Protection Program
Posted: Tue Jan 30, 2007 2:20 pm Post subject:
Bravo, BRAVO! ([:clap:]) ~ single "clap"
and ... here is another case where OOo has implemented something (RE's) in a non-standard way. Accident or on purpose? This is bound to create a a LOT of users problems and questions. sigh
Thank you Andy for your "extra" efforts to help resolve, or at least explain, this problem.
Philip ( objects to non-standard use of (non-) regular expressions. non&sense if you ask me! )
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You cannot download files in this forum