Welcome to NeoOffice developer notes and announcements
NeoOffice
Developer notes and announcements
 
 

This website is an archive and is no longer active
NeoOffice announcements have moved to the NeoOffice News website


Support
· Forums
· NeoOffice Support
· NeoWiki


Announcements
· Twitter @NeoOffice


Downloads
· Download NeoOffice


  
NeoOffice :: View topic - Find and Replace - help needed...
Find and Replace - help needed...
 
   NeoOffice Forum Index -> NeoOffice Releases
View previous topic :: View next topic  
Author Message
bryoman
Sentinel


Joined: Jan 07, 2009
Posts: 27
Location: Carlisle, England

PostPosted: Mon Mar 18, 2013 8:28 am    Post subject: Find and Replace - help needed...

I have been sent an RTF of a huge bibliography - 900 pages!!! - in which the industrious typist has tried so hard to FORMAT his work...

Yes - you guessed.

1) There is a return at the end of EVERY line on his page - so on any editing or reformatting we have multitudes of returns WITHIN the text for each separate entry.

2) There are (or were) multiple spaces here and there, at the start of lines, to 'push the text along' to where he wanted, instead of tabbing or indenting, as appropriate....

I'm working in Writer. The 2) above is easy enough to remedy with F & R, but I've not yet been able to sort the 1) problem.

The redeeming feature is that for **each new entry in his bibliography he has uppercased AND bolded AND underlined the author's name**.

Hence I should be able to PROTECT the essential returns (i.e. those occurring BEFORE an author's bolded and/or underlined name) - I'd perhaps replace them with a 'zxzx': but you'll maybe tell me a better (proper) way - THEN delete all other returns, THEN reinstate these essential returns. Job sorted. Hopefully.

But in spite of much fiddling about with Regular Expressions (combined with ordinary characters, or not) and Attributes and Format, and Wiki advice, and so on, I haven't managed to come up with the critical combinations for the Find and the Replace fields to achieve this - I just get 'Not found's for my efforts.

Must be easy, when you know the trick!

SOO grateful for your help.

(Have added a JPEG of the appearance in case my wording is unclear: a picture speaks a thousand words...)
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Mon Mar 18, 2013 11:46 am    Post subject: Re: Find and Replace - help needed...

Sorry, find and replace won't replace two concurrent lines with a single line (that is essentially what you are trying to do). Regular expressions work only on a single line at a time. You are going to have to edit the document manually.

Patrick
Back to top
bryoman
Sentinel


Joined: Jan 07, 2009
Posts: 27
Location: Carlisle, England

PostPosted: Mon Mar 18, 2013 3:23 pm    Post subject: Re: Find and Replace - help needed...

pluby wrote:
Sorry, find and replace won't replace two concurrent lines with a single line (that is essentially what you are trying to do). Regular expressions work only on a single line at a time. You are going to have to edit the document manually.

Patrick


Oh, boo. That's bad news... Shocked

The job probably won't get done, then - at least not by me!

is there any other Mac WP app that might do the job, I wonder???? Wink

Thanks,

JR
Back to top
ovvldc
Captain Naiobi


Joined: Sep 13, 2004
Posts: 2352
Location: Zürich, CH

PostPosted: Mon Mar 18, 2013 3:25 pm    Post subject:

Perhaps the industrious typist used a citation manager. This would be likely if all the citations are properly formatted, because people make mistakes in such a long document even if they triple check.

If so, maybe you could ask the typist what they used and give an alternate template.

_________________
"What do you think of Western Civilization?"
"I think it would be a good idea!"
- Mohandas Karamchand Gandhi
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Mon Mar 18, 2013 4:38 pm    Post subject: Re: Find and Replace - help needed...

bryoman wrote:
is there any other Mac WP app that might do the job, I wonder???? Wink


I would follow the recommendation in ovvldc's post since, unfortunately, neither Ed nor I have any spare time to evaluate third-party software so we cannot offer any product evaluations or opinions. Unlike many other sites, these forums are not a "user to user" or "customer service" forum. Instead, since Ed and I are software engineers, the support services that we provide are largely fixing the critical crashing, hanging, and data loss bugs posted by our users. In return for the US$100 payment, we focus all our attention immediately on finding and fixing such critical bugs.

Patrick
Back to top
bryoman
Sentinel


Joined: Jan 07, 2009
Posts: 27
Location: Carlisle, England

PostPosted: Mon Mar 18, 2013 5:00 pm    Post subject: Re: Find and Replace - help needed...

pluby wrote:
bryoman wrote:
is there any other Mac WP app that might do the job, I wonder???? Wink


I would follow the recommendation in ovvldc's post since, unfortunately, neither Ed nor I have any spare time to evaluate third-party software so we cannot offer any product evaluations or opinions. Unlike many other sites, these forums are not a "user to user" or "customer service" forum. Instead, since Ed and I are software engineers, the support services that we provide are largely fixing the critical crashing, hanging, and data loss bugs posted by our users. In return for the US$100 payment, we focus all our attention immediately on finding and fixing such critical bugs.

Patrick


Yes - I'll follow up the previous post... I DID put a winky after my pondering about other WP apps doing the job that NeoO couldn't do - I most certainly wasn't expecting you or Ed to spend any time answering my query in any serious way, and apologies if you thought I might be expecting that! Smile


JR.
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Mon Mar 18, 2013 6:30 pm    Post subject: Re: Find and Replace - help needed...

bryoman wrote:
Yes - I'll follow up the previous post... I DID put a winky after my pondering about other WP apps doing the job that NeoO couldn't do - I most certainly wasn't expecting you or Ed to spend any time answering my query in any serious way, and apologies if you thought I might be expecting that! Smile


I did not assume that you expected to spend time. Instead, I was trying to convey that we really don't use any other office suites other than NeoOffice.

As software engineers, Ed and I rarely create complex documents. Probably 90% of our work is done in the Terminal application and when we do use office suite software, we usually on work with simple spreadsheets and rarely open any Microsoft Office documents. Ironically, we are only light users of our own product and we haven't had work with any documents that required other products like Microsoft Office or Apple iWork.

Patrick
Back to top
James3359
The Merovingian


Joined: Jul 05, 2005
Posts: 685
Location: North West England

PostPosted: Tue Mar 19, 2013 8:25 am    Post subject:

There may be a way forward through using a three-stage process to mark your 'protected' line breaks before stripping out all line breaks, and then reinserting line breaks at the points which are marked. I strongly recommend you make a copy of your file to work on, so that the original will be safe if something goes wrong.

So first of all you need to identify a character which does not appear in your text. Often the bullet point character • (Alt-8) works well for this. Be careful not to choose a character which means something as part of a regular expression.

Next you need to use Find and Replace to insert the chosen character at your 'protected' line breaks.

In NeoOffice choose Edit::Find and Replace (⌘-F). If the full Find and Replace dialog is not already showing then click the More Options button to reveal it. In the More Options part of the dialog, select the Regular Expressions check box and click the Format button. This brings up the TextFormat Search dialog. Select the Font tab and in the Typeface section select Bold. Then select the Font Effects tab and select Single from the Underlining drop down. Click OK. You have now told NeoOffice that you are looking for underlined and bolded text, and under the Search field in the Find and Replace dialog you will see the words 'Single underline, bold'.

Now to tell NeoOffice what to search for using regular expressions. In the Search field put the ^ character (which means find the first character in the paragraph). NeoOffice is now looking for bolded underlined text at the beginning of a paragraph. Then put (.) that is a period/full stop enclosed within normal brackets. The period/full stop character tells NeoOffice to find any single character, the () are called a backreference and tell NeoOffice to remember what it has found.

At this point you might like to test the find function. You should see that it will select characters which are bolded and underlined and at the beginning of a paragraph.

Now we want to tell NeoOffice to insert the marker character. In the Replace field insert • the bullet point (or whatever other character you are using), then $0 The $0 tells NeoOffice to insert the text it has just found, so it inserts the bullet point followed by whatever text it has just found.

You may want to test this with two or three clicks of the replace button to check it is working and then click Replace All.

You should find now that every paragraph beginning with bolded single underlined text is now preceded by a bullet point.

All that remains is to strip out all the returns from your document, and then, finally, to replace all the bullet points with returns.

I hope this will work for you. I've devised it on the basis of the text shown in your post, but obviously not tested it on your actual file.
Back to top
bryoman
Sentinel


Joined: Jan 07, 2009
Posts: 27
Location: Carlisle, England

PostPosted: Tue Mar 19, 2013 3:32 pm    Post subject:

James3359 wrote:
There may be a way forward through using a three-stage process to mark your 'protected' line breaks before stripping out all line breaks, and then reinserting line breaks at the points which are marked. I strongly recommend you make a copy of your file to work on, so that the original will be safe if something goes wrong.

So first of all you need to identify a character which does not appear in your text. Often the bullet point character • (Alt-8) works well for this. Be careful not to choose a character which means something as part of a regular expression.

Next you need to use Find and Replace to insert the chosen character at your 'protected' line breaks.

In NeoOffice choose Edit::Find and Replace (⌘-F). If the full Find and Replace dialog is not already showing then click the More Options button to reveal it. In the More Options part of the dialog, select the Regular Expressions check box and click the Format button. This brings up the TextFormat Search dialog. Select the Font tab and in the Typeface section select Bold. Then select the Font Effects tab and select Single from the Underlining drop down. Click OK. You have now told NeoOffice that you are looking for underlined and bolded text, and under the Search field in the Find and Replace dialog you will see the words 'Single underline, bold'.

Now to tell NeoOffice what to search for using regular expressions. In the Search field put the ^ character (which means find the first character in the paragraph). NeoOffice is now looking for bolded underlined text at the beginning of a paragraph. Then put (.) that is a period/full stop enclosed within normal brackets. The period/full stop character tells NeoOffice to find any single character, the () are called a backreference and tell NeoOffice to remember what it has found.

At this point you might like to test the find function. You should see that it will select characters which are bolded and underlined and at the beginning of a paragraph.

Now we want to tell NeoOffice to insert the marker character. In the Replace field insert • the bullet point (or whatever other character you are using), then $0 The $0 tells NeoOffice to insert the text it has just found, so it inserts the bullet point followed by whatever text it has just found.

You may want to test this with two or three clicks of the replace button to check it is working and then click Replace All.

You should find now that every paragraph beginning with bolded single underlined text is now preceded by a bullet point.

All that remains is to strip out all the returns from your document, and then, finally, to replace all the bullet points with returns.

I hope this will work for you. I've devised it on the basis of the text shown in your post, but obviously not tested it on your actual file.


Ah! many thank-yous for taking all this trouble, James3359!

This seems to be close to the ideas I had in my first post - 'protecting' the desired para breaks using the bold-underline, etc. - but didn't manage to get to work - your method involves several steps I've never come across before...

This DOES work nicely, and I can get each author's name on a separate line! Thank you.

I then hit the next problem (which I didn't really predict in my OP). After going through this procedure, each author's entry becomes concatenated into a single para (obviously), but in the original, most authors have several separate entries starting with a year, each entry thus a separate para.

No problem - I added •s to them too, with the Search as ^[1-2] which found any year starting a para.

However, the next step of stripping the para breaks seems to cause problems... with just $ in the Search and blank in Replace, Find and Replace worked as expected, BUT when I click on Replace ALL, NeoO seems to stiff... Is it just taking A Very Long Time about it, and I'm too impatient? - or has it hung; or have I got something wrong in the procedure?

I've done the full routine on a much smaller sample of about 30 (out of the 900) pages, and that seems to work fine.

Finally, last step - weird: in starting to Replace the • with the para break, it takes about 5 seconds to replace a single instance when clicking on Replace, whilst clicking on Replace ALL, after a short pause replaces hundreds in a split second. (Should I have \n for that step?)

Grateful for your thoughts.

Jeremy.
Back to top
ovvldc
Captain Naiobi


Joined: Sep 13, 2004
Posts: 2352
Location: Zürich, CH

PostPosted: Tue Mar 19, 2013 5:19 pm    Post subject:

Glad you got it (mostly) sorted.

If NeoOffice hangs, take a sample and post it for Patrick to have a look at.

It is possible that find-replace on such a large document is tricky because the computing resources needed do not scale linearly with the size.

For example (and this was a problem in the past for someone), NeoOffice might be trying to reflow your document after every replace it does. Which means that it needs to reflow 900 pages for every line ending. With so many pages and so many reflows triggered, that will take a while.

Here the quick solution would be to add a dozen or so hard page breaks in the document to limit the reflowing to a subsection of the document.

I am not sure this is the cause, but that is what popped into my head. Only a sample will tell what NeoOffice is actually doing.

_________________
"What do you think of Western Civilization?"
"I think it would be a good idea!"
- Mohandas Karamchand Gandhi
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Tue Mar 19, 2013 5:25 pm    Post subject:

ovvldc wrote:
If NeoOffice hangs, take a sample and post it for Patrick to have a look at.

It is possible that find-replace on such a large document is tricky because the computing resources needed do not scale linearly with the size.


FYI. A quick way to determine if NeoOffice is hanging or just slogging through a lot of processing is to open a new, empty document before you start the problematic search and replace.

If you can type text into the new, empty document, then NeoOffice is just just slogging through a lot of processing. But if NeoOffice is hanging, you will not be able to type any text in the new, empty document while the search and replace is going. In the latter case, obtain a sample of the NeoOffice application using
the steps in this NeoWiki article and attach the sample.

Patrick
Back to top
James3359
The Merovingian


Joined: Jul 05, 2005
Posts: 685
Location: North West England

PostPosted: Wed Mar 20, 2013 12:40 am    Post subject:

Replace All: I do get a noticeable lag when I process a 13 page document in this way, so I expect you will get an even longer lag with a longer document.

900 pages is quite a large document and executing the replace may be quite demanding on memory. Also, when you strip out the line breaks you create a single paragraph document which may exceed the maximum paragraph size the software can handle. I believe OpenOffice.org, on which NeoOffice is based, has a maximum paragraph size of 64kB.

It may be better not to do the whole document at once, but to select chunks of pages at a time and use the 'Current selection only' checkbox, or just to allow a long time.

\n is what you need to insert breaks.

I don't know why the last step should be so fast Shocked
Back to top
James3359
The Merovingian


Joined: Jul 05, 2005
Posts: 685
Location: North West England

PostPosted: Wed Mar 20, 2013 12:22 pm    Post subject:

I've really gone further than I should on this topic. As Patrick has said above, the support services he and Ed provide are largely fixing the critical crashing, hanging, and data loss bugs posted by NeoOffice users.

It isn't appropriate for me to go on offering offering ideas which haven't been tested and are not likely to be something that NeoOffice can perform on such a massive document. This post lists a range of resources to help you tackle the challenge. I wish you well with it, but this forum should be left clear for critical crashing, hanging, and data loss bugs.

James
Back to top
bryoman
Sentinel


Joined: Jan 07, 2009
Posts: 27
Location: Carlisle, England

PostPosted: Wed Mar 20, 2013 4:10 pm    Post subject:

James3359 wrote:
I've really gone further than I should on this topic. As Patrick has said above, the support services he and Ed provide are largely fixing the critical crashing, hanging, and data loss bugs posted by NeoOffice users.

It isn't appropriate for me to go on offering offering ideas which haven't been tested and are not likely to be something that NeoOffice can perform on such a massive document. This post lists a range of resources to help you tackle the challenge. I wish you well with it, but this forum should be left clear for critical crashing, hanging, and data loss bugs.

James


james, Patrick, ovvldc:

Well - I got the job done - although NeoO crashed once, the rest of the time it DID do each process OK - mainly fast, the very slow one being deleting all para breaks and making one huge para.

I did several hundred pages at a time, as suggested.

Very grateful indeed for your invaluable help, without which I'd not have got anywhere with this huge job.

I'm so sorry if I have been cluttering up the wrong forum for this - my ignorance - and thanks James for the link to that resources page, which I'll look at more closely, 'next time'.

Thanks again, and all the best,

Jeremy.
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Wed Mar 20, 2013 4:37 pm    Post subject:

bryoman wrote:
Well - I got the job done - although NeoO crashed once, the rest of the time it DID do each process OK - mainly fast, the very slow one being deleting all para breaks and making one huge para.


Can you reproduce the crash? While NeoOffice can be slow, it should not crash.

If you can reproduce the crash, would it be possible for you to attach the document and the find and replace steps to do that trigger the crash?

Patrick
Back to top
Display posts from previous:   
   NeoOffice Forum Index -> NeoOffice Releases All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © Planamesa Inc.
NeoOffice is a registered trademark of Planamesa Inc. and may not be used without permission.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.