Welcome to NeoOffice developer notes and announcements
NeoOffice
Developer notes and announcements
 
 

This website is an archive and is no longer active
NeoOffice announcements have moved to the NeoOffice News website


Support
· Forums
· NeoOffice Support
· NeoWiki


Announcements
· Twitter @NeoOffice


Downloads
· Download NeoOffice


  
NeoOffice :: View topic - Database size
Database size
 
   NeoOffice Forum Index -> NeoOffice Releases
View previous topic :: View next topic  
Author Message
clac1
Red Pill


Joined: Jan 25, 2009
Posts: 5

PostPosted: Sun Jan 25, 2009 1:01 am    Post subject: Database size

Hello,
I created a database with 5 tables, relationship and 4 forms (total size = 110 Mb).
In order to make structure modifications on the biggest table, I copy/paste it serveral times.
The result is that the database size increase a lot (total size is now 650 Mb). So I dropped the copyed tables but the database size remains the same...

And now the performance are very very bad...
How could I reduce the size in order to accelerate response time?

Thanks in advance for your help !
Back to top
narf
The Anomaly


Joined: Jan 21, 2007
Posts: 1075

PostPosted: Sun Jan 25, 2009 10:28 am    Post subject:

I doubt the file size affects performance. The most likely cause of slow performance is due to how tables are defined and what queries are used against them.

Can you give us some more information about your database so we can try to understand it a little better? Most likely the best thing to do is take a closer look at your table structure and queries and see if there is any obvious performance problems.

Can you give us some more information about the table structure, relationships and queries you are using? Specifically, does each table have a primary key field? What is the name and type of of the primary key field?

Do you have queries that join (or link) more than one table? What fields are they joined on?

I ask this because if tables are joined without primary keys each record in the first table it has to look through every record in the next table, which is like going through all the records in table 1 multiplied by all the records in table 2. Each additional table that is queried without a primary key takes an exponential amount of processing.

--fran
Back to top
clac1
Red Pill


Joined: Jan 25, 2009
Posts: 5

PostPosted: Sun Jan 25, 2009 11:50 am    Post subject:

Thanks for your help !
Yes, every table has a primary key.
My performance problem appears when opening the DB, when saving it, when opening the main table, when closing it...

I create a new empty DB and copy/paste all tables and forms. after that, I rebuilt the relationship and the size is very correct (les than 100 Mb).
But if I must repeat those operations on regular basis, it'll be boring !
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Sun Jan 25, 2009 1:07 pm    Post subject:

clac1 wrote:
Thanks for your help !
Yes, every table has a primary key.
My performance problem appears when opening the DB, when saving it, when opening the main table, when closing it...

I create a new empty DB and copy/paste all tables and forms. after that, I rebuilt the relationship and the size is very correct (les than 100 Mb).
But if I must repeat those operations on regular basis, it'll be boring !


Since an .odb file is a file database, adding, deleting, editing tables or records in tables may cause the database table data to eventually become fragmented. Fragmented means that the table data is scattered throughout the .odb file and the data is not in sequence so querying the tables forces the database engine - HSQLDB - to have to jump back and forth to different points in the file.

Realistically, a file database like HSQLDB is designed for small data sets and I suspect that a 100 MB database is close to the point where performance in any file database is going to drop off sharply.

Since you have 100 MB of data, I assume that reconstructing that data if it were lost or damaaged would be a major effort. If so, I would recommend that you move to a true relational database like MySQL. MySQL is a free database that will run on Mac and NeoOffice Base can connect to.

We use MySQL as the database for this site and our Bugzilla site primarily because MySQL's performance does not degrade as badly as file databases do with large data sets. Also, MySQL runs as its own server on your machine and, because of this, is much more capable of avoiding fragmentation and accidently database deletion or corruption.

Patrick
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Sun Jan 25, 2009 5:14 pm    Post subject:

pluby wrote:
Since an .odb file is a file database, adding, deleting, editing tables or records in tables may cause the database table data to eventually become fragmented....


I think my explanation of fragmentation in the database was not very good so I will try to explain why this a little better.

Basically, your .odb file is a big zip file. Each table, query, and form is stored as a compressed subfile within the zip file. Since Mac OS X files do not allow inserting new data or deleting existing data in the middle of a file, when you change the data in a table, the changed data is appended onto the end of the zip file and the old data remains in the middle of the file as dead bytes. As a result, if you have large tables, making changes to those tables will cause your .odb file to quickly grow to a huge size.

While I still recommend that you migrate your database to a more scalable database like MySQL, there is a manual way to strip out the old unused bytes in your .odb. However, it requires use of the "unzip" and "zip" commands in the Terminal application.

If you are comfortable with using the Terminal application and using Unix commands, let us know and I can post the commands.

Patrick
Back to top
Lorinda
Captain Mifune


Joined: Jun 20, 2006
Posts: 2051
Location: Midwest, USA

PostPosted: Mon Jan 26, 2009 6:26 am    Post subject:

With the database in question open, go to the Tools menu and select SQL...
In the "command to execute" box of the window that opens, type

Code:
CHECKPOINT DEFRAG


Then hit the Execute button.

This should shrink the database size.

Lorinda
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Mon Jan 26, 2009 8:23 am    Post subject:

Lorinda wrote:
With the database in question open, go to the Tools menu and select SQL...
In the "command to execute" box of the window that opens, type

Code:
CHECKPOINT DEFRAG


Then hit the Execute button.

This should shrink the database size.


Sorry Lorinda, I had already tried that before my posts and it did not have any significant impact in my tests. Copying a table, deleting the table, and running that SQL command will still cause the .odb file size to grow.

That is when I realized that the issue isn't the actual size of the tables but is due to pushing those changes from memory back into the zip file. The zip file issue also would explain why clac1 is seeing really slow performance opening and saving the file as opening cause unzipping of the big .odb file and saving causes zipping the changes and appending them onto the .odb file.

Patrick
Back to top
clac1
Red Pill


Joined: Jan 25, 2009
Posts: 5

PostPosted: Mon Jan 26, 2009 8:39 am    Post subject:

Yes, I've no problem with Teminal nor Unix commands. And I'm very interested in your commands.
Back to top
clac1
Red Pill


Joined: Jan 25, 2009
Posts: 5

PostPosted: Mon Jan 26, 2009 8:40 am    Post subject:

I forgot to explain that the big size is the result of storing images in a table.
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Mon Jan 26, 2009 8:57 am    Post subject:

clac1 wrote:
Yes, I've no problem with Teminal nor Unix commands. And I'm very interested in your commands.


Use the following steps:

1. Make a new, empty temporary folder using the following Terminal command:

Code:
mkdir -p "/path/to/new/temp/folder"


2. Unzip the .odb file in that temporary folder using the following commands:

Code:
cd "/path/to/new/temp/folder"
unzip "/path/to/odb/file"


3. Rezip the unzipped files in that temporary folder into a new .odb file (I highly recommend that you do not overwrite your existing .odb file):

Code:
cd "/path/to/new/temp/folder"
zip -r "/path/to/new/odb/file" *


Patrick
Back to top
clac1
Red Pill


Joined: Jan 25, 2009
Posts: 5

PostPosted: Mon Jan 26, 2009 9:12 am    Post subject:

Very, very interesting !

The DB is 158 Mo now.
The Terminal commands create a new file : 150 Mo...
The Checkpoint defrag reduce the size to ... 20 Mo !

That's fantastic !

Thanks a lot for your help !
Back to top
Lorinda
Captain Mifune


Joined: Jun 20, 2006
Posts: 2051
Location: Midwest, USA

PostPosted: Mon Jan 26, 2009 9:18 am    Post subject:

pluby wrote:


Sorry Lorinda, I had already tried that before my posts and it did not have any significant impact in my tests. Copying a table, deleting the table, and running that SQL command will still cause the .odb file size to grow.

Patrick


Interesting! Checkpoint Defrag is the standard answer to these types of questions over at the OpenOffice.org Community forums, which is why I suggested it here. But obviously there's more to it than that in some cases.

Lorinda
Back to top
Markk
Operator


Joined: Mar 15, 2007
Posts: 43
Location: Wisconsin US

PostPosted: Mon Jan 26, 2009 10:30 am    Post subject:

pluby wrote:

Basically, your .odb file is a big zip file. Each table, query, and form is stored as a compressed subfile within the zip file. Since Mac OS X files do not allow inserting new data or deleting existing data in the middle of a file, when you change the data in a table, the changed data is appended onto the end of the zip file and the old data remains in the middle of the file as dead bytes. As a result, if you have large tables, making changes to those tables will cause your .odb file to quickly grow to a huge size.


Is this true? Gack! No data base I know of including embedded ones like sqllite or Berkeley work like this. Yes there will be fragmentation from different sized gaps, and growth because of logging, but updates or adds generally do make new versions inside reused records if they can, and the fragmented space is reused over time to a degree in all the other DB's I've used.

Storing pictures in the OO database sounds like it is pushing it though. Although I guess that will become more common.

Checkpoint defrag or its equivalent (vacuuming, etc) should solve this to some degree anyway, if you didn't see anything that would be odd, if it really was fragmentation you were causing.

I assumed that Neo keeps the database unzipped when open, so that access could be faster. I'll have to check that out! I could see why opening and closing would matter because the compression / uncompression at that time wouldn't really care that space is unused, only allocated.
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Mon Jan 26, 2009 10:42 am    Post subject:

Markk wrote:
Is this true? Gack! No data base I know of including embedded ones like sqllite or Berkeley work like this. Yes there will be fragmentation from different sized gaps, and growth because of logging, but updates or adds generally do make new versions inside reused records if they can, and the fragmented space is reused over time to a degree in all the other DB's I've used.


I assume that some of the gaps will eventually get reused. However, in clac1's case, copying tables and inserting lots of images is a pretty large block of data that is available for reuse. Wink

As for using checkpoint defrag, I could not get the size of the .odb file to ever shrink after copying an existing table to a new one, deleting the new table, running checkpoint defrag, and saving the .odb file. No matter what I did, the .odb file grew. Maybe at some point the growing would stop as the unused zip file gaps would be filled in, but I did not reach that point.

Patrick
Back to top
Display posts from previous:   
   NeoOffice Forum Index -> NeoOffice Releases All times are GMT - 7 Hours
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © Planamesa Inc.
NeoOffice is a registered trademark of Planamesa Inc. and may not be used without permission.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.