Posted: Sun Jan 25, 2009 1:01 am Post subject: Database size
Hello,
I created a database with 5 tables, relationship and 4 forms (total size = 110 Mb).
In order to make structure modifications on the biggest table, I copy/paste it serveral times.
The result is that the database size increase a lot (total size is now 650 Mb). So I dropped the copyed tables but the database size remains the same...
And now the performance are very very bad...
How could I reduce the size in order to accelerate response time?
I doubt the file size affects performance. The most likely cause of slow performance is due to how tables are defined and what queries are used against them.
Can you give us some more information about your database so we can try to understand it a little better? Most likely the best thing to do is take a closer look at your table structure and queries and see if there is any obvious performance problems.
Can you give us some more information about the table structure, relationships and queries you are using? Specifically, does each table have a primary key field? What is the name and type of of the primary key field?
Do you have queries that join (or link) more than one table? What fields are they joined on?
I ask this because if tables are joined without primary keys each record in the first table it has to look through every record in the next table, which is like going through all the records in table 1 multiplied by all the records in table 2. Each additional table that is queried without a primary key takes an exponential amount of processing.
Thanks for your help !
Yes, every table has a primary key.
My performance problem appears when opening the DB, when saving it, when opening the main table, when closing it...
I create a new empty DB and copy/paste all tables and forms. after that, I rebuilt the relationship and the size is very correct (les than 100 Mb).
But if I must repeat those operations on regular basis, it'll be boring !
Thanks for your help !
Yes, every table has a primary key.
My performance problem appears when opening the DB, when saving it, when opening the main table, when closing it...
I create a new empty DB and copy/paste all tables and forms. after that, I rebuilt the relationship and the size is very correct (les than 100 Mb).
But if I must repeat those operations on regular basis, it'll be boring !
Since an .odb file is a file database, adding, deleting, editing tables or records in tables may cause the database table data to eventually become fragmented. Fragmented means that the table data is scattered throughout the .odb file and the data is not in sequence so querying the tables forces the database engine - HSQLDB - to have to jump back and forth to different points in the file.
Realistically, a file database like HSQLDB is designed for small data sets and I suspect that a 100 MB database is close to the point where performance in any file database is going to drop off sharply.
Since you have 100 MB of data, I assume that reconstructing that data if it were lost or damaaged would be a major effort. If so, I would recommend that you move to a true relational database like MySQL. MySQL is a free database that will run on Mac and NeoOffice Base can connect to.
We use MySQL as the database for this site and our Bugzilla site primarily because MySQL's performance does not degrade as badly as file databases do with large data sets. Also, MySQL runs as its own server on your machine and, because of this, is much more capable of avoiding fragmentation and accidently database deletion or corruption.
Since an .odb file is a file database, adding, deleting, editing tables or records in tables may cause the database table data to eventually become fragmented....
I think my explanation of fragmentation in the database was not very good so I will try to explain why this a little better.
Basically, your .odb file is a big zip file. Each table, query, and form is stored as a compressed subfile within the zip file. Since Mac OS X files do not allow inserting new data or deleting existing data in the middle of a file, when you change the data in a table, the changed data is appended onto the end of the zip file and the old data remains in the middle of the file as dead bytes. As a result, if you have large tables, making changes to those tables will cause your .odb file to quickly grow to a huge size.
While I still recommend that you migrate your database to a more scalable database like MySQL, there is a manual way to strip out the old unused bytes in your .odb. However, it requires use of the "unzip" and "zip" commands in the Terminal application.
If you are comfortable with using the Terminal application and using Unix commands, let us know and I can post the commands.
With the database in question open, go to the Tools menu and select SQL...
In the "command to execute" box of the window that opens, type
Code:
CHECKPOINT DEFRAG
Then hit the Execute button.
This should shrink the database size.
Sorry Lorinda, I had already tried that before my posts and it did not have any significant impact in my tests. Copying a table, deleting the table, and running that SQL command will still cause the .odb file size to grow.
That is when I realized that the issue isn't the actual size of the tables but is due to pushing those changes from memory back into the zip file. The zip file issue also would explain why clac1 is seeing really slow performance opening and saving the file as opening cause unzipping of the big .odb file and saving causes zipping the changes and appending them onto the .odb file.
Joined: Jun 20, 2006 Posts: 2051 Location: Midwest, USA
Posted: Mon Jan 26, 2009 9:18 am Post subject:
pluby wrote:
Sorry Lorinda, I had already tried that before my posts and it did not have any significant impact in my tests. Copying a table, deleting the table, and running that SQL command will still cause the .odb file size to grow.
Patrick
Interesting! Checkpoint Defrag is the standard answer to these types of questions over at the OpenOffice.org Community forums, which is why I suggested it here. But obviously there's more to it than that in some cases.
Joined: Mar 15, 2007 Posts: 43 Location: Wisconsin US
Posted: Mon Jan 26, 2009 10:30 am Post subject:
pluby wrote:
Basically, your .odb file is a big zip file. Each table, query, and form is stored as a compressed subfile within the zip file. Since Mac OS X files do not allow inserting new data or deleting existing data in the middle of a file, when you change the data in a table, the changed data is appended onto the end of the zip file and the old data remains in the middle of the file as dead bytes. As a result, if you have large tables, making changes to those tables will cause your .odb file to quickly grow to a huge size.
Is this true? Gack! No data base I know of including embedded ones like sqllite or Berkeley work like this. Yes there will be fragmentation from different sized gaps, and growth because of logging, but updates or adds generally do make new versions inside reused records if they can, and the fragmented space is reused over time to a degree in all the other DB's I've used.
Storing pictures in the OO database sounds like it is pushing it though. Although I guess that will become more common.
Checkpoint defrag or its equivalent (vacuuming, etc) should solve this to some degree anyway, if you didn't see anything that would be odd, if it really was fragmentation you were causing.
I assumed that Neo keeps the database unzipped when open, so that access could be faster. I'll have to check that out! I could see why opening and closing would matter because the compression / uncompression at that time wouldn't really care that space is unused, only allocated.
Is this true? Gack! No data base I know of including embedded ones like sqllite or Berkeley work like this. Yes there will be fragmentation from different sized gaps, and growth because of logging, but updates or adds generally do make new versions inside reused records if they can, and the fragmented space is reused over time to a degree in all the other DB's I've used.
I assume that some of the gaps will eventually get reused. However, in clac1's case, copying tables and inserting lots of images is a pretty large block of data that is available for reuse.
As for using checkpoint defrag, I could not get the size of the .odb file to ever shrink after copying an existing table to a new one, deleting the new table, running checkpoint defrag, and saving the .odb file. No matter what I did, the .odb file grew. Maybe at some point the growing would stop as the unused zip file gaps would be filled in, but I did not reach that point.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You cannot download files in this forum