Files in database

Hello,

I want to “save” file information in my database (SQLite; different DB would be possible).

Better save the files in the database (BLOB), or just save the link to the files in the database?

What is your opinion on that?

Thank you very much!

Best,

Olaf

Hi, at the company i work we store documents (plans, invoices etc) on the server, and links in the database, BUT we store images of products in database too. So when one of my coworkers want’s to see how the product looks like, he just checks it in an app connected to the database.
tl;dr: store links, small images in database / store documents on the file server

SQLite:
I would just encode the binary file content and store it in a TEXT field. A TEXT field can hold in fact anything.
Encoding / decoding is simple: use Encodebase64

…thank you very much for your replies!!!

Storing files in the database may have the advantage to have everything in one file. But it might inflate the database…

Have no experience with that… :frowning:

Maybe saving (mall) images within the DB and larger files on the file system plus link tothe files stored in a table is the best solution.

Storing files in a DB is an age-old question.

There are pros and cons. Obviously DB will become bloated, which may impact performance. Storing them separately means the links could easily be broken, however.

Most databases provide a way to store large objects in them. For SQLite, you can use BLOBs and keep the BLOB in a separate related table so it does not impact queries on the primary table. For PostgreSQL you can use LargeObjects.

I would avoid using EncodeBase64 to embed binary data into Text fields as that will be slower and bloat the DB even more.

…thank you very much!

The MBS SQLite extension lets you write files to the database and it stores them as files separate from your database, but treated as if BLOB fields. I have tested it and it works and have an application in mind for it.
http://www.monkeybreadsoftware.de/SQLiteExtension/

Currently, one of my apps stores files in it as zipped, EncodeHex’ed BLOBS in PostgreSQL and, while it works, is very slow to add or retrieve the files since I have to do it in chunks.

A while ago, when I had to take a decision regarding this matter for a database driven project, I took into consideration where and how to use data.
What about the usage of triggers and version control of data (audit trail) ?
What about retrieving pictures from the database for reporting ?
What about replication of data … . etc.
And I did some performance testing and learned that encoded TEXT did not perform much worse than BLOB read / written by blocks in a transaction . PostgreSQL provides the functionality for encoding / decoding.
It is true that you will end up in a bigger database filesize using encode TEXT.

This article could be interesting.

[quote=320420:@Joost Rongen]SQLite:
I would just encode the binary file content and store it in a TEXT field. A TEXT field can hold in fact anything.
Encoding / decoding is simple: use Encodebase64[/quote]

unless the files are really small, I would place them in a folder and reference them. I did this and my sqlite DB files grew to over TB in size fairly quickly. I switched to files in a folder and references in the DB and the DB file was under a GB. Well under.

Microsoft did a study on this and determined that smaller files (under a MB or less perhaps) it was faster to store in the DB. Any modern database is essentially is its own file system and can read/write faster than the more generic file system of the OS. Most file systems in use today are very old and not optimized for what DB’s are good at.

Another factor to consider is most database servers have built in authentication. If you keep the files on the file system you now have two permission sets to maintain, logs to record, etc.

That being said you have to be very intelligent on how you do it. There is no right or wrong answer because they are just different levels of abstraction.

…very interesting comments; thx to all of you!

I tended to store the files in the file system and just save links to the files in the db.

But Phillips aspect - permissions - is also very interesting…?!

A OS filesystem is in fact a database … or ? (Mostly not maintained and optimized for years.)

and actually the limitations of modern PostgreSQL, SQL Server etc. are far beyond what we need with our applications.
Back in 2005 I was in a project with SQL Server containing 80 million records over just a few tables with lots of simultaneous busy connections making 30 changes per second at random to the database. With a little DBA work it was very stable and reliable. I prefer a modern database above storing files in the filesystem.

An OS filesystem is a database of paths, folders, permissions, etc. The main contention is if you are in a corporate environment storing sensitive data you might have Active Directory users, SQL users, and application-level users. There might be some convergence on those depending on your environment.

If you sensitive data on the file system you have to assure that the same employees have access to the necessary data despite having to manage multiple authentication schemes.

Larger companies when you separate have checklists of applications, servers, groups, etc. that they have to remove the user from to verify data security. Storing the data in one central place that is optimized for access/retrieval/permissions can be optimal because one central authority on data access and one central point of logging.

Phillips point is interesting; in most comanies you have to make sure that only authorized persons have access to data and files and…

I have no experience with PostgreSQL; but MS-SQL and Oracle have no problems dealing with huge data files with lots of (BLOB) data.

As for SQLite I do not know how reliable handing BLOB data is and if it might cause performance problems if the db file gets too big with too much (B)LOB data.

Maybe it is a good idea to have at least two data files: One “main” file, and one separated file for (B)LOB data…?!

Cheers…

I see three big problems with storing files inside database:

  1. User Stupidity
    You must stop users performing a SELECT * … on a database that has BLOBs. The RAM and network issues are potentially crippling.

  2. Database crash
    If your database becomes corrupted, you potentially lose EVERY file.

  3. Backups
    Unless you have highly intelligent backups, incremental backups of tables with BLOBs is very hard, whereas it is trivial if the files are stored in the file system.

[quote=320837:@David Cox]I see three big problems with storing files inside database:

  1. User Stupidity
    You must stop users performing a SELECT * … on a database that has BLOBs. The RAM and network issues are potentially crippling.

  2. Database crash
    If your database becomes corrupted, you potentially lose EVERY file.

  3. Backups
    Unless you have highly intelligent backups, incremental backups of tables with BLOBs is very hard, whereas it is trivial if the files are stored in the file system.[/quote]

#1 Not sure who you allow to run ad-hoc queries. If the BLOB field is separated from data it should not be a concern.

#2 SQLite is more resilient to corruption than your file system is.

#3 This one can go either way depending on what backup tools you use, etc. Syncing/backing up hundreds of thousands of files could take potentially longer than a SQL database. Especially if you use logging as your backup to only backup new data/changes.