Hi, at the company i work we store documents (plans, invoices etc) on the server, and links in the database, BUT we store images of products in database too. So when one of my coworkers want’s to see how the product looks like, he just checks it in an app connected to the database.
tl;dr: store links, small images in database / store documents on the file server
SQLite:
I would just encode the binary file content and store it in a TEXT field. A TEXT field can hold in fact anything.
Encoding / decoding is simple: use Encodebase64
There are pros and cons. Obviously DB will become bloated, which may impact performance. Storing them separately means the links could easily be broken, however.
Most databases provide a way to store large objects in them. For SQLite, you can use BLOBs and keep the BLOB in a separate related table so it does not impact queries on the primary table. For PostgreSQL you can use LargeObjects.
I would avoid using EncodeBase64 to embed binary data into Text fields as that will be slower and bloat the DB even more.
The MBS SQLite extension lets you write files to the database and it stores them as files separate from your database, but treated as if BLOB fields. I have tested it and it works and have an application in mind for it. http://www.monkeybreadsoftware.de/SQLiteExtension/
Currently, one of my apps stores files in it as zipped, EncodeHex’ed BLOBS in PostgreSQL and, while it works, is very slow to add or retrieve the files since I have to do it in chunks.
A while ago, when I had to take a decision regarding this matter for a database driven project, I took into consideration where and how to use data.
What about the usage of triggers and version control of data (audit trail) ?
What about retrieving pictures from the database for reporting ?
What about replication of data … . etc.
And I did some performance testing and learned that encoded TEXT did not perform much worse than BLOB read / written by blocks in a transaction . PostgreSQL provides the functionality for encoding / decoding.
It is true that you will end up in a bigger database filesize using encode TEXT.
[quote=320420:@Joost Rongen]SQLite:
I would just encode the binary file content and store it in a TEXT field. A TEXT field can hold in fact anything.
Encoding / decoding is simple: use Encodebase64[/quote]
unless the files are really small, I would place them in a folder and reference them. I did this and my sqlite DB files grew to over TB in size fairly quickly. I switched to files in a folder and references in the DB and the DB file was under a GB. Well under.
Microsoft did a study on this and determined that smaller files (under a MB or less perhaps) it was faster to store in the DB. Any modern database is essentially is its own file system and can read/write faster than the more generic file system of the OS. Most file systems in use today are very old and not optimized for what DB’s are good at.
Another factor to consider is most database servers have built in authentication. If you keep the files on the file system you now have two permission sets to maintain, logs to record, etc.
That being said you have to be very intelligent on how you do it. There is no right or wrong answer because they are just different levels of abstraction.
and actually the limitations of modern PostgreSQL, SQL Server etc. are far beyond what we need with our applications.
Back in 2005 I was in a project with SQL Server containing 80 million records over just a few tables with lots of simultaneous busy connections making 30 changes per second at random to the database. With a little DBA work it was very stable and reliable. I prefer a modern database above storing files in the filesystem.
An OS filesystem is a database of paths, folders, permissions, etc. The main contention is if you are in a corporate environment storing sensitive data you might have Active Directory users, SQL users, and application-level users. There might be some convergence on those depending on your environment.
If you sensitive data on the file system you have to assure that the same employees have access to the necessary data despite having to manage multiple authentication schemes.
Larger companies when you separate have checklists of applications, servers, groups, etc. that they have to remove the user from to verify data security. Storing the data in one central place that is optimized for access/retrieval/permissions can be optimal because one central authority on data access and one central point of logging.
Phillips point is interesting; in most comanies you have to make sure that only authorized persons have access to data and files and…
I have no experience with PostgreSQL; but MS-SQL and Oracle have no problems dealing with huge data files with lots of (BLOB) data.
As for SQLite I do not know how reliable handing BLOB data is and if it might cause performance problems if the db file gets too big with too much (B)LOB data.
Maybe it is a good idea to have at least two data files: One “main” file, and one separated file for (B)LOB data…?!
I see three big problems with storing files inside database:
User Stupidity
You must stop users performing a SELECT * on a database that has BLOBs. The RAM and network issues are potentially crippling.
Database crash
If your database becomes corrupted, you potentially lose EVERY file.
Backups
Unless you have highly intelligent backups, incremental backups of tables with BLOBs is very hard, whereas it is trivial if the files are stored in the file system.
[quote=320837:@David Cox]I see three big problems with storing files inside database:
User Stupidity
You must stop users performing a SELECT * on a database that has BLOBs. The RAM and network issues are potentially crippling.
Database crash
If your database becomes corrupted, you potentially lose EVERY file.
Backups
Unless you have highly intelligent backups, incremental backups of tables with BLOBs is very hard, whereas it is trivial if the files are stored in the file system.[/quote]
#1 Not sure who you allow to run ad-hoc queries. If the BLOB field is separated from data it should not be a concern.
#2 SQLite is more resilient to corruption than your file system is.
#3 This one can go either way depending on what backup tools you use, etc. Syncing/backing up hundreds of thousands of files could take potentially longer than a SQL database. Especially if you use logging as your backup to only backup new data/changes.