Data Storage Approach

Scott_Crick · February 27, 2025, 1:08am

I wrote before that I’m working on an app that reads data from a Davis weather station. With the help provided in that thread, I have something working well in retrieving the data and displaying it.

This project is something that is really going to only be for personal use (it’s not going to be anything I’m developing for public distribution). I’m mostly using this as a testbed application for me to tinker with to learn about things I’ve not really had need to deal with in the past.

Now, I’m wanting to start looking at storing historical data in a way that it is easy to retrieve. My initial thoughts were something as simple as a CSV file, but have also considered using a simple SQLite database.

All the data that I would be storing would be simple strings, a date and some integers (which would probably be stored as strings and converted as necessary). I’d like to be able retrieve data from the datastore for a specific date/time or a range of dates. All well and good, that simply points to a SQLite database, right?

But, then there’s a wrinkle. As I consider how I want to structure all this, I realize that I may want to run this application on multiple computers, but all pulling from the same datastore. If I have the application store something locally, that eliminates multiple application instances on different computers, right? Of course, I could simply have each instance of the application pull its own data from the weatherstation, but that seems wasteful and also opens up the possibility of data that doesn’t match. So, this points to a server solution.

I have a server host that offers a MySQL database, but something in my head tells me that this is way overkill for the very simple need I’m looking at here and I wonder if there is a simpler, lighter solution here that I’m thinking about.

I’d appreciate any thoughts on how I should structure and store this data within the context I describe above.

What am I missing in the way I’m thinking about this?

Scott_C · February 27, 2025, 1:19am

If your computers are all on the same network, CubeSQL could be a good choice. CubeSQL uses SQLite on the backend.

@Jürg_Otter was just talking about how the CubeSQL plugin is open source

As well, recently there was a blog post about using PocketBase on your local network. PocketBase also uses SQLite.

Scott_Crick · February 27, 2025, 3:50am

I thought about running a local database server on my Mac mini server, but one of the Mac’s I think I’d like to be able to access the data is a MacBook Pro that travels with me, so I can’t guarantee it’ll always be on my network. Which, I know, points more specifically at my hosting solution, unless there’s some other simpler solution I’m completely whiffing on.

But, I’ll look into CubeSQL. Maybe a good solution if I just give up remote access.

Scott_C · February 27, 2025, 4:14am

In the past (8-10 years ago?) when I used to have servers on my home network (part of a remote job requirement), I could expose a port externally via my ISP and access those servers when I travelled. But I don’t know if ISP’s let you do that sort of thing anymore, unless you have a business account or something with a static external IP address.

Of course you’d need a router you can configure yourself, with a good firewall.

Craig_Boyd · February 27, 2025, 1:14pm

This alone, as you correctly note, means that you need a database server of some kind. If you already have a server host running internally to your network then why not use it? I have a simple PostgreSQL server that has 23 different databases on it. In the interest of full disclosure only two of those are constantly active. The others are various projects that I have worked on or are working on and they all range in size.

If the MySQL database is external to your network then you will need to add a layer of protection on top of that rather going straight to it. Generally that layer would be a set of API calls that are secured in some way. It is generally considered a “bad idea” to have a database running openly on the internet. Somebody here or Uncle Google can explain it in greater/better detail than I can.

I hear lots of good things about CubeSQL so if you are interested in learning that bit of technology then I would say go for it.

As for remote access I would setup a router that will allow you to VPN into your home network. I know people open up ports on their routers and allow incoming traffic to a specific host, but since I do not consider myself expert enough to monitor something like that and do a good job of defending against bad actors I always try to find ways to NOT do that. Plus this feeds directly into my previous comment about how it is generally a bad idea to expose databases to the internet as a whole.

Jean-Yves_Pochez · February 27, 2025, 4:12pm

the domotic measured datas in my home network are stored in a postgres database on a 2010 macmini.
works like a charm.