Postegresql How to find the next available number?

Tim_Seyfarth · October 19, 2023, 12:05am

Hello all,

I have a table that has account numbers in them. Over time either through a mistake or the account being deleted, there are now holes in the list. What is the best SQL command to tell it to find the next available number starting at X and moving to the end of the ordered record set?

For example:
1000
1001
1002
1004
1005
1008
1009

How to advise that the next available would be 1003?

Thanks,
Tim

Greg_O · October 19, 2023, 12:37am

My immediate advice is not to add this to your software. If you ever add a feature that relies on related tables which use the account as the key (since they have to be unique) and you delete a row in this table, you run the risk of having related rows pointing to the wrong data.

Rick_Araujo · October 19, 2023, 12:55am

Usually we don’t reuse IDs released by any reason. Specially if your system need to track what happened. Imagine a sale cancellation, the items sold return to the shelves and the record is “released”. Well, we don’t delete them, we just mark them as “undone/returned”, so the 1003 sale is there for historical reasons. The sequence generators just go forward not reusing abandoned numbers.

https://database.guide/how-nextval-works-in-postgresql/

Tim_Seyfarth · October 19, 2023, 1:14am

Thanks guys!

Those are very good points!
In my case, I have many blocks open, and would like to fill them - with new entries.

I have done what you suggested - Do not Delete - and have marked them with an R in the Active Column instead of a Y or N.

So either I have to fill in the unused numbers, OR start a new sequence.
If fill in numbers, then I need that code to do so as stated in the original question

Tim

Eric_Williams · October 19, 2023, 1:23am

I think you’re going to have to write a query that returns unused numbers.

But why do you feel compelled to do this? I’m guessing there is some real-world, physical reason - perhaps a filing system? As another poster noted, it’s really not a good idea to reuse IDs like you are contemplating.

Tim_Seyfarth · October 19, 2023, 1:27am

In this case, they are not re-used. They were never used before but there was a problem with the code and over time holes appeared.

Tim

Eric_Williams · October 19, 2023, 1:35am

Ok. So - the question still stands: why are you concerned with them?

Tim_Seyfarth · October 19, 2023, 1:38am

Because there a both large and small blocks of missing numbers.

Anal I suppose…
Tim

Greg_O · October 19, 2023, 1:52am

Almost every database engine has a way to make a field auto-increment and set the starting number. I’d suggest that you do it that way.

Kem_Tekinay · October 19, 2023, 1:53am

Something like this should work:

select
  g.v
from
  generate_series(1, 10) as g (v)
  left join t on t.id = g.v
where
  t.id is null
order by 
  g.v
limit 1
;

Replace 1 and 10 with the range you are testing.

Edit: But I agree that backfilling gaps is not recommended, and you should use a Sequence to pick the next value for the column automatically. PostgreSQL lets you assign BIGSERIAL or SERIAL to set that up for you automagically.

Tim_Seyfarth · October 19, 2023, 2:09am

Thanks everyone!
I appreciate your insight and suggestions!

Tim

Eric_Williams · October 19, 2023, 4:18am

Yeah, you’ve correctly described your impulse. I can relate and I suggest you just leave well enough alone.

Eric_Williams · October 19, 2023, 4:22am

Sounds like he has a Swiss cheese index, where he has noncontiguous blocks of missing numbers. I don’t think the standard field options are going to work here.

Tim_Hare · October 19, 2023, 4:41am

Greg’s approach still applies, though.

Unless the numbers have grown incredibly large, there’s no point in filling in the gaps.

Greg_O · October 19, 2023, 1:00pm

To be fair, I once did what @Tim_Seyfarth is describing when working for a mail forwarding service where the box numbers were originally reused after a time. I warned my bosses about the potential issues, but they wanted to do it anyway. Many years later, they also gave up on “filling the gaps” as it also meant that you could not assume an account’s age based on its number, which employees were prone to do.

Thom_McGrath · October 19, 2023, 1:26pm

In PG, the SQL to change the value of the sequence is ALTER SEQUENCE tablename_columnname_seq RESTART WITH anynumber;

The problem with changing the value of the sequence is that it’ll re-choose values that were already used. If you have ids 1 and 5, the normal next value would be 6 of course. If you set the next value to 2, once you have an id 4, currval will return 5 and you’ll get a conflict.

My advice, like so many others have repeated, is just don’t mess with it. It doesn’t really matter.

PG’s currval function is what powers their serial column type. But for very good reason, it’s not transaction safe. When called, it’ll return the next value in the sequence and immediately increment the sequence. If your transaction is aborted, the sequence is still incremented. It does this so that two transactions have no possibility of choosing the same id. My guess is this is why there are gaps. Gaps are normal.

Next time you’re designing a table, consider using a UUID for your primary key. They’re easy to use and there won’t ever be gaps because they are unordered anyway. And PG stores them as 128 bit numbers, so they still compare quickly. It would be a nightmare to transition to them now for this table, but something to keep in mind for next time.

Paul_Chance · October 19, 2023, 5:07pm

One of the difficulties/challenges in designing computer schemes is getting out of the “human” mindset.

There was a great cartoon on the xkcd.com site. The essence of it was that various password content requirements just made it difficult for humans to remember the password but a brute force computer doesn’t care.

I saw the result of this at work where we were given new password creation requirements (note that intrusions should have been stopped at a firewall) with the result that many had their passwords written on PostIt notes stuck to their monitors.

So with your number gap scheme - the computer doesn’t care. And the concern about “history” is very valid. I’d be asked to design systems where the client wanted a product price pulled from the product file whenever a surer exited the product code in a line item. Problem was, the client never considered “history”. Under their scheme, if the user brought up an old order, as requested, the current prices would be pulled over from the product file - which might not be what was on the original invoice.

I get it - those numbers are not being ReUsed - they were never used in the first place. But, because of the increment, you have an implied chronology. If you start “plugging in” the holes, you lose a bit of information because the highest number is no longer the most recent activity.

Sure - if you have a date field, you can sort/view on it. But still, all the nerves in my programming fingers are screaming that sometime in the future, skipping around on account numbers with respect to issue dates, will cause more confusion.

All that said - I see the irony. I’m telling you the gaps are a human feeling concern, not a computer concern. Yet, with a date field for chronology sorting, the actual account number doesn’t matter either.

Which begs the question, why am I so proud of my low digit REI account number