sql select include 10 rows previous to query match

Andrew_Willyan · January 28, 2014, 1:41pm

Hi all,

Im after an sql statement (if it exists) or how to set up a method using several sql statements to achieve the following.

I have a listbox and a search text box.

in the search box, user would enter a surname e.g. smith.
i then want to query the database for the search with something like this
select * FROM customer where surname LIKE searchparam

This would give me all the results for customers with surname containing SMITH, simple, right?

What i need to do is limit the results returned. This statement could give me 1000’s of rows if the search param was just S.
What i want is the result, limited to the first 20 matches AND the 10 rows prior to the 1st match.

For example, SMI search

Sives
Skimmings
Skinner
Skipper
Slater
Sloan
Slow
Small
Smallwood
Smetain
Smith ----------- This is the first match of my query. But i want the previous 10 and following 20.
Smith
Smith
Smith
Smith
Smoday
Smyth
Snedden
Snell
Snow
Sohn
Solis
Solomon
Solway
Sommer
Sommers
Soper
Sorace
Spears
Spedding

Is there anyway to do this?
As few sql statements as possible.

Reason? I am creating an app for users with slow internet connections.

I am using POSTGRE v9

Thanks
Andrew

Andrew_Willyan · January 28, 2014, 1:42pm

Also something of importance, I don’t want to trigger any Beginning of file or end of file errors (or whatever you sql gurus call them).

Jay_Menna · January 28, 2014, 2:00pm

Post this question on stack overflow. The SQL community there is gigantic, fanatic, all over the world And focused only on SQL. I’ll bet you have an answer in under an hours

Andrew_Willyan · January 28, 2014, 2:02pm

will do, thanks.

I guess it isn’t really xojo related, regardless of the medium, it is still sql.

Cheers.

DaveS · January 28, 2014, 2:29pm

I don’t think it can be done “directly”
what you would need is a numeric field that was concurrent and sequential (ie. 1…2…3…4…etc) with no gaps that matched the alphabetic order of your key field.

Then you could do something like this

SELECT CUST_ID FROM customer WHERE surname like 'SMITH%' and cust_id=MIN(select cust_id from customer where surname like 'SMITH%')

This will return the FIRST Smith in case there are multiple

cust_id=rs.field("cust_id").integervalue

Get the value of that First Smith

SELECT * FROM customer WHERE cust_id>="+str(cust_id-10)+" and cust_id<="+str(cust_id+10)

This is off the top of my head…but gives you an idea.

Jay_Menna · January 28, 2014, 2:45pm

[quote=61826:@Andrew Willyan]will do, thanks.

I guess it isn’t really xojo related, regardless of the medium, it is still sql.

Cheers.[/quote]

I hope i didn’t sound like I was nit picking. I have just found the depth of knowledge at Stack overflow, specifically for SQL queries to be amazing. Just trying to solve you issue quickly and elegantly

Eduardo_Gutierrez_de_Oliveira · January 28, 2014, 3:34pm

While I agree you should ask in a place where this topic is relevant, you can do this with multiple queries or with subselects (Dave’s proposal can be made into a single query with multiple sub-queries within it).

Keep in mind that using a DB like this is a horrible practice and the StackOverflow guys will probably chastise you for it (as it depends on row sorting to be assumed in a way). SQL queries are designed to actively discourage practices like this (which is the reason you can’t just get the row number for the line you’re looking for).

Tobias_Bussmann · January 29, 2014, 1:06am

have a look to Window Functions and / or Common Table Expressions (CTE). With these it should be possible in an elegant way.

I quickly hacked something together on my psql, can be done more cleanly for sure:

create table t (v varchar); insert into t select chr((random() *(90-65)+65)::int) from generate_series(1,20); select * from t order by v; with a as (select * from t union all select 'C'), o as (select v, row_number() over (order by v) as n from a), m as (select min(n) as x, min(n)-5 as l, min(n)+10 as u from o where v='C') select v from o where n between (select l from m) and (select u from m) and n <> (select x from m);

further readings: http://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS , http://www.postgresql.org/docs/devel/static/functions-window.html#FUNCTIONS-WINDOW , http://www.depesz.com/tag/window-functions/ http://www.depesz.com/tag/cte/

Mike_Charlesworth · January 29, 2014, 1:14am

Are you using an auto inc id field?

Mike_Charlesworth · January 29, 2014, 1:25am

You could do something like (Please check syntax this is on the fly untested)

SELECT TOP 10 * FROM customer WHERE surname LIKE 'SM%' ORDER BY Surname UNION SELECT * FROM customer WHERE surname LIKE 'SMI%' ORDER BY Surname LIMIT 0 ,20

Note: The first query uses ‘SM’ in the like clause whereas the second query uses ‘SMI’.

DaveS · January 29, 2014, 1:26am

That would only work (assuming I think I know where you are headed), it the field is sequential AND consecutive (no gaps)

Also… Doesn’t you above code return a RANDOM 20 records… I think the OP wants to BRACKET a specific record +/- 10 records

They would have to have a unique index value (NOT row_id), otherwise the order of return is not guaranteed.

DaveS · January 29, 2014, 1:28am

You are making assumptions about the content of the data… there could be ONE Smith, preceded by “Cxxx” and followed by “Qxxx” nothing says the data is known or must meet a pattern. 10 previous records from an alphabetic sequence, regardless of the distance of the values.

Tim_Hare · January 29, 2014, 2:04am

While the intent appears to be to bracket a given record inside a “window” of data, the example given is nonsensical, as it combines a LIKE selection with records that don’t match the criteria. A more precise statement of the problem/desired result would be in order.

Mike_Charlesworth · January 29, 2014, 2:07am

That’s why I saiid he would need SM and SMI I his like statements

Andrew_Willyan · January 29, 2014, 12:45pm

Hi guys,

Thanks for all the replies…

To clear up a few things, The data will be out of order (not alphabetical) and there will be a primary key id field, which my app will auto inc the id.

I want the table sorted by surname and then find the closest match using search%. once found, list the previous 10 records and the following 20 records in surname order. So an ID field would not work in this situation. (I think Dave and Tobias get what i am after)

I posted over on stack overflow (Thanks Jay) and received a reply somewhat similar to the example Tobias gave, but a bit cleaner and checked with SQLfiddle.com.

the solution that works is:

WITH ranked AS (
SELECT , ROW_NUMBER() over (ORDER BY surname) AS rowNumber FROM customer
)
SELECT ranked.
FROM ranked, (SELECT MIN(rowNumber) target FROM ranked WHERE surname LIKE ‘Smi%’) AS found
WHERE ranked.rowNumber BETWEEN found.target - 10 AND found.target + 20
ORDER BY ranked.rowNumber;

This is the SQLfiddle result:

http://www.sqlfiddle.com/#!15/a47b1/43/0

Now i just need to know how to wrap this up into an sql execute? command in xojo.
Never done anything with CTE before just simple select * from table where x like ‘search’ type stuff.
Any help on this one please? Tobias, it is a lot like yours, how would you implement this in xojo code?

Thanks again for all your input, I’m nearly there!
Andrew

Maximilian_Tyrtania · January 29, 2014, 3:33pm

What exactly your SQL looks like makes no difference. You can just go ahead and do your MyRecordSet=myDBConnection.SQLSelect(MyCoolCTESQL) as if it was your everyday select. Oh, don’t use SQLExecute, you’re expecting a result.

Tobias_Bussmann · January 29, 2014, 5:09pm

As Maximilian already said, use a SQLSelect with the Query, not an SQLExecute. As usual, you should better do a prepared statement to hand over the parameters.
The main difference between my solution and the one you’ve posted from stack overflow seems to be that I first add the search term itself to the data source and remove it later from the result to ensure we get a result-window even if the search does not find any matches by itself. If you are working on a big table, make sure to test this query with EXPLAIN toughly, I suspect inserting the ranking will led to a full table scan, whereas the prefix sub string match alone could benefit from a index.
Knowing the full use case and table definitions may allow a more optimized query.

Eduardo_Gutierrez_de_Oliveira · January 29, 2014, 9:38pm

[quote=62056:@Tobias Bussmann]As Maximilian already said, use a SQLSelect with the Query, not an SQLExecute. As usual, you should better do a prepared statement to hand over the parameters.
The main difference between my solution and the one you’ve posted from stack overflow seems to be that I first add the search term itself to the data source and remove it later from the result to ensure we get a result-window even if the search does not find any matches by itself. If you are working on a big table, make sure to test this query with EXPLAIN toughly, I suspect inserting the ranking will led to a full table scan, whereas the prefix sub string match alone could benefit from a index.
Knowing the full use case and table definitions may allow a more optimized query.[/quote]

God time to remember this fantastic thread from the old forum:

http://forums.realsoftware.com/viewtopic.php?f=3&t=4342

Andrew_Willyan · January 30, 2014, 3:52am

Great, Thanks a bunch, got it working.

Did a test on Tobias’s code and stack overflows with a 20000 record table randomly generated like Tobias did
insert into t select chr((random() *(90-65)+65)::int) from generate_series(1,20000);

Stack overflows way worked quicker in all 15 tests i did. was getting execution time of around 300-600ms with Tobias’s and 120-300ms with stack overflows.

So thanks heaps Tobias for your code, at very least seeing it done a couple of ways helped me to understand how this work better.