How do I count records???

Hi,

I have some code which is driving me nuts:

If DigestStorage.Connect() then dim rs as RecordSet rs = DigestStorage.SQLSelect( "SELECT COUNT(*) FROM tblPeptide" ) sr.Text = Str( rs.RecordCount ) + EndOfLine + EndOfLine

I can follow it through in the debugger and I get

rs = DigestStorage.SQLSelect( “SELECT COUNT(*) FROM tblPeptide” ) <- rs has a RecordCount of 3

sr.Text = Str( rs.RecordCount ) + EndOfLine + EndOfLine <- sr.Text is 1

The weird thing is that when I mouse over rs.RecordCount the IDE tells me it’s a database cursor.

I can’t see anything wrong with the code (but it probably is) … any ideas?

Using (for backwards compatibility) REALSQLdatabase in RS2012R2.1 or Xojo 2014 R2 on MacOS X

If DigestStorage.Connect() then dim rs as RecordSet rs = DigestStorage.SQLSelect( "SELECT COUNT(*) FROM tblPeptide" ) sr.Text = rs.IdxField(0).StringValue + EndOfLine + EndOfLine

[quote=114307:@Eli Ott]If DigestStorage.Connect() then
dim rs as RecordSet
rs = DigestStorage.SQLSelect( “SELECT COUNT(*) FROM tblPeptide” )
sr.Text = rs.IdxField(0).StringValue + EndOfLine + EndOfLine[/quote]
Thanks, but that results in a NilObjectsException on the last line.

SELECT COUNT(*) FROM tblPeptide

as long as the table exists you’ll get a recordset with 1 row but the CONTENTS of the 1 field in the row is the count

so rs.RecordCount will be 1
but rs.IdxField(0).IntegerValue will be the results of the Count(*)

Do you have to assign an alias to COUNT(*) in order to get the value into the recordset vector?

SELECT COUNT(*) AS CNT FROM tblPeptide

Thanks Norman, but I still get a NOE with

sr.Text = Str( rs.IdxField(0).IntegerValue ) + EndOfLine + EndOfLine

idx fields are base 1 not 0. Also you should not use count(*), count(1) is way more efficient.

You’re right, so sorry… I need some rest!

yeah use the documentation to verify “forum code” :stuck_out_tongue:
and idxfield(1)

Eli Ott had it nearly - but it has to be IdxField 1.

IndexField 0 contains “Count(*)”

Thanks everyone.

Thanks Wayne. How do you read that? Count(*) I read as “count all”, but Count(1) ? I would have expected that to count the first record???

Count(*) will in some circumstances create a count for each field in your table. Count(1) will count each record - could be any number, it’s just placing a literal in the select statement. The other advantage is it will work with any table regardless of columns.

Thanks, good to know.

and if you have lots of really big blobs you avoid retrieving them all as part of the counting query

I can see how THAT would slow it down to a crawl :wink:

More importantly, COUNT(*) creates a data vector for the table, and passes some level of data thru it. Count(1) (or more traditionally COUNT(8)) does not which leads to the decrease in overhead

Why is COUNT(8) “tradtional” you ask? Look at your keyboard :slight_smile: 8 is the unshifted version of “*” :slight_smile:

Dave, I tried your suggestion in a large Postgres table (actually, two tables joined, just to increase the level of difficulty), and the “explain” output as presented by Valentina Studio looked identical to me. This suggests that some databases already optimize COUNT(*). Do you agree, or am I missing something?

SELECT count(*) FROM provider_nppe, provider_nppe_identifier 
WHERE provider_nppe_identifier.provider_nppe_id = provider_nppe.id

 23:10:04 select count(*) from provider_nppe, provider_nppe_identifier ...
  
  Aggregate  (cost=450245.69..450245.70 rows=1 width=0)
    Output: count(*)
    ->  Hash Join  (cost=214849.62..438059.02 rows=4874669 width=0)
          Hash Cond: (provider_nppe_identifier.provider_nppe_id = provider_nppe.id)
          ->  Seq Scan on public.provider_nppe_identifier  (cost=0.00..83143.69 rows=4874669 width=4)
                Output: provider_nppe_identifier.provider_nppe_id
          ->  Hash  (cost=144814.83..144814.83 rows=4268783 width=4)
                Output: provider_nppe.id
                ->  Seq Scan on public.provider_nppe  (cost=0.00..144814.83 rows=4268783 width=4)
                      Output: provider_nppe.id

SELECT count(8) FROM provider_nppe, provider_nppe_identifier 
WHERE provider_nppe_identifier.provider_nppe_id = provider_nppe.id

 23:10:48 select count(8) from provider_nppe, provider_nppe_identifier ...
  
  Aggregate  (cost=450245.69..450245.70 rows=1 width=0)
    Output: count(8)
    ->  Hash Join  (cost=214849.62..438059.02 rows=4874669 width=0)
          Hash Cond: (provider_nppe_identifier.provider_nppe_id = provider_nppe.id)
          ->  Seq Scan on public.provider_nppe_identifier  (cost=0.00..83143.69 rows=4874669 width=4)
                Output: provider_nppe_identifier.provider_nppe_id
          ->  Hash  (cost=144814.83..144814.83 rows=4268783 width=4)
                Output: provider_nppe.id
                ->  Seq Scan on public.provider_nppe  (cost=0.00..144814.83 rows=4268783 width=4)
                      Output: provider_nppe.id

I am upgraded count(8) is it!

Some database engines very well could optimize the count function where “*” is concerned…
I am going to try this on a large ORACLE table under PL/SQL at work and see what happens.

Also remember that the “*” is an argument to COUNT… you might say COUNT(distinct colors) which would be different from COUNT(colors)

Marcus,

You always seem to be working on apps for doing protein ID by MS.

What I am curious about is why, as most Mass Spec vendors have very capable software…

We have a Waters QToF with MassLynx along with Protein Lynx Global Server and BioPharmaLynx etc etc… and by your naming of objects/variables it appears that the code you post always seems to be related to things that type of software does…

Out of curiosity what does your code do that the commercial packages don’t?

  • Karen