Is there a difference in the compiled code?

I’m working on upgrading some old code to Xojo and began wondering whether the compiler generates the identical code or if not, which would generate the optimal output.

If Instr(Somestring, "xyz") > 0 Then

Or

If Somestring.Instr("xyz") > 0 Then

I guess a follow up question is whether the compiler always generates the identical output for “standalone” functions and the dot functions? If it doesn’t it would be great to have some official guide on optimization.

Thanks

Should be the same.

BTW the new Xojo framework (both desktop as iOS) mostly uses namespaces (dot) so it would be a good practice to use it now too.

I would make sense that there should not be differences, yet there are.

I placed the code in a button and built for each option. I obtained two Unix executable files on Mac.

The first gives a 1,609,292 bytes and the second 1,609,300 bytes.

Binary comparison with Hexedit shows numerous 1 byte differ, and a couple multiple bytes sequences.

I would not say that should imply changing radically programming style.

Then I measured execution time with this :

dim somestring as string dim old as double = microseconds If Somestring.Instr("xyz") > 0 Then //If Instr(Somestring, "xyz") > 0 Then beep end if msgbox str(microseconds-old)

As you see, to change method, just uncomment one and comment the other line style.

The first style gives an average of 0.8 Microsecond, the second 0.85 Microsecond.

Once again, no need to lose sleep over it.

Could be a valid reason, though it consumes a few more bytes and take 5/100th a Microsecond more time to execute (smile).

I understand that they should be the same however I do have some routines/calculations that any optimization would help out.

I’ve recently been wading thru the same kind of question over in the world of .NET and its IL code. Since that is so fresh in my mind, when I started back on my Xojo code it made me start to wonder what results in optimal code generation. As a matter of preference, I prefer the dot notation of using functions but being old school I can do either depending on what gets me the best performing code.

[quote=127637:@Michael Bierly]I understand that they should be the same however I do have some routines/calculations that any optimization would help out.

I’ve recently been wading thru the same kind of question over in the world of .NET and its IL code. Since that is so fresh in my mind, when I started back on my Xojo code it made me start to wonder what results in optimal code generation. As a matter of preference, I prefer the dot notation of using functions but being old school I can do either depending on what gets me the best performing code.[/quote]

I took the time to test for you. Have you read what I posted ?

The difference is negligible. Go by whatever comes to mind first and preference in this case.

Optimization is usually not so much a question of writing style but of programming style. Complex operations can often be simplified and choices can be made in the parameters of the program flaw that can make a big difference.

Unless your program is made exclusively of instr test, do not expect a miracle from one notation over another. 0.05% is a measurable difference, but in practical terms, it makes no difference. What is important is not to measure what the changes imply and make an informed decision, rather than to go on a whim.

You mention mathematical routines. There may be more room from improvement. Let alone in using XojoScript over the regular code, since tests do show very sensible differences.

Here is what I posted in https://forum.xojo.com/14123-xojo-benchmark

As you can read, the difference is approximately 50%. I would not call that negligible, and note that the code is exactly the same.

In the same thread, Sam Rowlands used the same code on a Mac and went down to 0.24 seconds…

— Posted on July 24th, 2014 —

Xojo has not said its last word. There are actually two compilers that can execute the benchmark code. The regular one used by the IDE, and the LLVM compiler used for the XojoScript class.

I used the very same code as posted by Djamel in a XojoScript control source, except I used PRINT in the last line like so

[code]#pragma BackgroundTasks False

Dim x As double
Dim y As Double
Dim i As int32
Dim t As double

x = 1
y = 1.000001
t = Microseconds

For i = 1 To 100000000
x = x * y
Next

t = (microseconds - t)/1000000

Print “X= " + str(x) + " T= " + format(t,”#0.00")+ " Sec"
[/code]

And placed msgbox msg in the Print event of XojoScript1.

On my now old PC, a built app gives :

  • Regular code : 0.94 s.
  • XojoScript : 0.48 s

It is already way below 0.66 s. If I apply the same ratio to Djamel benchmark, which is probably the time he would get from the same code on his machine, that would give a whopping 0.35 s. Almost half VB6 time :slight_smile:

For these kinds of mathematical operations, at least, and within the limits of the exercise, XojoScript could be a good tip to accelerate code.

Here is the project in case you want to try : benchmark.xojo_binary_project

Note : the 0.66 s execution time refers to the same code in VB6.

for me, it looks like same code is generated:

[code]// If Instr(Somestring, “xyz”) > 0 Then

00000081 E846280000 call 0x28cc
00000086 8B65C4 mov esp,[ebp-0x3c]
00000089 50 push eax
0000008A 683C161500 push dword 0x15163c
0000008F FF75E8 push dword [ebp-0x18]
00000092 6800000000 push dword 0x0
00000097 E8B28AFCFF call 0xfffc8b4e
0000009C 8B65C4 mov esp,[ebp-0x3c]
0000009F 8945E0 mov [ebp-0x20],eax
000000A2 33C9 xor ecx,ecx
000000A4 3BC8 cmp ecx,eax
000000A6 0F9CC2 setl dl
000000A9 83E201 and edx,byte +0x1
000000AC 8855DE mov [ebp-0x22],dl
000000AF 33DB xor ebx,ebx
000000B1 3BD3 cmp edx,ebx
000000B3 0F8400000000 jz near 0xb9
000000B9 E80E280000 call 0x28cc

// If Somestring.Instr(“xyz”) > 0 Then

000000C9 E8FE270000 call 0x28cc
000000CE 8B65C4 mov esp,[ebp-0x3c]
000000D1 50 push eax
000000D2 683C161500 push dword 0x15163c
000000D7 6800000000 push dword 0x0
000000DC FF75E8 push dword [ebp-0x18]
000000DF E8CF89FCFF call 0xfffc8ab3
000000E4 8B65C4 mov esp,[ebp-0x3c]
000000E7 8945D4 mov [ebp-0x2c],eax
000000EA 33C9 xor ecx,ecx
000000EC 3BC8 cmp ecx,eax
000000EE 0F9CC2 setl dl
000000F1 83E201 and edx,byte +0x1
000000F4 8855D2 mov [ebp-0x2e],dl
000000F7 33DB xor ebx,ebx
000000F9 3BD3 cmp edx,ebx
000000FB 0F8400000000 jz near 0x101
00000101 E8C6270000 call 0x28cc
[/code]

There are just minor differences like the order of push commands or the offsets.

The execution time difference is tiny but measurable, though.

Yes - I’ve read every post that each of you has made. I also appreciate the time that everyone has taken to test and/or prove things out.

I guess what I was really hoping for is to get something more definitive from the brains behind the compiler itself - figuring that would be the authoritative source.

As I mentioned briefly, I have had recent experience with .NET (and countless hours spent going over the ramifications) that if I take the exact same code and compile it even minutes apart VS will not generate the exact same executable. If you dig deep enough into the MS knowledge base you can find that MS will state this is the case. I was hoping to get a something from the Xojo compiler people as to whether there is a difference or not.

I’d be VERY surprised if the same code compiled at different times would result in different code.

And I’m fairly sure Joe would consider it a bug.

Sorry if what you were after was a comparison with VS. You should have made it clearer in your original post, which was simply a question about two notation styles for instr.

It is always nice to acknowledge what people do for you and have the courtesy to entertain a conversation, rather than ignore, if you think the answer is not to your taste.

Christian provided the code you are after, by the way. A simple comment seems a minimum, as a mark of gratitude.

I cannot believe compiling at different times would produce different code. That seems preposterous. How could a compiler, which is a program with no random aspect, produce different code depending on the hours it is run at ? As I wrote before, choices have to be based on science. Not whim, and especially not wild belief.

About science, your code run in XojoScript :
First method : 0.5599976 Microsecond
Dot method : 0.5169983 Microsecond

That is objective measurement. Not fantasies :confused:

Now you have had complete replies to the question you asked, in terms of code and performance, by users like you who use Xojo to produce professional software everyday, and have tried their best to assist :

I guess a follow up question is whether the compiler always generates the identical output for "standalone" functions and the dot functions? If it doesn't it would be great to have some official guide on optimization.

Although I would not dare quoting his statement as “official”, Norman Palardy is a prominent engineer at Xojo, Inc. and I believe you have your answer about output.

Interesting. It should be great if this is explained by Joe.

one is an extension method
one is not
so the extension method pushes arguments in the order required by that style of call
I’m pretty certain thats the only difference

BUT I’m NOT the compiler engineer so until/unless Joe replies mine is just an educated guess :slight_smile:

Thank you Norman! This is the type of explanation that I was hoping to get. I do understand that you say it an educated guess. However, considering you have “background” information that none of us possess AND your expertise in general I am grateful for your “educated guess”.

I apologize to anyone if they felt I was making a comparison to VS. I was simply using that as a point of reference since I know that there can be a difference in that environment and I was not trying to make a direct comparison. Also because of native language differences, I did not mean to offend anyone by not replying directly to each posting. :slight_smile:

Thanks for everyone’s contribution!

Well here you go
http://blogs.msdn.com/b/ericlippert/archive/2012/05/31/past-performance-is-no-guarantee-of-future-results.aspx
And once you read this is makes sense as to why C# might not generate the exact same exe each time.
In fact Eric points out that this is REQUIRED
Now, all this is very interesting, which is why I told you all about it. I could have just cut right to the chase, which is to say: the C# compiler by design never produces the same binary twice. The C# compiler embeds a freshly generated GUID in every assembly, every time you run it, thereby ensuring that no two assemblies are ever bit-for-bit identical. To quote from the CLI specification:

The Mvid column shall index a unique GUID […] that identifies this instance of the module. […] The Mvid should be newly generated for every module […] While the [runtime] itself makes no use of the Mvid, other tools (such as debuggers […]) rely on the fact that the Mvid almost always differs from one module to another.

And for the time difference, maybe it depends of the computer busy when we launch the tested applications ? We should launch many times the first build, and many times the second build.

[quote=127705:@Norman Palardy]Well here you go
http://blogs.msdn.com/b/ericlippert/archive/2012/05/31/past-performance-is-no-guarantee-of-future-results.aspx
And once you read this is makes sense as to why C# might not generate the exact same exe each time.
In fact Eric points out that this is REQUIRED
Now, all this is very interesting, which is why I told you all about it. I could have just cut right to the chase, which is to say: the C# compiler by design never produces the same binary twice. The C# compiler embeds a freshly generated GUID in every assembly, every time you run it, thereby ensuring that no two assemblies are ever bit-for-bit identical. To quote from the CLI specification:

The Mvid column shall index a unique GUID […] that identifies this instance of the module. […] The Mvid should be newly generated for every module […] While the [runtime] itself makes no use of the Mvid, other tools (such as debuggers […]) rely on the fact that the Mvid almost always differs from one module to another.[/quote]

Thank you for this reference. I had completely missed the necessity for a compiler to use unique identifiers even if I routinely see the concept used in Web Edition control ID. That is a very obvious reason. And having a unique GUID makes perfect sense.

What is more interesting is this :
Moreover: let’s not forget here that it’s not the IL that runs; yet another compiler will translate the IL to machine code. The jitter certainly does not guarantee that jit compiling the same code twice produces the same machine code; the jitter can be using all kinds of runtime information to tweak the generated code to optimize it for size, for speed, for debuggability, and so on, at its whim.

Xojo does not seem to exhibit such variability. I built twice the same project and compared the two binary files with Hexedit ; One 16 bytes string at offset 04D6, and a 3 bytes string that looks very much like a build number at 19455 (0000030A and 00000615).