JumpToCase- a structured High performance Jump table feature

  1. 6 months ago

    Karen A

    Jun 15 Pre-Release Testers
    Edited 6 months ago

    I have been thinking about ways to get the fastest code execution when one need it in Xojo without resorting to plugins or arcane coding tricks...
    That was the impetus of my Freezable MemoryBlock request.

    My GoTo thread reminded me of a feature in fortran I used back in teh stone age of computing ... It was essentially a jump table... so I stated think of how could such a feature fit into the Xojo philosophy and this is the feature request I came up with:

    Feedback Case #56070

    In general Xojo has a reputation for being slow at certain things so that good performance for them can only be obtained from using plugs or som very non intuitive code…

    This is one suggestion to start adding features to help change that (another is 55820 - Enable “freezing” Memoryblocks… enable making a mutable memoryblocks Immutable for processing)

    The Xojo Select Case is very flexible and very readable… but the flexibility limits it’s performance …

    Very often Select Case statements are used to select one option from a number of predefined ones say by an enum. There in that case all one needs is jump table.

    From back when i learned fortran there used to be something called a computed GoTo… What I am proposing here is essentially a structured computed goto..

    It could look something like this:

    JumpTo Case MyEnumVariable
        Case Enum.member1
            ’Some code
        Case Enum.member2
           ’Some code
        Case Enum.member3
           ’Some code
        Else
           ’Some code
    End JumpTo

    Under the hood Xojo could setup a jump table based an Enum values as the index into the table, What then could happen under the hood could be:

    JumpToAddress AddressArray(Ctype(MyEnumValiable, Integer) )

    If there is no case for that Enum member, it goes to the Else Address.

    If an integer that is not in the enum definition was cast to the enum type, it could either go to the ELSE clause or throw an exception. (I would rather see it go to ELSE)

    The execution of this should be blazing fast compared to a LONG select case statement. It would also be a lot less work (as well as more flexible) than settng up an array of delegates to do the same sort of thing

    Please consider implementing something like this for cases when high performance is required… and maybe we can can think of more such things that can be part of API 2.0 for situations when performance is paramount and the current features don’t offer a good solution.

    Thanks
    -Karen

    Yet the computed GoTo was 25% faster than a C Switch statement? That a Xojo Select Case could be faster than a jump table just seems wrong... and as I said I suspect it had something to do with a compiler optimization made possible by you calling it in a loop like that.

    On that link (https://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables ) he said:

    Benchmarking
    I did some simple benchmarking with random opcodes and the goto version is 25% faster than the switch version. This, naturally, depends on the data and so the results can differ for real-world programs.

    Comments inside the CPython implementation note that using computed goto made the Python VM 15-20% faster, which is also consistent with other numbers I've seen mentioned online.

    Well I'm not a professional coder, don't know C or the inner workings of compilers ... but logically it seems a computed goto SHOULD be significantly faster than a Select Case...

    Anyway you are probably right that Xojo Inc won't implement my feature request, so it's all academic with respect to Xojo.

    - karen

  2. Norman P

    Jun 15 Pre-Release Testers, Xojo Pro Outside Innisfail

    You have just reinvented "select case"

  3. Karen A

    Jun 15 Pre-Release Testers
    Edited 6 months ago

    @Norman P You have just reinvented "select case"

    I don't think so.

    This not as flexible and all the case options are essentially resolved at compile time (No IS < or IsA or variables in each case to deal with). Because of that it enables a lot more efficient/faster code as it becomes a simple jumpable .. which select case cannot be by it's nature.

    As does not have to test each Case condition individually as Select case does, it just jumps to the desired code.

    You yourself said a Jump table of using array of delegates would be blazingly fast... Well this would be as fast or faster and a heck of ago easier to set up...

    So maybe you want to reconsider your opinion?

    - karen

  4. Norman P

    Jun 15 Pre-Release Testers, Xojo Pro Outside Innisfail

    nope
    Used something almost analogous to this in basic way back when and it cannot be completely resolved at compile time since the variable that controls it HAS to be compared to figure out which jump to take
    That can only be done at runtime
    Its, at most, a special case of "select case" with nothing but integer cases

    Its not the same as the jump table I was talking about

  5. Norman P

    Jun 15 Pre-Release Testers, Xojo Pro Outside Innisfail

    current docs for fortran computed goto from a currently shipping compiler

    https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-goto-computed

    GOTO - Computed
    Statement: Transfers control to one of a set of labeled branch target statements based on the value of an expression. It is an obsolescent feature in Fortran 95.

    When the computed GO TO statement is executed, the expression is evaluated first. The value of the expression represents the ordinal position of a label in the associated list of labels. Control is transferred to the statement identified by the label. For example, if the list contains (30,20,30,40) and the value of the expression is 2, control is transferred to the statement identified with label 20.
    If the value of the expression is less than 1 or greater than the number of labels in the list, control is transferred to the next executable statement or construct following the computed GO TO statement.

    and they replaced the computed goto in fortran 95 with ... select case :)

  6. Karen A

    Jun 15 Pre-Release Testers
    Edited 6 months ago

    @Norman P Used something almost analogous to this in basic way back when and it cannot be completely resolved at compile time since the variable that controls it HAS to be compared to figure out which jump to take
    That can only be done at runtime
    Its, at most, a special case of "select case" with nothing but integer cases

    Logically I think you are wrong in that.

    Using a Enum means all the possibilities are know at compile time and the jump table can certainly be created then... The array would contain a entry for every enum... for those not used in the statement that address would be that of the else clause. That is why I said an exception might be thrown is some other number was cast to that enum type that was not in the definition. That is how it differs from the just using integers

    Yes some code has to execute at run time but it is only using the Enum as an index into precompiled jump Table

    BTW how is it different in basic concept from using an array of delegates?

    As I said I am looking for ways to speed things up in xojo...

    BTW I am aware they dropped it in fortran ...but in Xojo it would still give a performance improvement when needed .. and (i know famous last words) should not be difficult to implement

    - karen

  7. Beatrix W

    Jun 15 Pre-Release Testers, Third Party Store Europe (Germany)

    Where do you see performance problems with a select case? Please show us.

  8. Norman P

    Jun 15 Pre-Release Testers, Xojo Pro Outside Innisfail

    @Karen A As I said I am looking for ways to speed things up in xojo...

    If you can actually measure the speed of a select case over everything else in your app I'd be impressed - and skeptical at the same time
    I seriously doubt select case is a performance bottle neck

  9. Norman P

    Jun 15 Pre-Release Testers, Xojo Pro Outside Innisfail
    Edited 6 months ago

    I suspect that given a select case with simple types & comparisons where the case expressions are completely computable at compile time that they are and that LLVM would already optimize code to take advantage of something like you describe

    like this http://lists.llvm.org/pipermail/llvm-dev/2017-February/110172.html

  10. Sam R

    Jun 15 Pre-Release Testers, Xojo Pro, Third Party Store Hengchun, Pingtung, Taiwan

    One way in which I speed up functions using select case, was to use computed properties.

    The processing is separated out into different methods on the object, all with the same structure.

    Using a computed property, when a governing value is set, it then creates a delegate to use the appropriate method.

    When the processing is performed, it then simply calls the delegates, rather than doing the select case or if, else & end if.

    In the IDE the performance boost is quite significant; in the compiled application not so much, but still a welcome improvement.

  11. @Karen A — Honestly, I understand your point but is it factually a problem somewhere? If Xojo were that slow, projects like CustomEditField (which analyzes and format source code in real-time) could not exist at all.

    There is plenty of room for optimization in Xojo, but I do not think that if...endif or select case are part of it.

  12. Norman P

    Jun 16 Pre-Release Testers, Xojo Pro Outside Innisfail

    here an example of the kind of jump table I'm talking about

    maxrounds is a constant set to 100000

    Private Sub JumpTable()
      Dim startTime As Double = Microseconds
      
      Dim a() As NoParamDelegate
      
      a.append WeakAddressOf Untitled0
      a.append WeakAddressOf Untitled1
      a.append WeakAddressOf Untitled2
      a.append WeakAddressOf Untitled3
      a.append WeakAddressOf Untitled4
      a.append WeakAddressOf Untitled5
      a.append WeakAddressOf Untitled6
      a.append WeakAddressOf Untitled7
      a.append WeakAddressOf Untitled8
      a.append WeakAddressOf Untitled9
      
      
      For i As Integer = 1 To MaxRounds
        
        For j As Integer = 0 To 9
          
          a(j).Invoke
          
        Next
      Next
      
      Dim endtime As Double = Microseconds
      
      Dim difftime As Double = endtime - startTime
      
      System.DebugLog CurrentMethodName + " " + Str(difftime)
    End Sub

    when compared to

    Dim startTime As Double = Microseconds
    
    For i As Integer = 1 To MaxRounds
      
      For j As Integer = 0 To 9
        
        select case j
          
        Case 0
          Untitled0
        Case 1
          Untitled1
        Case 2
          Untitled2
        Case 3
          Untitled3
        Case 4
          Untitled4
        Case 5
          Untitled5
        Case 6
          Untitled6
        Case 7
          Untitled7
        Case 8
          Untitled8
        Case 9
          Untitled9
          
        End Select
      Next
    Next
    
    Dim endtime As Double = Microseconds
    
    Dim difftime As Double = endtime - startTime
    
    system.DebugLog CurrentMethodName + " " + str(difftime)

    there are 10 methods named untitled0 to untitled9

    when this code is run in the debugger the jump table performing all those iterations is nearly 2x as fast as select case
    when compiled with aggressive optimization the SELECT CASE is faster by about 20%
    this is what makes me think that in certain circumstances the LLVM optimizations apply and are quite good

  13. Karen A

    Jun 16 Pre-Release Testers

    @Norman P when this code is run in the debugger the jump table performing all those iterations is nearly 2x as fast as select case
    when compiled with aggressive optimization the SELECT CASE is faster by about 20%
    this is what makes me think that in certain circumstances the LLVM optimizations apply and are quite good

    I understood what you meant by your type of jump table and how to set it up.

    I don't know how smart LLVM is .... but if it is smart enough, in your code (which would not be the general case) it MIGHT simplify to a jump table (or some other optimization) only because it can know that the select case will only see the arguments 0-9... something that for the more general case the compiler could not know. I suspect making sure one is testing what one thinks they are, could be tricky with an optimizing compiler!

    So for the general case i suspect my type of Jump statement could still be significantly faster than the complier optimized select case as it should be able to have less overhead. It could ALWAYS literally just be a jump table with the argument being the table index - I don't think anything could take less cycles... but without it being implemented there is no way to prove it.

    That said I guess the argument against it is that the speed gain may not be worth the effort. While in theory I think it should be significant, it is hard to argue the point without data.

    BTW in the class I am writing with my testing data, Enum based Select case statements are being called a total of about 370,000 times and the select case statements have about 40 items each ... That data set is bigger than what i expect to have to typically deal with, but I think it's a good one to code against for tuning performance.

    - karen

  14. Norman P

    Jun 16 Pre-Release Testers, Xojo Pro Outside Innisfail

    there are hundreds, if not thousands, of folks working on LLVM

    If its possible to optimize select case down to a jump table I would expect its already being done (that was the gist of the post I referred to previously)

    Personally I would say Xojo focus on things that are much much higher priority than an experiment like this

  15. Karen A

    Jun 16 Pre-Release Testers
    Edited 6 months ago

    BTW I came across this with respect to C code:

    https://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables

  16. Norman P

    Jun 16 Pre-Release Testers, Xojo Pro Outside Innisfail

    this is precisely what the code I posted earlier did - and that was surprisingly SLOWER than select case

  17. Karen A

    Jun 16 Pre-Release Testers Answer
    Edited 6 months ago

    Yet the computed GoTo was 25% faster than a C Switch statement? That a Xojo Select Case could be faster than a jump table just seems wrong... and as I said I suspect it had something to do with a compiler optimization made possible by you calling it in a loop like that.

    On that link (https://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables ) he said:

    Benchmarking
    I did some simple benchmarking with random opcodes and the goto version is 25% faster than the switch version. This, naturally, depends on the data and so the results can differ for real-world programs.

    Comments inside the CPython implementation note that using computed goto made the Python VM 15-20% faster, which is also consistent with other numbers I've seen mentioned online.

    Well I'm not a professional coder, don't know C or the inner workings of compilers ... but logically it seems a computed goto SHOULD be significantly faster than a Select Case...

    Anyway you are probably right that Xojo Inc won't implement my feature request, so it's all academic with respect to Xojo.

    - karen

  18. Carsten B

    Jun 16 Pre-Release Testers, Xojo Pro Germany, NRW, Europe
    Edited 6 months ago

    My measurements of Norman's methods compiled as 64-bit app with aggressive optimization in Xojo 2019r1.1:
    Mac: SelectCase is always 5-10 ms faster than JumpTable.
    Win: JumpTable is always about 5 ms faster than SelectCase.

    5-10 ms difference at 100.000 calls!

    BTW: Norman's JumpTable sample is not realistic compared to the SelectCase because it's missing the comparison. If it's used in a real-world solution, a VM for example which method to invoke relies on an op-code, you first have to determine the index of the jump table entry.

  19. Norman P

    Jun 16 Pre-Release Testers, Xojo Pro Outside Innisfail
    Edited 6 months ago

    @Karen A Well I'm not a professional coder, don't know C or the inner workings of compilers ... but logically it seems a computed goto SHOULD be significantly faster than a Select Case...

    Except that LLVM already HAS optimizations in palce for "switch lowering" that basically do EXACTLY what you're suggesting
    They just DONT expose it as a language construct that requires you to choose select case vs computed goto
    It just does the very best it can, and in certain circumstances, it amounts to a computed goto

    And thats the beauty of something like Xojo
    You should not have to pick a certain language feature or another to get best performance (esp if you turn on aggressive optimizations) It should just be done when & where it can be and you dont do anything more than write the same code you always did

    At least in theory thats how it should be
    I know there are some internal details that may hinder this as code is transformed from the text we write into intermediate code then optimized and machine code eventually output BUT I'm not permitted to talk about internals so ........ I just know it would take a "compiler guy" to alter them and its a non-trivial bit of work.

  20. Norman P

    Jun 16 Pre-Release Testers, Xojo Pro Outside Innisfail

    @Carsten B BTW: Norman's JumpTable sample is not realistic compared to the SelectCase because it's missing the comparison. If it's used in a real-world solution, a VM for example which method to invoke relies on an op-code, you first have to determine the index of the jump table entry.

    Actually that jump table is simply too brief - it was simply to illustrate a point
    But I have actually used a set up like that as a finite state machine for handling streams of characters very fast
    No IF statement required - you simply use ASC(char) as the index into a table and execute what ever delegate is there
    Thats the point

    Now the entries in the array may not be static and could change as states change but there's no "if" required
    That also is how it was used in the VM

or Sign Up to reply!