TextArea and StyledText.TextColor > Performance?

Currently I am working on a little text editor as a personal practice. By now the highlighting (keywords, strings, comments) is working as I intended it to, but there is one thing bothering me.

The highlighting works in two steps: 1. tokenizing the text, 2. iterating through the token list and coloring the regarding parts with the TextColor method of my TextArea’s StyledText property. The lexing works reasonably fast, but the coloring method seems to be rather slow; when highlighting a long Java example file, the lexing takes 27 milliseconds, whereas the coloring takes 70(!) additional milliseconds, although it does nothing more than:

For Each item As Token In Tokens Me.StyledText.TextColor(item.Start, item.Length) = item.Col // "Me" is my subclassed TextArea Next

Is there a faster way to color certain text parts (I would try StyleRuns, but as I remember they taken even longer than the TextColor method), or am I just hitting the limits of Xojo’s TextArea control?
The problem mainly occurs when loading a text file or changing bigger text parts, because while typing the highlighting method only processes the changed text part. But still that’s bugging me …

I tried that exact same method and rejected it in favor of a RegEx based solution

it actually takes a few passes but is actually quite fast

  • Highlight all the keywords (regardless of their context)
  • Highlight all the numeric digits
  • Highlight all Quoted Strings (this undoes any keywords inside quotes)
  • Highlight any line comments // ’ (entire line where // is not inside quote)

This is assuming you are going for something like the Xojo IDE. BLOCK comments (/* */) are alot more tricky

Thanks for the suggestion, I will give it a try!

(Block comments or keywords in strings etc. are no problem, my tokenizer takes care of that.)

EDIT: I think I misunderstood your suggestion. The problem is not recognizing the tokens I have to color, but the coloring method itself. My first approach was to call SelStart/SelLength/SelTextColor for each token, but that was much, much slower than using StyledText.TextColor. I wondered, if there was an even faster approach …

EDIT 2: I tried a RegEx (and RegExMBS) approach before, but that was a bit slower (20%) than tokenizing the text.

First off you don’t recolor the entire document, only that which is visible and/or changed

But if you have BLOCK comments, that is where it gets tricky… if the start or end isn’t visible, did you just start a new block, end a block or are you in the middle. I think that is one reason Xojo doesn’t support them, so the the IDE rehighlights one line at a time, and the context is entirely within that line.

My implementation is many times faster, as it doesn’t need to tokenize each line (I do have code for that that is used by Transpiler project, but it is not part of the “editor”), and it too relies on the content in a single “line” of text

My Textarea does only highlight changed text while typing, that’s not the problem. But when I load/reload a whole file, I have to color all tokens, and it takes a bit too long for my taste (1800 tokens need approximately 70 milliseconds to be colored), and that was where my question was heading…