Hi all
I’m working on a little HTML parser which loads a page into a hierarchical structure of HTMLNode objects. My current approach uses a recursive method which steps down into each node and my 24.5KB test file is parsing in ~64 msec in the IDE.
I’m now looking at ways to improve the speed where possible and I wanted to check that recursion is still an appropriate choice here given a couple of issues I have:
- Theoretically, with a html structure many levels deep, I could run out of stack space during the processing. Although I haven’t encountered this situation yet so not sure how likely this is?
- I imagine that there must be a time penalty each time another function is called? If I’m processing hundreds or thousands of objects via recursion, does it not reach a point where the time taken to call the function multiple times outweighs performing all the processing in a single function call? How significant is this?
If anyone could offer any advice/tips in this regard that would be much appreciated