DynaPDF - Extract text from specifical area

Pietro_Beccegato · June 7, 2014, 4:03pm

I have to develope an application with Xojo and DynaPdf to extract text from pdf from a specifical area defined as page, x, y, width, height.
I’ve spent many hours to search a method in DynaPdf manual and in MonkeyBread DynaPdf examples folder but, although I have the feeling that there is the opportunity, I could not find it.
Is there someone that have previous experiences on this topic?
In the examples of MBSDynaPdf, there’s a project called “Text extraction.rbp”; should be a good starting point.
A small piece of sample code would be great!

Any suggestion is appreciated.
Thanks in advance

Pietro

Christian_Schmitz · June 7, 2014, 4:05pm

We have the Extract text.rbp example which also shows the coordinates of the text fragments found.
You could filter it and build your own text of the area you need.

Pietro_Beccegato · June 7, 2014, 4:25pm

I studied the project “Extract text.rbp”; it return what was found on pdf and it’s coordinates.
What that I need it’s a little bit different.
My app will be monitor some folders and when some pdf arrives in it, I’ll have to search in an area proviously defined on a sample pdf. So my requirement is not to know what was found and in what position has been found; my requirement is to know what there is in a specifically area, which could also be empty-blank.

I hope to be able to explain.

Christian_Schmitz · June 7, 2014, 6:03pm

the example gets a list of things and you could filter the coordinates there to only take the text in the area.

Pietro_Beccegato · June 7, 2014, 6:15pm

I will try as soon as possible as you suggested.

Thanks for your reply.