Anyone familiar with raw PDF Code?

DaveS · October 31, 2013, 9:53pm

I cannot for the life of me get it to recognize the CropBox. Based on everything I have read, if the Cropbox is SMALLER and INSIDE the Mediabox, then it Crops the drawing/text to make that the “visible” area. The idea here being using the Cropbox to create page margins, as PDF is capable of drawing edge to edge, and printers are not.

This code creates a RED box with two diagonal lines. It SHOULD be CENTERED in the page, but seeing as it is ignoring the Cropbox, it appears in the upper left corner instead.

Anyone have any ideas?. I cannot believe I have what I thought were the “hard parts” figured out, and this should be so “simple”…

I will be glad to email the actual PDF file to anyone… and the first person that can come up with a viable solution or point out what I am doing wrong, will get two things from me. My grateful appreciation… and a free license to the end product I am creating

Note : my understanding is this
/MediaBox [0 0 612 792]
This line sets the PAGESIZE : 8.5x11 at 72dpi

/Cropbox [144 144 324 504]
This SHOULD set 2 inch margins all the way around… but seems to do absolutly nothing.

%PDF-1.2
%
1 0 obj
<<
/Author (Not Provided)
/Title (Not Provided)
/Subject(My Application.debug)
/CreationDate (D:20131031143808Z00'00')
/ModDate (D:20131031143808Z00'00')
/Producer (SimplePDF for XOJO)
/Creator (My Application.debug)
>>
endobj

2 0 obj
<<
/Type /Catalog
/Outlines 3 0 R
/Pages 4 0 R
>>
endobj

3 0 obj
<<
/Type /Outlines
/Count 0
>>
endobj

4 0 obj
<<
/Type /Pages
/MediaBox [0 0 612 792]
/Cropbox [144 144 324 504]
/Kids [5 0 R]
/Count 1
>>
endobj

5 0 obj
<<
/Type /Page
/Parent 4 0 R
/Contents 6 0 R
/MediaBox [0 0 612 792]
/Cropbox [144 144 324 504]
>>
endobj

6 0 obj % page content
<<
/Length 76
>>
stream
5 w
1. 0. 0. RG
0 792 m
324 288 l
s
324 792 m
0 288 l
s
0 792 324 -504 re
s
endstream
endobj

xref
0 5
0000000000 65535 f
0000000236 00000 n
0000000302 00000 n
0000000349 00000 n
0000000458 00000 n
0000000573 00000 n
trailer
<<
/Size 5
/Info 1 0 R
/Root 2 0 R
>>
startxref
714
%%EOF

Christian_Schmitz · October 31, 2013, 10:06pm

If I’m correct, the coordinates are relative to media box.
Cropbox may or may not be applied by the viewer to crop.

DaveS · October 31, 2013, 10:14pm

Yes… and assuming they are… why does it not work?
This should define a page 8.5"x11" (which by the way it does)… and a viewable area of 4.5"x7" offset inside that area by 2"

144/72=2"

And to the best I can tell… this matches what few clear code PDF examples I can find… it is amazing how convoluted the documentation seems to be.

Jeff_Tullin · October 31, 2013, 10:38pm

I have a sneaking suspicion that what you need is not CropBox but TrimBox…
Cropbox is optional.
Usually these are set to the same where both are present.

Peter_Truskier · October 31, 2013, 10:52pm

[quote=43551:@Christian Schmitz]If I’m correct, the coordinates are relative to media box.
Cropbox may or may not be applied by the viewer to crop.[/quote]
I’m not sure this is true. It generally works this way because in virtually all PDFs, the MediaBox has a 0,0 origin (as Dave’s does). But, there is no requirement in the PDF spec that this be the case. I believe that all the boxes exist in the same coordinate space.

While it is optional, it is what determines how the PDF will be displayed in a PDF Viewer, so I think it IS what Dave wants.

I’m not sure why you say that it should be centered. When I open the PDF in Acrobat, the red object is draw flush to the upper left corner. So, if you set the crop box as you have it, I would expect the red object to be partially cropped.

You are using a lower left origin, right (unlike Xojo graphics)? Also, I notice that when I open it in Acrobat, it seems to “fix” it, as evidenced by the fact that Acrobat offers to save it for me when I close it. There is an ever so brief dialog box that appears when I open it, but computers are fast enough and the PDF simple enough that it is barely noticeable. This usually means that the XRef offsets are screwed up. This could be due to my text editor having added a BOM or somehow else having disturbed the byte counts, but you should see if Acrobat offers to save it for you using your “real” PDF.

Of course, now that I’ve told three of you that you are wrong, I haven’t yet been able to get it to crop, either

I’ll try to have a deeper look a little later.

DaveS · October 31, 2013, 11:03pm

Ok… I threw in ALL the PDF boxes in order

ArtBox->BleedBox->TrimBox->MediaBox->CropBox

and ultimately got it to work… and I’m not REALLY sure what changed…

/MediaBox [0 0 612 792]
/CropBox [144 144 468 648]

gives the exact output I want… although I did find one item that confused the crud out of me.

PREVIEW.APP ONLY SHOWS THE CROPBOX contents… it does NOT show a “PAGE” you have to do a PRINT PREVIEW to see that …

But what I changed was this.

The MediaBox is based on the Pagesize (8.5"x11) times 72dpi… per the PDF specs.
Originally I was calculating the Cropbox as LeftMargin, TopMargin, Width-LeftMargin-RightMargin, Height-TopMargin-BottomMargin

NOW (the working model) is Cropbox as LeftMargin, TopMargin, Width-LeftMargin, Height-TopMargin

I printed a test… and the edges of my red box were all EXACTLY 2" from the edges of the paper…

In answer to someones question above.

For all the graphics (and soon text) items in my class…

The X coordinate becomes : X+MARGIN_LEFT
The Y coordinate becomes : HEIGHT-Y-MARGIN_TOP

So “Xojo” coordinates go in, but are transformed immediately to PDF coordinates

This makes the TOP/LEFT corner of the CROPBOX the 0,0 point.

Here is the working model (as compared to the code in the first post)

%PDF-1.2
%
1 0 obj
<<
/Author (Not Provided)
/Title (Not Provided)
/Subject(My Application.debug)
/CreationDate (D:20131031160134-00'00')
/ModDate (D:20131031160134-00'00')
/Producer (SimplePDF for XOJO)
/Creator (My Application.debug)
>>
endobj

2 0 obj
<<
/Type /Catalog
/Outlines 3 0 R
/Pages 4 0 R
>>
endobj

3 0 obj
<<
/Type /Outlines
/Count 0
>>
endobj

4 0 obj
<<
/Type /Pages
/MediaBox [0 0 612 792]
/CropBox [144 144 468 648]
/Kids [5 0 R]
/Count 1
>>
endobj

5 0 obj
<<
/Type /Page
/Parent 4 0 R
/Contents 6 0 R
>>
endobj

6 0 obj % page content
<<
/Length 82
>>
stream
5 w
1. 0. 0. RG
144 648 m
468 144 l
s
468 648 m
144 144 l
s
144 648 324 -504 re
s
endstream
endobj

xref
0 5
0000000000 65535 f
0000000236 00000 n
0000000302 00000 n
0000000349 00000 n
0000000458 00000 n
0000000522 00000 n
trailer
<<
/Size 5
/Info 1 0 R
/Root 2 0 R
>>
startxref
669
%%EOF

Peter_Truskier · October 31, 2013, 11:15pm

Glad you got it working, Dave, though I’m not sure what the difference in the definition of the cropbox in the newer version is. But, it works here, too.

I am still getting Acrobat offering to save it for me when I close it, and I checked with a hex editor to confirm that there are no extra bytes at the beginning of my files, so I think you may have an XRef issue. If you don;t have Acrobat Pro, feel free to send me your actual PDF (peter at premediasystem dot com), I’ll check it in Acrobat…

DaveS · October 31, 2013, 11:50pm

Uh… that email doesn’t work… tried twice it bounced… and yes “at=@” and “dot=.”

PREVIEW in OSX doesn’t complain… but I viewed it with AcroReader in Win7 and you’re right…it asked to save

So I did… and it created a PDF-1.6 version [I labeled it v1.2]
and the format is TOTALLY different… some due to internal zipping, and then it added a whole new section at the end.

So not sure what the deal is.

Peter_Truskier · November 1, 2013, 12:11am

Sorry; I’m an idiot. I misspelled my own domain name!

But, I don’t suppose you need to send it to me now since you’ve confirmed that it is a problem. (I didn’t realize that Reader would perform the same function.)

It is not at all easy to get Adobe to save to an older PDF version than the one for which the version of Acrobat was designed. I guess they want everything to be the most recent version PDF 1.6 has a lot of new features, and facilities, and the files that a newer version Acrobat produces can be a lot different as you’ve discovered - not the least of which is due to the compression and encoding differences.

Being off by just one in any of the byte offset references is enough to cause this behavior, so it’s tricky to debug and fix, though once you do, you should be able to forget about the problem

Looking forward to seeing what you come up with. Though we do have a DynaPDF license, a lightweight “simple” PDF generation in Xojo would be very handy.

DaveS · November 1, 2013, 12:19am

I think it is more that it wants to deflate the file as opposed to the version #.

I set it to be v1.7 and Acroreader STILL wanted to change it… it left it as v1.7 but mucked with the rest of it

bound to be some parameters I can add that say “Hey! Leave me alone”

Sean_Mitchell · November 1, 2013, 2:20am

Dave that is one very simple pdf. I would not set the pdf version 1.7. There is nothing in that pdf that requires the 1.7 spec. You might want to try pdf version 1.2 or 1.3

DaveS · November 1, 2013, 6:45am

That is the plan actually… I just used 1.7 to see if that would keep AcroReader from wanting to save it thinking it was “changed” when in fact it wasn’t

KarenA · November 1, 2013, 1:39pm

Did it help? I get that using Asher Dunn’s classes to… I would love to get rid of that message.

DaveS · November 1, 2013, 4:25pm

Karen… no it did not help

What I think is happening is the AcroReader INSISTS that PDF files contain ZIP formatted datastreams instead of just properly formed text based PDF code. So it converts it…

I bet if you create a PDF using Dunn’s classes (which I have not seen), and then view it with AcroReader and save that as another file… When you open the two… the will not look ANYTHING alike

If anyone can come up a clue as to a tag or something that would tell AcroReader to just leave it alone… that would be great

Peter_Truskier · November 1, 2013, 6:05pm

I’m quite sure this is not the problem. If you open a “proper” PDF 1.2, 1.3, or other earlier version in Acrobat or Reader, you will not see the fleeting warning, and you will not be promoted to save on close. If you DO choose to save, newer versions of Acrobat will rewrite the PDF as you’ve seen.

The problem with your current PDFs, and apparently with Asher’s as well, is that the xrefs are somehow, possibly very subtly, incorrect. Acrobat, and some other viewers are clever enough to “look around” the specified byte offsets for the content that they are looking for, and recover from the error(s). When this happens, Acrobat flags the file as dirty, leading to the behavior you see. Some other PDF viewers will simply barf on a file with damaged refs.

Here are a couple of links that might help:

What’s wrong with this PDF? And how to get more Info?
The Trouble With the XREF Table

DaveS · November 1, 2013, 6:12pm

I thought it might be that I neglected to count the LF in my tally… but that obvioulsy made a different XREF, and Preview still displayed it… AcroReader still changed it… but FireFox refused to read it …

Let me check out those links… and if I find the issue I will post details

DaveS · November 1, 2013, 7:22pm

SOLVED IT!!!
Those links had most of the answer…

XREF entries MUST be 20 characters long… INCLUDING EOL… so I had to add an extra space since I am using just LF
The 2nd line which is normally ASCII high-byte chars screwed things up as they were TWO byte characters… but XOJO LEN had to be replaced with LENB

Peter_Truskier · November 1, 2013, 11:13pm

Ah. Yes; I forgot about the 20 byte requirement. Great! Very glad you’ve got it sorted…

DaveS · November 2, 2013, 1:08am

LOL…
Now back to square one… I can’t figure out the stupid font resources syntax… it LOOKS right… but now PREVIEW says the file is corrupt to the point where it can’t even open it.


4 0 obj
<<
   /Type /Font
   /Subtype /Type1
   /Name /F5
   /BaseFont /Helvetica
>>
endobj


5 0 obj
<<
   /Type /Pages
   /MediaBox [0 0 612 792]
   /CropBox [18 18 594 774]
   /Kids [6 0 R]
   /Count 1
   /Resources
   <<
      /Font <<
         /F5 4 0 R
      >>
   >>
>>
endobj