PDF Generating Tool Support Forum

HOME   Login   Register    Search




  Subject: Odd problem with a UTF-8 dash/minus character
   PostPosted: 19 Nov 2018, 17:20 
Hello,

We are in the process of upgrading our system from JDK8 to JDK9 and have ran into a bit of a roadblock with our tests. A large part of our system are transformations of HTML to PDF and there are several tests failing for the following reason.

We have a test that reads HTML-documents and makes sure that it generates proper PDF-data, basically by converting HTML -> PDF and then using a different PDF library to check that the generated bytes are indeed a PDF and check if the text extracted from it contains what we expect.

This process gets stuck just at the beginning of the document. It seems like a UTF-8 "minus" character (U+2212) is used instead of a normal ascii minus sign (which is used in the rest of the coded document) and this causes the token to be read as a pdf string literal rather than a number, which in turn causes a class cast exception.

Basically, this concerns the "% modifyCTM" section in the generated data. The two concerned rows used to look like this using PD4ML v.3.10.6 in JDK8:

% modifyCTM
0.8125 0 0 -0.8125 50 817 cm

and now they look like this in JDK11:

% modifyCTM
0.8125 0 0 −0.8125 50 817 cm

Notice the changed dash/minus character.

It seems odd that the character is only changed in this section and not the rest of the file, as well as only seemingly being a problem with JDK8 -> 9 (changing JDK9 to JDK11 still causes the same error).

Is there a bug in PD4ML that we have happened to stumble upon?

The source html-file is available upon request.


  Subject: Re: Odd problem with a UTF-8 dash/minus character
   PostPosted: 19 Nov 2018, 17:42 
We have, btw, updated our license and tried with v3.10.7 as well, but the same problem occurs.


  Subject: Re: Odd problem with a UTF-8 dash/minus character
   PostPosted: 19 Nov 2018, 18:00 
Also, using:

-Djava.locale.providers=COMPAT

solves this issue, so it seems to be related to UTF-8 changes in JDK9:

https://docs.oracle.com/javase/9/intl/i ... 009D2B36AF


  Subject: Re: Odd problem with a UTF-8 dash/minus character
   PostPosted: 30 Nov 2018, 15:25 
Sorry for the delay with the reply and thanks a lot for the very useful info you provided. Now we are trying to reproduce the issue to find a workaround (to avoid the need in the env setting)...



[Reply]     [ 4 posts ] 

cron
Copyright ©2004-10 zefer|org. All rights reserved. Bookmark and Share