HTML special char not print with the newest Pd4ml 394v - PD4ML

This topic has 2 replies, 2 voices, and was last updated Feb 02, 2015
10:08:51 by vfieschi.

Viewing 3 posts - 1 through 3 (of 3 total)

Author

Posts
vfieschi
January 29, 2015 at 16:31
#26990
Hi,
I try to convert this html code in pdf with pd4ml jar library:

<br /> <html><br /> <head></head><br /> <body><br /> <p><br /> Š<br /> </p><br /> </body><br /> </html><br />

With 380 Pro version, Pd4ml renders it correctly, with the newest 394 (and with the prior 384), it’s not decode (render print “?” char).

Attached the pdfs.

Someone can help me?
PD4ML
January 30, 2015 at 16:06
#29665
The problem reason is that recent PD4ML versions are more strict by character encoding conversion.

ISO-8859-1 (also called Latin-1) is identical to Windows-1252 (also called CP1252) except for the code points 128-159 (Š = 138). ISO-8859-1 assigns several control codes in this range. Windows-1252 has several characters, punctuation, arithmetic and business symbols assigned to these code points.

As I see you use Windows-1252 as default; PD4ML relies on ISO-8859-1.

A general solution for that type of issues is to utilize TTF embedding feature of PD4ML.

As long as you run DMS edition of PD4ML, the easiest workaround would be to enable PDF/A generation mode (make sure pd4ml_rc.jar is in the same directory with pd4ml.jar) with pd4ml.generatePdfa(true) API call.
vfieschi
February 2, 2015 at 10:08
#29666
We enable PDF/A generation mode and it works correctly

Many thanks
Author

Posts

Viewing 3 posts - 1 through 3 (of 3 total)

The forum ‘HTML/CSS rendering issues’ is closed to new topics and replies.