HTML to PDF / DOCX / RTF Java converter library › Forums › PD4ML Forums › Technical questions / Troubleshooting › Encoding problem › Reply To: Encoding problem
Received same result with, fixed well formated html, setting HTML header charset and using readHTML with encoding parametr.
PD4ML pd4ml = new PD4ML();
String html = "<html>" +
" <head>\n" +
" <meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\"/>\n" +
" </head>" +
" <body>" +
" First Pageä|ä|á|ą|â|à|ả|ã|ạ|ă|ằ|ắ|ẳ|ẵ|ặ|ầ|ấ|ẩ|ẫ|ậ|å" +
" </body>" +
"</html>";
System.out.println(html);
byte[] myBytes = html.getBytes(StandardCharsets.UTF_8);
System.out.println(new String(myBytes, StandardCharsets.UTF_8));
InputStream stream = new ByteArrayInputStream(myBytes);
pd4ml.overrideDocumentEncoding("utf-8");
pd4ml.useTTF("C:\\Windows\\Fonts", true);
pd4ml.readHTML(stream, new URL("https://google.com"), "utf-8");
String output_path = "C:\\test\\zxccc.pdf";
try(OutputStream outputStream = new FileOutputStream(output_path)) {
pd4ml.writePDF(outputStream);
}
Desktop.getDesktop().open(new File(output_path));