Viewing 11 posts - 1 through 11 (of 11 total)
  • Author
    Posts
  • #26754

    Hi guys,

    I’m new to PD4ML, and I’m having some issue converting html page which has some image on it to a pdf.

    The problem I’m having is not all the images are getting converted, for example:

    the following images are converted fine :

    http://3.bp.blogspot.com/-2NUocmek-R8/TflZgp0ZTAI/AAAAAAAAAUQ/JYX9Ieb0cmA/s320/Paul%20Newman%20Pinterest.jpg

    http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg

    but the following are not :

    case 1 :

    http://www.tumblr.com/photo/1280/mojitosandblow/1347661703/1/tumblr_l9sxt2kDS71qafm1v

    case 2 :

    http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg?AWSAccessKeyId=AKIAI6WLSGT7Y3ET7ADQ&Expires=1341938431&Signature=uX39Hooms%2FMHFZ8PwKNqCAdYLcA%3D

    All the url are get from img src in the html, and they all appear in the html page which is used to convert to pdf.

    for case 1,

    it is a url which redirects to the actual image.

    for case 2,

    it is the actual image, but with some extra information in the url, i.e the bits after “?” (in bold)

    I’m not sure what the problem is, hope some you will be able to help!

    Just got some debug logging:

    cache disabled. (re-)reading http://www.tumblr.com/photo/1280/mojitosandblow/1347661703/1/tumblr_l9sxt2kDS71qafm1v
    resource http://www.tumblr.com/photo/1280/mojitosandblow/1347661703/1/tumblr_l9sxt2kDS71qafm1v not found.
    image http://www.tumblr.com/photo/1280/mojitosandblow/1347661703/1/tumblr_l9sxt2kDS71qafm1v not found.
    can not load image: http://www.tumblr.com/photo/1280/mojitosandblow/1347661703/1/tumblr_l9sxt2kDS71qafm1v

    cache disabled. (re-)reading http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg?AWSAccessKeyId=AKIAI6WLSGT7Y3ET7ADQ&Expires=1341938431&Signature=uX39Hooms%2FMHFZ8PwKNqCAdYLcA%3D
    resource http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg?AWSAccessKeyId=AKIAI6WLSGT7Y3ET7ADQ&Expires=1341938431&Signature=uX39Hooms%2FMHFZ8PwKNqCAdYLcA%3D not found.
    image http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg?AWSAccessKeyId=AKIAI6WLSGT7Y3ET7ADQ&Expires=1341938431&Signature=uX39Hooms%2FMHFZ8PwKNqCAdYLcA%3D not found.
    can not load image: http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg?AWSAccessKeyId=AKIAI6WLSGT7Y3ET7ADQ&Expires=1341938431&Signature=uX39Hooms%2FMHFZ8PwKNqCAdYLcA%3D

    I’m not sure why PD4ML can not load those images, hope the log will help!

    Many thanks,
    Xin

    #29060

    I tried the same html with pd4ml probe, and it works for all the images….now I’m totally confused, is it anything to do with the package I imported?

    I imported
    import org.zefer.pd4ml.PD4Constants
    import org.zefer.pd4ml.PD4ML
    import org.zefer.pd4ml.PD4PageMark
    in my code, and it having the issue, not sure what else pd4ml probe imports….

    Thanks for you advice!!

    Xin

    #29061

    I guess (but not 100% sure right now) %2F and %3D codes are auto-replaced with corresponding characters ‘/’ and ‘=’. Would the replacement break the image link?

    And redirects are not supported by PD4ML.

    Could you please publish an HTML snippet to simplify testing from our side. And please let us know PD4ML version you currently use.

    #29062

    Thanks for you reply,

    Please see the html below, which is the one I used :


    test

    The pd4ml version I’m using is 3.80.

    Thank you~

    Xin

    #29063

    sry, the previous html has one image doesn’t exist, please use the one below for your internal testing:

    <html><br /> <body><div class="printHead" style="height: 18pt; border-bottom: 1pt solid #CCC; padding-bottom: 5pt; margin-bottom: 5pt;"><span style="float: left; font-size: 20px; text-transform: uppercase; font-family: Helvetica, Tahoma, Arial, sans-serif;">test</span></div><br /> <div class="moodboard" style="position:relative"><br /> <br /> <br /> <img src="http://3.bp.blogspot.com/-2NUocmek-R8/TflZgp0ZTAI/AAAAAAAAAUQ/JYX9Ieb0cmA/s320/Paul%20Newman%20Pinterest.jpg" top="144" left="34" sizewidth="191" sizeheight="235" style="top:144.0px;left:34.0px;z-index:1;width:191.0px;height:235.0px;position:absolute;"/><br /> <br /> <img src="http://3.bp.blogspot.com/-2NUocmek-R8/TflZgp0ZTAI/AAAAAAAAAUQ/JYX9Ieb0cmA/s320/Paul Newman Pinterest.jpg" top="152" left="514" sizewidth="191" sizeheight="235" style="top:152.0px;left:514.0px;z-index:2;width:191.0px;height:235.0px;position:absolute;"/><br /> <br /> <img src="http://www.tumblr.com/photo/1280/mojitosandblow/1347661703/1/tumblr_l9sxt2kDS71qafm1v" top="132" left="272" sizewidth="178" sizeheight="270" style="top:132.0px;left:272.0px;z-index:3;width:178.0px;height:270.0px;position:absolute;"/><br /> <br /> <img src="http://25.media.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_500.jpg" top="554" left="48" sizewidth="134" sizeheight="202" style="top:554.0px;left:48.0px;z-index:3;width:134.0px;height:202.0px;position:absolute;"/><br /> <br /> <img src="http://25.media.tumblr.com/tumblr_m5nxz04CeE1rttjy8o1_500.jpg" top="552" left="268" sizewidth="176" sizeheight="174" style="top:552.0px;left:268.0px;z-index:3;width:176.0px;height:174.0px;position:absolute;"/><br /> <br /> <img src="http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg" top="154" left="768" sizewidth="160" sizeheight="244" style="top:154.0px;left:768.0px;z-index:3;width:160.0px;height:244.0px;position:absolute;"/><br /> <br /> <img src="http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg?AWSAccessKeyId=AKIAI6WLSGT7Y3ET7ADQ&Expires=1342001418&Signature=vP617e4HcVaKvCN07IKk53A7kcw%3D" top="524" left="780" sizewidth="144" sizeheight="218" style="top:524.0px;left:780.0px;z-index:3;width:144.0px;height:218.0px;position:absolute;"/><br /> <br /> <img src="http://2.bp.blogspot.com/-L7f1tx6NyvI/TkdiCmIh4dI/AAAAAAAAAxU/BlOxp8jws-Q/s400/pnewman%20via%20mojitosandblow.tumblr.png" top="534" left="494" sizewidth="156" sizeheight="196" style="top:534.0px;left:494.0px;z-index:3;width:156.0px;height:196.0px;position:absolute;"/><br /> <br /> <br /> <br /> </div><br /> </body></html>

    #29064

    @PD4ML wrote:

    I guess (but not 100% sure right now) %2F and %3D codes are auto-replaced with corresponding characters ‘/’ and ‘=’. Would the replacement break the image link?

    And redirects are not supported by PD4ML.

    Could you please publish an HTML snippet to simplify testing from our side. And please let us know PD4ML version you currently use.

    one thing I’m curious is that why the html I use works perfectly in PD4ML probe?

    Thanks.

    #29065

    PD4ML Probe currently embeds v380fx5 – probably that is the reason. (I assume in PD4ML Probe you check generated PDF, not a web browser preview)

    I’ve just tested your HTML with fx5 – images #3 and #8 are missing in the resulting PDF.

    It seems image #3 URL returns a redirect, which is not supported.

    If I open the HTML with a web browser – only image #8 is missing. However if I copy-paste image #8 URL to the address bar of the web browser – it shows it. I’ll let you know our analyze results.

    #29066

    @Guest wrote:

    PD4ML Probe currently embeds v380fx5 – probably that is the reason. (I assume in PD4ML Probe you check generated PDF, not a web browser preview)

    I’ve just tested your HTML with fx5 – images #3 and #8 are missing in the resulting PDF.

    It seems image #3 URL returns a redirect, which is not supported.

    If I open the HTML with a web browser – only image #8 is missing. However if I copy-paste image #8 URL to the address bar of the web browser – it shows it. I’ll let you know our analyze results.

    Thanks for your reply,

    did you get image 7 working with v380fx5 ? which is <img src="http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg?AWSAccessKeyId=AKIAI6WLSGT7Y3ET7ADQ&Expires=1342001418&Signature=vP617e4HcVaKvCN07IKk53A7kcw%3D" top="524" left="780" sizewidth="144" sizeheight="218" style="top:524.0px;left:780.0px;z-index:3;width:144.0px;height:218.0px;position:absolute;"/>

    That image has arguments in the url after “?” mark. This is not working with the version of PD4ML we have here, but image 8 is working .

    Thanks

    #29067

    @csueaq wrote:

    @Guest wrote:

    PD4ML Probe currently embeds v380fx5 – probably that is the reason. (I assume in PD4ML Probe you check generated PDF, not a web browser preview)

    I’ve just tested your HTML with fx5 – images #3 and #8 are missing in the resulting PDF.

    It seems image #3 URL returns a redirect, which is not supported.

    If I open the HTML with a web browser – only image #8 is missing. However if I copy-paste image #8 URL to the address bar of the web browser – it shows it. I’ll let you know our analyze results.

    Thanks for your reply,

    did you get image 7 working with v380fx5 ? which is <img src="http://s3.amazonaws.com/data.tumblr.com/tumblr_l9sxt2kDS71qafm1vo1_1280.jpg?AWSAccessKeyId=AKIAI6WLSGT7Y3ET7ADQ&Expires=1342001418&Signature=vP617e4HcVaKvCN07IKk53A7kcw%3D" top="524" left="780" sizewidth="144" sizeheight="218" style="top:524.0px;left:780.0px;z-index:3;width:144.0px;height:218.0px;position:absolute;"/>

    That image has arguments in the url after “?” mark. This is not working with the version of PD4ML we have here, but image 8 is working .

    Thanks

    I was assuming that image 8 you mentioned is the one with a guy sitting on a motorbike. Also, I noticed that if I open the html in IE, 1 image is missing ( the one with arguments), but all image appears in firefox, hope that explains sth……

    #29068

    As we learned the problem is most probably caused by HTTP 302 redirects – our HTML renderer does not implement any code to follow them.

    PD4ML uses a very standard approach with URLConnection to load URL resources. If Java runtime implicitly follows 302 redirects – PD4ML also supports it.

    #29069

    @csueaq wrote:

    Hi there
    Hello Everyone,
    Image is not displayed while converting from html to pdf.
    I have picked randomly some image from internet like
    http://www.daycomsolutions.com/Support/BatchImage/HPIM0050w50.JPG
    But I am able to view image when placed in browser.
    How to fix this?
    Any help is appreciated ….Thanks

    ~/Downloads/pd4>java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml_demo.jar:ss_css2-0.9.4_pd4ml2.jar Pd4Cmd ‘file:sample1.html’ 1200 LETTER -bookmarks HEADINGS -pdfforms -debug -out ./pd4ml.pdf
    version: PD4ML 399 (eval)
    default built-in stylesheet parsed [28ms]
    loading file:sample1.html [6ms]
    image not yet in cache: file:sf.jpg
    not yet in cache: file:sf.jpg
    loading file:sf.jpg [0ms]
    image size: 5510
    image not yet in cache: http://www.daycomsolutions.com/Support/BatchImage/HPIM0050w50.JPG
    not yet in cache: http://www.daycomsolutions.com/Support/BatchImage/HPIM0050w50.JPG
    loading http://www.daycomsolutions.com/Support/BatchImage/HPIM0050w50.JPG [26221ms]
    errno: 111 (Connection refused), error: Connection refused (local port 50735 to address 0:0:0:0:0:0:0:0, remote port 80 to address 74.208.20.166): http://www.daycomsolutions.com/Support/BatchImage/HPIM0050w50.JPG (socket timeout 0ms)
    image http://www.daycomsolutions.com/Support/BatchImage/HPIM0050w50.JPG has zero length.
    can not load image: http://www.daycomsolutions.com/Support/BatchImage/HPIM0050w50.JPG
    done in 27878ms.

    Many thanks

    Sudhakar

Viewing 11 posts - 1 through 11 (of 11 total)

The forum ‘Troubleshooting’ is closed to new topics and replies.