HTML to PDF / DOCX / RTF Java converter library Forums PD4ML v3 Archived Forums (Read Only) Troubleshooting PD4Device.finalize() closes output stream and causes mixup!

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
  • #26685

    We are running a PD4ML Pro 323, which I am aware of is a bit old. Lately we had an incident where arbitrary requests/responses were mixed up, that is users seeing each other’s data! Being a web application that requires login and uses HTTPS to protect our user’s data that is very serious business. The problem was traced back to our usage of PD4ML.

    PD4ML.render(StringReader, OutputStream) was called from a web application, and HttpServletResponse.getOutputStream() was used as the second argument. It turns out that PD4ML closes the OutputStream, and does it twice. Once from within PD4ML.render() via PD4Device.dispose(). That is unexpected (since the usual contract is that the one who opened a stream is the one responsible for closing it), but it does not do any harm in our case. The second close comes from PD4Device.finalize(), and that is the harmful one! The finalize method will be invoked from another thread, the JVM FinalizerThread, at some later time in the future. It means that the ResponseOutputStream gets an unexpected close() call from an external thread, which is not allowed by the Servlet specification.

    We are using Apache + Tomcat with an AJP connector. In that case, the OutputStream returned by response.getOutputStream() is a org.apache.catalina.connector.CoyoteOutputStream, which is a thin wrapper dispatching directly to an org.apache.catalina.connector.OutputBuffer. The problem is that the latter is reused by the web container. So when the stream.close() comes from PD4Device, it is probably already pooled for reuse or even in use by a subsequent request! That unexpected close causes the connector and/or Apache mod_jk to mismatch request/response on that socket. Exactly how this bit happen has not been possible to debug yet. But clearly the web container makes no guarantees if the response stream is accessed out of scope.

    If I wrap the ServletOutputStream with a FilterOutputStream that makes all close() calls ignored, the bug disappears at once. That is how the call from finalize was found.

    at org.zefer.pd4ml.pdf.PD4Device.dispose(Unknown Source)<br /> at org.zefer.pd4ml.pdf.PD4Device.finalize(Unknown Source)<br /> at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)<br /> at java.lang.ref.Finalizer.runFinalizer(<br /> at java.lang.ref.Finalizer.access$100(<br /> at java.lang.ref.Finalizer$

    Why would you close a stream in finalize()? A stream that you did not even open and thus were not responsible for closing? I know there are several releases past 323, but since we are currently not paying for updates I can’t tell if the bug remains or not. If it does it should be fixed, or you must post big warning signs in the javadoc that no one must ever call PD4ML.render() with a ServletOutputStream as argument.


    The issue has been solved in the actual versions of PD4ML. Just give a try to any fully-functional evaluation version (or request a commercial one to evaluate from support pd4ml com).

    > usual contract is that the one who opened a stream is the one responsible for closing it

    You are right. But if we leave the output stream not closed, there is a probability the hosting framework writes some finalizing content/tags there, which corrupts the generated binary PDF data.

    A safer (and recommended) approach would be to pass ByteArrayOutputStream to render() method. That would also let you measure generated PDF size and send correct Content-length HTTP header.


    So the stream.close() is now definitely removed from the finalizer, is that correct? Do you know in which release this was fixed? Any release note link?

    I’ll give the evaluation jar a spin in my local environment and see how it acts. Thanks.


    Currently PD4Device.finalize() looks like that:

    * @see java.lang.Object#finalize()
    public void finalize() {
    // dispose();

    Unfortunately I could not find in our source control system when exactly the dispose() call has been commented out. But it definitely happened more that 2 years ago.

    dispose() method is triggered once now. The method forces collected PDF metadata to be written and it still closes the output stream.

    I would recommend to wait the next 2-3 hours – we plan to get from QA and to release the newest version today. Of you can get it on Monday.

Viewing 4 posts - 1 through 4 (of 4 total)

The forum ‘Troubleshooting’ is closed to new topics and replies.