HTML to PDF / DOCX / RTF Java converter library › Forums › PD4ML v3 Archived Forums (Read Only) › Troubleshooting › Extracted attachment from DXL says it is damaged
- This topic has 12 replies, 2 voices, and was last updated Apr 16, 2013
12:56:30 by PD4ML.
-
AuthorPosts
-
April 3, 2013 at 09:31#26830
Hi,
I am able to extract the attachment inside the DXL file for some of the DXL Files.But for the attached DXL files the extracted attached says it is damaged.The attachment is opening fine when i open the DXL in Lotus Notes.Kindly help to resolve the issue.
THanks.
April 5, 2013 at 15:42#29264Here is our research results:
We extracted base64-encoded PDF attachment from 619895_1.dxl file. After base64 decoding, we found that Acroread failed to show the decoded file.
Further research shown, that between offsets 30496 (a beginning of the last PDF object found) and 34150 (the previous offset plus a declared PDF object length) the PDF gets corrupted. The remaining binary data looks like a garbage.
PD4ML just copies the problematic DXL attachment to PDF attachment “as is”. So I am afraid, the problem is somewhere by DXL export phase on your side.
BTW: The offsets are suspiciously close to 32768 (max 16bit signed integer value). I’ll check 720553_1.dxl if it is the “magic limit” for it as well.
April 5, 2013 at 15:54#29265No. A PDF from 720553_1.dxl gets started corrupted between 130286 and 132722.
April 6, 2013 at 09:50#29266Hi,
The same DXL file attachment is opening fine in Lotus notes.It doesn’t say the PDF is corrupted.
Many DXL file attachment almost 50 out of 100 tested has this problem.But I attached 2 only as the problem is similar.Could you kindly let me know is there any way to resolve this?Since it is opening in lotus notes our client expecting the attachment should extract by Pd4ml as well.
Thanks.
April 6, 2013 at 11:28#29267In Notes the documents are stored in an internal database format. And Lotus Notes opens the documents (and their attachments) directly.
The problem occurs by a converting of the internal document format to DXL: DxlExporter module of Notes (not PD4ML) corrupts the attachment data somehow. Why does that happen? It is a question to IBM, I guess. PD4ML can do nothing with that as it receives DXL already damaged and not repairable.
Could you please publish, at least, a code snippet, which exports DXL from Notes database and version/platform of your Notes environment. We’ll try to google for an explanation or a possible workaround.
April 6, 2013 at 11:53#29268It looks like the attachments are compressed with an adaptive LZ1 (LZ 77) algorithm which may skip long portions of data uncompressed, when a compression makes no significant win. Hopefully we’ll find a good Java uncpmpress library for it.
April 6, 2013 at 12:06#29269Do you mean you will find a way and fix the issue?
April 6, 2013 at 12:14#29270Hopefully. When I am back to the office on Monday I’ll do few tests and let you know.
April 10, 2013 at 11:18April 10, 2013 at 11:35#29272In the meantime we tried to implement our own LZ1 uncompressing method – for the time being still no success. Unfortunately there is no any documentation that could clarify some aspects of the Lotus Notes LZ1 method implementation.
Unfortunately I have no info which Notes version you use and the way you export DXL.
The only possible workaround would be to force DxlExporter to always uncompress attachment data. The relevant DxlExporter property takes effect in Notes 8.0 or newer.
April 15, 2013 at 11:18#29273Did you try to export DXL and to force to uncompress attachments? Did it help?
April 16, 2013 at 12:34#29274I will check and let you know.But all the DXL files are already exported I think it would be difficult (probably not possible) to find the DXL files which are having compressed attachement and again re-exporting it by uncompressing the attachement.
It would be great if you can hadle it in code to uncompress this type of attachements.April 16, 2013 at 12:56#29275As I wrote, Lotus Notes LZ1 compression is undocumented, which turns the feature implementation into an untrivial task. On the other hand we already invested quite a lot of time to an analyse and the implementation seems to be doable now (however there are still few questionable issues there).
If your company ready to (partially) fund the development, please contact support pd4ml com by email.
-
AuthorPosts
The forum ‘Troubleshooting’ is closed to new topics and replies.