HTML to PDF / DOCX / RTF Java converter library › Forums › PD4ML Forums › Technical questions / Troubleshooting › Duplicate font subset prefixes cause Adobe Acrobat to abort page rendering
- This topic has 0 replies, 1 voice, and was last updated Jun 29, 2026
15:28:50 byDavid Rejdemyhr.
-
AuthorPosts
-
June 29, 2026 at 15:28#41137
# Bug report: duplicate font subset prefixes cause Adobe Acrobat to abort page rendering
## Summary
PD4ML emits multiple **distinct** embedded font subset programs that share the
**same 6-character subset prefix and BaseFont name** (e.g.CIPRXO+ArialMT
applied to six different font programs in a single one-page document). This
violates ISO 32000-1 §9.6.4, which requires the subset tag to be unique per
subset.Lenient viewers (Apple Preview / Quartz, MuPDF) render the affected documents
correctly. Adobe Acrobat does not: it renders content up to the first font
whose name collides with an incompatible cached program, then silently stops
drawing the remainder of the page. The text is present and extractable in the
PDF; Acrobat simply does not paint it.## Environment
Taken from the PD4ML diagnostic comments embedded in the page content stream:
– PD4ML version: **4.1.0** (the build dated 2026-02-22)
– JDK version: 21.0.11
– OS version: Mac OS X 26.5.1
– File encoding: UTF-8
– Output: single-page A4 PDF (595 x 842), PDF 1.7
– Fonts embedded as CIDFontType2 (Type0 / Identity-H), TrueType FontFile2The document uses two watermarks injected as HTML, per the diagnostics:
setWatermark: 345,595.0,1.0,1.0,90.0,2.0,true,true,1+ (injectHtml 502) setWatermark: 345,0.0,841.0,1.0,-90.0,2.0,true,true,1+ (injectHtml 503)
## Observed behaviour
A certificate template (heading, a few centred paragraphs, a bulleted list,
two italic lines) is converted to PDF.– **Apple Preview / Quartz:** renders the full page correctly.
– **MuPDF (pdf rasterizer):** renders the full page correctly; all 17 text
spans extract correctly.
– **Adobe Acrobat:** renders only the heading and the following bold lines,
then stops. Everything from the first italic line onward is blank. No error
dialog is shown for this particular file. (An earlier variant of the same
template did raise an Acrobat error dialog and truncated at the same point.)## Root cause (verified from the PDF bytes)
The page content stream is well-formed and balanced (19
q/ 19Q,
17BT/ 17ET) and contains all text. The defect is in the embedded fonts.Ten font resources (
/f1../f10) are present. Each is a **distinct font
object** with a **distinct FontFile2 program** (distinct object numbers and
distinct byte lengths), yet they collide on BaseFont name:| BaseFont | Used by | FontFile2 objects (sizes in bytes) |
|—————————|———————-|—————————————————-|
|CIPRXO+ArialMT| f1, f5, f6, f7, f8, f10 | 55 (19768), 18 (19768), 15 (28368), 43 (31340), 39 (21748), 59 (25148) |
|EGOKFG+Arial-BoldMT| f2, f3 | 32 (24356), 34 (30624) |
|QDIRYT+Arial-ItalicMT| f4, f9 | 40 (16436), 31 (17420) |So a single subset prefix (
CIPRXO) is reused for six different subset
programs,EGOKFGfor two, andQDIRYTfor two. Per ISO 32000-1 §9.6.4 the
6-letter tag must uniquely identify the subset; here it does not.Acrobat appears to key its embedded-font cache on the BaseFont name. When a
second resource references a name it has already cached, Acrobat reuses the
first program (with a different CIDToGIDMap and glyph set) and aborts rendering
of the remaining content on the page.The truncation begins, empirically, at the first use of the italic face
(QDIRYT+Arial-ItalicMT).## Likely trigger
The collision correlates with the **watermark / injectHtml** usage. Those
appear to run as separate render passes, each of which independently subsets
the fonts it needs while reusing the same per-face prefix. The result is
multiple distinct subset programs sharing one tag. (PD4ML 4.0.9fx5 / 3.11.4fx5
fixed a related “possible name clash of embedded fonts by PDF document merge”;
the watermark-overlay path may not be covered by that fix.)This is **reproducible on the current 4.1.0 build (2026-02-22)**.
## Confirmation of the cause
Post-processing the generated PDF to assign a **unique** subset prefix to each
embedded font (rewriting/BaseFonton the Type0 dictionary and its descendant
CIDFont, and/FontNameon the FontDescriptor) makes Acrobat render the entire
page correctly. This was verified two ways:1. Renaming **all** subsets to unique prefixes — full render in Acrobat.
2. Renaming **only** the two italic subsets to unique prefixes, leaving the
CIPRXO(x6) andEGOKFG(x2) collisions in place — also a full render in
Acrobat. This isolates the italic-face collision as the fatal one in this
document, although the underlying defect (non-unique subset tags) applies to
all three groups.No change to the font programs themselves is required; only the names need to
be made unique.## Expected behaviour
Every distinct embedded font subset should receive a unique 6-character subset
prefix (and correspondingly unique/BaseFontand/FontName), per
ISO 32000-1 §9.6.4, regardless of how many render passes (including watermark /
injectHtml passes) contribute fonts to the page.## Reproduction
Minimal repro: a single-page HTML template containing bold heading text, normal
body text, and at least two italic text runs, converted with two injected HTML
watermarks. Open the result in Adobe Acrobat (not Preview).A sample failing PDF is attached. Its
/Fontresources exhibit the collisions
listed above.## Severity / impact
High for any workflow whose output is consumed in Adobe Acrobat. The document
looks correct in Preview and most browsers, so the defect is easy to ship
unnoticed and only surfaces for end users opening the file in Acrobat, where the
bulk of the page content silently disappears.Attachments:
You must be logged in to view attached files. -
AuthorPosts
You must be logged in to reply to this topic.
