Accessibility - PDF/UA
Version 6.1.1 · 2024-08-01
In this article
- What is PDF/UA?
- How PDF/UA works
- Creating a PDF/UA document
- The base API
- Extension methods
- Structure the document
This article describes how to use Universal Accessibility with PDFsharp to create accessible PDF files.
What is PDF/UA?
PDF/UA (PDF/Universal Accessibility), formally ISO 14289, is an International Organization for Standardization (ISO) standard for accessible PDF technology. It is a technical specification for PDF producing software to create documents that fulfill the requirements to ensure accessibility for disabled people. PDF/UA conforming documents contain additional information about language and document structure used by screen readers to navigate and read the electronic content.
How PDF/UA works
When PDFsharp creates a PDF/UA conforming document it adds hints to the PDF file so that screen readers know what to read in the correct order and what to skip. PDFsharp provides a base API and extension methods to add these UA hints.
In PDF/UA the document is organized in a structure tree consisting of various UA structure elements. In this structure tree a Document may contain e. g. Parts, Sections and Articles. You can find the different structure element types in the enums PdfGroupingElementTag, PdfBlockElementTag, PdfIllustrationElementTag and PdfInlineElementTag.
Creating a PDF/UA document
To make a PDF document a PDF/UA document, you first have to get a UAManager for your document.
// Get the UAManager for the document and create it, if necessary.
var uaManager = UAManager.ForDocument(pdfDocument);
This creates a UAManager and initializes the document for using UA. Any further calls always return the same UAManager object. This must be done before any other operation on the document (e.g. creating a page) is done.
The base API
Using the base API, you place the drawing instructions representing one particular logical text structure element between a BeginElement and an End function call on the StructureBuilder. This way you have a maximum of flexibility in writing drawing instructions and building the structure tree of the document.
To define the document structure with the base API, you first have to retrieve the StructureBuilder from the UAManager.
// Get the structure builder.
var sb = uaManager.StructureBuilder;
Example using the base API
To clarify, which drawing instructions belong to which structure element, you can structure your code with additional curly braces.
// Start an Article structure element.
sb.BeginElement(PdfGroupingElementTag.Article);
{
// Create a Page and an XGraphics object as usual.
var page = document.AddPage();
var gfx = XGraphics.FromPdfPage(page);
// Start a Heading1 structure element.
sb.BeginElement(PdfBlockLevelElementTag.Heading1);
{
gfx.DrawString("Header Text", fontH1, XBrushes.DarkBlue, 50, 100);
}
// End the Heading1 element.
sb.End();
// Start a Paragraph structure element.
sb.BeginElement(PdfBlockLevelElementTag.Paragraph);
{
// This string contains a trailing space needed for screen readers.
gfx.DrawString("Line one ", font, XBrushes.DarkBlue, 50, 200);
gfx.DrawString("Line two", font, XBrushes.DarkBlue, 50, 250);
}
// End the Paragraph structure element.
sb.End();
}
// End the Article structure element.
sb.End();
Important
Note the trailing blank in the call of gfx.DrawString
for "Line one "
.
This is needed to make word break identification between "Line one"
and "Line two"
possible for screen readers.
Otherwise screen readers would read "Line oneLine two"
.
If there are several DrawString calls inside one structure element, always insert trailing blanks as if you would write the content as pure text.
Extension methods
Extension methods allow combining drawing instructions and UA hints in a single function call. This reduces the needed code, but it is only applicable for simple structure element constellations. For more complex constellations you still have to use the base API.
To use all extension methods, you have to add PdfSharp.UniversalAccessibility
and PdfSharp.UniversalAccessibility.Drawing
to your using directives.
Example using extension methods
Instead of using the base API’s BeginElement and End functions, you can use the extension methods for simple structure element constellations. E. g. for this excerpt using the base API...
// Start a Heading1 structure element.
sb.BeginElement(PdfBlockLevelElementTag.Heading1);
{
gfx.DrawString("Header Text", fontH1, XBrushes.DarkBlue, 50, 100);
}
// End the Heading1 structure element.
sb.End();
...an extension method can be used instead:
// Create the Heading1 structure element, draw the string and end the structure element
// in one line of code.
gfx.DrawString("Header Text", fontH1, XBrushes.DarkBlue, 50, 100,
PdfBlockLevelElementTag.Heading1);
You can choose between several DrawString overloads. This way you can e. g. use the different PdfBlockLevelElementTag and PdfInlineLevelElementTag enum values to define the structure element type.
Structure the document
Language
Narrator applications use the language information to correctly pronounce the text. You can set the document language like this:
uaManager.SetDocumentLanguage("en-US");
Using the base API, you can also change the language of the current structure element with the SetLanguage function.
// Create a Paragraph structure element.
sb.BeginElement(PdfBlockLevelElementTag.Paragraph;
{
// Change the language for the current structure element.
sb.SetLanguage("de-DE");
// Insert the text in the new language.
gfx.DrawString("Herzlichen Glückwunsch!", font, XBrushes.DarkBlue, 50, 200);
}
// End the Paragraph structure element.
sb.End();
Text and Paragraphs
For writing text and paragraphs, take a look at the base API and extension method examples above. All text, that is textual content of the document, should be organized in the structure tree using e. g. Articles and Paragraphs. Structure elements used for this simple grouping can be created like done in the base API and extension method examples above. Below you will find examples for structure elements that should be used in a specific way.
Abbreviations
As the meaning of abbreviations may vary, it is recommended to define the expanded text for each abbreviation. This can be done with the SetExpandedText method.
In this example using the base API, the first part of the sentence is written as usual, followed by the abbreviation that is encapsulated in its own structure element. The rest of the sentence is again written as usual.
// Create a Paragraph structure element.
sb.BeginElement(PdfBlockLevelElementTag.Paragraph);
{
// Insert simple text. It contains a trailing space needed for screen readers.
gfx.DrawString("A text with an ", font, XBrushes.DarkBlue, 50, 100);
// Create a Span structure element for the abbreviation.
sb.BeginElement(PdfInlineLevelElementTag.Span);
{
// Draw the abbreviation.
gfx.DrawString("abbr.", font, XBrushes.DarkBlue, 50, 120);
// Set the expanded text for the abbreviation to be used by screen readers.
// It contains a trailing space needed for screen readers.
sb.SetExpandedText("abbreviation ");
}
// End the Span structure element.
sb.End();
// Insert further text.
gfx.DrawString("in a structure element in the middle.", font, XBrushes.DarkBlue, 50, 140);
}
// End the Paragraph structure element.
sb.End();
As explained earlier, each blank that would be written in a pure text representation, must also be present in a PDF/UA document.
For this reason there is a trailing blank in the first DrawString call ("A text with an "
) and in the SetExpandedText call ("abbreviation "
).
Otherwise a screen reader would read "A text with anabbreviationin a structure element in the middle."
.
You can also use extension methods to add the same abbreviation:
// Create the Span structure element, draw the abbreviation, add the expanded text
// and end the structure element in one line of code.
// "abbreviation " contains a trailing space needed for screen readers.
gfx.DrawAbbreviation("abbr.", "abbreviation ", font, XBrushes.DarkBlue, 50, 120);
You can choose between several DrawAbbreviation overloads.
Artifacts
Artifacts are text chars, drawn paths, or images that are not part of the content of the document, meaning that they only exist for layout or design and not for understanding purposes. Everything that is drawn or written inside an artifact will not be part of the structure tree and will be ignored by screen readers.
// Create an Artifact.
sb.BeginArtifact();
{
// This text is, as it is included in the artifact, not handled as actual content
// of the document. A screen reader will skip this text.
gfx.DrawString("An artifact", font, XBrushes.Gray, 50, 150);
}
// End the Artifact.
sb.End();
Links
A Link structure element must include the PdfLinkAnnotation and an alternate text. There is a BeginElement overload to create the Link structure element accepting these two parameters.
// Create the link’s bounding rect.
var rect = new PdfRectangle(gfx.Transformer.WorldToDefaultPage(new XRect(new XPoint(50, 90),
new XPoint(205, 105))));
// Create a LinkAnnotation referring to page 2.
var link = PdfLinkAnnotation.CreateDocumentLink(rect, 2);
// Create a new Link structure element with the LinkAnnotation
// and the alternative text passed as parameters.
sb.BeginElement(link, "This is a link to page 2");
{
// Insert the text shown as link.
gfx.DrawString("Go to next page", font, XBrushes.DarkBlue, 50, 100);
}
// End the Link structure element.
sb.End();
Instead of creating a document link, you could, of course, create a web or file link.
You can also use extension methods to add Links:
// Create the Link structure element, draw the link’s text and end the structure element
// in one line of code.
gfx.DrawLink("Go to next page", font, XBrushes.DarkBlue, 50, 100, link,
"This is a link to page 2");
You can choose between several DrawLink overloads.
Page break
A page break is not relevant for the structure tree. To add a page break inside a structure element, simply use the base API and add the code for the page break:
// Create a Paragraph structure element.
sb.BeginElement(PdfBlockLevelElementTag.Paragraph);
{
// Draw text on the first page.
// It contains a trailing space needed for screen readers.
gfx.DrawString("A paragraph that contains text content on the first page and that ",
font, XBrushes.DarkBlue, 50, 100);
// Break to the second page.
page = document.AddPage();
gfx = XGraphics.FromPdfPage(page);
// Draw text on the second page. The text is still content of the same Paragraph.
gfx.DrawString("continues on the second page.", font, XBrushes.DarkBlue, 50, 100);
}
// End the Paragraph structure element.
sb.End();
Lists
The List structure element contains several ListItem structure elements. Each ListItem contains a Label and a ListBody structure element. The Label contains the bullet, number or another kind of label and the ListBody the text of the list item.
According to these rules a list is created like this:
// Create a List structure element.
sb.BeginElement(PdfBlockLevelElementTag.List);
{
// Create a ListItem structure element for the first list item.
sb.BeginElement(PdfBlockLevelElementTag.ListItem);
{
// Create a Label structure element for the first list item.
sb.BeginElement(PdfBlockLevelElementTag.Label);
{
// Draw the label.
gfx.DrawString("1)", font, XBrushes.DarkBlue, 50, 80);
}
// End the Label structure element.
sb.End();
// Create a ListBody structure element for the first list item.
sb.BeginElement(PdfBlockLevelElementTag.ListBody);
{
// Draw the list item’s content.
gfx.DrawString("Item 1", font, XBrushes.DarkBlue, 70, 80);
}
// End the ListBody structure element.
sb.End();
}
// End the ListItem structure element.
sb.End();
// Create further list items.
…
}
// End the List structure element.
sb.End();
You can also use extension methods to create the ListItem structure element. This way, the code of the example above can be reduced like this:
// Create a List structure element.
sb.BeginElement(PdfBlockLevelElementTag.List);
{
// Create the first ListItem, Label and ListBody structure element, draw the list items’s
// label and text and end the structure elements in one line of code.
gfx.DrawListItem("1)", "Item 1", font, XBrushes.DarkBlue, 50, 80, 20);
// Create further list items.
…
}
// End the List structure element.
sb.End();
The last parameter in the DrawListItem call above is the labelWidth, which means the x-offset between the positions of the label and the text of the list item. You can choose between several DrawLink overloads.
Nested lists
You can also create nested lists by beginning a further List structure element inside the ListBody structure element of a list item of the parent list:
// Create a ListBody structure element for a list item.
sb.BeginElement(PdfBlockLevelElementTag.ListBody);
{
// Draw some text.
gfx.DrawString("Item 1", font, XBrushes.DarkBlue, 70, 100);
// Create a List structure element for the nested list.
sb.BeginElement(PdfBlockLevelElementTag.List);
{
// Create list items of the nested list.
…
}
// End the List structure element of the nested list.
sb.End();
}
// End the ListBody structure element.
sb.End();
Drawn list bullets
To preserve the accessibility of the document, you should provide an alternative text for drawn list bullets. You can do this by using the SetAltText method.
// Create the Label structure element of a list item.
sb.BeginElement(PdfBlockLevelElementTag.Label);
{
// Draw an ellipse as the list item’s label.
gfx.DrawEllipse(XBrushes.DarkBlue, 50, 75, 3, 3);
// Set the alternative text for the label’s structure element.
sb.SetAltText("Bullet");
}
// End the Label structure element.
sb.End();
Images
In PDF/UA you must define a bounding box and an alternate text for each graphic. This can be simply done with the BeginElement function for a Figure structure element.
The following image is inserted without changing its size. Nevertheless you need the image’s size to define its bounding box. For this reason you should create the image first and use its PointWidth and PointHeight to create the bounding box of the structure element.
// Load the image.
var image = XImage.FromFile("Z3.jpg");
// Create the Figure structure element and pass its alternate text and its bounding box
// as parameter. Use the image size as size for the bounding box.
sb.BeginElement(PdfIllustrationElementTag.Figure, "A BMW Z3 driving through a desert.",
new XRect(50, 100, image.PointWidth, image.PointHeight));
{
// Draw the previously created image as usual without specifying a size.
gfx.DrawImage(image, 50, 100);
}
// End the Figure structure element.
sb.End();
With a size defined, you can move the Image.FromFile call inside the DrawImage call, because you don’t have to access the image’s size.
// Create the Figure structure element and pass its alternate text and its bounding box
// as parameter.
sb.BeginElement(PdfIllustrationElementTag.Figure, "A scaled BMW Z3 driving through a desert.",
new XRect(50, 100, 400, 300));
{
// Load the image and draw it as usual specifying a size.
gfx.DrawImage(XImage.FromFile("../../assets/Z3.jpg"), 50, 100, 400, 300);
}
// End the Figure structure element.
sb.End();
You can also use extension methods to add images:
// Create the Figure structure element, draw the image and end the structure element
// in one line of code.
gfx.DrawImage(image, 50, 100, "A BMW Z3 driving through a desert.")
You can choose between several DrawImage overloads. If you don’t specify a bounding box, it will be created from the non-transformed image position and size.