PDF/A Compliance

Generate archival-grade PDFs that conform to the PDF/A-2 and PDF/A-3 standards (ISO 19005). FolioPDF handles XMP metadata, font embedding, color space requirements, and structure tagging automatically.

Overview

PDF/A is an ISO-standardized subset of PDF designed for long-term preservation of electronic documents. It guarantees that a PDF will render identically decades from now, regardless of which viewer or operating system is used. PDF/A achieves this by restricting certain PDF features (external references, encryption, JavaScript) and mandating others (font embedding, color profiles, XMP metadata).

FolioPDF supports six PDF/A conformance levels across two standard versions:

LevelEnum ValueStandardDescription
PDF/A-2a PdfAConformance.PdfA_2A ISO 19005-2 Full accessibility. Requires complete semantic tagging (headings, tables, lists, figures with alt text). The strictest level.
PDF/A-2b PdfAConformance.PdfA_2B ISO 19005-2 Basic visual fidelity. Guarantees the document reproduces its visual appearance. No accessibility requirements.
PDF/A-2u PdfAConformance.PdfA_2U ISO 19005-2 Unicode mapping. Like 2b, but all text must have a Unicode mapping (ToUnicode CMap). Enables text search and copy.
PDF/A-3a PdfAConformance.PdfA_3A ISO 19005-3 Full accessibility + arbitrary file attachments. Required for ZUGFeRD/Factur-X invoicing.
PDF/A-3b PdfAConformance.PdfA_3B ISO 19005-3 Basic visual fidelity + file attachments. The most common level for automated invoice generation.
PDF/A-3u PdfAConformance.PdfA_3U ISO 19005-3 Unicode mapping + file attachments.

Quick Start

Set the conformance level in DocumentSettings:

using FolioPDF;
using FolioPDF.Fluent;
using FolioPDF.Helpers;
using FolioPDF.Infrastructure;

Document.Create(doc =>
{
    doc.Page(page =>
    {
        page.Size(PageSizes.A4);
        page.Margin(50);
        page.Content().Text("This is a PDF/A-2b document.").FontSize(14);
    });
})
.WithSettings(new DocumentSettings
{
    PdfAConformance = PdfAConformance.PdfA_2B
})
.WithMetadata(new DocumentMetadata
{
    Title = "Annual Report 2026",
    Author = "Finance Department",
    Language = "en-US"
})
.GeneratePdf("report-pdfa.pdf");

Understanding the Conformance Levels

Level A vs Level B vs Level U

Each PDF/A version (2 and 3) comes in three "conformance levels" that add progressively stricter requirements:

LevelSuffixRequirements Beyond Previous Level
b (basic) -2b, -3b Visual fidelity only. All fonts embedded. Color spaces defined. XMP metadata present. No external references.
u (unicode) -2u, -3u Everything in b, plus: every glyph in the document must have a Unicode mapping. This makes text searchable and copy-pasteable.
a (accessible) -2a, -3a Everything in u, plus: the document must have a complete structure tree with semantic tags (headings, paragraphs, tables, figures with alt text). This enables screen readers and assistive technology.

PDF/A-2 vs PDF/A-3

The only difference between PDF/A-2 and PDF/A-3 is file attachments:

  • PDF/A-2 does NOT allow arbitrary file attachments.
  • PDF/A-3 allows embedding arbitrary files (XML, CSV, images, other PDFs) as document-level attachments with a mandatory /AFRelationship entry.

This makes PDF/A-3 the required standard for:

  • ZUGFeRD (German e-invoicing) — embeds a factur-x.xml file
  • Factur-X (French/EU e-invoicing) — same XML standard
  • Hybrid documents that carry both a human-readable PDF and machine-readable data

Required Metadata

PDF/A mandates certain metadata fields. FolioPDF sets defaults for most, but you should provide meaningful values:

Document.Create(doc => { /* ... */ })
.WithMetadata(new DocumentMetadata
{
    Title = "Q1 Financial Report",     // Required by PDF/A
    Author = "Jane Smith",             // Strongly recommended
    Subject = "Quarterly financials",  // Optional but recommended
    Keywords = "finance, Q1, 2026",    // Optional
    Language = "en-US",                // Required by PDF/A-a levels
    Creator = "Report Generator v3.0", // Optional
})
.WithSettings(new DocumentSettings
{
    PdfAConformance = PdfAConformance.PdfA_2A
})
.GeneratePdf("q1-report.pdf");

Title is mandatory. PDF/A requires the document to have a non-empty /Title in the XMP metadata. FolioPDF will set a default if you omit it, but veraPDF may flag a missing title as a compliance issue. Always set Title explicitly.

Language is required for level A. PDF/A-2a and PDF/A-3a require a /Lang entry on the document catalog so screen readers know how to pronounce the text. Set Language to a BCP 47 tag like "en-US", "de-DE", or "ja".

Semantic Tagging for Level A

Level A (PDF/A-2a, PDF/A-3a) requires a complete structure tree. FolioPDF's auto-tagging system handles most content automatically, but headings must be tagged explicitly:

Auto-Tagging (Default Behavior)

When DocumentSettings.AutoTagDocument is true (the default), FolioPDF automatically wraps:

  • Text blocks as <P> (Paragraph)
  • Images as <Figure>
  • Tables as <Table>
  • Headers and footers as Artifacts (excluded from the structure tree)

This is sufficient for level B and U. Level A requires explicit heading tags:

Explicit Semantic Tags

doc.Page(page =>
{
    page.Size(PageSizes.A4);
    page.Margin(50);

    page.Content().Column(col =>
    {
        col.Spacing(10);

        // Heading level 1
        col.Item().SemanticHeader1().Text("Annual Report").FontSize(24).Bold();

        // Paragraph (auto-tagged, but can be explicit)
        col.Item().SemanticParagraph().Text("This report covers the fiscal year 2026.");

        // Heading level 2
        col.Item().SemanticHeader2().Text("Revenue").FontSize(18).Bold();
        col.Item().SemanticParagraph().Text("Total revenue: $4.2M");

        // Semantic list
        col.Item().SemanticList().Column(list =>
        {
            list.Item().SemanticListItem().Row(row =>
            {
                row.AutoItem().SemanticListLabel().Text("1. ");
                row.RelativeItem().SemanticListItemBody().Text("Q1: $1.0M");
            });
            list.Item().SemanticListItem().Row(row =>
            {
                row.AutoItem().SemanticListLabel().Text("2. ");
                row.RelativeItem().SemanticListItemBody().Text("Q2: $1.1M");
            });
        });

        // Image with alt text (required for level A)
        col.Item().SemanticFigure("Chart showing quarterly revenue growth")
            .Image(File.ReadAllBytes("revenue-chart.png"));

        // Semantic table
        col.Item().SemanticTable().Table(table =>
        {
            table.ColumnsDefinition(c =>
            {
                c.RelativeColumn();
                c.RelativeColumn();
            });
            table.Cell().SemanticTableHeaderCell().Text("Quarter").Bold();
            table.Cell().SemanticTableHeaderCell().Text("Revenue").Bold();
            table.Cell().SemanticTableCell().Text("Q1");
            table.Cell().SemanticTableCell().Text("$1.0M");
        });
    });
});

Available Semantic Tags

Extension MethodPDF TagPurpose
SemanticHeader1() through SemanticHeader6()H1H6Heading levels 1-6
SemanticParagraph()PParagraph text
SemanticList()LList container
SemanticListItem()LIList item
SemanticListLabel()LblList item label (bullet/number)
SemanticListItemBody()LBodyList item body text
SemanticTable()TableTable container
SemanticTableRow()TRTable row
SemanticTableHeaderCell()THTable header cell
SemanticTableCell()TDTable data cell
SemanticTableHeaderGroup()THeadTable header group
SemanticTableBodyGroup()TBodyTable body group
SemanticTableFooterGroup()TFootTable footer group
SemanticFigure(altText)FigureImage or illustration with alt text
SemanticTableOfContents()TOCTable of contents
SemanticTableOfContentsItem()TOCITOC entry

Choosing the Right Level

ScenarioRecommended LevelWhy
Long-term archival (no accessibility requirement) PdfA_2B Simplest to implement. Guarantees visual fidelity without tagging overhead.
Archival with text search/copy PdfA_2U All text is Unicode-mapped. Users can search and copy text from the document.
Government/regulated documents PdfA_2A Full accessibility. Required by many government agencies and legal frameworks (e.g. Section 508).
ZUGFeRD/Factur-X e-invoicing PdfA_3B Allows embedding the machine-readable XML invoice alongside the visual PDF.
Hybrid documents (PDF + embedded data) PdfA_3B or PdfA_3U Arbitrary file attachments are permitted.
Accessible e-invoicing PdfA_3A Combines file attachments with full accessibility.

Combining with ZUGFeRD / Factur-X

ZUGFeRD and Factur-X are European e-invoicing standards that require a PDF/A-3b document with an embedded XML invoice file. FolioPDF supports the full workflow:

using FolioPDF;
using FolioPDF.Fluent;
using FolioPDF.Fluent.DocumentOperation;
using FolioPDF.Helpers;
using FolioPDF.Infrastructure;

// 1. Generate the visual invoice as PDF/A-3b
byte[] pdfBytes = Document.Create(doc =>
{
    doc.Page(page =>
    {
        page.Size(PageSizes.A4);
        page.Margin(40);
        page.Content().Column(col =>
        {
            col.Spacing(8);
            col.Item().Text("Invoice #2026-0042").FontSize(20).Bold();
            col.Item().Text("Date: 2026-04-11");
            col.Item().Text("Total: EUR 1,234.56").FontSize(16);
        });
    });
})
.WithSettings(new DocumentSettings
{
    PdfAConformance = PdfAConformance.PdfA_3B
})
.WithMetadata(new DocumentMetadata
{
    Title = "Invoice 2026-0042",
    Author = "Acme Corp Billing"
})
.GeneratePdf();

// 2. Attach the Factur-X XML and set the AFRelationship
byte[] xmlInvoice = File.ReadAllBytes("factur-x.xml");

byte[] finalPdf = DocumentOperation.FromBytes(pdfBytes)
    .AddAttachment(new DocumentAttachment
    {
        Name = "factur-x.xml",
        Data = xmlInvoice,
        MimeType = "text/xml",
        Description = "Factur-X invoice data",
        Relationship = "Alternative"    // PDF/A-3 AFRelationship
    })
    .ExtendMetadata(zugferdXmpXml)      // XMP with Factur-X extension schemas
    .ToBytes();

AFRelationship is required. PDF/A-3 mandates that every embedded file has an /AFRelationship entry and is referenced from the document-level /AF array. FolioPDF's qpdf backend handles this automatically via a downstream patch when you set the Relationship property on DocumentAttachment.

Validation with veraPDF

veraPDF is the industry-standard open-source PDF/A validator. FolioPDF's conformance tests use it to verify every generated document.

Install veraPDF

# Download from https://verapdf.org/software/
# Or via package manager (Linux):
sudo apt install verapdf

# macOS (Homebrew):
brew install verapdf

Validate a Document

# Validate against auto-detected profile
verapdf report-pdfa.pdf

# Validate against a specific profile
verapdf --profile 2b report-pdfa.pdf

# Machine-readable output
verapdf --format json report-pdfa.pdf

Programmatic Validation

// Run veraPDF as a process
var process = new System.Diagnostics.Process
{
    StartInfo = new System.Diagnostics.ProcessStartInfo
    {
        FileName = "verapdf",
        Arguments = "--format json report-pdfa.pdf",
        RedirectStandardOutput = true,
        UseShellExecute = false
    }
};

process.Start();
string output = process.StandardOutput.ReadToEnd();
process.WaitForExit();

bool isCompliant = output.Contains("\"compliant\": true");
Console.WriteLine($"PDF/A compliant: {isCompliant}");

What FolioPDF Handles Automatically

When a PDF/A conformance level is set, FolioPDF's rendering pipeline automatically:

  • Embeds all fonts as subsetted OpenType/TrueType programs (no external font references)
  • Includes ToUnicode CMaps for all text (required by levels U and A)
  • Sets the output intent to sRGB IEC61966-2.1 (the default RGB color space for screen)
  • Generates XMP metadata with the correct pdfaid:part and pdfaid:conformance entries
  • Marks the document as tagged with /MarkInfo << /Marked true >>
  • Sets the document language from DocumentMetadata.Language
  • Auto-tags content elements as P, Figure, and Table when AutoTagDocument is true
  • Strips JavaScript and multimedia (forbidden in PDF/A)
  • Ensures all images use supported color spaces (sRGB, Grayscale)

Configuration Reference

var settings = new DocumentSettings
{
    // PDF/A conformance level (default: None)
    PdfAConformance = PdfAConformance.PdfA_2A,

    // Auto-tag content for structure tree (default: true)
    // Set to false only if you provide all semantic tags manually
    AutoTagDocument = true,

    // Image compression quality (default: High)
    // PDF/A allows JPEG compression; lower quality = smaller files
    ImageCompressionQuality = ImageCompressionQuality.High,

    // Document content direction (default: LeftToRight)
    ContentDirection = ContentDirection.LeftToRight,
};

Common Validation Issues

veraPDF RuleCauseFix
Missing /Title in XMP DocumentMetadata.Title not set Always set Title in WithMetadata()
Missing /Lang on document catalog DocumentMetadata.Language not set Set Language = "en-US" (or appropriate BCP 47 tag)
Untagged content (level A) Content not wrapped in semantic tags Use SemanticHeader*, SemanticParagraph, SemanticFigure etc.
Figure without alt text (level A) Image not wrapped in SemanticFigure(altText) Wrap images: .SemanticFigure("Description of image").Image(...)
Non-embedded font Rare: using a system font that cannot be subset Use a bundled font or register the font via FontManager
Missing AFRelationship (PDF/A-3) Attachment added without Relationship property Set Relationship = "Alternative" (or Data, Source, Supplement, Unspecified)

Known Limitations

  • PDF/A-1 is not supported. FolioPDF targets PDF/A-2 and PDF/A-3 (ISO 19005-2/3, based on PDF 1.7). PDF/A-1 (ISO 19005-1, based on PDF 1.4) is an older standard with stricter constraints that are incompatible with modern Skia output.
  • CMYK output intent is not yet supported. FolioPDF uses sRGB as the output intent. Professional print workflows that require ISO Coated v2 or similar CMYK profiles should convert the output in a post-processing step.
  • Embedded file validation for PDF/A-3 does not validate the content of attached files. The attachment is embedded with correct metadata, but the PDF/A validator does not check whether the XML is valid Factur-X.
  • Auto-tagging does not detect headings. Font-size heuristics are unreliable, so auto-tagging only produces P, Figure, and Table tags. Use SemanticHeader1 through SemanticHeader6 explicitly for heading structure.