Assyro AI logo background
ectd xml backbone
ectd xml
ectd backbone structure
xml for ectd
ectd technical specifications

eCTD XML Backbone: Complete Technical Guide to index.xml and regional.xml

Guide

eCTD XML backbone technical guide covering index.xml, regional.xml, DTD specifications, leaf elements, and common XML validation errors. Master FDA and EMA submission structure.

Assyro Team
25 min read

eCTD XML Backbone: Complete Technical Guide to Structure, DTD, and Validation

Quick Answer

The eCTD XML backbone consists of index.xml and regional.xml files that serve as the navigational foundation for regulatory submissions. Index.xml defines the structure of Modules 2-5 using leaf elements with unique IDs, MD5 checksums, and lifecycle operations (new, replace, append, delete), while regional.xml files contain Module 1 content specific to FDA, EMA, Health Canada, or PMDA requirements. XML validation errors account for 23% of all eCTD gateway rejections, making backbone validation critical for submission success.

The eCTD XML backbone is the foundational XML file structure that provides navigation, metadata, and lifecycle management for Electronic Common Technical Document (eCTD) submissions. Consisting of the index.xml master file and region-specific XML files (us-regional.xml, eu-regional.xml), the eCTD XML backbone creates the hyperlinked structure that enables regulatory reviewers to navigate through thousands of documents efficiently.

For regulatory teams, understanding the eCTD XML backbone is essential for successful submission publishing. XML validation errors account for 23% of all gateway rejections - the single largest category of eCTD failures. A malformed backbone file means your entire submission cannot be processed, regardless of how perfect your PDF documents are.

In this guide, you'll learn:

  • The complete technical structure of index.xml and how leaf elements define document metadata
  • Regional.xml requirements for FDA, EMA, Health Canada, and PMDA submissions
  • DTD and schema specifications including ich-ectd-3-2.dtd and v4.0 XSD
  • Lifecycle operations (new, replace, append, delete) and when to use each
  • Common eCTD XML validation errors and how to prevent them

What Is the eCTD XML Backbone?

Definition

The eCTD XML backbone is the set of XML files (index.xml and regional.xml) that serve as the structural foundation and navigation layer for an eCTD submission, defining document inclusion, location, relationships, and lifecycle changes across submission sequences. The backbone enables regulatory reviewers and automated systems to process submissions and validates against Document Type Definition (DTD) or XML Schema (XSD) specifications.

The eCTD XML backbone is the set of XML files that serve as the structural foundation and navigation layer for an eCTD submission. These XML files define which documents are included, where they are located, how they relate to each other, and how they change across submission sequences.

Key characteristics of the eCTD XML backbone:

  • Creates the navigable table of contents for regulatory reviewers
  • Stores metadata for every document (title, location, checksum, operation)
  • Enables lifecycle management across multiple submission sequences
  • Validates against Document Type Definition (DTD) or XML Schema (XSD)
  • Supports hyperlinked navigation between related documents
Key Statistic

The eCTD XML backbone was introduced in ICH M8 version 1.0 (2003) and has evolved through version 3.2.2 (DTD-based) to version 4.0 (XSD-based), with v3.2.2 remaining the most widely implemented globally as of 2026.

The backbone consists of two primary file types that work together:

File TypePurposeScope
index.xmlMaster navigation file containing Modules 2-5 structureGlobal (harmonized)
regional.xmlRegion-specific Module 1 content and metadataRegional (varies by agency)

Together, these files create a complete map of the submission that both human reviewers and automated validation systems use to process the eCTD.

Understanding eCTD XML Structure: index.xml Explained

The index.xml file is the heart of the eCTD XML backbone. It serves as the master table of contents for the entire submission, defining the structure of harmonized Modules 2 through 5 and linking to regional XML files for Module 1 content.

index.xml File Location and Purpose

The index.xml file must be located at the root of each sequence folder:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

index.xml Structure Breakdown

The index.xml follows a hierarchical structure matching the eCTD module organization:

Pro Tip

Always validate your index.xml against the correct DTD version before submission. Version mismatches (e.g., using v3.2.2 DTD when your XML declares v4.0) are among the easiest errors to introduce and hardest to catch without proper validation tools. Use XML editors with built-in DTD validation to catch these errors in real-time.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

XML Declaration and DTD Reference

Every index.xml must begin with the XML declaration and DTD reference:

ElementPurposeRequired
`<?xml version="1.0" encoding="UTF-8"?>`Declares XML version and character encodingYes
`<!DOCTYPE ectd:ectd SYSTEM "...">`References the DTD for validationYes
`xmlns:ectd`eCTD namespace declarationYes
`xmlns:xlink`XLink namespace for hyperlinksYes
`dtd-version`Specifies DTD version (3.2 for v3.2.2)Yes

Root Element Requirements

The root element must include:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Namespace requirements:

  • xmlns:ectd="http://www.ich.org/ectd" - Required for all eCTD elements
  • xmlns:xlink="http://www.w3.org/1999/xlink" - Required for document references
  • dtd-version="3.2" - Must match the referenced DTD version

eCTD Leaf Elements: The Building Blocks of XML Backbone

Leaf elements are the fundamental building blocks of the eCTD XML backbone. Each leaf represents a single document in the submission and contains all metadata necessary for validation, navigation, and lifecycle management.

Leaf Element Anatomy

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Leaf Element Attributes Reference

AttributeDescriptionRequiredValues
`ID`Unique identifier for the documentYesAlphanumeric, unique within submission
`operation`Lifecycle action for this documentYesnew, replace, append, delete
`checksum`MD5 hash of the file contentYes32-character hexadecimal string
`checksum-type`Hash algorithm usedYesmd5 (only valid value)
`modified-file`Original leaf ID when replacingConditionalID of leaf being modified
`xlink:href`Relative path to the documentYesValid relative path from sequence root

Leaf ID Best Practices

The leaf ID must be unique within the entire submission lifecycle, not just the current sequence:

Recommended leaf ID conventions:

ModulePatternExample
Module 2`m2-[section]-[number]``m2-23-qos-001`
Module 3`m3-[substance/product]-[section]-[number]``m3-s-stability-001`
Module 4`m4-[study-type]-[number]``m4-tox-repeat-001`
Module 5`m5-[study-id]-[doc-type]``m5-study001-csr`

Leaf ID rules:

  • Must start with a letter (not a number)
  • Can contain letters, numbers, hyphens, and underscores
  • No spaces or special characters
  • Maximum 128 characters (recommended under 64)
  • Must remain consistent across sequences when referencing the same document
Key Statistic

Using inconsistent leaf IDs across sequences is one of the most common causes of lifecycle operation failures. Establish a leaf ID naming convention before your first submission and maintain it throughout the product lifecycle.

Pro Tip

Create a master leaf ID registry spreadsheet documenting every leaf ID, its module/section, first appearance sequence, and all subsequent lifecycle operations. This prevents duplicate IDs and makes it easy to verify modified-file references are correct when using replace operations.

Regional.xml: Region-Specific eCTD XML Requirements

The regional.xml file contains Module 1 content and region-specific metadata. Unlike index.xml, regional files vary significantly between regulatory authorities.

Regional XML File Naming by Agency

RegionFile NameDTD Reference
FDA (US)`us-regional.xml``us-regional-v3-0.dtd`
EMA (EU)`eu-regional.xml``eu-regional.dtd`
Health Canada`ca-regional.xml``ca-regional.dtd`
PMDA (Japan)`jp-regional.xml``jp-regional.dtd`
TGA (Australia)`au-regional.xml``au-regional.dtd`

FDA us-regional.xml Structure

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

FDA Application Type Codes

The us-regional.xml requires specific application type codes:

CodeApplication TypeDescription
`nda`New Drug ApplicationOriginal NDA submission
`anda`Abbreviated New Drug ApplicationGeneric drug application
`bla`Biologics License ApplicationBiologic product application
`ind`Investigational New DrugClinical trial application
`dmf`Drug Master FileManufacturing information file
`pmsr`Post-Marketing Safety ReportSafety update reporting

FDA Submission Type Codes

CodeSubmission TypeUsage
`orig`Original ApplicationInitial submission
`efficacy-suppl`Efficacy SupplementNew indication
`manuf-suppl`Manufacturing SupplementCMC changes
`labeling-suppl`Labeling SupplementLabeling changes
`safety-suppl`Safety SupplementSafety updates
`annual-report`Annual ReportIND/NDA annual reports
`amendment`AmendmentPre-approval amendments

EMA eu-regional.xml Structure

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Regional XML Comparison Table

ElementFDA (US)EMA (EU)Health Canada
Root element`us:us-regional``eu:eu-regional``ca:ca-regional`
Application ID`us:application-number``eu:procedure-number``ca:control-number`
Module 1 structureSections 1.1-1.16Sections 1.0-1.10Sections 1.0-1.7
Forms section`m1-1-forms``m1-2-application-form``m1-1-forms`
Labeling section`m1-14-labeling``m1-3-pi``m1-3-1-product-monograph`
DTD versionv3.0v3.0v3.0

eCTD DTD Specifications: Technical Reference

The Document Type Definition (DTD) specifies the valid structure, elements, and attributes for eCTD XML files. Understanding DTD specifications is essential for troubleshooting validation errors.

ICH eCTD DTD Versions

VersionReleaseStatusKey Changes
ich-ectd-2-0.dtd2005ObsoleteInitial stable release
ich-ectd-3-0.dtd2008LegacyAdded lifecycle operations
ich-ectd-3-2.dtd2016Current StandardRefined element structure
eCTD v4.0 XSD2024ImplementingSchema-based validation

DTD Location Requirements

DTD files must be placed in the util/dtd folder within each sequence:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Key DTD Element Definitions

The ich-ectd-3-2.dtd defines the valid elements for index.xml:

Module container elements:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Leaf element definition:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

DTD Validation Rules

RuleRequirementError if Violated
Element orderElements must appear in DTD-specified order`Element out of order`
Required attributesID, operation, checksum, xlink:href always required`Missing required attribute`
Attribute valuesoperation must be new/replace/append/delete`Invalid attribute value`
ID uniquenessEach ID must be unique within document`Duplicate ID value`
IDREF validitymodified-file must reference existing ID`Invalid IDREF`

eCTD Lifecycle Operations: new, replace, append, delete

Lifecycle operations define how documents change across submission sequences. Proper use of lifecycle operations is critical for maintaining submission integrity and regulatory compliance.

Operation Definitions and Usage

OperationPurposeWhen to Usemodified-file Required
newFirst appearance of documentInitial sequence or new contentNo
replaceReplaces entire documentUpdated version of existing docYes
appendAdds content to existing docAdditional data for same topicYes
deleteRemoves document from submissionWithdrawn or superseded contentYes

Operation Examples with XML

New operation (initial submission):

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Replace operation (updated document):

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Delete operation (document withdrawal):

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Lifecycle Operation Best Practices

Operation selection criteria:

ScenarioCorrect OperationIncorrect Operation
First time document appears`new`replace (nothing to replace)
Document content updated`replace`new (loses history)
Additional stability data`append` or `new` leafreplace (overwrites existing)
Document no longer relevant`delete`Just omitting (ghost reference)
Correcting title onlyKeep same with correct titlenew (breaks traceability)
Key Statistic

Using `new` instead of `replace` for updated documents breaks the submission audit trail. Regulatory agencies track document history using the modified-file reference chain - breaking this chain can raise data integrity questions.

Sequence-Based Lifecycle Tracking

Across multiple sequences, the lifecycle chain must be maintained:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

This creates an auditable chain: spec-001 -> spec-002 -> spec-003

Pro Tip

Before submitting a multi-sequence eCTD, trace the full lifecycle chain for each document by following the modified-file references backwards from the final sequence. Any broken link in the chain is a critical error that will likely be caught by the gateway. Regulatory agencies audit these chains during reviews to verify data integrity.

MD5 Checksum in eCTD: File Integrity Verification

The MD5 checksum is a critical component of the eCTD XML backbone that ensures file integrity throughout the submission process.

What Is the MD5 Checksum?

The MD5 (Message-Digest Algorithm 5) checksum is a 128-bit hash value that uniquely identifies file content. Any modification to a file - even a single byte change - produces a completely different checksum.

MD5 checksum characteristics:

  • 32-character hexadecimal string
  • Deterministic (same file always produces same checksum)
  • One-way (cannot reverse-engineer file from checksum)
  • Collision-resistant (different files produce different checksums)

Checksum Format in eCTD XML

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Checksum Validation Process

StepProcessValidation Check
1Publisher calculates MD5 of PDF fileStore in leaf checksum attribute
2Gateway receives submissionRecalculates MD5 of each file
3Gateway compares checksumsIf mismatch, reject submission
4Reviewer accesses documentIntegrity verified via checksum

Common Checksum Errors and Prevention

Error causes:

CauseDescriptionPrevention
Post-calculation modificationFile edited after checksum generatedLock files immediately after publishing
Antivirus modificationAV software modifies file during scanDisable real-time scanning during assembly
Compression issuesZIP/unzip alters file contentVerify after decompression
Character encodingLine ending differences (CR/LF)Standardize on UTF-8
Publishing tool errorTool calculates incorrectlyVerify with independent tool

Verification command (cross-platform):

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Common eCTD XML Validation Errors and Solutions

XML validation errors are the leading cause of eCTD gateway rejections. Understanding common errors and their solutions enables faster troubleshooting and prevention.

Top 10 eCTD XML Validation Errors

RankErrorFrequencyImpactTypical Fix Time
1Schema/DTD validation failure23%Critical2-4 hours
2Invalid leaf operation15%Critical1-2 hours
3Checksum mismatch12%Critical1 hour
4Missing required element10%Critical1-2 hours
5Invalid attribute value9%High30 min - 1 hour
6Duplicate leaf ID8%High1 hour
7Broken xlink:href path7%High30 min
8Invalid modified-file reference6%High1-2 hours
9Character encoding error5%Medium30 min
10Namespace declaration error5%Medium30 min

Error 1: Schema/DTD Validation Failure

Symptoms:

  • Gateway returns "XML parsing error" or "DTD validation failed"
  • Validation tool reports "Element not allowed" or "Invalid content"

Common causes and solutions:

CauseError MessageSolution
Missing closing tag`Element not closed`Add missing `</element>` tag
Wrong element order`Element out of order`Reorder per DTD specification
Invalid nesting`Element not allowed here`Check parent-child relationships
Wrong DTD version`DTD not found`Verify DTD path and version

Prevention: Use XML-aware editors with real-time validation against the DTD.

Error 2: Invalid Leaf Operation

Symptoms:

  • "Invalid operation for leaf" error
  • "modified-file reference not found" error

Cause-solution mapping:

ScenarioErrorCorrect Approach
Using `replace` without prior leafNo leaf to replaceUse `new` for first appearance
Using `new` for updated documentBreaks lifecycle chainUse `replace` with modified-file
modified-file points to wrong IDReference not foundVerify ID matches prior sequence
Using `delete` without modified-fileMissing referenceInclude modified-file attribute

Error 3: Checksum Mismatch

Symptoms:

  • "MD5 checksum does not match" error
  • "File integrity verification failed"

Diagnostic steps:

  1. Regenerate checksum for the file independently
  2. Compare with value in backbone.xml
  3. If different, file was modified after publishing
  4. Check for antivirus, compression, or transfer issues

Error 4: Invalid xlink:href Path

Symptoms:

  • "File not found" error
  • "Invalid href reference"

Path validation checklist:

RequirementCorrectIncorrect
Relative path`m3/32-body-data/spec.pdf``C:\ectd\m3\32-body-data\spec.pdf`
Case sensitivity`m3/32-body-data/Spec.pdf` if file is Spec.pdf`m3/32-body-data/spec.pdf` when file is Spec.pdf
Forward slashes`m3/32-body-data/spec.pdf``m3\32-body-data\spec.pdf`
No spaces`m3/32-body-data/spec-final.pdf``m3/32-body-data/spec final.pdf`

XML Validation Against Schema: eCTD v4.0 Considerations

eCTD version 4.0 transitions from DTD-based to XSD (XML Schema Definition) validation, introducing more rigorous validation capabilities.

DTD vs. XSD Comparison

FeatureDTD (v3.2.2)XSD (v4.0)
SyntaxOwn syntaxXML-based
Data typesLimited (CDATA, ID)Rich types (date, integer, etc.)
Validation strengthBasic structureStructure + content
Namespace supportLimitedFull support
ExtensibilityDifficultBuilt-in extension mechanisms
Controlled vocabularyExternalIntegrated

eCTD v4.0 XML Structure Changes

Key structural changes in v4.0:

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Controlled Vocabulary in v4.0

eCTD v4.0 introduces controlled vocabulary for standardized values:

Elementv3.2.2 Approachv4.0 Approach
OperationFree text (new, replace)CV code (1=new, 2=replace)
Application typeRegional codeGlobal CV
Submission typeRegional codeGlobal CV with regional extensions
Document typeTitle textStandardized CV code

Key Takeaways

The eCTD XML backbone is the set of XML files (index.xml and regional.xml) that provide navigation, metadata, and lifecycle management for Electronic Common Technical Document submissions. The backbone creates a hyperlinked table of contents using leaf elements that define each document's location, checksum, and lifecycle operation. All major regulatory agencies (FDA, EMA, Health Canada, PMDA) require valid XML backbone files for eCTD submission acceptance.

Key Takeaways

  • The eCTD XML backbone consists of index.xml and regional.xml files that together create the navigable structure, metadata layer, and lifecycle management system for regulatory submissions. XML validation errors account for 23% of gateway rejections - the largest single error category.
  • Leaf elements are the building blocks of eCTD XML containing unique IDs, lifecycle operations (new, replace, append, delete), MD5 checksums, and file path references. Consistent leaf ID conventions across sequences are essential for maintaining audit trails.
  • Regional XML files vary significantly between agencies with FDA using us-regional.xml (sections 1.1-1.16), EMA using eu-regional.xml (sections 1.0-1.10), and each requiring region-specific DTD validation and application type codes.
  • DTD validation ensures XML structural compliance while MD5 checksums verify file integrity. Both validation layers must pass for gateway acceptance. eCTD v4.0 introduces XSD-based validation with richer data typing and integrated controlled vocabularies.
  • Lifecycle operations must follow strict rules where `replace` requires modified-file references to prior leaf IDs, creating an auditable chain across sequences. Breaking this chain raises data integrity concerns during regulatory review.
  • ---

Next Steps

Understanding the eCTD XML backbone is essential for regulatory submission success, but manually verifying XML structure across thousands of documents is error-prone and time-consuming. XML validation errors remain the leading cause of gateway rejections.

Eliminate XML backbone errors before submission. Assyro's AI-powered platform validates your eCTD XML structure against all ICH M8 and regional DTD specifications, checking leaf elements, lifecycle operations, checksums, and cross-references in real-time during publishing - not just at final assembly.

See How Assyro Catches XML Errors Before FDA Does - Request a Demo

Sources