Difference between revisions of "EDelivery"

From DE4A
Jump to navigation Jump to search
Line 57: Line 57:
 
*The look up can be done with the “dig” tool on the command line:
 
*The look up can be done with the “dig” tool on the command line:
 
[[File:DNS NAPTR record query result.png|left]]
 
[[File:DNS NAPTR record query result.png|left]]
 +
  
  

Revision as of 13:03, 10 March 2023

The exchange of a single document between a DE and a DO always requires two eDelivery exchanges: the first one initiated by DE and targeted for DO, and the second one is initiated by DO and targeted for the DE. Technically speaking both transmissions are “requests” even though their semantics are “request” and “response”.

The foundation of the document exchange is the so called “4-corner model”, which differentiates between the business sender of a document (Corner 1 aka C1), the technical sender of a document (Corner 2 aka C2), the technical receiver of a document (Corner 3 aka C3) and the business receiver of a document (Corner 4 aka C4). Depending on the order of a message exchange, the assignment of the corner varies. In DE4A the “DE4A Connector” (sometimes just “Connector”) can play the role of both DR and DT and therefore acts as C2 or C3, depending on whether a message is sent or received.

eDelivery business request from DE to DO

The figure above depicts the structural message exchange initiated by DE (C1), sent by DR (C2), received by DT (C3) and forwarded to DO (C4). The message exchange between C1 and C2 as well as the message exchange between C3 and C4 are not specified by eDelivery, even though AS4 may be used for this, but they must be defined by the DE4A Connector.

If DO sends a message back to DE, the order of the messages change as well as the corner assignment, as shown in the following figure where the DO becomes C1, forwarding the response to DT which is now C2. The AS4 transmission targets DR as C3 who in turn forwards the payload to DE which is the C4 in this scenario.

eDelivery business response from DO to DE

This duality of the message exchange means, that each of the named nodes (DE, DR, DT and DO) requires both sending and receiving capabilities.

For the sake of clarity, the rest of the document only shows images with messages flowing from DE to DO because it seems easier to understand, even though the image would be perfectly valid for the return direction from DO to DE (except when stated differently).

The eDelivery message exchange in DE4A uses the so called “Dynamic Discovery” which is an extension of the basic eDelivery in the sense that it adds the usage of SML and SMP. Both components as well as the lookup process are described below.

Identification of components

Each C1 and C4 of a message exchange is called a “Participant” and is uniquely identified by a “Participant Identifier”. The nodes C2 and C3 are not participants and have no respective identifier, they are only accessed by URLs.

Different types of documents exchanged via eDelivery are classified via “Document Type Identifiers”. The orchestrations in which document types are exchanged are classified via “Process Identifiers”.

There is a separate policy document on the usage of identifiers within its network. This document, called DE4A Policy for use of identifiers.pdf, contains the details about the following identifier types:

  • Participant Identification
  • Type Identification
  • Process Identification
  • Transport Profile Identification

Each Participant ID, Document Type ID and Process ID consists of two separate parts – one “scheme” part and one “value” part. A scheme defines the layout and constraints of the value. This allows to add new types of identifiers in different scenarios, without interfering with existing used identifiers.

This document was heavily inspired by the “Peppol Policy for use of Identifiers”, the identifier reference document of the Peppol network.

Additional to the guiding document, the project also created a set of Code Lists, that contains the allowed values.

SML/DNS

The SML and DNS are often used synonymously but they serve different purposes. The SML is the registry of all known SMPs and Participants from any project or service that use the dynamic discovery mode of eDelivery and is responsible for creating DNS entries in a specific DNS zone. For every Participant a unique DNS entry is created. Only SMPs interacts with the SML. Other components will never explicitly interact with the SML.

During the main message exchange between C2 and C3 the DNS is queried for the SMP URL of C3. The SML is neither queried in the message exchange nor before or after. Due to the distributed nature of DNS, this message exchange is performed without a single point of failure, which is one of the key benefits of eDelivery.

The SML is operated centrally by the European Commission. Its service is gratefully offered free of charge to the DE4A project.

The SML creates DNS records for each SMP itself (see SMP) and for each Participant (aka Service Group; see SMP) pointing to the owning SMP. These DNS NAPTR records need to be read by the sending party (the DE4A Connector), to determine the URL of the SMP that needs to be queried.

The SML and the DNS are centralized third-party components for DE4A. For the SMPs to be able to communicate with the SML, a Client Certificate is needed (see Certificates), so only legitimate requestors can create DE4A participants in the SML.

SML query process in DE4A

DE4A only uses the production SML and the dedicated domain de4a.edelivery.tech.ec.europa.eu. for its purposes.

The lookup process from the Connector solely requires the Participant Identifier (see Identification of components) of the receiver. Using a SHA-256 hash value of the Participant Identifier, a unique domain name is created, which will be looked up from the DNS using the “NAPTR” record query type. This NAPTR record then contains the base path of the SMP to be queried. The details of the SMP query are described in chapter 3.1.4.5.

The structure of the Participant IDs used in DE4A is described in chapter 3.1.3.2.

A real-life example for looking up the Participant Identifier iso6523-actorid-upis::9915:de4atest looks like this:

  • Apply the following algorithm to the Participant Identifier:
    • strip-trailing (base32 (sha256 (lowercase (ID-VALUE))), "=") + "." + ID-SCHEME + "." + SML-ZONE-NAME
    • In the above example identifier, the “ID-SCHEME” is “iso6523-actorid-upis” and the “ID-VALUE” is “9915:de4atest”.
  • The created domain name is: 54VMPCQA26DNZS74VHQOKJ7U6IRBBI5KPMQ6AO3KVCQC3F6YR2YA.iso6523-actorid-upis.de4a.edelivery.tech.ec.europa.eu
  • The look up can be done with the “dig” tool on the command line:
DNS NAPTR record query result.png







  • The relevant part of the response is "!.*!https://de4a-smp.usp.gv.at!" that contains the link to the SMP of this particular Participant Identifier that can be queried for the details, embedded into a regular expression as defied by the NAPTR record specification.
  • Using the extracted URL https://de4a-smp.usp.gv.at the regular SMP query process, as described in chapter 3.1.4.5, can be performed.

As can be seen from this description, the SML system itself is not invoked in the lookup – only the DNS system is involved. Because the DNS system is inherently replicated, a fast and fail-safe operation is guaranteed.

Domains and types of Participant IDs used in DE4A

In DE4A, there are two types of participants:

  • Imaginary participants: participants for whom example evidence and fake data have been made up to be used for testing purposes.
  • Real participants: the actual DE4A partners participating in the pilots of the project.

At the same time, for the real participants, there are also two “domains”:

  • Test domain: only fake data is used for test purposes. It is the domain related to the DE4A Playground, where participants return datasets from test sources.
  • Pilot domain: where real data from real citizens and companies are expected to be used. It is the environment used for the running phase of the pilots, where participants access to their real registries and return real evidence from them.

Finally, the project is divided into two pilots iterations where each of them needs a copy of those two domains, and where some overlap between them occurs.

As the DE4A project uses the dynamic discovery of eDelivery, to query the DNS there cannot be duplicate participant IDs targeting to different SMPs (depending on the domain concerned). Thus, a different participant ID schema per each domain has been defined:

  • PRE1a: Imaginary participants used for testing during iteration 1 of the project. The related evidence is stored in the Mocked DO of the Playground iteration 1.
    • Scheme identifier used: “9999”.
    • ­Suffix used: “-it1”.
    • ­E.g. iso6523-actorid-upis::9999:ess2833002e-it1
  • PRE1b: Real participants queried about testing data through the Playground iteration 1. Simulated data will be returned when querying for the participants. The related evidence is located in each partner’s infrastructure, within their test domains.
    • Scheme identifier used: “99XX”, where “XX” depends on the identifier each participant is using.
    • Suffix used: “-it1”.
    • E.g. iso6523-actorid-upis::9920:ess2833002e-it1
  • PRO1: Real participants queried about actual data during the execution of phase 1 of the pilot. The related evidence is located in each partner’s infrastructure, within their pilot domains.
    • Scheme identifier used: “99XX”, where “XX” depends on the identifier each participant is using. It is the “real” participant ID.
    • Suffix used: none, since there is no overlap between phases 1 and 2 of the pilots.
    • E.g. iso6523-actorid-upis::9920:ess2833002e
  • PRE2a: Imaginary participants used for testing during iteration 2 of the project. The related evidence is stored in the Mocked DO of the Playground iteration 2.
    • Scheme identifier used: “9999”.
    • Suffix used: “-mock-it2”.
    • E.g. iso6523-actorid-upis::9999:ess2833002e-mock-it2
  • PRE2b: Real participants queried about testing data through the Playground iteration 2 Simulated data will be returned when querying for the participants. The related evidence is located in each partner’s infrastructure, within their test domains.
    • Scheme identifier used: “99XX”, where “XX” depends on the identifier each participant is using.
    • Suffix used: “-test-it2”.
    • E.g. iso6523-actorid-upis::9920:ess2833002e-test-it2
  • PRO2: Real participants queried about actual data during the execution of phase 2 of the pilot. The related evidence is located in each partner’s infrastructure, within their pilot domains.
    • Scheme identifier used: “99XX”, where “XX” depends on the identifier each participant is using. It is the “real” participant ID.
    • Suffix used: none, since there is no overlap between phases 1 and 2 of the pilots.
    • E.g. iso6523-actorid-upis::9920:ess2833002e

These different participant IDs must be stored in their corresponding IDK components and SMPs and configured to return the proper information:

  • PRE1a: Imaginary participants during the first iteration
    • IDK: none (for these participants, the DemoUI of the Playground knows which the imaginary participants are and which DO to query, namely, the Mocked DO of the first iteration).
    • SMP: shared SMP of the Playground IT1. It returns the routing information of the targeted Connector (DR/DT) of the Playground IT1.
  • PRE1b: Simulated data from real participants during the first iteration
    • IDK: Mocked IDK of the Playground IT1.
    • SMP: shared SMP of the Playground IT1. It returns the routing information of the targeted Connector (DR/DT) of the requested participant.
  • PRO1: Real data from real participants during the first iteration
    • IDK: Mocked IDK of the pilot running environment.
    • SMP: national SMP of the requested participant. It returns the routing information of the targeted Connector (DR/DT) of the requested participant.

For the first pilot running phase, partners who had not deployed their own SMPs were able to use a shared SMP provided by one of the partners with an available SMP. In such cases, the routing information of those partners was stored in that shared SMP.

  • PRE2a: Imaginary participants during the second iteration
    • IDK: Central IAL. Its information is automatically updated and fed from the SMPs connected to it.
    • SMP: shared SMP of the Playground IT2. It returns the routing information of the targeted Connector (DR/DT) of the Playground IT2.
  • PRE2b: Simulated data from real participants during the second iteration
    • IDK: Central IAL. Its information is automatically updated and fed from the SMPs connected to it.
    • SMP: national SMP of the requested participant. It returns the routing information of the targeted Connector (DR/DT) of the requested participant.
  • PRO2: Real data from real participants during the second iteration
    • IDK: Central IAL. Its information is automatically updated and fed from the SMPs connected to it.
    • SMP: national SMP of the requested participant. It returns the routing information of the targeted Connector (DR/DT) of the requested participant.

SMP

The Service Metadata Publisher (SMP) is a decentralized registry with routing information. For DE4A a solution that is compatible with the OASIS BDXR SMP v1 specification must be used, as indicated by the eDelivery specification.

The SMP is responsible for maintaining the relationship between a Participant Identifier and its technical addressing details, such as the AS4 endpoint URL and the X.509 certificate. Every SMP must implement a standardized REST API for querying.

All SMPs MUST provide the two REST APIs mandated by the specification, identified as /{participantID} and /{participantID}/services/{docTypeID}. The first API returns a list of all document types the participant is capable to receive (which may be an empty list) and the second API returns the details on the receiving “Endpoints” including the endpoint URL and the X.509 certificate of the receiver. Both APIs can only return XML content, and only the second API response is digitally signed with the SMP certificate of the SMP maintainer. Each of the response data structures contain an optional Extension element that could be used for additional content.

Each SMP MUST have one Certificate from the SMP PKI configured (see Certificates), independent of the number of Participants it manages. This certificate is used as a client certificate for the communication with the SML (see SML/DNS), as a client certificate for the communication with the DE4A Directory and as an XML signing certificate for its REST responses.

An SMP MUST be registered once in the SML before it can be used in the network.

Big Picture for SMP Registration

The initial registration of an SMP to the SML is depicted in the above figure. This process needs to be triggered manually by the SMP Administrator. It requires a trusted SMP certificate which is used as a client certificate when invoking the SML’s API.

Input parameters to the SML registration are:

  • The SMP ID
  • The public IP address of the SMP server (which is a legacy parameter)
  • The public hostname of the SMP server

The results of this registration process are:

  • The SMP ID is linked inside the SML with the SMP certificate.
  • The creation of the “Publisher DNS entry”, which is a generated DNS CNAME entry based on the SMP’s “SMP ID” and the base DNS zone the SML is operating in. The target is the public hostname of the SMP server.

Note: every time an SMP certificate is updated it MUST be updated in the SMP and the SML.

Big Picture for Participant Registration

The above picture shows the Participant registration process – and it looks exactly the same as the figure above because the involved components are the same. It requires a trusted SMP certificate which is used as a client certificate when invoking the SML’s API.

Each Participant Registration in the SML performs the following actions:

  • Link the Participant ID with the owning SMP. This implicitly checks that the Participant ID is unique and not already registered.
  • Create a new DNS record of type “NAPTR” that links the Participant ID with the owning SMP. NAPTR is a special DNS record type, different from the usual “A” or “CNAME” record types, and is able to store absolute URLs. The domain name is created using a hash algorithm and the target of the DNS entry is the public domain name of the SMP.

Note: Of course, an SMP should be able to handle multiple participants.

Business Card Extension

As an addition to the routing information, an SMP must also support the “Business Card” API as specified by the Peppol Directory specification. It adds non-routing information to a “Participant Identifier”. The data model of the Business Cards is depicted in the following figure.

All the “Business Cards” will be collected in the DE4A Directory and made available centrally for querying. The SMP is responsible for keeping the data in the Directory up to date.

The API to query the business cards from an SMP is /businesscard/{participantID} and returns XML only. The supported XML Schemas are available on GitHub- any of these versions may be returned from an SMP.

The SMP itself triggers the DE4A Directory via a REST API to indicate that the data of a Participant needs to be re-indexed. This API only takes the participant ID and the Directory will perform a DNS lookup with the participant ID (see SML/DNS) to determine the owning SMP, and query the Business Card via the previously mentioned REST API from that SMP. The reason why the SMP is not sending the full Business Card is to avoid that somebody else than the data owner can publish data into the Directory.

The phoss SMP, provided in GitHub, is the only open source SMP known to the author that supports the Business Card extension out of the box.

Big Picture for Message Transmission

The figure above shows the big picture of a message exchange. It includes the following steps. Error handling is purposely left out.

  • Step 1: C1 submits the document anyhow to C2. “Anyhow” from an eDelivery perspective means, that it is not specified by the eDelivery components – any protocol, payload and addressing mechanisms can be used here. In DE4A this communication is defined by a DE4A Connector interface.
  • Step 2: C2 requires the Participant ID of C4, calculate the DNS name (as outlined in Big Picture for Participant Registration) and perform a DNS lookup for the “NAPTR” record. The outcome is the public URL of the SMP. The DNS is a distributed system itself and is one of the corner stones of the Internet as we know it.
  • Step 3: Perform the SMP client lookup with the public URL from step 2. The response XML format is described in the OASIS SMP specification. Select the best matching SMP Endpoint, based on the business requirements, which results in an “Endpoint URL” and the X.509 Certificate to be used in the AS4 transmission. The specific SMP Endpoint selection rules may vary from pilot to pilot. Access to the SMP SHOULD be transport layer secured(since the response of an SMP is digitally signed, authenticity can be verified anyway.).
  • Step4: Create the AS4 message, encrypt it with the certificate from Step 3 and sign it with your AS4 certificate. Transmit the document via AS4 to the URL retrieved from Step 3. The transmission MUST be transport-layer secured.
  • Step 5: C3 submits the document anyhow to C4. In DE4A this communication is defined by a DE4A Connector interface.

The list of these steps is complete, and no central, single-instance components are involved in the transmission.

SMP Query Process

After the base URL of the SMP was determined using the DNS lookup (see SML query process in DE4A), the regular SMP lookup can take place. This action is performed by the DE4A Connector using the specified query API /{participantID}/services/{docTypeID}. The parameters to this query API are the Participant ID (that is already required for the DNS lookup) and Document Type ID. The Document Type ID defines what kind of document should be received and needs to be taken from the DE4A Code List (see Identification of Components).

For the Participant ID iso6523-actorid-upis::9915:de4atest and the Document Type ID urn:de4a-eu:CanonicalEvidenceType::CompanyRegistration:1.0 (used to identify company registration data from the DBA pilot) the following query URL is built:

/iso6523-actorid-upis::9915:de4atest/services/urn:de4a-eu:CanonicalEvidenceType::CompanyRegistration:1.0

The result of this SMP HTTP GET query is an XML document that contains a list of all Processes and for each Process the list of Endpoints.

The structure and values for the Participant IDs, Document Type IDs and Process IDs used during the SMP query process is described in Values of the parameters for the SMP query process.

Inside each Process returned by the SMP query is a list of so called “Endpoints”. Each endpoint represents the connection details for one particular transport protocol. In DE4A we are only supporting the AS4 transport protocol with the identifier bdxr-transport-ebms3-as4-v1p0 – so each returned Process may only contain a single Endpoint.

The above figure shows the technical content of an SMP Endpoint. Inside an SMP Endpoint the two main elements that are of interest are the contents of:

  • element EndpointURI containing the URL of the Connector where this Participant can retrieve the queried Document Type and
  • element Certificate containing the public X.509 certificate of the receiving Connector, so that the message can be digitally encrypted for that specific receiver.

With these information elements at hand, the message can be encrypted for the specific receiver and send to the correct URL. The main message exchange via AS4 can start now.

Values of the parameters for the SMP query process

This section specifies the general values of the Participant ID, Document Type ID and Process ID parameters when querying the SMPs for the three main types of DE4A messages. Remember that, in DE4A, the Transport Protocol ID value is always the same (see SMP Query Process). The complete set of values for the following evidenceID and catalogueID variables are available at Annex I. Lists of codes.

Evidence Request

When a Data Consumer wants to send an evidence request message based on the IM, USI or LU patterns, or when it wants to send subscription request based on the S&N pattern, its DE4A Connector will query the SMP of the recipient Data Provider about this data:

  • ParticipantId:
  • DocumentTypeId:
    • Identifier of the required item, depending on whether it is a canonical evidence type or a canonical event catalogue, and whether it is a multi-item request or not.
    • For a single piece of evidence:
      • urn:de4a-eu:CanonicalEvidenceType::evidenceID
    • For multiple pieces of evidence:
      • urn:de4a-eu:CanonicalEvidenceType::MultiItem
    • For a single subscription:
      • urn:de4a-eu:CanonicalEventCatalogueType::catalogueID
    • For multiple subscriptions:
      • urn:de4a-eu:CanonicalEventCatalogueType::MultiItem
  • ProcessId:
    • urn:dea4-eu:MessageType::request

Data Provider’s SMP will reply with the endpoint and the certificate of the Data Transferor’s AS4 Gateway.

Evidence Response

When a Data Provider wants to send an evidence response message based on the IM, USI or LU patterns, or a subscription confirmation message based on the S&N pattern, or a redirection user message based on the USI pattern, its DE4A Connector will query the SMP of the recipient Data Consumer about this data:

  • ParticipantId:
  • DocumentTypeId:
    • Identifier of the required item, depending on whether it is a canonical evidence type or a canonical event catalogue, and whether it is a multi-item request or not.
    • For a single piece of evidence:
      • urn:de4a-eu:CanonicalEvidenceType::evidenceID
    • For multiple pieces of evidence:
      • urn:de4a-eu:CanonicalEvidenceType::MultiItem
    • For a single subscription:
      • urn:de4a-eu:CanonicalEventCatalogueType::catalogueID
    • For multiple subscriptions:
      • urn:de4a-eu:CanonicalEventCatalogueType::MultiItem
  • ProcessId:
    • urn:dea4-eu:MessageType::response

Data Consumer’s SMP will reply with the endpoint and the certificate of the Data Requestor’s AS4 Gateway.

Notification

When a Data Provider wants to send an event notification message based on the S&N pattern, its DE4A Connector will query the SMP of the recipient Data Consumer about this data:

  • ParticipantId:
  • DocumentTypeId:
    • Identifier of the involved canonical event catalogue, depending on whether it is a multi-item request or not.
    • For a single event notification:
      • urn:de4a-eu:CanonicalEventCatalogueType::catalogueID
    • For multiple event notifications:
      • urn:de4a-eu:CanonicalEventCatalogueType::MultiItem
  • ProcessId:
    • urn:dea4-eu:MessageType::notification

Data Consumer’s SMP will reply with the endpoint and the certificate of the Data Requestor’s AS4 Gateway.

AS4

AS4 is the corner stone of the eDelivery document exchange. It ensures that messages are transmitted in a secure, reliable, standardised and interchangeable manner. AS4 1.0 is an OASIS Standard and defines a subset of OASIS ebMS 3.0. The technology builds on top of SOAP messages and the usage of SOAP with attachments. The security specifications used are WS Security and WS Reliable Messaging.

The usage of AS4 is mandatory when using eDelivery: every message that is exchanged between a DE and a DO, independent of its direction, must be sent via AS4. The sending DE4A Connector acting as the DR and receiving Connector acting as the DT are the only components that directly deal with AS4. Neither DE nor DO need to know the details of the protocol.

AS4 messages are encrypted and signed on the protocol level (leveraging the WS-Security 1.1.1 specification) and by governance the usage of TLS 1.2 or later on the transport layer (with strong cipher suites only) is required – see the CEF eDelivery specification for details.

Messages sent from C2 to C3 are encrypted with the public key of C3 and signed with the private key of C2. The public key of C2 is transmitted as a BST as part of the message to C3. Each AS4 installation needs exactly one X.509 certificate (see Certificates) independent of the number of Participants for which it exchanges messages.

Each AS4 message exchange matches one HTTP exchange – it always consists of one request and one response. The message exchange pattern used by DE4A is the “One Way Push” pattern and that means that the requestor always sends a so called “User Message” with a payload to the receiver who has to respond with a so called “Signal Message” that contains either an Error or a Receipt with the non-repudiation of receipt information. In case of a successful message transmission and a positive response, no further payload besides the pure acceptance information is allowed.

In DE4A each request from a DR to a DT contains a RegRep document (see RegRep) that contains the DE4A Core Data Model (e.g. an Evidence Request). The response from DT back to DR contains also a RegRep document containing the DE4A Code Data Model (e.g. an Evidence Response) but also the main evidences as so called “attachments”. Depending on the number of pieces of evidence requested and the number of formats provided, the number of AS4 attachments in the response may vary.

The above figure shows the structural parts of an AS4 message. The figure depicts a DE4A evidence response that only contains the Canonical Evidence but no other evidence formats. A DE4A evidence request looks very similar – just without the Canonical Evidence.