Sahana Eden

S3XML

S3XML is a data exchange format for Sahana Eden.

S3XML is a meta-format and does not specify any particular data elements. The interface is entirely introspective to the underlying data model, thus the specific constraints defined in the data model also apply for S3XML documents.

Conventions

Name Space

In the current implementation of S3XML, no name space identifier shall be used. Where a name space identifier for the native S3XML format is needed (e.g. when embedding S3XML in other XML), it shall be:

xmlns:s3xml="http://eden.sahanafoundation.org/wiki/S3XML"

Character Encoding

Generally, XML documents can specify their character encoding in the XML header:

 <?xml version="1.0" encoding="utf-8"?>

Sources in non-XML formats (JSON, CSV) used with S3XML on-the-fly conversion/transformation are expected to be UTF-8 encoded.

All exported data are always UTF-8 encoded.

Import Sources

There are 3 different ways to specify or submit data sources for import:

Files on the Server

A source file in the server file system can be specified using the filename URL variable:

PUT http://<server>/<controller>/<resource>.xml?filename=<path>

Multiple files can be specified as list of comma-separated pathnames:

PUT http://<server>/<controller>/<resource>.xml?filename=<path>,<path>

Remote Files

A source file can be specified by its URL using the fetchurl URL variable:

PUT http://<server>/<controller>/<resource>.xml?fetchurl=<url>

Multiple files can be specified as list of comma-separated pathnames:

PUT http://<server>/<controller>/<resource>.xml?fetchurl=<url>,<url>

Supported protocols are http, ftp and file, where file is interpreted in the server file system context. URLs of different protocols can be mixed.

The specified URLs must be accessible either without authentication, or (if you specify credentials in the URLs) they must support unsolicited HTTP basic authentication - HTTP 403 retries are not handled by the interface.

The URLs must be properly quoted (see http://www.w3schools.com/tags/ref_urlencode.asp  for more details), and must not contain commas.

Request Attachments

Source files can also be attached to a multipart-request. In this case the file extension of the source file must match the request URL file extension. Multiple files can be attached.

Multiple Sources

Where multiple sources are specified or attached, they are first converted and transformed one-by-one and then combined into a single element tree before import.

Duplicate Resolution

The S3XML Importer does not handle duplicates within the same source. As the order of elements in the resulting element tree is not defined, and the last update time attribute is not mandatory in source elements, there is no predictable rule of precedence.

Records in the source must not be fractionated, but submitted in one element. Fractions of records will not be merged by the Importer, and which of the fractions finally would be imported is not predictable

Source elements using unique keys are automatically matched with existing records. Where the match is ambiguous (e.g. a set of keys matching multiple existing records), the import element will be rejected as invalid. For certain resources, the server may have additional duplicate finders and resolvers configured. How duplicates are handled by these resolvers, can differ from resource to resource.

The duplicate resolution strategy in standard import mode is to update the existing record with the values from the source record. In synchronization mode the default strategy is to accept/keep the newest data (the last update time attribute is mandatory in this case).

XML Format

Document Types and Structure

S3XML defines 3 types of documents:

Document Type
Description
Schema Documents
describe the data schema for a resource
Field Option Documents
describe the currently acceptable options for fields in a record
Data Documents
provide the current contents (data) of resources
 

Schema Documents

Schema documents describe the data schema for a resource. Clients can use these documents e.g. for automatic generation of forms.

Schema documents can be retrieved from Sahana Eden by sending an empty GET request (i.e. without source) to the create.xml method of a resource, e.g.:

GET http://localhost:8000/eden/pr/person/create.xml

Document Tree:

<s3xml>
  <resource>
    <field>
    ...
    <resource>
      <field>
      ...
    </resource>
  </resource>
</s3xml>

or (if requested from the fields.xml method):

<fields resource="name">
  <field/>
  <field/>
  <field/>
  ...
</fields>

Note:

  • These documents can only be requested (GET), but not submitted for import
  • Schema documents support on-the-fly transformation (see chapter Web Services)
  • the URL query parameter ?options=true adds a list of field options to those fields where options are defined, and combined with the parameter &reference=true, even options for foreign key references will be included
  • the URL query parameter ?meta=true will include the meta fields (as <meta> elements). In data documents, the meta fields appear as attributes of the <resource> element

Field Options Documents

Field options documents describe the currently acceptable options for fields in a record. Clients can use these documents e.g. for automatic generation and/or client-side validation of forms.

Field options documents can be requested from Sahana Eden by sending a GET request to the options.xml method of a resource, e.g.:

GET http://localhost:8000/eden/pr/person/options.xml

Document Tree:

<options>
  <select>
    <option>
    <option>
    <option>
    ...
  </select>
  <select>
    ...
  </select>
  ...
</options>

Note:

  • the field URL variable can be used to specify a particular field in the resource, the enclosing <options> element would then be omitted (i.e. <select> becomes root element)
  • on-the-fly transformation of field options documents is not supported
  • Field option documents can only be requested (GET), but not submitted for import

Data Documents

Data documents provide the current contents (data) of resources.

Data documents can be requested from Sahana Eden by sending a GET request to the URL of the resource, e.g.:

GET http://localhost:8000/eden/pr/person.xml

Data documents can be submitted to Sahana Eden by sending PUT requests to the URL of the resource, e.g.:

PUT http://localhost:8000/eden/pr/person.xml

Note that sending data with POST will enter an interactive review of the source data before importing them, thus POST cannot be used by merely non-interactive clients.

Document Tree:

<s3xml>
  <resource> <!-- primary resource element -->
    <data> <!-- field data -->
    <data>
    ...
    <resource> <!-- component resource inside the primary resource -->
      <data>
      <data>
      <reference/> <!-- reference -->
      ...
    </resource>
    <reference/> <!-- reference -->
    <reference> <!-- reference with embedded resource element -->
       <resource>
         <data>
         ...
       </resource>
    </reference>
  </resource>
</s3xml>

Components

Components of resources are <resource> elements nested inside the master <resource> element. Component records will be automatically imported and the required key references be added (=no explicit reference-element required).

Foreign key references of component records to their primary record will not be exported, and where they appear in import sources, they will be ignored.

Components of components are not allowed (maximum depth 1), and where they appear in import sources, they will be ignored.

References

Foreign key references (except those linking components to their primary record) are represented by <reference> elements.

Foreign keys can be importable UIDs (uuid-attribute, which will be both imported and used to find and/or link to existing records in the DB) or temporary UIDs (tuid-attribute, which will not be imported but only used to find records within the current tree), If a <resource> element with a matching UID key attribute is found in the same tree, it will be automatically imported.

References inside referenced elements will be resolved (unlimited depth) and also be imported. Circular references will be detected and properly resolved.

Multi-references (list:reference type in web2py) use a list of UID keys separated by vertical dashes like uuid=|uid1|uid2|uid3|. The leading and trailing vertical dashes must be present.

If a <resource> element is nested inside the <reference>, either or both of the UID keys can be omitted. Where both keys are however used, they must match. Multiple embedded <resource> elements are allowed for multi-references.

Element Descriptions

<s3xml>

This is the root element (in schema and data documents).

<s3xml success="true" results="2" domain="mycomputer" url="http://127.0.0.1:8000/eden" latmin="-90.0" latmax="90.0" lonmin="-180.0" lonmax="180.0">
   ...
</s3xml>
Parent elements: none (root element)
Child elements: <resource>
Contents: empty

Attributes:

Name Type Description mandatory?
domain string the domain name of the data repository no
url string the URL of the data repository no
success boolean true if the page contains any records, otherwise false no
results integer the total number of records matching the request no
start integer the index of the first record returned (in paginated requests) no
limit integer the maximum number of records returned (in paginated requests) no
latmin, latmax, lonmin, lonmax float geo-location boundary box of the results no

<resource>

This element represents a record (in data documents) or a database table (in schema documents).

<s3xml>
  <resource name="xxx_yyy">
     ...
  </resource>
</s3xml>
Parent elements: <s3xml>, <resource>, <reference>
Child elements: <resource>, <data>, <field>
Contents: empty

Attributes:

 Name  Type  Description mandatory?
 name  string  the name of the database table
 yes
 uuid  string  a unique identifier for the record  no*
 tuid  string  a temporary unique identifier for the record  no*
 created_on  datetime  date and time when the record was created  no**
 modified_on  datetime  date and time when the record was last updated  no, default: time of the request** ***
 created_by  string email-address of the user who created the record  no
 modified_by  string email-address of the user who last updated the record  no
 owned_by_user  string email-address of the user who owns the record*****  no
 owned_by_role  string  name of the user group who collectively own the record*****  no
 mci  integer  master-copy-index  no, default: 2*** ****
  • (*) Records will be identified within the input file by their uuid, or, if no uuid is specified, by their tuid.
  • (**) as YYYY-MM-DDTHH:mm:ssZ, always UTC
  • (***) the last update date/time and mci are required in synchronization
  • (****) the master copy index specifies how often a record has been copied across sites, see below
  • (*****) record ownership will be retained if the record owners can be matched against existing users/user groups

The uuid will be stored in the database together with the record. If uuid is present and matches an existing record in the database, then this record will be updated. If there's no match or no uuid specified in the resource element, then the importer will create a new record in the database (and automatically generate a uuid if required).

The mci - master-copy-index - indicates how often this record has been copied across sites:

  • when importing a new record the mci value is always *imported* as-is from the source
  • when updating a record, the mci of the database record remains unchanged
  • the mci of a record is *exported* as its current database value + 1.
  • the repository first creating a record sets mci=0 in the database record, which appears as mci=1 in the exported XML.
  • a copying site then imports mci=1 into its database, which appears as mci=2 in its export XML, and so forth...

The mci can be used to filter records for whether they have been originated at a repository or not. If there's a fixed set of synchronization paths between a number of Sahana Eden instances, the mci can be used for conflict resolution. If the mci is not specified, it defaults to 2.

MCI handling is optional for non-synchronizing peers.

<data>

This element represents the value of a single field in the record.

<s3xml>
  <resource>
    <data field="fieldname" value="value">...</data>
  </resource>
</s3xml>
Parent elements: resource
Child elements: none (leaf element)
Contents: Text

Attributes:

Name Type Description
mandatory?
field string the field name in the record yes
value JSON
the native field value no
url URL the URL to download the contents from* no
filename filename the filename of the attached contents* no

(*) If the field is for file upload, a url attribute should be provided to specify the location of the file. The importer will try to download and store the file (file transfer) from that URL (pull). It is also possible to send the file together with the HTTP request - in this case the filename must be specified instead of the url (push). The push variant for uploads is meant for peers which do not support pulling for some reason (e.g. mobile phones). Normal servers would always provide a URL for download in order to allow the consuming site to decide which files to download and when (saves bandwidth).

The text node in the data element provides a human-readable representation of the field value.

The value attribute contains a JSON representation of the field value, retaining the original data type (i.e. strings must be double-quoted) except for date, time and datetime values, which are to be represented as simple strings in the respective standard format (no double quotes). The standard format for datetime values is YYYY-MM-ddTHH:mm:ssZ (ISO format, UTC), date shall be represented as YYYY-MM-dd, and time as HH:mm:ss.

data elements representing passwords can contain the clear text password in the value attribute, or the encrypted password in the text node. Where a clear text password is given as value attribute, it will be stored encrypted, otherwise the password will be stored as-is. Note that clear-text representation of passwords will be accepted by the interface, but never be exported.

<reference>

Represents a foreign key reference.

<s3xml>
    <resource name="xxx_yyy">
        <reference field="xy" resource="aaabbb" uuid="urn:uuid:e4bcb9fd-d890-4f2f-b221-1d75fff79e2d"/>
    </resource>
</s3xml>
Parent elements: <resource>
Child elements: <resource>
Contents: Text

Attributes:

 Name  Type  Description  mandatory?
 field  string  the field name in the record  yes
 resource  string  the name of the referenced database table
 yes
 uuid  string  the unique identifier of the referenced record (foreign key)*  (yes)**
 tuid  string  a temporary identifier for a referenced record (foreign key)*  (yes)**
(*) Referenced records would always be exported in the same output file. If a referenced record is found in the same input file, then it will be automatically imported.

(**) Records will be identified within the input file by their uuid, or, if no uuid is specified, by their tuid.

If the referenced record is enclosed in the reference element, then uuid and tuid can be omitted:

<s3xml>
   <resource name="xxxyyy">
       <!-- content of the record goes here -->
       <reference field="xy" resource="aaabbb">
          <resource name="aaabbb">
            <!-- content of the referenced record goes here -->
          </resource>
       </reference>
   </resource>
</s3xml>