Technical guide to AFP NewsML-G2

Introduction

AFP delivers information in a number of ways, tailored to its clients needs. One delivery vector is NewsML-G2, an industry-driven format and processing model allowing rich machine-readable representation of news content.

This document is your technical guide to AFP NewsML-G2 documents. You'll make use of it when implementing systems that receive and process AFP NewsML-G2 documents. It describes how building blocks defined by NewsML-G2 are combined in AFP documents to convey news content and associated metadata (titles, genres, subjects, embargo, etc.). It should be used along the NewsML-G2 documentation provided by IPTC [G2Doc], which it assumes knowledge of.

AFP NewsML-G2 documents build upon the NewsML-G2 format and processing model defined by IPTC (International Press Telecommunications Council) in the context of the NAR (News Architecture). NewsML-G2 is itself an application of XML and makes use of XML Schema. AFP NewsML-G2 documents also make use of the XML syntax of HTML [HTMLSpec] (formerly referred to as "XHTML 5") to represent textual content along with rich structural information as chunks of HTML in XML syntax can be embedded right into NewsML-G2 content. In order to deal with AFP NewsML-G2 documents, you will make use of all these technologies.

Technology stack

Further sections provide an overview of AFP documents structure. They describe the information a document conveys and how to it.

Mandatory processing

NewsML-G2 documents convey a number of metadata. For the most part you can pick some and ignore others as you see fit. For example, you can make use of IPTC media topics, or opt to not rely on it. Some metadata, however, cannot be ignored and must be processed, such as embargo instructions.
It is possible that, in addition to your NewsML-G2 integration, your workflows process AFP's metadata delivered by other means. For example, for video production we also deliver metadata in the form of a human readable dopesheet sent by email. In the end, these metadata must be correctly processed, be they obtained from NewsML-G2 or from another delivery medium.

Correctly processing the following metadata is mandatory:

Identifier and version number for basic document management including updates and corrections.
The publishing status, which allows canceling or witholding documents.
The embargo instructions.
The role "for byline" if the creator or contributor metadata is used by your system.
The correction signal and/or the general editorial note in a way that ensure important corrections are processed correctly.
The general editorial note (again), which must be displayed to your editorial team as it can contain important instructions. If you cannot put such a workflow in place, one option is for your system to discard the document and associated content, including previous versions, when a general editorial note is present.

Should you encounter questions or difficulties when implementing these mandatory processes, please contact your AFP representative to get assistance.

Undocumented features

In actual NewsML-G2 documents delivered by AFP you will find several things neither documented here nor in the NewsML-G2 specification, such as undocumented XML elements and attributes. You must not rely on these undocumented features, unless specifically advised to do so by your AFP representative. These undocumented features are prone to change without notice and contain information that you cannot interpret reliably.

Overview

An AFP NewsML-G2 document provides metadata about content published by AFP. For example, when AFP publishes a picture it also publishes an associated NewsML-G2 document that provides metadata about this picture such as a caption, the name of the photographer, the location of the event, etc. Depending on the nature of the content, the NewsML-G2 document can be separate from the main content itself (e.g., a picture as a JPEG file along with a NewsML-G2 document) or can contain the main content (e.g., a textual story embedded inside the NewsML-G2 document).

There are eight main types of AFP news content for which NewsML-G2 can play a role:

Text story

The content is a textual news story. It is expressed using the XML syntax of HTML (formerly referred to as "XHTML 5") and can include some text structure markup as well as hypertext links. A NewsML-G2 document provides both this textual content and various associated metadata such as creation date, version number, media topics, content warning, etc. Such NewsML-G2 documents are said to be of type "text". Se section Document Walk-through for an example of such document.
Picture

The content is a non-animated visual representation of a physical scene. It is typically a photo, but can also be produced by other means: for example, it can be an artist drawing representing a physical scene (e.g., a courtroom drawing). Some pictures are made by combining multiple pictures as in a collage; we call such composite pictures photo combos. A picture is delivered in formats such as JPEG, at several resolutions. Along with this visual content, a NewsML-G2 document provides metadata such as creation date, version number, media topics, content warning, etc. Such NewsML-G2 documents are said to be of type "picture".
Still graphic

The content is a non animated graphic composition of elements such as text, image, symbolic shapes, data (etc.) providing news coverage and/or contextual information. For example, it can be charts, diagrams, maps, etc. It can includes real photos: typically a photo overlaid with other elements of the composition. A still graphic is delivered in several formats such as PDF, Adobe Illustrator Artwork, JPEG, PNG, etc. Vector-based formats allow you to translate and customize the content. Bitmap formats are available at various alternative resolutions. Along with this visual content, a NewsML-G2 document provides metadata such as creation date, version number, media topics, content warning, etc. Such NewsML-G2 documents are said to be of type "still graphic".
Animated graphic

The content is an animated symbolic visual representation. It often includes text labels. For example, it can be animated charts, diagrams, maps, etc. It may includes audio content. An animated graphic is delivered in a zip package which contains among other thing a Flash media. Along with this visual content, a NewsML-G2 document provides metadata such as creation date, version number, media topics, content warning, etc. Such NewsML-G2 documents are said to be of type "animated graphic". Note that such content is legacy: no new content of this kind is created by AFP. It has been replaced by videographics, described below.
Interactive graphic

The content is an interactive graphic composition of various elements such as text, images, symbolic shapes, data (etc.) Interactive controls allow the viewer to explore, drill down and filter information. Technically an interactive graphic is implemented as an interactive HTML document. It is hosted by AFP and is typically integrated into your application using an <iframe> tag. Along with this interactive content, a NewsML-G2 document provides metadata such as creation date, version number, media topics, content warning, etc. Such NewsML-G2 documents are said to be of type "interactive graphic".
Video

AFP delivers NewsML-G2 for two kinds of video content:
- Moving visual representation of one or more physical scene(s), possibly accompanied by audio such as natural sound, commentary or music. This content is typically made from digital video material.
- Animated graphic and audio composition of elements such as computer rendered 3D scenes, text, images, photos, videos, musics and audio commentaries. It typically describes or explains an event, a phenomenon, a situation (historical, political, economical, etc.), a technique, a scientific or medical information. We call it videographic content.
These videos are delivered in several formats such as H.264, MPEG-2, Windows Media Video, Flash Video, etc. Illustration images are also provided as JPEG data. The audio and, for videographics, textual content, can be delivered separately allowing you to translate and customize it.
Along with this audiovisual content, a NewsML-G2 document provides metadata such as creation date, version number, media topics, content warning, shotlist, script, synthe, etc. Such NewsML-G2 documents are said to be of type "video".
In addition to this machine processable metadata, a textual summary of metadata, called the dopesheet, can be automatically sent to you by email.
Multimedia

The content is made of a textual news story intermingled with audiovisual components such as pictures, videos, graphics, etc. A NewsML-G2 document, said to be of type "multimedia", provides the multimedia content using the XML syntax of HTML. It also provides various metadata such as creation date, version number, media topics, content warning, etc. The visual components are delivered along this NewsML-G2 document in their respective formats (JPEG, PDF, MPEG, etc.)
Live report

A live report provides live coverage of an ongoing event, delivering news bits as the story develops. In the context of live reports, these news bits are called "posts". The posts are organized chronologically, as each one is tagged with a timestamp. They contain real-time coverage in text, photo, video, graphics, tweets, etc. including contributions from AFP journalists on the ground. Each post is represented by a NewsML-G2 multimedia document (see description above), which provides access to the news content and associated metadata, including the timestamp. Another NewsML-G2 document, the "index", provides metadata about the live report itself (title, media topics, etc.) and the collection of posts in the form of a list of links to the NewsML-G2 documents representing posts. This NewsML-G2 document is said to be of type "live report index", See the presentation of live reports top level structure for more information.

The type of a NewsML-G2 document defines important characteristics of the document such as the nature of its content, its XML structure, the metadata it provides as well as some elements of its processing model. As you can see the type of an NewsML-G2 document is named after the type of news content the NewsML-G2 document is associated with.

AFP NewsML-G2 documents of type text, picture, video, still graphic, animated graphic and interactive graphic have the same top-level structure: a NewsML-G2 element called "news message". This news message is an envelope that contains one "news item". This news item represents some news content which can be either a news story in textual form, a photo, a video, a still graphic, an animated graphic or an interactive graphic.

AFP NewsML-G2 documents of type multimedia also have a news message as the top-level structure. This news message is an envelope that contains one or more news item(s): a main item with the multimedia content in the XML syntax of HTML and additional items for photos, videos, etc.

AFP NewsML-G2 documents of type live report index also have a news message as the top-level structure. This news message is an envelope that contains a "package item" providing metadata about the live report as a whole and links to NewsML-G2 documents representing the individual posts of the live report.

Section "Type of document" describes how to determine the type of a document. The following sections provide an overview of the structure of documents.

Text documents

Text documents have only one news item. This item contains metadata and textual news content. The content is represented by some HTML (in its XML) syntax embedded right into the news item.

Top-level structure of text documents

Picture and still graphic documents

Picture and still graphic documents have only one news item that conveys only one logical visual content (e.g., one photo). However, this content may be available in different renditions (e.g., different formats, resolutions, etc.). In addition to metadata about the picture or still graphic, the news item contains links to the actual visual content (e.g., JPEG resources) for each rendition. The visual content for each rendition isn't provided in the NewsML-G2 document itself, but by external resources (e.g., accompanying files, Web resources, etc.).

Top level structure of picture and still graphic documents (example)

Video and animated graphic documents

Video and animated graphic documents have only one news item that conveys only one logical visual content (e.g., one video, one animated graphic). However, this content may be available in different renditions (e.g., different formats, resolutions, etc.). In addition to metadata about the video or animated graphic, the news item contains links to the actual visual content (e.g., MPEG resources) for each rendition. The visual content for each rendition isn't provided in the NewsML-G2 document itself, but by external resources (e.g., accompanying files, Web resources, etc.).

The the news item may also contains links to renditions of an icon (aka "illustration" or "preview image"). The renditions of the icon aren't provided in the NewsML-G2 document itself, but in external resources (e.g., accompanying files, Web resources, etc.).

Top-level structure of video and animated graphic documents (example)

Multimedia documents

Multimedia documents have one or multiples news items. One of these items is the "main news item". It is always present and provides the multimedia content using the XML syntax of HTML. Tt also provides metadata about the document, much like the news item of a text document. It also contains links to other items of the document. These additional items convey information about visual content: pictures, videos or graphics. They are much like the items found in picture, video or graphic documents.

The figure below provides an example of multimedia document with one main item, a picture item and a video item.

Top-level structure of multimedia documents (example)

The main news item is identified by the presence of a specific element in its item metadata section: a link element whose rel attribute convey the concept URI http://cv.iptc.org/newscodes/conceptrelation/isA (using the QCode crel:isA) and whose href attribute, an URI, is equal to http://cv.afp.com/itemnatures/mmdMainComp.

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <!-- This link element tells that this news item is the main item of the multimedia document  -->
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>                
            </itemMeta>
        </newsItem>
        
        <!-- Additional, non-main items  -->
        <newsItem></newsItem>
        <newsItem></newsItem>
     </itemSet>
</newsMessage>

You'll find more information about QCodes in section Controlled vocabularies and qualified codes.

Live reports

A live report is represented by multiple NewsML-G2 documents :

A document called the "index" which provides metadata about the live report and contains links to the various posts. The index is structured as a "package item" embedded in a news message. A package item is a standard NewsML-G2 construct used to represent collections. See the section on list of posts for a detailled description of how we represent the links to the various posts. See also the NewsML-G2 specification [G2Doc] for a general description of package items.
A set of multimedia documents, each one representing a post of the live report.

The figure below shows the top level structure of a live report. You can see the index on the left and the various posts on the right.

Top-level structure of live reports (example)

Document walk-through

Below is an example of a simple text document, with just a few metadata and some textual content. Using this example, we will walk through some structural elements that are common to every type of AFP NewsML-G2 documents.

Note that while the XML in this example is formatted to ease reading, actual document you will receive will usually be in a compact form (e.g., all XML on one line).

<newsMessage xmlns="http://iptc.org/std/nar/2006-10-01/" xmlns:afp="http://www.afp.com/format/internal/">
   <header>
      <sent>2009-02-23T20:44:07+02:00</sent>
   </header>
   <itemSet>
      <newsItem standard="NewsML-G2" standardversion="2.28" conformance="power" 
                guid="http://doc.afp.com/863OC" version="3" xml:lang="en">
         <catalogRef href="http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_26.xml"/>
         <catalogRef href="http://cv.afp.com/std/catalog/catalog.AFP-IPTC-G2-V2_4.xml"/>
         <itemMeta>
            <itemClass qcode="ninat:text"/>
            <provider qcode="nprov:AFP">
               <name>AFP </name>
            </provider>
            <versionCreated>2009-02-23T20:43:00+02:00</versionCreated>
            <pubStatus qcode="stat:usable"/>
         </itemMeta>
         <contentMeta>
            <headline>
               YSL-Bergé collection sets new world record at auction 
               for a private collection
            </headline>
            <subject qcode="medtop:20000031" type="cpnat:abstract">
               <name>visual art</name>
            </subject>
            <subject qcode="medtop:20000011" type="cpnat:abstract">
               <name>fashion</name>
            </subject>
         </contentMeta>    
         <contentSet>
            <inlineXML contenttype="application/xhtml+xml" wordcount="70">
               <html xmlns="http://www.w3.org/1999/xhtml">
                  <head>
                     <title>
                        YSL-Bergé collection sets new world record at auction 
                        for a private collection
                     </title>
                  </head>
                  <body>
                     <p>The Yves Saint Laurent and Pierre Bergé collection sets 
                     new world record at auction for a private collection. 
                     Hundreds of art treasures amassed by late fashion designer
                     Yves Saint Laurent and his companion Pierre Berge over half
                     a century are being auctioned.</p>
                     <p>Bids hit 206 million euros (261 million dollars) on February
                     23, 2009 making it the biggest private collection ever 
                     auctioned with two days of sales still left to run.</p>
                  </body>
               </html>
            </inlineXML>
         </contentSet>
      <newsItem>
   </itemSet>
</newsMessage>

Some notes about this structure:

News message

The newsMessage element conveys the document. It includes attributes providing namespace declarations and other information such as schema location. This information is automatically interpreted by the standard XML software components you will likely be using when processing documents (e.g., parsers, validators, etc.).

The newsMessage element has two children: a header, which provides a transmission date (and possibly some additional information), and an item set which, in this example, contains one news item. In a multimedia document, the item set typically contains multiple items.
News item

The newsItem element provides the journalistic content along with metadata about this content and other information useful for processing. It has attributes stating the name, version and conformance level of the NAR standard further used in the item. AFP NewsML-G2 documents use NewsML-G2 version 2.10 or higher at the "power" conformance level.

The guid attribute is a persistent and globally unique identifier for this news item, in the form of an IRI [RFC3987].

The version attribute, if present, provides the version number of the item. It is incremented (not necessarily by one) when the document is updated.
Catalog information

The news item then carries catalog information using catalogRef and catalog elements (only the former is shown in the example above). This information specifies mappings between scheme aliases and scheme URIs. It allows you to resolve qualified codes found, for instance, in qcode attributes further in the item, to full URIs (i.e., unambiguous identifiers). In the example above, we reference a standard IPTC-provided catalog at http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_26.xml and an AFP specific catalog at http://cv.afp.com/std/catalog/catalog.AFP-IPTC-G2-V2_4.xml. In actual documents you may find references to other catalogs. The section Controlled vocabularies and qualified codes provides more information about qualified codes resolution.
Item meta data

The itemMeta element contains information about the news item itself. It always specifies the class of the item (e.g., text, picture, video, etc.), the provider of the item (that will be AFP or a specific AFP service), and the date of creation of this version of the item. It may also contain additional information, such as an embargo directive, a publication status, editorial notes, etc.
Content meta data

The contentMeta element contains information about the journalistic content of the item (e.g., title, subjects, genres, language, etc.)
Content set

The contentSet element contains the principal journalistic content of the item.

In text documents this content is provided inline, in an inlineXML element as shown in this example. See section "Data specific to text documents" for more information.

In picture, video, still graphic and animated graphic documents, this content is provided by reference: the content set contains links to the actual visual content (e.g., links to JPEG files, links to MPEG files, etc.). See section "Visual content" for more information.

In multimedia documents, the content set of the main item contains the multimedia content of the document expressed using the XML syntax of HTML. Inside, the picture, video, still graphic and animated graphic elements are provided by reference through links to other news items. In addition, default renditions are provided through standard HTML elements such as <img> or <video>. See section "Data specific to multimedia documents" for more information.

Controlled vocabularies and qualified codes

Concepts are identified with concept URIs

Documents make use of a number of controlled vocabularies (aka taxonomies) to convey information. In this section, we focus on a specific set of controlled vocabularies called "NewsML-G2 schemes".

A NewsML-G2 scheme associates unambiguous identifiers to "concepts". These identifiers take the form of URIs (Uniform Resources Identifiers [RFC3986]).

For example, in NewsML-G2 a document is usable, withheld or canceled; this is known as the "publishing status" :

The status "usable" is identified by http://cv.iptc.org/newscodes/pubstatusg2/usable
The status "withheld" is identified by http://cv.iptc.org/newscodes/pubstatusg2/withheld
The status "canceled" is identified by http://cv.iptc.org/newscodes/pubstatusg2/canceled

These identifiers are called "concept URIs". Together, they form a controlled vocabulary. While they may look like dereferencable HTTP URLs, they do not need to be. Their main purpose is to unambiguously identify various concepts.

A document can contain a pubStatus element that conveys the concept URI identifying its publishing status. Therefore, when you receive a document, you can process this concept URI (e.g., compare it to the three possible values given above) to determine what is the publishing status of the document.

Concepts URIs might be represented by QCodes in NewsML-G2 documents

In NewsML-G2 documents, some concept URIs are not directly expressed using the URI syntax. Instead, they are conveyed as QCodes (short for "Qualified Codes"). A QCode is made of two parts separated by a colon. The leftmost part (before the leftmost colon) is called the scheme alias. The part on the right of the leftmost colon is called the code.

QCode structure

In some ways, a QCode can be seen as a compressed form of concept URI (actually it is a bit more than that, as it also identifies the controlled vocabulary the concept URI is part of, but this is an advanced topic that we won't develop further in this documentation). Determining the concept URI a QCode stands for is called resolving the QCode. We'll describe how this operation is to be performed at the end of this section.

Why it is useful to resolve QCodes to concept URIs

When processing NewsML-G2 documents it is useful to resolve QCodes to concept URIs and then to work in terms of concept URIs because QCodes are not universally unambiguous identifiers whereas concept URIs are.

For example, in a given document the publishing status "usable" may be expressed by the following QCode: stat:usable (see it in situ in section Document walk-through). However, in another document the same status might be expressed by the QCode pst:usable. These two QCodes are different but resolve to the same concept URI: http://cv.iptc.org/newscodes/pubstatusg2/usable.

Furthermore, while it does not happen within AFP production, if you consider NewsML-G2 documents in general it is even possible for the QCode stat:usable to express the publishing status "usable" in a given document while expressing something completely different in another document. In that case the resolution process will correctly yield http://cv.iptc.org/newscodes/pubstatusg2/usable in the context of the first document and a different concept URI in the context of the second document.

Important design principle: QCode resolution shields you from QCode-level variations or accidental homonymies and gives you unambiguous identifiers to work with.

What to do if you can't implement QCode resolution with your tool chain

Depending on your tool chain, QCode resolution might be difficult to implement. For example standards XML tools such as XPath processors can't easily integrate QCode resolution. If you are in such situation you can bypass the QCode resolution step and work directly in terms of QCodes when dealing with AFP's production because we ensures that in our NewsML-G2 documents QCodes are unambiguous (e.g., in all AFP documents the QCode stat:usable will represent the publishing status "usable").

In this documentation we specify both concept URIs and QCodes wherever needed. Unless specified otherwise, for IPTC standardized NewsML-G2 schemes we use the IPTC recommended QCodes that you can lookup in the corresponding IPTC documentation: for example, if you navigate with your Web browser to the resource identified by the concept URI for the publishing status "usable" (you can do it by clicking on this link: http://cv.iptc.org/newscodes/pubstatusg2/usable) you'll see that the IPTC recommended QCode for this publishing status is stat:usable.

When possible, however, it is advised to resolves QCodes. It includes the following benefits:

It will be easier for your system to work with NewsML-G2 documents from other providers (e.g., Reuters, AP, etc.) which use their own sets of QCodes for representing the same concept URIs.
Some tools, APIs or other kind of services you might want to use, operate in terms of concept URIs; you won't be able to interoperate with such tools if you work at the QCode level. For example, the IPTC exposes standards NewsML-G2 controlled vocabularies by means of Web resources whose URIs are the concept URIs of the vocabularies elements: if you know the concept URI of a vocabulary element, then you can fetch information about it from the IPTC servers. If you only know the QCode and can't resolve it to the associated concept URI then you won't be able to fetch this information.

How to perform QCode resolution

The resolution process is described precisely in the NewsML-G2 documentation ([G2Doc]). In short, it consists in resolving the scheme alias part of the QCode to a scheme URI using the catalog information provided in the document at the item level, and then to concatenate that scheme URI to the code part of the QCode. In our example, the QCode stat:usable has a scheme alias stat and a code usable. It is resolved to http://cv.iptc.org/newscodes/pubstatusg2/usable, because the catalog information of the enclosing news item contains the following element :

<scheme alias="stat" uri="http://cv.iptc.org/newscodes/pubstatusg2/"/>

This catalog information can appear inline in the item inside catalog elements, or in an external resource referenced by the item through a catalogRef element, as in the following example borrowed from the section Document walk-through:

<catalogRef href="http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_26.xml"/>

Resolving a QCode raises a concept URI that unambiguously identifies a given concept on a global scale. In our example, the concept identified by http://cv.iptc.org/newscodes/pubstatusg2/usable is: the publishing status "usable". In the context of NewsML-G2 schemes, two logically different concepts are never given the same concept URI, even in different systems managed by different organizations.

How to read the examples

The following sections of this document are dedicated to answer questions of the form "Where is data X in an AFP NewsML-G2 document (and how can I make use of it)?". For example: "Where is the title of the document?", "Where is the textual content?", "Where is the caption?", "Where is the visual content?" etc.

For each data, XML examples are provided. These examples aren't complete documents, though: they are high-level representations of the format, omitting many aspects and focusing on the data in question.

For instance, here is the example we provide for the "word count" metadata in text documents (the word count gives an estimation of size of the textual content):

<newsMessage>
    <itemSet>
        <newsItem>
            <contentSet>
                <inlineXML wordcount="450">
                </inlineXML>
            </contentSet>
        </newsItem>
    </itemSet>
</newsMessage>

As you can see, this example omits many elements: contrast it with the example of a complete document provided in section Document walk-through. What you get from it, however, is a sense of where the word count information can be found and how it looks like.

Some examples contain XML comments. For example:

<!-- A subject represented by a QCode  -->
<subject qcode="medtop:20000273"/>

These comments won't appear in real documents, they are annotations specific to this documentation.

Common data

Some data may be present in most types of documents. For example, a creation date or content warning can appear in any document (text, picture, still graphic, animated graphic, video, multimedia, live report, ...). This section details these common data elements. Further sections details data associated with specific types of documents.

Creators & Contributors

Text, picture, still graphic and video documents: creators and contributors may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <creator role="afpcrrol:writer afpctrol:forbyline">
                    <name>
                        John Doe
                    </name>
                </creator>
                <contributor role="afpctrol:editor afpctrol:validator">
                    <name>
                        Jeanne Dupont
                    </name>
                </contributor>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: creators and contributors to the multimedia document as a whole may be provided in the content metadata section of the main news item. Creators and contributors specific to an individual item may be provided in the content metadata section of that item.

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
            <contentMeta>
                <!-- The creators and contributors to the multimedia document as a whole -->
                <creator role="afpcrrol:writer afpctrol:forbyline">
                    <name>
                        John Doe
                    </name>
                </creator>
                <contributor role="afpctrol:forbyline">
                    <name>
                        Jeanne Dupont
                    </name>
                </contributor>
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- The creators and contributors specific to this item -->
                <creator role="afpcrrol:photographer afpctrol:forbyline">
                    <name>
                        Al Dente
                    </name>
                </creator>
                <contributor>
                    <name>
                        Annie Mall
                    </name>
                </contributor>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: creators may be provided in the content metadata section of the package item. Note that no contributors are provided in live report indexes.

<newsMessage>
    <itemSet>
        <packageItem>
            <contentMeta>
                <creator role="afpcrrol:writer afpctrol:forbyline">
                    <name>
                        John Doe
                    </name>
                </creator>
                <creator role="afpcrrol:writer">
                    <name>
                        Walter Melon
                    </name>
                </creator>
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

Creators and contributors may be provided by creator and contributor elements. Creators are persons who created the document or parts of the documents. Contributors are persons who modified or enhanced the document or parts of the documents. There might be any number of creators and contributors per news item.

For each creator and contributor we provide a name in the name element and optionally a list of roles, in the form of a QCode list, in the role attribute. The table below presents some roles often used in AFP documents.

Creator and contributor roles
Role	QCode	Concept URI
Writer	`afpcrrol:writer`	`http://cv.afp.com/creatorroles/writer`
Photographer	`afpcrrol:photographer`	`http://cv.afp.com/creatorroles/photographer`
Graphic designer	`afpcrrol:graphicDesigner`	`http://cv.afp.com/creatorroles/graphicDesigner`
For byline	`afpctrol:forbyline`	`http://cv.afp.com/contributorroles/forbyline`

Important: The "for byline" role has a special meaning: the names of creators and contributors without this role must not be published. You may use them for internal purpose such as contacting the journalist for questions, but you must not display them publicly in association with the content of the document.

Content warning

Text, picture, still graphic, video and multimedia documents: a content warning may be provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <signal qcode="sig:cwarn"/>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: a content warning may be provided in the item metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <itemMeta>
                <signal qcode="sig:cwarn"/>
            </itemMeta>
        </packageItem>
    </itemSet>
</newsMessage>

A document may includes a warning about its content when it might be perceived offensive. In such case, you'll typically want to review the content of the document in order to decide how to use it. This warning takes the form of a signal element with a QCode sig:cwarn resolving to http://cv.iptc.org/newscodes/signal/cwarn.

When a content warning is present, we often provide a set of exclAudience elements that convey the reason(s) for the content warning. For example, in a document whose content contains potentially offensive violence and language:

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <signal qcode="sig:cwarn"/>
            </itemMeta>
            <contentMeta>
                <exclAudience qcode="cwarn:violence"/>
                <exclAudience qcode="cwarn:language"/>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

In a live report index, the exclAudience elements are provided in the package item instead of in a news item:

<newsMessage>
    <itemSet>
        <packageItem>
            <itemMeta>
                <signal qcode="sig:cwarn"/>
            </itemMeta>
            <contentMeta>
                <exclAudience qcode="cwarn:violence"/>
                <exclAudience qcode="cwarn:language"/>
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

Used in this way, each exclAudience element identifies an audience that may be offended or distressed by a given characteristic of the content (e.g. "violence"). In order to specify these, the IPTC's content warnings vocabulary [IPTCCWarn] must be used.

At the time of this writing we make use of the following content warnings, using the standard IPTC scheme: death, language, nudity, sexuality, violence and suffering.

Correction signal

Text, picture, still graphic, video and multimedia documents: a correction signal may be provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
   <itemSet>
      <newsItem>
         <itemMeta>
            <signal qcode="sig:correction"/>
         </itemMeta>
      </newsItem>
   </itemSet>
</newsMessage>

One particular type of update that can occur on a document is a correction. A correction occurs when an error has been found in a document and a corrected version is published. In such case, you receive a new version of the document (i.e., a document with the same guid an a new version number) that contains a correction signal. This signal takes the form of a signal element with a qcode attribute sig:correction resolving to http://cv.iptc.org/newscodes/signal/correction.

Common practice at AFP is to use this mechanism only for corrections of great significance. For example, the correction of a typo that doesn't change the meaning of the news story shall not be marked as a correction but might be issued as a mere update.

When a serious error is found with a key information in a document, which renders it unusable as such, it will usually be canceled instead of corrected. A document is canceled by issuing a version with the "canceled" publishing status, as discussed in section Publishing Status.

The correction signal doesn't provide details about the correction (e.g., what or where was the error, how it has been corrected). Such details will usually be provided in the general editorial note, which is given by an edNote element with a role attribute afpnoteRole:client resolving to http://cv.afp.com/ednoteroles/client (see the section on the general editorial note). For example:

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <edNote role="afpnoteRole:client">
                   CORRECTS the first sentence of the answer of the auctioneer, which was incorrectly translated.
                </edNote>
                <signal qcode="sig:correction"/>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Handling a correction correctly is of paramount importance and can be a complex process (you probably have it in place already). For example, you may want to have someone review the item, along with its previous versions and the editorial note, to understand the error. You may then ensure that this correction is applied to any published material that carries the original error. This may include making sure that recipients of such material are notified and provided with the corrected information.

Dates

Two dates formats are used in this specification:

Full date and time. This is the XML Schema dateTime format [XMLSchemaDataTypes] with the added constraints that the time zone must be specified.
Truncated date and time. This format is based on the XML Schema dateTime format but allows to for truncated forms (e.g. "2014-08"). The description of this format is provided in the NewsML-G2 specification as follows:
The date has an optional time part: it is optionally possible to omit one to many less significant components, from right to left. “From right to left” means starting from the least significant component (i.e., fraction of a second) and to continue with the full time part, the day part and the month part. The year part MUST NOT be omitted. If the time part is present the time zone SHOULD NOT be omitted.

In addition to the description provided below, you should refer to the NewsML-G2 specification for information on the processing model for these dates.

Document transmission date

All documents: the transmission date of the document is provided in the header of the news message.

<newsMessage>
    <header>
        <sent>2009-02-23T20:44:07+02:00</sent>
    </header>
</newsMessage>

The transmission date is provided by the sent element. It is always present and uses the full date and time format. The transmission date indicates when the document was transmitted from AFP to your system.

Document creation date

Text, picture, still graphic, video and multimedia documents: the creation date of the NewsML-G2 document may be provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <firstCreated>2009-02-23T18:22:08+02:00</firstCreated>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: the creation date of the NewsML-G2 document may be provided in the item metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <itemMeta>
                <firstCreated>2009-02-23T18:22:08+02:00</firstCreated>
            </itemMeta>
        </packageItem>
    </itemSet>
</newsMessage>

If present, the creation date of the NewsML-G2 document is provided by a firstCreated element in the full date and time format. This creation date specifies when the NewsML-G2 document was created (contrast this with the content creation date, which specifies when some content was created; e.g., when a given photo was shot). When a new version of the document is emitted, the creation date of the document isn't modified, but the version creation date is.

Document version creation date

Text, picture, still graphic, video and multimedia documents: the creation date of this version of the NewsML-G2 document is provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <versionCreated>2009-02-23T20:43:00+02:00</versionCreated>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: the creation date of this version of the NewsML-G2 document is provided in the item metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <itemMeta>
                <versionCreated>2009-02-23T20:43:00+02:00</versionCreated>
            </itemMeta>
        </packageItem>
    </itemSet>
</newsMessage>

The creation date of this version of the NewsML-G2 document is provided by a versionCreated element in the full date and time format. This date information is always present in documents.

Content creation date

The content creation date is the date of creation of the main journalistic content associated with the NewsML-G2 document. For a photo, this is the date of shooting, except for a photo combo where we provide the date at which the combo was produced. Likewise, for live video footage, this is a date at which the covered event was occurring. For other type of content (e.g., video report, graphic) this is typically the date at which the content was produced.

Picture, still graphic, animated graphic and video documents

The content creation date may be provided by a contentCreated element in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <contentCreated>2009-02-23T17:31:00+02:00</contentCreated>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents

The creation date of a specific picture, still graphic or video component may be provided in the content metadata section of the corresponding item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- This is the content creation date for this item -->
                <contentCreated>2009-02-23T17:31:00+02:00</contentCreated>
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- This is the content creation date for this other item -->
                <contentCreated>2009-02-22</contentCreated>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

While content creation dates may be provided for components, none is provided for the multimedia document itself. The version creation date of the document often provides a good approximation. However this might not be the case for all documents so you should adopt this heuristic approach only if your usage of this date can support a "right most of the time" situation.

Text documents

As with multimedia documents, no content creation date is provided. The version creation date of the document often provides a good approximation. However this might not be the case for all documents so you should adopt this heuristic approach only if your usage of this date can support a "right most of the time" situation.

Live report indexes

No content creation date is provided for live report indexes.

Embargo

Text, picture, still graphic, video and multimedia documents: embargo information is provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <embargoed/>
                <edNote role="afpnoteRole:embargo">
                    Embargoed until end of first auction day
                </edNote>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Embargo information is specified through the embargoed element, which can be completed by an edNote element with a role attribute afpnoteRole:embargo resolving to http://cv.afp.com/ednoteroles/embargo.

Embargo-wise, an AFP document can have one of the three statuses described in the table below.

Embargo statuses
Embargoed	Representation	Example
No	No `embargoed` element.	N/A
Until given date and time	An `embargoed` element providing the date and time at which the embargo ends.	`<embargoed> 2009-02-23T21:00:00+02:00 </embargoed>`
Under other provided conditions	An empty `embargoed` element and an embargo editorial note specifying the embargo conditions. This form is used when the precise date and time at which the embargo expires is not known. Note that if the conditions are made of a date and time and additional conditions, all these conditions are expressed in the editorial note (i.e., the date and time aren't provided inside the `embargoed` element, but as part of the editorial note).	`<embargoed/> <edNote role="afpnoteRole:embargo"> Embargoed until end of first auction day </edNote>`

See the NewsML-G2 specification for more information on the representation and processing model of embargo information.

For multimedia documents, the way embargo information is conveyed differs from standard NewsML-G2. In NewsML-G2 each G2 item carries its own embargo information, and a G2 item without an embargoed element is defined as not embargoed. In AFP's multimedia documents the only embargoed element to consider is those of the main item. The embargoed elements of non main items must be ignored. You must process multimedia documents in a way that applies embargo directives provided in the main news item to the entire content of the document (i.e., to all items in the document).

Event identifiers

Text, picture, still graphic, video and multimedia documents: multiples event identifiers may be provided by subject elements in the content metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <subject qcode="QCode identifying an event" type="cpnat:event">
                    <name>
                        Auction for the Yves Saint Laurent and Pierre Bergé collection
                    </name>
                </subject>       
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: only one event identifier may be provided by a subject element in the content metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <contentMeta>
                <subject qcode="QCode identifying an event" type="cpnat:event">
                    <name>
                        Auction for the Yves Saint Laurent and Pierre Bergé collection
                    </name>
                </subject>       
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

The news coverage of an event often spans multiple NewsML-G2 documents. For example the auction for the Yves Saint Laurent and Pierre Bergé collection may be covered by two news stories (one announcing the event and one reporting on the event later on), two interview transcripts (one with Pierre Bergé and one with a Christie's representative), a multimedia document, a video report and a number of pictures of the event. It might be interesting for you to know that all these documents are about the same event. For example, it might help your editorial team to access all the documents available about the event. Another example: if you operate a Web site publishing news you could use this knowledge to automatically provide links to related content.

To let you know that multiple NewsML-G2 documents relate to the same event, AFP creates unique event identifiers and insert them into documents. For example, an unique event identifier is assigned to the auction for the Yves Saint Laurent and Pierre Bergé collection, and each related document contains this identifier.

Different NewsML-G2 documents covering the same event

An event identifier is the concept URI of a subject element whose type attribute, the QCode cpnat:event, resolves to http://cv.iptc.org/newscodes/cpnature/event. It is conveyed by the qcode attribute.

In addition to event identifiers we provide, whenever possible, the names of the events. An event name provides a short description of the event in natural-language. The name is provided by a name element inside the subject element.

See the section on subjects for more information about the subject element.

When a document covers multiple events it might contain multiple event identifiers, as shown in the example below:

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <subject qcode="QCode identifying an event" type="cpnat:event">
                    <name>Name of this event</name>
                </subject>       
                <subject qcode="QCode identifying another event" type="cpnat:event">
                    <name>Name of this other event</name>
                </subject>                
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Why are event identifiers provided using <subject> elements?

This is because events covered by a document are also subject matter of the document: things the document is about. Hence it is appropriate to convey their identifiers using the NewsML-G2 <subject> elements, along with other subjects of the documents. This allows them to be generically processed like any other subjects when that make sense, or to be processed specifically as event identifiers when needed, thanks to the type attribute which marks them as such.

General editorial note

Text, picture, still graphic, video and multimedia documents: a general editorial note may be provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <edNote role="afpnoteRole:client">
                    Original source is unknown and unverified. This photo was posted on twitter.
                    Following an official ban in San Theodoros on foreign media outlets covering
                    demonstrations, AFP is using pictures from other sources.
                </edNote>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

The general editorial note provides some text in natural language addressed to the editorial people in your team receiving and processing the item. It can provide instructions or hints on how to handle the document, information about the nature of a correction (see example in the section on correction signal), excluded audience/usage, additional information about the content, etc. It is not intended for publication.

There is at most one general editorial note in a document. If present, it is provided by an edNote element whose role attribute, the QCode afpnoteRole:client, resolves to http://cv.afp.com/ednoteroles/client. Note that while NewsML-G2 allows for rich text by using some markup in the content of an editorial note, AFP's systems only output simple textual content not interspersed with markup.

The general editorial note is often used to express usage restrictions, as in the following example:

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <edNote role="afpnoteRole:client">
                    EDITORIAL USE ONLY
                    NO MARKETING NO ADVERTISING CAMPAIGNS
                    NO ARCHIVE
                </edNote>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

The following table provides examples of common usage restrictions you might find in pictures documents.

Examples of usage restrictions conveyed by the general editorial note
Phrase inside the general editorial note	Comment
RESTRICTED TO EDITORIAL USE	The picture can be used only by media outlets for news purposes (newspapers, magazines, radios, TVs, news websites and mobile news services...)
NO MARKETING NO ADVERTISING CAMPAIGNS	The picture cannot be used for advertising or marketing.
NO INTERNET	The picture cannot be published on Internet websites.
NO MOBILE	The picture cannot be used by mobile services.
NO ARCHIVE	The picture cannot be archived.
MANDATORY USE WITH AFP STORY	The handout picture shall be published with the corresponding AFP story only (this mention is only available for handouts).
TO BE USED WITHIN XX DAYS FROM XX/XX/XXXX	The picture cannot be used outside of the specified timeframe.
NO VIDEO EMULATION	The picture cannot be used in a sequence of pictures to simulate a video.

Genres

Text, picture, still graphic and video documents: genres of the document may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- A genre represented by a QCode and associated with a rank -->
                <genre rank="1" qcode="afpattribute:Interview"/>
                
                <!-- A genre represented by a QCode and a name and associated with a rank -->
                <genre rank="2" qcode="afpedtype:VideoWithTitling">
                    <name>Titling</name>
                </genre>  
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: genres of the document as a whole may be provided in the content metadata section of the main news item. Genres specific to a non-main item may be provided by the content metadata section of this item.

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
            <contentMeta>
                <!-- This genre is in the main news item:
                     it applies to the document as a whole -->
                <genre rank="1" qcode="afpattribute:Interview">
                    <name>Interview</name>
                </genre> 
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- This genre only qualifies this item -->
                <genre rank="1" qcode="afpattribute:Profile">
                    <name>Profile</name>
                </genre> 
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Genres of a document, and of individual item in the case of multimedia documents, may be provided by genre elements. Each genre element describes a nature or a style of the content (e.g., an intellectual or journalistic form). There may be multiple genre elements per item, as a given item may be at the intersection of multiple genres.

In AFP documents, a genre is specified by a QCode, optionally completed by a natural language name.

Often used in AFP documents are genre defined in the schemes http://ref.afp.com/attributes/ (scheme alias afpattribute) and http://ref.afp.com/editorialtypes/ (scheme alias afpedtype).

The name child element, if present, provides a natural language name for the genre.

Identifier and version number

Text, picture, still graphic, video and multimedia documents: the document identifier is provided in the news item (for multimedia documents: in the main news item). A version number may be present too.

<newsMessage>
    <itemSet>
        <newsItem guid="http://d.afp.com/MM48X" version="5">
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: the document identifier is provided in the package item. A version number may be present too.

<newsMessage>
    <itemSet>
        <packageItem guid="http://d.afp.com/MM48X" version="5">
        </packageItem>
    </itemSet>
</newsMessage>

A document is a set of information carrying some journalistic content and associated meta data. As news stories develop or corrections are made, new versions of the document are published.

Each NewsML-G2 document has a global unique identifier (guid), which is provided by the guid attribute of a newsItem or packageItem element. A guid is a character string. It is designed to be globally unique among all NewsML-G2 documents, past and future. This guid makes it possible to identify a document as it moves through the news workflow and is transferred/duplicated from place to place and from system to system. It is also used as a basis for an updating mechanism: an update is carried on by sending you a new version of a document identified by a given guid (i.e., the original and the new version share the same guid).

In AFP's NewsML-G2 documents, guids can take multiple forms. Examples include URIs in the http scheme, URNs in the namespace "newsml" [RFC3085bis] or AFP UNOs (a format more or less equivalent to IIM UNO).

Note: most AFP GUIDs look like plain URLs, for example: http://doc.afp.com/11N38S. However, they actually are non dereferencable URIs and their purpose is only to serve as identifiers.

From a technical point of view, given two representations of some journalistic content in NewsML-G2, the guid is what tells whether these two representations are those of the same document (possibly different versions of it): same guids means same document, different guids means different documents.

When integrating AFP's NewsML-G2 production into your information system you'll often need to compare guids. For example, when receiving a document from AFP you'll want to check if you already received some version of this document in the past, an action you'll perform by looking in your system for a document with the same guid.

A version number may be provided by a version attribute in the form of an XML Schema positive integer. It identifies the version of the document. The first time you receive a given document (i.e., a document identified by a given guid), this document isn't necessarily in its first version. That is, the version number of a document you receive for the first time may be greater than 1. The version number is incremented by 1 or more each time the document is updated. If no version attribute is present, you must assume that the document is in version 1 (i.e., first version).

How a new version of a document should be dealt with?

The answer is given by the NewsML-G2 documentation:

In the absence of any specific instructions from the provider, a "usable" item [cf. section on publishing status] should be regarded as replacing any previous version of the item with the same GUID. In practice, a provider is likely to provide some supplementary information in the form of a human-readable <edNote> [cf. section on general editorial note] which can be displayed to inform recipients of the reason for the update.

Often, new versions are issued to enrich previous ones with additional information, especially as stories develop in real time. Sometimes, however, a new version is meant to correct some error found in a previous version. In such case you may want to take some additional actions, as it might be the case that erroneous material has been published. Such correction-conveying versions are specifically tagged using a correction <signal>. For more information on this topic see the section on correction signal.

Information sources

Text, picture, still graphic and video documents: information sources may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>                
                <!-- An information source represented by a name and a role -->
                <infoSource role="isrol:origcont">
                    <name>AP</name>
                </infosource> 
                            
                <!-- An information source represented by a QCode, a name and a role -->
                <infoSource qcode="afpsource:2648" role="isrol:origcont">
                    <name>CHRISTIE'S</name>
                </infosource> 
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: information sources may be provided in the content metadata sections of the main news items. When an information source appears in a news item which is not the main one, it describes an information source for the content of this item. When an information source appears in the main news item, it should be considered as an information source of the "document", with no indication of the specific part of the content it is associated with (if any).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
            <contentMeta>
                <!-- This information source is in the main news item: it is an information source of the document -->
                <infoSource role="isrol:origcont">
                    <name>AP</name>
                </infosource> 
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- This information source is specific to this item -->
                <infoSource qcode="afpsource:2648" role="isrol:origcont">
                    <name>Business Wire</name>
                </infoSource> 
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Information sources of a document, and of individual items in multimedia documents, may be provided by infoSource elements.

In AFP NewsML-G2 document, an information source is a party (person or organization) which originated, distributed, aggregated or supplied the content. For example, in a document created/published by AFP but reusing content provided by Business Wire, this source (i.e., Business Wire) will appear in an infoSource element.

In AFP documents, an information source is specified by either:

An URI expressed as a QCode, optionally completed by a natural language name.
A natural language name.

The URI space used to specify information source through QCodes is open and can evolve over time.

The name child element, if present, provides a natural language name for the information source.

The role attribute carries a QCode that specifies the role of the information source. AFP documents use the role "Content originator" whose Qcode is isrol:origcont and whose concept URI is http://cv.iptc.org/newscodes/infosourcerole/origcont.

Keywords

Text, picture, still graphic and video documents: keywords may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <keyword>culture</keyword>
                <keyword>arts</keyword>
                <keyword>fashion</keyword>
                <keyword>auction<keyword>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: keywords of the document as a whole may be provided in the content metadata section of the main news item. Keywords specific to an individual item may be provided by the content metadata section of that item.

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
            <contentMeta>
                <!-- These keywords are in the main news item: 
                     they are associated with the document as a whole -->
                <keyword>culture</keyword>
                <keyword>arts</keyword>
                <keyword>fashion</keyword>
                <keyword>auction<keyword>
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- These keywords are specifically associated with this news item -->
                <keyword>people</keyword>
                <keyword>money</keyword>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: keywords may be provided in the content metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <contentMeta>
                <keyword>culture</keyword>
                <keyword>arts</keyword>
                <keyword>fashion</keyword>
                <keyword>auction<keyword>
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

Keywords are defined by NewsML-G2 as "free-text terms to be used for indexing or finding the content by text-based search engines".

If present, keywords are provided by keyword elements.

Some keyword may have a refined role, expressed by a role attribute. The value of this attribute is a QCode. Currently we may issue the QCode afpkrole:tagWeb, which resolves to http://cv.afp.com/keywordroles/tagWeb. For example:

<keyword role="afpkrole:tagWeb">culture</keyword>

Keywords with a http://cv.afp.com/keywordroles/tagWeb role are meant to be used to compute tag clouds [TagClouds].

Language of the content

Text, picture, still graphic and video documents: the language of the content may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <language tag="en"/>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: the language of the content may be provided in the content metadata section of each news item.

<newsMessage>
    <itemSet>
        <!-- An item whose content is in english -->
        <newsItem>
            <contentMeta>
                <language tag="en"/>
            </contentMeta>
        </newsItem>
        
        <!-- An item whose content is in french -->
        <newsItem>
            <contentMeta>
                <language tag="fr"/>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

The tag attribute of the language element carries a BCP 47 language tag [RFC5646] that specifies the main language of the content. The content is what is provided inline or linked to by the content set (i.e., the contentSet element). For example, in text document this attribute specifies the main language the textual content is written in, and in a video document it typically specifies the main language used in the soundtrack.

The main languages used by AFP along their BCP 47 tags are shown in the table below.

Main languages in AFP production
Language	BCP 47 tag
Arabic	ar
English	en
French	fr
German	de
Portuguese	pt
Spanish	es

Language of metadata

Text, picture, still graphic and video documents: the language of metadata is specified by the news item.

<newsMessage>
    <itemSet>
        <newsItem xml:lang="en">
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: the language of metadata is specified by each news item.

<newsMessage>
    <itemSet>
        <newsItem xml:lang="en">
        </newsItem>
        <newsItem xml:lang="en">
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: the language of metadata is specified by the package item.

<newsMessage>
    <itemSet>
        <packageItem xml:lang="en">
        </packageItem>
    </itemSet>
</newsMessage>

The xml:lang attribute carries a BCP 47 language tag [RFC5646] that specifies the main language of the metadata (e.g., titles, subject's names, caption, etc.) provided by the item.

In a multimedia document, this attribute has the same value in every new items of the document (i.e., in a given document, all items make use of the same language for metadata).

Important design principle: In an AFP NewsML-G2 document, metadata is provided in a single language, with exceptions for a few elements. When some news content is of global interest we often provide metadata in multiple languages: in this case we do so by issuing multiple NewsML-G2 documents (e.g., one with metadata in french, another one with metadata in english, etc.). These are different documents: each one has its own GUID and lifecycle (see section on documents identifiers).

The main languages used by AFP along their BCP 47 tags are shown in the table below.

Main languages in AFP production
Language	BCP 47 tag
Arabic	ar
English	en
French	fr
German	de
Portuguese	pt
Spanish	es

While most metadata in a NewsML-G2 document uses the language specified by the xml:lang attribute of the item element as shown in the examples above, there may be exceptions for a few elements. For example, in a video document the original transcription of some speech is typically provided in the original language that was actually used by the speaker(s), which may differ from the main language of metadata. Whenever possible, the language for such metadata is provided by an xml:lang attribute on the XML element conveying the metadata in question.

The example below shows a document whose main language of metadata is English but whose "transcription" metadata is in French.

<newsMessage>
    <itemSet>
        <newsItem xml:lang="en">
            <partMeta>
                <description role="afpdescRole:contentDescription">
                    Pierre Bergé speaks about the auction. 
                </description>
                <description xml:lang="fr" role="afpdescRole:transcription">
                    C’est le jour ou le dernier objet sera passé sous le marteau d'un commissaire priseur
                    que à mon sens – a mon sens - cette collection pourra écrire le mot fin.
                </description>
            </partMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Locations

AFP's NewsML-G2 documents can convey information about locations. We establish a distinction between locations from which the content originates (e.g., the place where a news story was written) and locations that are subject matter of the content. These two kind of locations are conveyed using different means, as described in the following sections.

Locations may be typed, using a type attribute. The following types are used in AFP documents:

Types of locations
Type	Description	QCode	Concept URI
Geopolitical area	In AFP documents, it is a generic type that may be used for any kind of location. It merely informs that the associated element represents a location.	`cpnat:geoArea`	`http://cv.iptc.org/newscodes/cpnature/geoArea`
Point of interest	In AFP documents, this type is used for locations that cannot be classified as cities, country areas or countries. For instance the Eiffel Tower and the White House will be typed as points of interest, as well as the Sherwood forest or a random building. Note that this may diverges a bit from NewsML-G2 standard usage, where areas such as forests, ponds, hills, streets or random places are not usually classified as point of interest.	`cpnat:poi`	`http://cv.iptc.org/newscodes/cpnature/poi`
City	Informs that the associated element represents a city.	`loctyp:City`	`http://cv.iptc.org/newscodes/location/City`
Country area	In AFP documents it is typically used for areas such as provinces, states or other areas that may contain multiple cities but which pertain themselves to countries.	`loctyp:CountryArea`	`http://cv.iptc.org/newscodes/location/CountryArea`
Country	Informs that the associated element represents a country.	`loctyp:Country`	`http://cv.iptc.org/newscodes/location/Country`

Locations from which the content originates (aka datelines)

Text, picture, still graphic and video documents: the locations from which the content originates are provided in the content metadata section of the news item (in the following example only one location is provided).

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <located qcode="afplocation:281108" type="cpnat:poi">
                    <name>White House</name>
                    <related qcode="afplocation:6666" rel="skos:broader" type="loctyp:City">
                        <name>Washington</name>
                        <related qcode="afplocation:1149" rel="skos:broader" type="loctyp:CountryArea"/>
                    </related>
                    <related qcode="afplocation:1149" type="loctyp:CountryArea">
                        <name>District of Columbia</name>
                        <related qcode="afplocation:206" rel="skos:broader" type="loctyp:Country"/>
                    </related>
                    <related qcode="afplocation:206" type="loctyp:Country">
                        <name>United States</name>
                        <related qcode="iso3166-1a3:USA" rel="skos:exactMatch"/>
                    </related>
                    <POIDetails>
                        <position latitude="38.89761" longitude="-77.03637"/>
                    </POIDetails>
                </located>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: for news items in the document, the locations from which the content of the item originates may be provided in the content metadata section of the item (in the following example only one location per item is provided).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
            <contentMeta>
                <!-- Location from which the content of the main news item originates -->
                <located qcode="afplocation:2500" type="loctyp:City">
                    <name>Paris</name>
                    <related qcode="afplocation:67" rel="skos:broader" type="loctyp:Country">
                        <name>France</name>
                        <related qcode="iso3166-1a3:FRA" rel="skos:exactMatch" /> 
                    </related>
                    <geoAreaDetails>
                        <position latitude="48.85341" longitude="2.34121" /> 
                    </geoAreaDetails>
                </located>
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- Location from which the content of this news item originates -->
                <located qcode="afplocation:2613" type="loctyp:City">
                    <name>Marseille</name>
                    <related qcode="afplocation:719" rel="skos:broader" type="loctyp:CountryArea">
                        <name>Bouches-du-Rhône</name>
                        <related qcode="afplocation:67" rel="skos:broader" type="loctyp:Country"/>
                    </related>
                    <related qcode="afplocation:67" type="loctyp:Country">
                        <name>France</name>
                        <related qcode="iso3166-1a3:FRA" rel="skos:exactMatch"/>
                    </related>
                    <geoAreaDetails>
                        <position latitude="43.29695" longitude="5.38107"/>
                    </geoAreaDetails>
                </located>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: the location from which the content originates is provided in the content metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <contentMeta>
                <located qcode="afplocation:6666" type="loctyp:City">
                    <name>Washington</name>
                    <related qcode="afplocation:1149" rel="skos:broader" type="loctyp:CountryArea">
                        <name>District of Columbia</name>
                        <related qcode="afplocation:206" rel="skos:broader" type="loctyp:Country"/>
                    </related>
                    <related qcode="afplocation:206" type="loctyp:Country">
                        <name>United States</name>
                        <related qcode="iso3166-1a3:USA" rel="skos:exactMatch"/>
                    </related>
                    <geoAreaDetails>
                        <position latitude="38.89511" longitude="-77.03637"/>
                    </geoAreaDetails>
                </located>
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

In AFP NewsML-G2 documents, located elements specify the geographical origin of the editorial content conveyed by the <contentSet> of a news item: the text of a news story, the jpeg renditions of a picture document, etc. For live reports, the located element specify the geographical origin of the live report. There is always at least one location provided per item.

Locations from which the content originates are not necessarily the locations the content is about. For example a news story about an event taking place in Paris may be written in London; in such case the city of London may be specified as the location from which the content originates. The locations the content is about are conveyed in another part of the document, as described in section "Locations that are subject matter of the document".

There are some subtleties about what "locations from which the content originates" means depending on the nature of the content; we discuss them in the table below. Note that the policy described here is specific to AFP. Other conventions might be in place at other news providers.

Policy used to specify the locations from which the content originates
Nature of content	Policy
Text	A location from which the content originates is usually a location (e.g., a city) where the text was written or from which it was dictated. Alternatively it might be the location of the event if an AFP reporter is present nearby. Multiple locations may be provided in the form of multiple `located` elements when the content originates (as defined here) from multiples locations; in this case the usual practice is to provide no more than two locations.
Picture	The location from which the content originates is the location of the camera when the picture was shot. Therefore it may differ from the location of what is shown in the picture. Knowing the location of the camera is useful as it lets one know "how the subject of the picture looks like when viewed from that location". Only one location is provided.
Video	The location from which the content originates is the location of the camera when the video was recorded. Therefore it may differ from the location of what is shown in the video. Knowing the location of the camera is useful as it lets one know "how the subject of the video looks like when viewed from that location". Only one location is provided. If the video is shot in different places, only one of these places is provided, usually the most significant.
Still or animated graphic	When a graphic is produced, it is often accompanying or illustrating a separate production (typically of textual nature). In such case the location from which the content originates is the same as this production. Else, it is the location of the event the graphic is about.
Multimedia	Each news item in a multimedia document specifies the location(s) from which the content originates. The exact meaning for each news item is determined by the nature of its content as described in this table.
Live report	The location from which the content originates is the location of the event the live report is about. The value of this metadata can change as the live report develops. For example, the live report about the Bergé/Saint-Laurent auction may be tagged with the location where the auction takes place while we report on the auction, and later be tagged with the location where the Pierre Bergé press conference takes place while we report on this press conference.

The locations from which the content originates are provided by located elements in the content metadata section of news items. A given located element may convey several informations about a location:

A QCode identifying it, provided by a qcode attribute.
A QCode identifying its type, provided by a type attribute. In AFP documents we typically make use of the IPTC location types [IPTCLocTypes] to specify whether the location a city, a country area or a country. For locations that are classified as a "point of interest", we use the QCode cpnat:poi (concept URI: http://cv.iptc.org/newscodes/cpnature/poi) from [IPTCCPNatures]. For a description of the different location types see the table in section "Locations".
Its name, provided by a name element.
Its latitude and longitude in decimal degrees, provided by a position element inside a geoAreaDetails, or in a POIDetails if the location is classified as a "point of interest". We use the WGS84 geodesic system.
Several broader geographical entities the location is part of and how these entities relates with each other in term of inclusion. Each broader entity is provided by a related element whose rel attribute, the QCode skos:broader, resolves to http://www.w3.org/2004/02/skos/core#broader. Combined with the base location described above this forms a geographical hierarchy. Typically we provide three levels in this hierarchy: a city, a country area and a country; but sometimes we may provide four levels (as in the example above where the location is the White House) or only one or two levels, and we may also provide more in the future. Each of these broader geographical entities may be described with:

A QCode identifying it, provided by a qcode attribute.
A QCode identifying its type, provided by a type attribute. As described above for the base location, we make use of the IPTC location types [IPTCLocTypes] to specify whether it is a city, a country area or a country.
Its name, provided by a name element.
Its ISO 3166-1 alpha 3 code [ISO3166], provided by a related element whose rel attribute, the QCode skos:exactMatch, resolves to http://www.w3.org/2004/02/skos/core#exactMatch and with a qcode attribute using the scheme alias iso3166-1a3 (scheme URI: http://cvx.iptc.org/iso3166-1a3/). The ISO 3166-1 alpha 3 code is the code part of this qcode attribute.
A reference to a broader geographical entity provided by a related element whose rel attribute, the QCode skos:broader, resolves to http://www.w3.org/2004/02/skos/core#broader.

In text documents or text components of multimedia documents we may provide multiple locations from which the content originates. In this case the current practice being to provide at most two. Below is an example:

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- A location from wich the content originates -->
                <located qcode="afplocation:2500" type="loctyp:City">
                    <name>Paris</name>
                    <related qcode="afplocation:67" rel="skos:broader" type="loctyp:Country">
                        <name>France</name>
                        <related qcode="iso3166-1a3:FRA" rel="skos:exactMatch" /> 
                    </related>
                    <geoAreaDetails>
                        <position latitude="48.85341" longitude="2.34121" /> 
                    </geoAreaDetails>
                </located>
                
                <!-- Another location from wich the content originates -->
                <located qcode="afplocation:6666" type="loctyp:City">
                    <name>Washington</name>
                    <related qcode="afplocation:1149" rel="skos:broader" type="loctyp:CountryArea">
                        <name>District of Columbia</name>
                        <related qcode="afplocation:206" rel="skos:broader" type="loctyp:Country"/>
                    </related>
                    <related qcode="afplocation:206" type="loctyp:Country">
                        <name>United States</name>
                        <related qcode="iso3166-1a3:USA" rel="skos:exactMatch"/>
                    </related>
                    <geoAreaDetails>
                        <position latitude="38.89511" longitude="-77.03637"/>
                    </geoAreaDetails>
                </located>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

When are multiple locations provided?

Multiple locations may be provided when the content originates from multiple locations. For example, suppose that we publish a story about the Bergé/Saint-Laurent auction. To write this story we might use informations provided by an AFP reporter present at the auction in Paris and by another AFP reporter present at a press conference given by Pierre Bergé at the same time in Washington. In this case we might provide Paris and Washington in located elements. Alternatively we might choose to provide the location where the story is actually written (say, e.g. London) instead of Paris and Washington.

Locations that are subject matter of the document

Text, picture, still graphic, video and multimedia documents: locations that are subject matter of the document may be provided in the news item (for multimedia documents: in the main news item) in the content metadata section. In text and multimedia documents only, additional information may be provided in assertions. Locations that are subject matter of the document are not provided in live report indexes.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                        
                <!-- The city of Beijing is a subject of the content -->
                <subject qcode="afplocation:2618" type="cpnat:geoArea">
                    <name>Beijing</name>
                </subject> 

                <!-- The city of Paris is a subject of the content and is a location of the event the content is about -->
                <subject qcode="afplocation:2500" type="cpnat:geoArea" afp:role="http://cv.afp.com/subjectroles/locationOfEvent">
                    <name>Paris</name>
                </subject>
                
                <!-- Some locations are not identified by a qcode attribute but by an uri attribute (typically providing a geo URI [rfc5870])-->
                <subject uri="geo:43.82883,5.78688" type="cpnat:geoArea">
                    <name>Manosque</name>
                </subject>
            </contentMeta>
            
            <!-- This assertion provides additional information about Beijing  -->
            <assert qcode="afplocation:2618">
                <type qcode="loctyp:City"/>
                <geoAreaDetails>
                    <position latitude="39.9075" longitude="116.39723"/>
                </geoAreaDetails>
            </assert>
            
            <!-- This assertion provides additional information about Paris  -->
            <assert qcode="afplocation:2500">
                <type qcode="loctyp:City"/>
                <broader qcode="afplocation:67" type="loctyp:Country">
                    <name>France</name>
                    <related qcode="iso3166-1a3:FRA" rel="skos:exactMatch"/>
                </related>
                <geoAreaDetails>
                    <position latitude="48.85341" longitude="2.3488"/>
                </geoAreaDetails>
            </assert>
            
            <!-- This assertion provides additional information about Manosque  -->
            <assert uri="geo:43.82883,5.78688">
                <type qcode="loctyp:City"/>
                <geoAreaDetails>
                    <position latitude="43.82883" longitude="5.78688"/>
                </geoAreaDetails>
            </assert>
            
        </newsItem>
    </itemSet>
</newsMessage>

Locations that are subject matter of the document may be provided by subject elements. Note that other entities such as persons, media topics, organizations and so on may also be conveyed using subject elements. To differentiate them, a type attribute is used. Its value, a Qcode, is either cpnat:geoArea (resolving to http://cv.iptc.org/newscodes/cpnature/geoArea) or cpnat:poi (resolving to http://cv.iptc.org/newscodes/cpnature/poi). All these subjects share some common properties, such as optional type and afp:role attributes that are described in the section on subjects.

Additional information about these locations may be provided by assertions; an assertion is represented by an assert element. You can correlate assertions with specific locations using their concept URIs: the information provided by an assertion applies to the location whose concept URI is conveyed by the qcode or the uri attribute of the assertion. In the example above, a subject element whose qcode resolves to http://ref.afp.com/locations/2618 (in AFP documents, afplocation is a scheme alias for http://ref.afp.com/locations/). We also have an assert element whose qcode resolves to http://ref.afp.com/locations/2618. It means that both this subject and this assertion convey information about the same location.

If your don't perform QCode resolution (cf. section on controlled vocabularies and qualified codes) then you can correlate QCode-based assertions with specific locations using their QCodes directly.

A given assertion may convey several informations about a location:

An identification provided either by a qcode attribute or an uri attribute. As discussed above it is used to correlate the assertion with a location.
A QCode identifying its type, provided by a type attribute. In AFP documents we typically make use of the IPTC location types [IPTCLocTypes] to specify whether the location is a city, a country area or a country. For locations that are classified as "point of interest", we use the QCode cpnat:poi (concept URI: http://cv.iptc.org/newscodes/cpnature/poi) from [IPTCCPNatures].
Its latitude and longitude in decimal degrees, provided by a position element inside a geoAreaDetails. Unless specified otherwise by a gpsdatum attribute, we use the WGS84 geodesic system.
A broader geographical entitiy the location is part of, provided by a broader element. Typically the broader entity we provide is a country. This broader geographical entity may be described with:

A QCode identifying it, provided by a qcode attribute.
A QCode identifying its type, provided by a type attribute. We make use of the IPTC location types [IPTCLocTypes] to specify whether it is a city, a country area or a country.
Its name, provided by a name element.
Its ISO 3166-1 alpha 3 code [ISO3166], provided by a related element whose rel attribute, the QCode skos:exactMatch, resolves to http://www.w3.org/2004/02/skos/core#exactMatch and with a qcode attribute using the scheme alias iso3166-1a3 (scheme URI: http://cvx.iptc.org/iso3166-1a3/). The ISO 3166-1 alpha 3 code is the code part of this qcode attribute.
A reference to a broader geographical entity provided by a related element whose rel attribute, the QCode skos:broader, resolves to http://www.w3.org/2004/02/skos/core#broader.

Locations of the event(s)

Some locations that are subject matter of the document also happen to be locations of event(s). A location of event is a place where an event the document is about happens or is foreseen to happen. Locations of event(s) are provided by subject elements with an attribute role in namespace http://www.afp.com/format/internal/ equal to http://cv.afp.com/subjectroles/locationOfEvent.

For example, in our document about the auction of the Pierre Bergé and Yves Saint-Laurent collection, we could have the city of Paris as a subject because the news story mentions that the auction takes place in Paris. We could also have the city of Beijing as a subject because the news story mentions China's claims that some objects in the auction were stolen in Beijing during the opium wars and therefore should be returned. In this case, both cities would appear in dedicated subject elements. The city of Paris could be tagged as being a location of event using the role attribute because the auction happens in Paris and in our example the auction is the event the story is about. Beijing would not be tagged as being a location of event because while it is a subject of the story it is not a location of the event the story is about.

There is no default value for the role attribute: if a subject element conveying a location does not have a role attribute with a value of http://cv.afp.com/subjectroles/locationOfEvent, it doesn't mean that it isn't a location of the event, but merely that the information regarding this matter isn't provided by the element.

Products the document belongs to

All documents: products the document belongs to may be provided in the header of the news message.

<newsMessage>
    <header>
        <afp:headerExtension xmlns:afp="http://www.afp.com/format/internal/">
            <!-- The document belongs to this product -->
            <afp:product name="EAA" uri="http://products.afp.com/wires/EAA"></afp:product>
            
            <!-- The document also belongs to this other product -->
            <afp:product name="MAX" uri="http://products.afp.com/wires/MAX"></afp:product>
        </afp:headerExtension>
    </header>
</newsMessage>

The commercial relationship between AFP and its clients is often structured around the notion of product. A product is a subset of AFP's production a client can subscribe to. Each product is defined by several characteristics such as subject matters, media types, languages, etc.

The product elements, if present, are provided in the headerExtension inside the header of the newsMessage. The headerExtension element is an AFP specific extension and is defined in namespace http://www.afp.com/format/internal/.

Each product element identifies a product the document belongs to. It can be a product you have subscribed to but it is not necessarily the case: typically, all products the document belongs to are listed regardless of your specific subscriptions.

In your information system, a possible usage of the product elements is to automatically route documents to specific teams or workflows. For example you might want to automatically route documents of the "Economic & Business News" product to your economics specialists.

Each product is uniquely identified by an URI, provided by the uri attribute. You can ask your AFP representative for the URIs of the products you have subscribed to.

The name attribute provides the name of the product, meant to be used for display purpose.

The following table provides examples of products.

Examples of products
Name	Unique identifier	Description
EAA	`http://products.afp.com/wires/EAA`	The World News (EAA) wire offers up-to-the-minute, complete English-language global news, sports and business coverage delivered specifically to suit the needs of clients in Europe, Africa and the Middle East. EAA also provides in-depth coverage of Europe for Europe.
MAX	`http://products.afp.com/wires/MAX`	The world news wire, MAX, carries AFP's entire English-language news production and is designed specifically for clients who demand comprehensive global coverage.
FRS	`http://public.products.afp.com/wires/FRS`	The FRS wire is the AFP news feed mainly for French customers. This feed in French-language offers French and foreign sources of information on varied topics (general news, politics , economy, culture , social, sport and equestrian ), with emphasis on in-depth coverage of France.
DAB	`http://public.products.afp.com/wires/DAB`	The DAB wire in French language is designed primarily for African customers. Produced in Paris by a specialized desk, which processes and translates the information gathered by the largest networks of all international agencies active in Africa, it is also powered by the four other regional centers of AFP (Hong Kong, Nicosia, Washington and Montevideo) to provide comprehensive coverage of world news round the clock and seven days a week.

Provider

Text, picture, still graphic, video and multimedia documents: the provider of the document is given in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <provider qcode="afpprovider:AFP-TV">
                    <name>AFP-TV</name>
                    <broader qcode="nprov:AFP"/>
                        <name>AFP</name>
                    </broader>    
                </provider>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: the provider of the document is given in the item metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <itemMeta>
                <provider qcode="afpprovider:AFP-TV">
                    <name>AFP-TV</name>
                    <broader qcode="nprov:AFP"/>
                        <name>AFP</name>
                    </broader>    
                </provider>
            </itemMeta>
        </packageItem>
    </itemSet>
</newsMessage>

The provider of a document is the party responsible for the management and the release of the document (i.e., the publisher of the document). It is given by the qcode attribute of the provider element. This element is always present. The QCode is part of one of the following schemes:

The IPTC news provider scheme [IPTCNProviders], whose scheme URI is http://cv.iptc.org/newscodes/newsprovider/ and whose scheme alias is nprov.
An AFP-defined scheme, whose scheme URI is http://ref.afp.com/providers/ and whose scheme alias is afpprovider.

The name child element, if present, provides a natural language name for the provider.

The broader child element, if present, specifies a larger entity the provider is part of. This entity is identified by a qcode attribute, optionally completed by a natural language name in a name element.

In the example above, the document is provided by AFP-TV, a service inside AFP. The fact that this provider is part of AFP is expressed using the broader element.

Publishing Status

Text, picture, still graphic, video and multimedia documents: the publishing status is provided by the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <pubStatus qcode="QCode of scheme http://cv.iptc.org/newscodes/pubstatusg2/ specifying the publishing status"/>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: the publishing status is provided by the item metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <itemMeta>
                <pubStatus qcode="QCode of scheme http://cv.iptc.org/newscodes/pubstatusg2/ specifying the publishing status"/>
            </itemMeta>
        </packageItem>
    </itemSet>
</newsMessage>

A document can be usable, withheld or canceled. The table below describes how this is specified in documents and what it means.

Publishing statuses
Status	Representation	Meaning
Usable	No `pubStatus` element or a `pubStatus` element with a `qcode` attribute `stat:usable` resolving to `http://cv.iptc.org/newscodes/pubstatusg2/usable`	The document is usable. Note that "usable" does not necessarily means "publishable"; for example an embargo may prevent publication of an otherwise usable document.
Withheld	A `pubStatus` element with a `qcode` attribute `stat:withheld` resolving to `http://cv.iptc.org/newscodes/pubstatusg2/withheld`	The document and all its previous versions must not be used until further notice (except for a few metadata, as described bellow). This status is typically used when a serious problem with a document is suspected and is under investigation (e.g., important information in the document is suspected to be false). In the meantime, any usage of the document must be prohibited, if needed by the way of alerts. If the document has been published it must be rendered inaccessible until further notice. You must immediately remove it from all your online services and stop using it in any other fashion. People that may have viewed previous versions should be notified, whenever possible, that it is being retracted until further notice. If you have been authorized by AFP to distribute it to third parties, you must ensure that the same actions are carried out by them. In a withheld document, only the following metadata can be considered reliable/useable: GUID, version number, publication status, general editorial note (in this version of the document only).
Canceled	A `pubStatus` element with a `qcode` attribute `stat:canceled` resolving to `http://cv.iptc.org/newscodes/pubstatusg2/canceled`	The document and all its previous versions must not be used, ever (except for a few metadata, as described bellow). This status is typically used when a serious problem with a document is detected (e.g., important information in the document has been found to be false) and the scope of the problem is wide enough to warrant a complete kill of the document instead of issuing a correction. Any usage of the document must be prohibited, if needed by the way of alerts. If the document has been published it must be rendered inaccessible. You must immediately remove it from all your online services, stop using it in any other fashion and delete it from your servers. People that may have viewed previous versions should be notified, whenever possible, that it is being retracted. If you have been authorized by AFP to distribute it to third parties, you must ensure that the same actions are carried out by them. In a cancelled document, only the following metadata can be considered reliable/useable: GUID, version number, publication status, general editorial note (in this version of the document only) and cancel-dedicated rendition(s). A cancel-dedicated rendition is specifically designed to be used canceled documents, allowing to publish something (e.g., a note about the cancellation) replacing the canceled content . It is conveyed by an `inlineXML` or `remoteContent` element and denoted through the `rendition` attribute by the QCode `afprnd:cancel`, resolving to `http://cv.afp.com/renditions/cancel`.

When a document is withheld or canceled, a general editorial note is often provided to provide additional information and/or instructions.

The NewsML-G2 specification provides detailed information on how you must make use of this publishing status when processing documents.

For multimedia documents, the way publishing status is conveyed differs from standard NewsML-G2. In NewsML-G2 each G2 item carries its own publishing status, and a G2 item without a pubStatus element is defined as usable. In AFP's multimedia documents the only pubStatus element to consider is those of the main item. The pubStatus elements of non main items must be ignored. You must process multimedia documents in a way that applies the publishing status provided in the main news item to the entire content of the document (i.e., to all items in the document).

Subjects

Text, picture, still graphic and video documents: subjects of the document may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- A subject represented by a natural language name  -->
                <subject>
                    <name>auction</name>
                </subject> 

                <!-- A subject represented by a QCode  -->
                <subject qcode="medtop:20000273"/>
                                
                <!-- A subject represented by a QCode and a natural language name  -->
                <subject qcode="medtop:01000000">
                    <name>arts, culture and entertainment</name>
                </subject> 

                <!-- A subject represented by a QCode, a natural language name, a type and a role -->
                <subject qcode="afplocation:2500" type="cpnat:geoArea" afp:role="http://cv.afp.com/subjectroles/locationOfEvent">
                    <name>Paris</name>
                </subject> 
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: subjects of the document as a whole may be provided in the content metadata section of the main news item. Subjects specific to an item may be provided in the content metadata section of this item.

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
            <contentMeta>
                <!-- This subject is in the main news item: 
                     it applies to the document as a whole -->
                <subject qcode="medtop:20000031" type="cpnat:abstract">
                    <name>visual art</name>
                </subject> 
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- This subject only applies to this news item -->
                <subject qcode="medtop:20000011" type="cpnat:abstract">
                    <name>fashion</name>
                </subject> 
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: subjects of the document may be provided in the content metadata section of the package item. In live reports document, subjects expressed using a controlled vocabulary are only media topics and event identifiers.

<newsMessage>
    <itemSet>
        <packageItem>
            <contentMeta>
                <!-- A subject represented by a natural language name  -->
                <subject>
                    <name>auction</name>
                </subject>

                <!-- A subject represented by a QCode  -->
                <subject qcode="medtop:20000273"/>
                                
                <!-- A subject represented by a QCode and a natural language name  -->
                <subject qcode="medtop:01000000">
                    <name>arts, culture and entertainment</name>
                </subject> 
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

Subjects are important topics of the content; what the content is about. Some subjects of a document (and of individual items in the case of multimedia documents) may be provided by subject elements. Each subject element contains an indication on what the document's content (or item's content) is about.

Some subjects of the document may be described by keyword elements instead of subject elements. However, keywords may also be used for other purposes: while a keyword may describe a subject of the document, not all keywords do. See the Keywords section.

In AFP documents, a subject represented by a subject element is specified by either:

A natural language name.
An URI expressed as a QCode, conveyed in a qcode attribute, optionally completed by a natural language name. For example, in AFP documents medtop is a scheme alias for the scheme http://cv.iptc.org/newscodes/mediatopic/, therefore the QCode medtop:20000011 shown above resolves to the URI http://cv.iptc.org/newscodes/mediatopic/20000011, which identifies the media topic "fashion".
An URI conveyed in a uri attribute, optionally completed by a natural language name. This is used for specifying some locations, using 'geo' URIs. Geo URIs are defined by [RFC5870]. They allows identifying locations and conveying information such as latitude, longitude and so on. For example the URI geo:13.4125,103.8667 identifies the location at latitude 13.4125 and longitude 103.8667 in WGS-84. At the time of this writing AFP documents make use of simple geo URIs with only latitude and longitude, but in the future we may use additional features (e.g., altitude, uncertainty, etc.). An example is provided in the section "Locations that are subject matter of the document".

The URI space used to specify subjects through qcode and uri attributes is open and can evolve over time. Often used in AFP documents are QCodes identifying IPTC media topics [IPTCMediaTopics], a standard taxonomy for categorizing news content. Also often used are QCodes identifying events, in order to associate a document with the events it covers. The table below presents common schemes used in AFP documents to identify subjects. Note that this list is not exhaustive.

Common types of subjects used in AFP documents
Type	Scheme URI	Scheme alias	Comment
Media topics	`http://cv.iptc.org/newscodes/mediatopic/`	`medtop`	Media topics is a standard IPTC taxonomy for categorizing news content. For example the concept URI `http://cv.iptc.org/newscodes/mediatopic/01000000` identifies the category "arts, culture and entertainment", which is defined as "Matters pertaining to the advancement and refinement of the human mind, of interests, skills, tastes and emotions".
Events	`http://eventmanager.afp.com/events/`	`afpevent`	An AFP specific scheme for identifying events. It is used to associate a document with the event it covers. For more on this topic see the section on event identifiers.
Persons	`http://ref.afp.com/persons/`	`afpperson`	AFP specific scheme for identifying persons. For example the concept URI `http://ref.afp.com/persons/193573` identifies Pierre Bergé.
Organizations	`http://ref.afp.com/organizations/`	`afporganization`	AFP specific scheme for identifying organizations. For example the concept URI `http://ref.afp.com/organizations/5308` identifies Christie's, the auction company.
Locations	`http://ref.afp.com/locations/`	`afplocation`	AFP specific scheme for identifying locations. For example the concept URI `http://ref.afp.com/locations/2500` identifies the city of Paris.

A subject element can have a name child element. If present it provides a natural language name for the subject.

In a given item, the order of appearance of subject elements provides a hint about their relative importance (i.e., editorial significance) in the context of this item: a subject should be considered as having either the same or a lesser importance than subjects appearing before in the item. Note that while AFP's documents currently don't rank subjects with rank attributes, that may change in the future. In order to be forward compatible, if your NewsML-G2 processor interprets such ranks, the relative importance they convey should take precedence over the relative importance conveyed by the order of appearance of subjects elements in the item. The rank attribute is described in the NewsML-G2 specification.

Optional attributes (these attributes may or may not be present in a given subject element):

type: this attribute carries a QCode that specifies the type of the subject (i.e., person, organization, event, abstract concept, etc.). The value space for this attribute is open, but in AFP documents you'll typically find types defined in the standard IPTC "Nature of a concept" controlled vocabulary [IPTCCPNatures].

role (in namespace http://www.afp.com/format/internal/): some subjects have a specific role, which is conveyed by this attribute in the form of an URI. This attribute is not defined by the NewsML-G2 standard: it is an AFP specific extension and is therefore defined in a specific namespace.

Currently the only possible value for this attribute when it is present is http://cv.afp.com/subjectroles/locationOfEvent. If a subject is tagged with this role then this subject is a location of the event(s) the editorial content is about. This usage is described in detail in the section "Locations that are subject matter of the document".

Titles & subtitles

Documents may contain various types of titles and multiple levels of subtitles.

Note that while NewsML-G2 allows for rich text by using some markup in the content of titles and subtitles, AFP's systems only output simple textual content not interspersed with markup.

Titles

Text, picture, still graphic, video and multimedia documents: titles may be provided in the content metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- The main title of the document -->
                <headline>
                    YSL-Bergé collection sets new world record at auction 
                    for a private collection
                </headline>
                
                <!-- The short title of the document -->
                <headline role="afpheadlinerole:shorttitle">
                    YSL-Bergé collection: a new record at auction
                </headline>
                
                <!-- The long title of the document -->
                <headline role="afpheadlinerole::longtitle">
                    Yves Saint Laurent/Pierre Bergé collection sets new world record at 
                    auction for a private collection with more than 206 million euros
                </headline>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: A title may be provided in the content metadata section of the package item.

<newsMessage>
    <itemSet>
        <nackageItem>
            <contentMeta>
                <!-- The title of the live report -->
                <headline>
                    YSL-Bergé auction live report
                </headline>
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

All documents may contain a title. In addition, text, picture, still graphic, animated graphics, video and multimedia documents may include a short title and/or a long title. These titles, if present, are provided by headline elements located in the content metadata section of the first item. There is at most one title, one short title and one long title.

You can determine the type of a given title by looking for the presence and value of a role attribute, as described in the following table.

Title types
Type	Function	Identification
Title	The main title of the document: a short summary of the journalistic content.	No `role` attribute.
Short title	A shorter version of the title, suitable for displaying on space constrained surfaces (e.g., mobile handsets).	A `role` attribute whose value, the QCode `afpheadlinerole:shorttitle`, resolves to `http://cv.afp.com/headlineroles/shorttitle`
Long title	A longer version of the title. This is a short catch line, useful, for example, to display on a banner.	A `role` attribute whose value, the QCode `afpheadlinerole:longtitle`, resolves to `http://cv.afp.com/headlineroles/longtitle`

Subtitles

Text and multimedia documents: subtitles may be provided in the content metadata section of the news item (for multimedia documents: in the main news item). Subtitles are only provided for text and multimedia documents.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <headline role="afpheadlinerole:subtitle" rank="0">
                    Auction to continue tuesday and wednesday  
                </headline>
                <headline role="afpheadlinerole:subtitle" rank="1">
                    Prestigious attendance noted on first day
                </headline>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

In addition to titles, text and multimedia documents may contain subtitles. Subtitles complement tittles with additional information about the news content of the document. In current production there is at most two subtitles. Like titles, they are provided by headline elements in the content metadata section of the main news item. Their subtitle nature is denoted by a role attribute whose value, the QCode afpheadlinerole:subtitle, resolves to http://cv.afp.com/headlineroles/subtitle. A rank attribute may be present to specify the relative importance of subtitles. Ranks are nonnegative integers. Subtitles with a lower value for this attribute have a higher importance than subtitles with a higher value of this attribute, and subtitles without a rank attribute have a lower importance than subtitles with a rank attribute. See the NewsML-G2 specification for additional information on ranks and their processing model.

Type of document

An AFP NewsML-G2 document can be of one of the following types:

Text
Picture
Video
Still graphic
Animated graphic
Interactive graphic
Multimedia
Live report index

The overview section provides a description of these types.

To determine the type of a document, you first need to determine if it is a multimedia or non-multimedia document. A document is multimedia if the item set of the news message contains a news item whose item metadata section contains a link element with both:

a rel attribute whose value, the QCode crel:isa, resolves to http://cv.iptc.org/newscodes/conceptrelation/isA
an href attribute whose value is the URI http://cv.afp.com/itemnatures/mmdMainComp

That is, a multimedia document contains the following:

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

In a non-multimedia document, the type is the item class of the item present in the item set of the news message.

For Text, picture, still graphic, video and multimedia documents the item class is given by the qcode attribute of the itemClass element in the item metadata section of a news item, as shown here:

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <itemClass qcode="QCode specifying the type"/>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

For live reports the item class is given by the qcode attribute of the itemClass element in the item metadata section of a package item, as shown here:

<newsMessage>
    <itemSet>
        <packageItem>
            <itemMeta>
                <itemClass qcode="QCode specifying the type"/>
            </itemMeta>
        </packageItem>
    </itemSet>
</newsMessage>

The itemClass element is always present. For non multimedia documents, it's qcode attributes resolves to a concept URI that specifies the type of the document, as shown in the table below.

Item classes used in AFP document
Type	QCode	Concept URI
Text	`ninat:text`	`http://cv.iptc.org/newscodes/ninature/text`
Picture	`ninat:picture`	`http://cv.iptc.org/newscodes/ninature/picture`
Video	`ninat:video`	`http://cv.iptc.org/newscodes/ninature/video`
Still graphic	`ninat:graphic`	`http://cv.iptc.org/newscodes/ninature/graphic`
Animated graphic	`ninat:animated`	`http://cv.iptc.org/newscodes/ninature/animated`
Interactive graphic	`afpinat:interactive`	`http://cv.afp.com/itemnatures/interactive`
Live report index	`afpinat:liveReport`	`http://cv.afp.com/itemnatures/liveReport`

The NewsML-G2 standard states that it is mandatory to use one of the IPTC News Item Nature NewsCodes schemes for item classes. AFP NewsML-G2 deviates from this rule by using an AFP specific scheme (whose URI is http://cv.afp.com/itemnatures/) in addition to the mandatory IPTC schemes.

Urgency

Text, picture, still graphic, animated graphic, video and multimedia documents: the urgency of the document may be provided in the content metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <urgency>1</urgency>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Live report indexes: the urgency of the document may be provided in the content metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <contentMeta>
                <urgency>1</urgency>
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

A document may include an indication of the editorial urgency of its content in an urgency element. The content of this element is an integer from 1 (highest urgency) to 9 (lowest urgency). Usually, AFP documents are tagged with urgencies from 1 to 4.

There is often a correlation between this property and the role in workflow of the document. In our documents, flashes are typically issued with the highest urgency (i.e., a value of 1) alerts with an urgency of 2 and urgents with an urgency of 3.

Data specific to text and multimedia documents

Some data appear only in text and multimedia documents. This section details these data elements.

Catchline

Text documents: a catchline may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <headline role="afpheadlinerole:introduction">
                    The Yves Saint Laurent and Pierre Bergé collection sets new world record at  
                    auction for a private collection on monday, the first day of a three action  
                    days, with more than 206 million euros. Participants describe first day
                    as "surprising, moving, electric!".
                </headline>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: a catchline may be provided in the content metadata section of the main news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
            <contentMeta>
                <headline role="afpheadlinerole:catchline">
                    The Yves Saint Laurent and Pierre Bergé collection sets new world record at  
                    auction for a private collection on monday, the first day of a three action  
                    days, with more than 206 million euros. Participants describe first day
                    as "surprising, moving, electric!".
                </headline>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

A catch line, if present, provides a clear and concise summary of the story that tells the reader what has happened in simple language. It is designed to arouse or call viewer's attention. It gives an overview of all the main elements of the news. A catchline may be found at most once per document.

In text documents the catchline is provided by a headline element whose role attribute, the QCode afpheadlinerole:introduction, resolves to http://cv.afp.com/headlineroles/introduction. At the time of this writing a catchline may be provided only for text documents produced by SID (Sport-Informations-Dienst), an AFP subsidiary. To determine if the kind of text documents you are interested in might contain a catchline you are advised to discuss the matter with your AFP representative.

In multimedia documents the catchline is provided by a headline element whose role attribute is either afpheadlinerole:catchline (resolving to http://cv.afp.com/headlineroles/catchline) or afpheadlinerole:introduction (resolving to http://cv.afp.com/headlineroles/introduction).

While NewsML-G2 allows for rich text by using some markup in the content of a catch line, AFP's systems only output simple textual content not interspersed with markup.

In some documents you might observe that the content of the catchline is the same as the first paragraph of the main textual content of the document. Note however that this is not always the case and that sometimes an original catchline is provided.

Number of hypertext links to external resources in textual or multimedia content

Text documents and multimedia documents: the number of hypertext links to external resources present in textual or multimedia content may be provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage xmlns="http://iptc.org/std/nar/2006-10-01/" xmlns:afp="http://www.afp.com/format/internal/">
    <itemSet>
        <newsItem>
            <itemMeta>
                <afp:extension>
                    <afp:stats>
                        <afp:totalLinks>
                            3
                        </afp:totalLinks>
                    </afp:stats>
                </afp:extension>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

The HTML (in XML syntax) rendition of the textual or multimedia content can contain hypertext links to external resources, typically conveyed by <a> elements. External resources are resources that are not intrinsically part of the document; for example, in a multimedia document a link to one of the item of the document isn't a link to an external resource whereas a link to a Wikipedia page is.

As shown in the example above this number may be provided as an integer by a totalLinks element inside a stats element inside an extension element in the item metadata section of the (main) news item.

Note that the totalLinks, stats and extension elements are not standard NewsML-G2 vocabulary but part of an AFP's specific extension. They are defined in an XML namespace whose name is http://www.afp.com/format/internal/.

Related production

Text and multimedia documents: mentions of the existence of related production may be provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <!-- The following signals that AFP is publishing/will publish related photo and video production -->
                <signal qcode="afpmedtype:Photo"/>
                <signal qcode="afpmedtype:Video"/>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Text and multimedia documents may contain mentions of the existence of related production, i.e., additional production covering the event(s) the document is about. For example, if AFP has released or plan to release photo(s) and video(s) of the Yves Saint Laurent auction then it may be mentioned in the metadata of a text or multimedia news story covering this auction, as shown in the example above. To that end, we use signal elements specifying which type of related production exists or is planned, using a controlled vocabulary defined by the scheme http://ref.afp.com/mediatypes/ (scheme alias: afpmedtype).

We provide only one signal by type of related production. For example, if there are several related photos, there may be only one <signal qcode="afpmedtype:Photo"/> element.

Note that signal elements are also used for other purposes (e.g., correction signal). Only signal elements in the scheme http://ref.afp.com/mediatypes/ are mentions of related production.

The table below provides the QCodes/concepts URIs that are used in these signal elements. See the overview section for a descriptions of the various types of news content this table refers to.

Types of related production
Concept URI	QCode	Description
`http://ref.afp.com/mediatypes/Photo`	`afpmedtype:Photo`	Related picture(s). For example, a picture of the Yves Saint Laurent auction.
`http://ref.afp.com/mediatypes/PHOTOARCH`	`afpmedtype:PHOTOARCH`	Related picture(s) from archive material. It is typically an archive picture of someone or something that plays an important role in the event(s). For example an archive picture of Yves Saint Laurent, or an archive picture of Christie's salerooms. When this mention is used, the related archive pictures are republished by AFP.
`http://ref.afp.com/mediatypes/Video`	`afpmedtype:Video`	Related video(s). For example, a video report about the Yves Saint Laurent auction.
`http://ref.afp.com/mediatypes/LIVEVIDEO`	`afpmedtype:LIVEVIDEO`	Related video(s) providing live coverage. For example a video of the Yves Saint Laurent auction broadcasted live.
`http://ref.afp.com/mediatypes/VIDEOARCH`	`afpmedtype:VIDEOARCH`	Related video(s) from archive material. It is typically an archive video of someone or something that plays an important role in the event(s). For example an archive video of Yves Saint Laurent, or an archive video of Christie's salerooms. When this mention is used, the related archive videos are republished by AFP.
`http://ref.afp.com/mediatypes/Sketch`	`afpmedtype:Sketch`	Related courtroom sketch(s). A courtroom sketch is an artistic depiction of the proceedings in a court of law. In many jurisdictions, cameras are not allowed in courtrooms in order to prevent distractions and preserve privacy. Consequently we rely on sketch artists for illustrations of the proceedings.
`http://ref.afp.com/mediatypes/Graphic`	`afpmedtype:Graphic`	Related still graphic(s).
`http://ref.afp.com/mediatypes/ANIGRAPHIC`	`afpmedtype:ANIGRAPHIC`	Related interactive graphic(s).
`http://ref.afp.com/mediatypes/VIDEOGRAPHIC`	`afpmedtype:VIDEOGRAPHIC`	Related videographic(s).
`http://ref.afp.com/mediatypes/Multimedia`	`afpmedtype:Multimedia`	Related multimedia document(s).
`http://ref.afp.com/mediatypes/LIVEREPORT`	`afpmedtype:LIVEREPORT`	Related live report(s).
`http://ref.afp.com/mediatypes/INTERACTIVEGRAPHIC`	`afpmedtype:INTERACTIVEGRAPHIC`	Related interactive graphic(s).

The mechanism described in this section is not the only one to deal with related production. As described in the section on event identifiers, we also provide you with correlation keys allowing you to identify documents covering the same events.

Role in workflow

Text and multimedia documents: a role in workflow may be provided in the item metadata section of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <role qcode="QCode specifying the role in workflow"/>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Some text and multimedia documents carry an indication of their role in workflow (aka editorial role). This allows you to handle them in specific ways. This role, if present, is specified by the qcode attribute of the role element. The possible values for the role are taken from a controlled vocabulary provided by the IPTC (we do not use its whole value space, though). They are described in the table below, where the Concept URI column gives the URI the QCode resolves to.

Roles in workflow
Role	Description	QCode	Concept URI
Flash	A very short text – typically four or five words – on an event of exceptional importance. Flashes are rare. For example, only four events were reported by AFP by a flash in 2008 : Kosovo’s declaration of independence; the opening of the Beijing Games; Russia’s recognition of South Ossetia and Abkhazia as independent states; and Barack Obama’s victory in the US presidential elections. A flash is usually followed within five minutes by an urgent providing more information	`erol:flash`	`http://cv.iptc.org/newscodes/edrole/flash`
Alert	A very short text with high priority. An alert is usually followed within five minutes by an urgent providing more information. Fits in a single line.	`erol:alert`	`http://cv.iptc.org/newscodes/edrole/alert`
Urgent	A short text on a major development of a top story. An urgent is typically two paragraph long, or longer when it provides a follow-up to multiple alerts. On a freshly breaking story, an urgent is typically followed within 10 minutes by a 200-250 word lead.	`erol:urgent`	`http://cv.iptc.org/newscodes/edrole/urgent`
Lead	A sum-up or a complete version of a developing story.	`erol:lead`	`http://cv.iptc.org/newscodes/edrole/lead`

When a document is updated, its role in workflow may be updated too. For example it is typical for a breaking news that deserves immediate diffusion to starts its life as an alert, then becomes an urgent, then a lead, as it gets refreshed/enriched with more content. Each version of the document share the same guid (see the section on identifiers).

Evolution over time of a developing story

Once a document is a lead, subsequent versions may be qualified as "second lead", "third lead" and so on up to a "ninth lead". However, this qualification is not done through the role in workflow property: this property use the same concept URI of http://cv.iptc.org/newscodes/edrole/lead (QCode erol:lead) from the first lead through the ninth one. To convey what kind of lead the document is, we use a <genre> element (see the section on genres). For example, we typically convey that a document is a first lead by specifying a role in workflow with the concept URI http://cv.iptc.org/newscodes/edrole/lead and a genre with the concept URI http://ref.afp.com/editorialtypes/Lead (QCode afpedtype:Lead), as in the following example:

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <role qcode="erol:lead"/>
            </itemMeta>
            <contentMeta>
                <genre qcode="afpedtype:Lead" />
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

For a second lead, the role in workflow is still http://cv.iptc.org/newscodes/edrole/lead and a genre with a concept URI of http://ref.afp.com/editorialtypes/2ndlead (QCode afpedtype:2ndlead) is provided:

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <role qcode="erol:lead"/>
            </itemMeta>
            <contentMeta>
                <genre qcode="afpedtype:2ndlead" />
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

A document with a role in workflow of "lead" can also be qualified by the genre "general lead", whose meaning is described at the end of the table below. Typically a general lead has a different guid than the various documents it consolidates. A document cannot be both a general lead and "first lead" or "second lead" etc.

The following table describes the various genres used to qualify a lead.

Genres used to qualify a lead
Genre	Description	QCode	Concept URI
Lead (typically used to mean "first lead")	A sum-up or a complete version of a developing story	`afpedtype:Lead`	`http://ref.afp.com/editorialtypes/Lead`
Second lead	A sum-up or a complete version of a developing story. For a given story, common usage is that a second lead is published only if a lead is already out. It provides a refreshed and/or enriched version of that story.	`afpedtype:2ndlead`	`http://ref.afp.com/editorialtypes/2ndlead`
Third lead	A sum-up or a complete version of a developing story. For a given story, common usage is that a third lead is published only if a second lead is already out. It provides a refreshed and/or enriched version of that story.	`afpedtype:3rdlead`	`http://ref.afp.com/editorialtypes/3rdlead`
Fourth lead	A sum-up or a complete version of a story. For a given story, common usage is that a fourth lead is published only if a third lead is already out. It provides a refreshed and/or enriched version of that story.	`afpedtype:4thlead`	`http://ref.afp.com/editorialtypes/4thlead`
Fifth lead	A sum-up or a complete version of a story. For a given story, common usage is that a fifth lead is published only if a fourth lead is already out. It provides a refreshed and/or enriched version of that story.	`afpedtype:5thlead`	`http://ref.afp.com/editorialtypes/5thlead`
Sixth lead	A sum-up or a complete version of a developing story. For a given story, common usage is that a sixth lead is published only if a fifth lead is already out. It provides a refreshed and/or enriched version of that story.	`afpedtype:6thlead`	`http://ref.afp.com/editorialtypes/6thlead`
Seventh lead	A sum-up or a complete version of a developing story. For a given story, common usage is that a seventh lead is published only if a sixth lead is already out. It provides a refreshed and/or enriched version of that story.	`afpedtype:7thlead`	`http://ref.afp.com/editorialtypes/7thlead`
Eighth lead	A sum-up or a complete version of a developing story. For a given story, common usage is that a eighth lead is published only if a seventh lead is already out. It provides a refreshed and/or enriched version of that story.	`afpedtype:8thlead`	`http://ref.afp.com/editorialtypes/8thlead`
Ninth lead	A sum-up or a complete version of a developing story. For a given story, common usage is that a ninth lead is published only if a eighth lead is already out. It provides a refreshed and/or enriched version of that story.	`afpedtype:9thlead`	`http://ref.afp.com/editorialtypes/9thlead`
General lead	A large sum-up or a complete version of a story. A general lead regroups, hierarchizes and develops all available elements of a developing story, including elements that were previously published under a number of different documents, each one focusing on specific facets of the more general story.	`afpedtype:LeadGeneral`	`http://ref.afp.com/editorialtypes/LeadGeneral`

Word count

Text and multimedia documents: the word count is provided in the inline XML rendition of the content of the news item (for multimedia documents: in the main news item).

<newsMessage>
    <itemSet>
        <newsItem>
            <contentSet>
                <inlineXML wordcount="450">
                </inlineXML>
            </contentSet>
        </newsItem>
    </itemSet>
</newsMessage>

The word count gives an approximation of the size of the textual content of the document (not including textual content provided in metadata). That size is provided as an approximative count of words: when it is computed, each individual word might not count for one as short words count for less than one and long words count for more than one.

The word count is provided by the wordcount attribute of the inlineXML element of the news item. It is a non-negative integer. It is present in all text and multimedia documents.

Data specific to text documents

Some data is specific to text documents. This section details these data elements.

Textual content

Text documents: the textual content is provided in the content set of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentSet>
                <inlineXML contenttype="application/xhtml+xml">
                    <html xmlns="http://www.w3.org/1999/xhtml">
                        <head>
                            <title>
                                YSL-Bergé collection sets new world record at auction 
                                for a private collection
                            </title>
                        </head>
                        <body>
                            <p>The Yves Saint Laurent and Pierre Bergé collection sets 
                            new world record at auction for a private collection. 
                            Hundreds of art treasures amassed by late fashion designer
                            Yves Saint Laurent and his companion Pierre Berge over half
                            a century are being auctioned.</p>
                            <p>Bids hit 206 million euros (261 million dollars) on February
                            23, 2009 making it the biggest private collection ever 
                            auctioned with two days of sales still left to run.</p>
                            ...
                            ...
                            <!-- An hypertext link -->
                            The <a class="ignorableTextFalse" href="http://en.wikipedia.org/wiki/Yves_Saint_Laurent_(designer)">
                            wikipedia page about Yves Saint-Laurent</a> claims that ...
                            ...
                        </body>
                    </html>
                </inlineXML>
            </contentSet>
        </newsItem>
    </itemSet>
</newsMessage>

The textual content of the document is the main journalistic text of the document. It is provided by an inlineXML element. It is expressed using the XML syntax of HTML. This is explicitly denoted by a contentType attribute with a value of application/xhtml+xml.

The textual content can also contain links to entities that aren't logically part of the document, such as other NewsML-G2 documents, Web pages (as shown in the example above), etc. The sections below describe how these link are represented.

Note that text items of multimedia documents can also contain similar data, but with additional information such as links to visual content. This is described in section "Data specific to multimedia documents".

Hypertext links to other resources

The HTML can contain hypertext links to other resources such as Web pages. They may be provided by a elements. For example here is a link to a wikipedia page:

<a class="ignorableTextFalse"
   href="http://en.wikipedia.org/wiki/Yves_Saint_Laurent_(designer)" >wikipedia page about Yves Saint-Laurent</a>

The class attribute, if present, may be used to specify either the class name "ignorableTextFalse" or "ignorableTextTrue". These class names are meant to assist you if you need to remove hypertext links from the HTML content (this is a common need for some of our clients).

ignorableTextFalse

ignorableTextFalse means that if you process the HTML in order to remove links then not removing the text associated with this link will produce a better result.

For example, suppose that the HTML contains the following fragment before removing the hypertext links:

Pierre Bergé quoted the 
<a class="ignorableTextFalse" 
   href="http://en.wikipedia.org/wiki/Yves_Saint_Laurent_(designer)">wikipedia page about Yves Saint-Laurent</a>
to illustrate...

After removing hypertext links the fragment should be:

Pierre Bergé quoted the wikipedia page about Yves Saint-Laurent to illustrate...

ignorableTextTrue

ignorableTextTrue means that if you process the HTML in order to remove links then also removing the text associated with this link will produce a better result.

For example, suppose that the HTML contains the following fragment before removing the hypertext links :

Some text before.
<a class="ignorableTextTrue" 
   href="http://en.wikipedia.org/wiki/Yves_Saint_Laurent_(designer)">
   This Web page provides additional information.
</a> 
 Some text after.

After removing hypertext links the fragment should be:

Some text before. Some text after.

Links to other NewsML-G2 documents

The HTML can contain links to other NewsML-G2 documents managed by AFP. Such links are associated with a part of the textual content. We represent these links using the g2document microformat. It consists in a span element with a class attribute that contains "g2document". In addition, we provide another class name denoting the type of the referenced document: "g2picture", "g2video", etc. Finally, we may provide a class name that provides a hint on how a link could be removed gracefully. For example:

<span class="g2document g2text ignorableTextFalse">
    <a style="display: none" href="http://doc.afp.com/7W37U"></a>
    <a style="display: none" href="otherDocument.xml"></a>
    some text
</span>

The content of the span element is organized as follow:

The first child element of such a span is an a tag whose href attribute provides the GUID of the NewsML-G2 document. Note that while it may look like a dereferencable URI, it actually isn't. This element is marked as non displayable as it is not meant to be directly displayed.
Following this element, another non displayable a tag may provide the dereferencable URI reference of the NewsML-G2 document. Typically, this element will be present if the AFP delivery system determines that it has delivered the corresponding document to you and know where to locate it in your delivery space.
Finally, we provide the part of the textual content the other NewsML-G2 document is associated with.

The following table lists the class names used to specify the type of a referenced NewsML-G2 document. See the overview section for a presentation of the various document types.

Types of referenced NewsML-G2 document
Class name	Type
g2text	Text
g2multimedia	Multimedia
g2picture	Picture
g2graphic	Still graphic
g2animated	Animated graphic
g2video	Video
g2liveReport	Live report index
g2interactive	Interactive graphic

The class attribute may also be used to specify "ignorableTextFalse" or "ignorableTextTrue". These class names are meant to assist you if you need to remove links from the HTML content (this is a common need for some of our clients).

ignorableTextFalse

ignorableTextFalse means that if you process the HTML in order to remove links then not removing the text associated with this link will produce a better result.

For example, suppose that the HTML contains the following fragment before removing the links :

Pierre Bergé quoted  
<span class="g2document g2text ignorableTextFalse">
    <a style="display: none" href="http://doc.afp.com/7W37U"></a>
    <a style="display: none" href="otherDocument.xml"></a>
    a recent AFP news story
</span>
to illustrate...

After removing links the fragment should be:

Pierre Bergé quoted a recent AFP news story to illustrate...

ignorableTextTrue

ignorableTextTrue means that if you process the HTML in order to remove links then also removing the text associated with this link will produce a better result.

For example, suppose that the HTML contains the following fragment before removing the links :

Some text before.
<span class="g2document g2text ignorableTextFalse">
    <a style="display: none" href="http://doc.afp.com/7W37U"></a>
    <a style="display: none" href="otherDocument.xml"></a>
    This AFP news story provides additional information.
</span>
Some text after.

After removing links the fragment should be:

Some text before. Some text after.

Data specific to visual content

Some data is associated with visual content. It may be present in picture, video, still graphic and animated graphic documents. It may also be present in picture, video, still graphic and animated graphic items of multimedia documents. This section details these data elements.

Caption

Picture, video, still graphic, animated graphic documents: a caption may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <description role="afpdescRole:contentDescription">
                    French businessman and head of Sidaction organisation Pierre Berge
                    attends at Marigny theater in Paris.
                </description>
                <description role="afpdescRole:contextDescription">
                    This is the first of the four auction days led by Christie's of 
                    Yves Saint-Laurent and Pierre Berge collection, which profit will 
                    fund campaigns against HIV-AIDS.
                </description>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: a caption may be provided in the content metadata section of each news item conveying picture, video, still graphic or animated graphic content.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- Caption for the content of this item -->
                <description role="afpdescRole:contentDescription">
                    French businessman and head of Sidaction organisation Pierre Berge
                    attends at Marigny theater in Paris. This is the first of the four auction days led by Christie's of 
                    Yves Saint-Laurent and Pierre Berge collection, which profit will 
                    fund campaigns against HIV-AIDS.
                </description>
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- Caption for the content of this other item -->
                <description role="afpdescRole:contentDescription">
                    Christie's auctioneer François de Ricqles proceeds with the auction 
                    of a rabbit head, a Chinese imperial bronze on February 25, 2009 
                    at the Grand Palais in Paris. This object is part of a prized art collection assembled by 
                    Yves Saint Laurent and his partner Pierre Berge over half a 
                    century. One of the world's great private collections, it takes
                    in masterpieces by Picasso, Mondrian and Matisse, old masters, Art
                    Deco gems, bronzes, enamels and antiques. Two looted Chinese bronzes
                    sold for 15.7 million euros (20.3 million dollars) each to anonymous
                    telephone bidders at the Yves Saint Laurent art sale on Wednesday, 
                    despite protests from Beijing.
                </description>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

In picture, video, still graphic or animated graphic documents, the caption, if present, is provided in two parts. The content description is a concise textual descriptions of what is shown in the visual content. The context description provides background information (e.g., context, meaning, etc.) about what is shown.

The content description may be provided in the associated news item by a description element whose role attribute, the QCode afpdescRole:contentDescription, resolves to http://cv.afp.com/descriptionRoles/contentDescription. The context description may be provided by a description element whose role attribute, the QCode afpdescRole:contextDescription, resolves to http://cv.afp.com/descriptionRoles/contextDescription.

In Multimédia document, the captions of visual components are in one part, as shown in the example above.

There is no caption for text content. In picture, video, still graphic and animated graphic documents, there is a single news item, which, consequently, is the one that may provide a caption. For multimedia documents, the caption of each picture, video, still graphic and animated graphic may appear in each corresponding news item. There is at most one caption per news item.

Note that while NewsML-G2 allows for rich text by using some markup in the content of a caption, AFP's systems only output simple textual content not interspersed with markup.

From time to time the AFP NewsML-G2 format evolves, but you may still want to correctly process older documents that make use of previous versions of the format.

In older documents, captions are represented in a different way. In some documents the content description may be provided in the associated news item by a description element whose role attribute, the QCode afpdescRole:captionContentDescription, resolves to http://cv.afp.com/descriptionRoles/captionContentDescription. The context description may be provided by a description element whose role attribute, the QCode afpdescRole:captionContext, resolves to http://cv.afp.com/descriptionRoles/captionContext.

In even older documents, the content description and context description may not be provided as separate elements but instead in a single description element whose role attribute, the QCode drol:caption, resolves to http://cv.iptc.org/newscodes/descriptionrole/caption.

Copyright Notice

Picture, video, still graphic, animated graphic documents: a copyright notice may be provided in the rights information of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <rightsInfo>
                <copyrightNotice>Copyright AFP or licensors</copyrightNotice>
            </rightsInfo>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: a copyright notice may be provided in the rights information of each news item conveying picture, video, still graphic or animated graphic content.

<newsMessage>
    <itemSet>
        <newsItem>
            <!-- A copyright notice for this item -->
            <rightsInfo>
                <copyrightNotice>Copyright AFP or licensors</copyrightNotice>
            </rightsInfo>
        </newsItem>
        <newsItem>
            <contentMeta>
            <!-- A copyright notice for this item -->
            <rightsInfo>
                <copyrightNotice>Copyright AFP or licensors</copyrightNotice>
            </rightsInfo>
        </newsItem>
    </itemSet>
</newsMessage>

Note that while NewsML-G2 allows for rich text by using some markup in the content of a copyright notice, AFP's systems only output simple textual content not interspersed with markup.

Visual content

Basic format

Picture, video, still graphic, animated graphic documents: one or multiple links to visual content may be provided in the content set of the news item.

<newsMessage>
    <itemSet>
        <!-- A visual item with three different renditions of the same visual content -->
        <newsItem>
            <contentSet>
                <remoteContent href="pictureItem/image1.jpg"/>
                <remoteContent href="pictureItem/image2.jpg"/>
                <remoteContent href="ftp://example.com/image3.gif"/>
            </contentSet>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: one or multiple links to visual content may be provided in the content set of each news item conveying picture, video, still graphic or animated graphic content.

<newsMessage>
    <itemSet>
        <!-- A visual item with three different renditions of the same visual content -->
        <newsItem>
            <contentSet>
                <remoteContent href="pictureItem/image1.jpg"/>
                <remoteContent href="pictureItem/image2.jpg"/>
                <remoteContent href="ftp://example.com/image3.gif"/>
            </contentSet>
        </newsItem>
        
        <!-- Another visual item with two rendition of some other visual content -->
        <newsItem>
            <contentSet>
                <remoteContent href="videoItem/video1.mp4"/>
                <remoteContent href="http://example.com/video2.mp4"/>
            </contentSet>
        </newsItem>        
    </itemSet>
</newsMessage>

Links to the actual visual content (e.g., bitmaps, vector graphics, video frames, etc.) are provided by href attributes of remoteContent elements. The value of each href attribute is an URI reference (while NewsML-G2 allows for IRI references, AFP NewsML-G2 documents use only URI references). See section "Accessing visual content through URI references" for additional directions on how to use these links.

Each picture, video, still graphic and animated graphic news item carries information one visual content (i.e., one picture, video or graphic). However, this content may be available in multiple renditions (e.g., low resolution, high resolution, JPEG format, TIFF format, etc.). Each rendition is described by a remoteContent element in the content set of the item.

In standard NewsML-G2 "Each rendition [in the content set of a given news item] MUST represent the same visual content, differentiated only by physical properties such as content type and format. [Renditions in the content set of a given news item are] different technical representations of the same logical content". AFP renditions for picture and graphic content do not always abide by this rule: in addition to providing different technical representations of the same logical content, our renditions may also consist in crops or other alterations of the content provided by other renditions of the same news item.

Additional properties of renditions

For each rendition, some information may be provided by attributes on remoteContent elements. These attributes are described below.

Rendition type

To aid selecting renditions, the type of a rendition may be provided by a rendition attribute in the remoteContent element describing the rendition, as in this example:

<!-- Three description of renditions of different types -->
<remoteContent rendition="rnd:lowRes"    href="pictureItem/image1.jpg"/>
<remoteContent rendition="rnd:highRes"   href="pictureItem/image2.jpg"/>
<remoteContent rendition="rnd:thumbnail" href="pictureItem/image3.gif"/>

At the time of writing, some remoteContent elements may be delivered with no rendition attribute. For instance, this is the case for renditions in postscript or pdf format for still graphics, but they will have a contenttype attribute identifying the format, as detailled in the section about rendition formats).

The rendition attribute provides a QCode whose possible values are taken from an IPTC controlled vocabulary and from AFP controlled vocabularies. The following tables provide examples of such values.

Examples of rendition types for picture documents
Concept URI	QCode	Description
`http://cv.iptc.org/newscodes/rendition/highRes`	`rnd:highRes`	High resolution image
`http://cv.iptc.org/newscodes/rendition/preview`	`rnd:preview`	Preview resolution image
`http://cv.iptc.org/newscodes/rendition/thumbnail`	`rnd:thumbnail`	A very small rendition of an image, giving only a general idea of its content

Examples of rendition types for still graphic documents
Concept URI	QCode	Description
`http://cv.afp.com/renditions/AIcs11`	`afprnd:AIcs11`	Rendition in Adobe Creative Suite 11 format
`http://cv.iptc.org/newscodes/rendition/highRes`	`rnd:highRes`	High resolution image
`http://cv.afp.com/renditions/jpeg_retina`	`afprnd:jpeg_retina`	A JPEG image in retina resolution. Typically, it contains four times more pixels than the jpeg_standard rendition.
`http://cv.afp.com/renditions/jpeg_standard`	`afprnd:jpeg_standard`	A JPEG image in standard resolution
`http://cv.afp.com/renditions/png_retina`	`afprnd:png_retina`	A PNG image in retina resolution. Typically, it contains four times more pixels than the png_standard rendition.
`http://cv.afp.com/renditions/png_standard`	`afprnd:png_standard`	A PNG image in standard resolution
`http://cv.iptc.org/newscodes/rendition/preview`	`rnd:preview`	Preview resolution image
`http://cv.iptc.org/newscodes/rendition/thumbnail`	`rnd:thumbnail`	A very small rendition of an image, giving only a general idea of its content

Examples of rendition types for visual components in multimedia documents
Concept URI	QCode	Description
`http://cv.iptc.org/newscodes/rendition/fullSize`	`afprnd:fullSize`	Documentation forthcoming
`http://cv.afp.com/renditions/highDef`	`afprnd:highDef`	Rendition of the highest definition of a visual component in a multimedia document
`http://cv.afp.com/renditions/ipad`	`afprnd:ipad`	Content intended to appear on iPad
`http://cv.iptc.org/newscodes/rendition/mobile`	`rnd:mobile`	Content intended to appear on a mobile or handheld device
`http://cv.afp.com/renditions/squaredThumbnail`	`afprnd:squaredThumbnail`	A small squared rendition of an image
`http://cv.iptc.org/newscodes/rendition/thumbnail`	`rnd:thumbnail`	A very small rendition of an image, giving only a general idea of its content
`http://cv.iptc.org/newscodes/rendition/web`	`rnd:web`	Content intended to appear on a web page

Examples of renditions types for interactive documents
Concept URI	QCode	Description
`http://cv.afp.com/renditions/png_standard`	`afprnd:interactive`	The interactive rendition

Media type and format

The media type of a rendition may be provided by a contenttype attribute on the remoteContent element describing the rendition, as in this example:

<!-- Three description of renditions, each one with a media type -->
<remoteContent contenttype="image/jpeg" href="pictureItem/image1.jpg"/>
<remoteContent contenttype="image/jpeg" href="pictureItem/image2.jpg"/>
<remoteContent contenttype="image/gif"  href="pictureItem/image3.gif"/>

The value of the contenttype attribute is a IANA MIME media type name [MediaTypes].

The contenttype attribute may be complemented by a format attribute to refine information about the data format of the rendition. For example:

<!-- Three descriptions of renditions, each one with a media type complemented by a format -->
<remoteContent contenttype="image/jpeg" format="example:JPEG_Baseline"    
               href="pictureItem/image1.jpg"/>
<remoteContent contenttype="image/jpeg" format="example:JPEG_Progressive" 
               href="pictureItem/image2.jpg"/>
<remoteContent contenttype="image/gif"  format="example:GIF87a"
               href="pictureItem/image3.gif"/>

Visual dimensions

The width and height of a rendition may be provided by width and height attributes (whose values are non-negative integers) on the remoteContent element describing the rendition. The units in which these dimensions are expressed may be provided by widthunit and heightunit attributes. These attributes provide QCodes whose possible values are in the controlled vocabulary defined by IPTC for dimension units (cf. [IPTCDimUnits]). For example:

<remoteContent width ="640" widthunit ="dimensionunit:pixels" 
               height="400" heightunit="dimensionunit:pixels" href="pictureItem/image1.jpg"/>

This fragment states that the visual content at images/image1.jpg is 640 pixels width and 400 pixels height (in this example, we suppose that dimensionunit is a scheme alias for the controlled vocabulary defined by IPTC for dimension units).

The possible dimension units are a subset of the IPTC dimension units controlled vocabulary. They are provided in the table below, where the "Concept URI" column gives the URI to which the heightunit and/or widthunit attributes resolve.

Dimension units
Unit	QCode	Concept URI
Pixel	`dimensionunit:pixels`	`http://cv.iptc.org/newscodes/dimensionunit/pixels`
Typographic Point	`dimensionunit:points`	`http://cv.iptc.org/newscodes/dimensionunit/points`
Millimeter	`dimensionunit:mm`	`http://cv.iptc.org/newscodes/dimensionunit/mm`

If a width and/or a height attribute is present but the corresponding dimension unit attribute is missing, then you must assume that the width and/or height is expressed in the default unit for that dimension. The default dimension units, which are specified by NewsML-G2, are given in the table below.

Default dimension units
Type of visual content	Default height unit	Default width unit
Picture	pixels	pixels
Graphic (still or animated)	points	points
Digital video	pixels	pixels

Size

The size in bytes of a rendition may be provided by a size attribute on the remoteContent element describing the rendition, as in this example:

<remoteContent size="253476" href="pictureItem/image1.jpg"/>

In this example, the size attribute asserts that the representation of the resource identified by images/image1.jpg weight 253476 bytes.

The value of the size attribute is a non-negative integer.

Data specific to picture and still graphic content

Some data is only present in picture and still graphic documents, and in picture and still graphic items of multimedia documents. This section describes these data elements.

Note that picture and still graphic documents/items also contains data common to visual content (see section "Data specific to visual content") and, of course, data common to all kind of content (see section "Common data").

Additional data about visual content

As described in the section "Visual content", a given visual may have multiple renditions, each one described by a remoteContent element. This section describes additional data that may be used to describe a picture or still graphic rendition.

Orientation

The "orientation" of a rendition is an indication of orientation change from the original digital image. It may be provided by an orientation attribute on the remoteContent element describing the rendition. The value of this attribute is an integer in the range of 1 to 8 (inclusive). For example:

<remoteContent orientation="5" href="pictureItem/image1.jpg"/>

This fragment states that the image at pictureItem/image1.jpg has been flipped about the vertical axis and rotated 90 degrees counterclockwise with regard to the original image. See the NewsML-G2 specification for a comprehensive description of the meaning of each value.

If no orientation attribute is present, you should assume a value of 1, which means "upright, no flip, no rotation" (i.e., the visual top of the original image is at the top, the visual left side of the original image in on the left, etc.)

Illustration images (aka previews or thumbnails)

Small illustration images may be provided as part of the content set through remotecContent elements, just like other renditions. They are distinguished by the value of their rendition attribute; e.g., http://cv.iptc.org/newscodes/rendition/thumbnail, http://cv.afp.com/renditions/squaredThumbnail. See the section on visual content for detailed information.

Note that illustration images for video or animated graphics are provided through a different way, as described in the section on icons.

Data specific to video and animated graphic content

Some data is only present in video and animated graphic documents, and in video and and animated graphic items of multimedia documents. This section describes these data elements.

Note that video and animated graphic documents/items also contains data common to visual content (see section "Data specific to visual content") and, of course, data common to all kind of content (see section "Common data").

Additional data about visual content

Duration

The duration of a rendition may be provided by a duration attribute (a non-negative integer) on the remoteContent element describing the rendition. The unit in which the duration is expressed may be provided by a durationunit attribute. This attribute provides a QCode whose possible values are in a subset of the controlled vocabulary for time units defined by IPTC [IPTCTimeUnits]. For example:

<remoteContent duration="120" durationunit="timeunit:seconds" 
               href="http://example.com/video2.mp4"/>

This fragment states that the content at http://example.com/video2.mp4 lasts 120 seconds (in this example, we suppose that timeunit is a scheme alias for the controlled vocabulary defined by IPTC for time units).

Possible time units are given in the table below, where the "Concept URI" column gives the concept URI to which the QCode provided by durationunit resolves.

Time units for video or animated graphic duration
Unit	QCode	Concept URI
Edit Unit	`timeunit:editUnit`	`http://cv.iptc.org/newscodes/timeunit/editUnit`
Second	`timeunit:seconds`	`http://cv.iptc.org/newscodes/timeunit/seconds`
Millisecond	`timeunit:milliseconds`	`http://cv.iptc.org/newscodes/timeunit/milliseconds`

If a duration attribute is present without a durationunit attribute, then you must assume that the duration is expressed in seconds.

Icon (aka illustration or preview image)

Basic format

Video and animated graphic documents: icon renditions may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <!-- A visual item with two icons -->
        <newsItem>
            <contentMeta>
                <icon href="http://example.com/img1.jpg"/>
                <icon href="icons/img2.tiff"/>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: icon renditions may be provided in the content meta of each news item conveying video or animated graphic content.

<newsMessage>
    <itemSet>
        <!-- A video or animated graphic item with two icon renditions -->
        <newsItem>
            <contentMeta>
                <icon href="http://example.com/img1.jpg"/>
                <icon href="icons/img2.tiff"/>
            </contentMeta>
        </newsItem>
        <!-- A video or animated graphic item with one icon rendition -->
        <newsItem>
            <contentMeta>
                <icon href="ftp://example.com/img3.jpg"/>
            </contentMeta>
        </newsItem>        
    </itemSet>
</newsMessage>

An icon is an image illustrating a video or an animated graphic (in NewsML-G2, an icon can also be associated with pictures or still graphics, but AFP documents do not use this feature). An icon is typically a keyframe of the visual content, but it can also be a logo or any other illustration.

Each video or animated graphic document, and each video or animated graphic item of a multimedia document may have at most one logical visual content as its icon. However, this content may be available in multiple renditions (e.g., low resolution, high resolution, JPEG format, TIFF format, etc.). Each rendition is described by an icon element in the content metadata section the news item.

Links to the actual icon renditions are provided by href attributes of icon elements. The value of each href attribute is an URI reference (while NewsML-G2 allows for IRI references, AFP systems only output URI references). See section "Accessing visual content through URI references" for additional directions on how to use these links.

In standard NewsML-G2 "Each [icon] rendition [in the content metadata section of a given news item] MUST represent the same visual content, differentiated only by physical properties such as content type and format". AFP icon renditions do not always abide by this rule: in addition to providing different technical representations of the same visual content, our icon renditions may also consist in crops or other alterations of the content provided by other icon renditions.

For each icon rendition, some information might be provided by attributes on icon elements. These attributes are described below.

Icon rendition type

To aid selecting icon renditions, the type of a rendition may be provided by a rendition attribute in the icon element describing the rendition, as in this example:

<!-- Two icon renditions of different types -->
<icon rendition="rnd:thumbnail"  href="icons/img1.jpg"/>
<icon rendition="afprnd:squaredThumbnail" href="icons/img2.tiff"/>

The rendition attribute provides a QCode whose possible values are taken from an IPTC controlled vocabulary and from AFP controlled vocabularies. Typical values are shown below.

Icon rendition types
QCode	Concept URI	Description
`rnd:thumbnail`	`http://cv.iptc.org/newscodes/rendition/thumbnail`	A very small rendition of an image, giving only a general idea of its content
`afprnd:squaredThumbnail`	`http://cv.afp.com/renditions/squaredThumbnail`	A small squared rendition of an image

Media type and format

The media type of an icon rendition may be provided by a contenttype attribute on the icon element describing the rendition, as in this example:

<!-- Two description of icon renditions of different types -->
<icon contenttype="image/jpeg" href="icons/img1.jpg"/>
<icon contenttype="image/tiff" href="icons/img2.tiff"/>

The value of the contenttype attribute is a IANA MIME media type name [MediaTypes].

The contenttype attribute may be complemented by a format attribute to refine information about the data format of the icon rendition. For example:

<!-- Two descriptionss of icon renditions,
     each one with a media type complemented by a format -->
<icon contenttype="image/jpeg" format="example:JPEG_Baseline" href="icons/img1.jpg"/>
<icon contenttype="image/tiff" format="example:NSK-TIFF"      href="icons/img2.tiff"/>

Visual dimensions

The width and height of an icon rendition may be provided by width and height attributes (whose values are non-negative integers) on the icon element describing the rendition. The units for these dimensions may be provided by widthunit and heightunit attributes. These attributes provide QCodes whose possible values are in a subset of the controlled vocabulary for dimension units defined by IPTC [IPTCDimUnits]. For example:

<icon width ="640" widthunit ="dimensionunit:pixels" 
      height="400" heightunit="dimensionunit:pixels" href="icons/img1.jpeg"/>

This fragment states that the visual content at icons/image1.tiff is 640 pixels width and 400 pixels height (in this example, we suppose that dimensionunit is a scheme alias for the controlled vocabulary defined by IPTC for dimension units).

Dimension units
Unit	QCode	Concept URI
Pixels	`dimensionunit:pixels`	`http://cv.iptc.org/newscodes/dimensionunit/pixels`

If a width and/or a height attribute is present but the corresponding dimension unit attribute is missing, then you can assume that the width and/or height is expressed in pixels.

Size

The size in bytes of an icon rendition may be provided by a size attribute on the icon element describing the rendition, as in this example:

<icon size="253476" href="icons/img1.jpeg"/>

In this example, the size attribute asserts that the representation of the resource identified by icons/image1.tiff weight 253476 bytes.

The value of the size attribute is a non-negative integer.

Script (aka verbatim or transcript)

Video and animated graphic documents: a script may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <description role="afpdescRole:script">
                    A rare glimpse of the art behind the label. 
                    What Yves Saint Laurent earned in the fashion industry he spent on 
                    masterpieces. At Christie’s auction house in London, a treasure trove of
                    paintings, sculpture, furniture and jewellery amassed by the fashion 
                    icon and his lover and business partner Pierre Bergé -- over a 50 year 
                    partnership.

                    SOUNDBITE 1: Thomas Seydoux, International Co-Head of Department, 
                    Christie’s Europe [English, 13 sec]:
                    "It's unprecedented - I mean we've never sold a collection in recent 
                    memory of that sort of outstanding quality throughout and I think it's
                    going to be most welcome by collectors who don't have that often a 
                    chance to acquire pieces of such quality"

                    Following the death of Yves Saint Laurent last year, Bergé chose to sell
                    the couple’s entire collection, which adorned their apartments in Paris.

                    For him, the sale is about finding some degree of closure: 

                    SOUNDBITE 2: Pierre Bergé, co-founder Yves Saint Laurent Couture house 
                    [French, 16 sec]: "C’est le jour ou le dernier objet sera passé sous le 
                    marteau d'un commissaire priseur que à mon sens – a mon sens - cette 
                    collection pourra écrire le mot fin."

                    "Only on the day that the last piece goes under the hammer of an 
                    auctioneer – in my view – will the last word of this collection be 
                    written"

                    In spite of the global economic slowdown, Christie’s hopes the 
                    collection will fetch around 400 million dollars when it goes up for 
                    sale in Paris at the end of February.

                    A cubist-era Picasso – valued at 40 million dollars – and a rare 
                    selection of Mondrians are among the highlights. But for Yves Saint
                    Laurent and Pierre Bergé, it was not about the price tags – more the
                    enjoyment of living amongst beautiful art.

                    SOUNDBITE 3: Jonathan Rendell, Deputy Chairman, Christie’s Americas 
                    [English, 19 sec]: "There was a great sense of everything being in the
                    right place - nothing dominating -and no trophies. I think it is a 
                    collection that's formed by two incredibly intelligent people working 
                    completely in concert with eachother - that's very unusual."

                    But it’s an unusual bond that is soon to be broken up amongst 
                    collectors, dealers and museums – the end of a long reign for 
                    the king of fashion.
                </description>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: a script may be provided in the content metadata section of each news item conveying video or animated graphic content.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- A script for the content of this item -->
                <description role="afpdescRole:script">
                    A rare glimpse of the art behind the label. 
                    What Yves Saint Laurent earned in the fashion industry he spent on 
                    masterpieces.At Christie’s auction house in London, a treasure trove of
                    paintings, sculpture, furniture and jewellery amassed by the fashion 
                    icon and his lover and business partner Pierre Bergé -- over a 50 year 
                    partnership.

                    SOUNDBITE 1: Thomas Seydoux, International Co-Head of Department, 
                    Christie’s Europe [English, 13 sec]:
                    "It's unprecedented - I mean we've never sold a collection in recent 
                    memory of that sort of outstanding quality throughout and I think it's
                    going to be most welcome by collectors who don't have that often a 
                    chance to acquire pieces of such quality"
                    ...
                    ...
                 </description>
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- A script for the content of this item -->
                <description role="afpdescRole:script">
                    Hundreds of art buyers and lovers from around the world came for the
                    biggest private collection ever up for auction.
                    
                    SOUNDBITE 1: Vox pop (woman) (english, 3 sec)
                    "I arrived two days ago to attend the sale." 

                    SOUNDBITE 2: Vox pop (man) (English, 4 sec)
                    "I came especially for the exhibition. Going back to New York very
                    shortly."
                    ...
                    ...
                </description>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

A script, if present, provides the transcript of voices that can be heard in the video. This may include voices recorded when the video was shot as well as audio commentary written and voiced by a journalist which is added to the images and recounts the events of the story. It may also contains indications of significant sounds (e.g., "the sound of an explosion"). These elements are provided in their order of occurrence in the video or animated graphic.

A script is provided by a description element whose role attribute, the QCode afpdescRole:script, resolves to http://cv.afp.com/descriptionRoles/script. It may appear at most once per item.

Note that in some documents, the content of a description element whose role attribute resolves to http://cv.afp.com/descriptionRoles/script isn't a voice/sound transcript or isn't only a voice/sound transcript:

It may contains only a "suggested script". One may have to listen to the video to determine wether the text is an actual transcript. Alternatively, this may be signaled in the text by a mention such as "Suggested script:".
It may contains a transcript of actual voices/sounds intermingled with "suggested script" elements. One may have to listen to the video to make sense of which text is an actual transcript and which is a suggested element.
It may contain a shot list, either stand alone or in addition to elements described above.

Shot lists have their dedicated slots in this XML format (see section "Shot list"), but in some documents they appear in the slots for scripts. For example, here is a description element that contains both a script an a shot list (we show only partial content):

<description role="afpdescRole:script">
    Script:
    Hundreds of art buyers and lovers from around the world came for the biggest 
    private collection ever up for auction.
    
    SOUNDBITE 1: Vox pop (woman) (english, 3 sec)
    "I arrived two days ago to attend the sale."
    ...
    ...
    
    Shotlist: (shot Feb 23, 2009)
    -wide of auctioneer
    -painting on screen
    -Berge arriving at auction
    -SOUNDBITE 1: Vox pop (woman) (english, 3 sec)
    -SOUNDBITE 2: Vox pop (man) (English, 4 sec)
    -close up of Matisse
    ...
    ...
</description>

Note that while NewsML-G2 allows for rich text by using some markup in the content of a script, AFP's systems only output simple textual content not interspersed with markup.

Shot list

Video and animated graphic documents: a shot list may be provided in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <description role="afpdescRole:shotList">
                    -Member of Christie's staff walking in front of paintings
                    -Photographers
                    -Tilt of YSL poster
                    -VAR Christie's member of staff with metal art works
                    -VAR Theodore Gericault painting
                    -Thomas Seydoux, International Co-Head of Department, Christie’s Europe 
                    -PAN of photo of YSL's flat in Paris
                    -SOUNDBITE 2: Pierre Bergé, co-founder Yves Saint Laurent Couture house
                    -Paintings on wall
                    -VAR Ferdinand Leger painting
                    -Picasso painting
                    -Woman looking at painting
                    -VAR Frans Hals portrait
                    -SOUNDBITE 3: Jonathan Rendell, Deputy Chairman, Christie’s Americas 
                    -People walking through gallery
                    -Tilt to poster of YSL
                </description>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: a shot list may be provided in the content metadata section of each news item conveying video or animated graphic content.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- A shot list for the content of this item -->
                <description role="afpdescRole:shotList">
                    -Member of Christie's staff walking in front of paintings
                    -Photographers
                    -Tilt of YSL poster
                    ...
                    ...
               </description>
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- A shot list for the content of this item -->
                <description role="afpdescRole:shotList">
                    -wide of auctioneer
                    -painting on screen
                    -Berge arriving at auction
                    -SOUNDBITE 1: Vox pop (woman) (english, 3 sec)
                    -SOUNDBITE 2: Vox pop (man) (English, 4 sec)
                    -close up of Matisse
                    ...
                    ...
                </description>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

A shot list, if present, provides a concise description of each sequence. These elements are provided in their order of occurrence in the video or animated graphic.

A shot list is provided by a description element whose role attribute, the QCode afpdescRole:shotList, resolves to http://cv.afp.com/descriptionRoles/shotList. It may appear there at most once per item.

In some documents, the shot list isn't provided in this way but appear concatenated to the script (see section "Script" for an example).

The exact format of a shot list may not be the same for all kind of documents and may also vary according to local journalistic practices.

Note that while NewsML-G2 allows for rich text by using some markup in the content of a shot list, AFP's systems only output simple textual content not interspersed with markup.

Speakers heard during audio or film recording (aka synthe)

Video and animated graphic documents: Speakers heard during audio or film recording may be described in the content metadata section of the news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <description role="afpdescRole:synthe">
                    -Thomas Seydoux (man), International Co-Head of Department,
                     Christie’s  Europe 
                    -Pierre Bergé (man), co-founder Yves Saint Laurent Couture house
                    -Jonathan Rendell (man), Deputy Chairman, Christie’s Americas                
                </description>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Multimedia documents: Speakers heard during audio or film recording may be described in the content metadata section of each news item conveying video or animated graphic content.

<newsMessage>
    <itemSet>
        <newsItem>
            <contentMeta>
                <!-- Speakers heard during recording the content of this item -->
                <description role="afpdescRole:synthe">
                    -Thomas Seydoux (man), International Co-Head of Department,
                     Christie’s  Europe 
                    -Pierre Bergé (man), co-founder Yves Saint Laurent Couture house
                    -Jonathan Rendell (man), Deputy Chairman, Christie’s Americas                
               </description>
            </contentMeta>
        </newsItem>
        <newsItem>
            <contentMeta>
                <!-- Speakers heard during recording the content of this item -->
                <description role="afpdescRole:synthe">
                    -Vox pop woman
                    -Vox pop man
                    -Pierre Berge (man), Yves Saint Laurent's partner
                </description>
            </contentMeta>
        </newsItem>
    </itemSet>
</newsMessage>

Specific information may be provided about speakers heard during audio or film recording where an important value of the clip consists of what is said. In most clips these speakers appear in the images, but that may not always be the case.

This information may be provided by a description element whose role attribute, the QCode afpdescRole:synthe, resolves to http://cv.afp.com/descriptionRoles/synthe. It may appear at most once per item. This information is provided in the order of occurrence of speakers in the video or animated graphic.

This information typically includes speakers' name and function. It can be used, for example, to add captions accompanying speakers' appearances in the video.

Note that while NewsML-G2 allows for rich text by using some markup in description elements, AFP's systems only output simple textual content not interspersed with markup.

Data specific to multimedia documents

Some data is specific to multimedia documents. This section details these data elements.

Number of non-main items by nature

Multimedia documents: the number of non-main items broken down by item natures may be provided in the item metadata section of the main news item.

<newsMessage xmlns="http://iptc.org/std/nar/2006-10-01/" xmlns:afp="http://www.afp.com/format/internal/">
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
                <afp:extension>
                    <afp:stats>
                        <afp:totalComponentsOfType qcode="ninat:graphic" total="1" />
                        <afp:totalComponentsOfType qcode="ninat:picture" total="3" />
                    </afp:stats>
                </afp:extension>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

As shown above each totalComponentsOfType element provides the number of non-main items of a given nature present in the document. The qcode attribute specifies the nature as described in the following table:

Natures of multimedia non-main items
Type	QCode	Concept URI
Picture	`ninat:picture`	`http://cv.iptc.org/newscodes/ninature/picture`
Video	`ninat:video`	`http://cv.iptc.org/newscodes/ninature/video`
Still graphic	`ninat:graphic`	`http://cv.iptc.org/newscodes/ninature/graphic`
Animated graphic	`ninat:animated`	`http://cv.iptc.org/newscodes/ninature/animated`

The total attribute provides the number of items of the given nature, as a strictly positive integer. If the stats element is present, the absence of a totalComponentsOfType element for a given nature means that no non-main item of that nature is present in the document.

The totalComponentsOfType elements appears inside a stats element inside an extension element in the item metadata section of the main news item. Note that the totalComponentsOfType, stats and extension elements are not standard NewsML-G2 vocabulary but part of an AFP's specific extension. They are defined in an XML namespace whose name is http://www.afp.com/format/internal/.

Therefore, here is how to interpret the example given at the beginning of this section:

The presence of <afp:totalComponentsOfType qcode="ninat:graphic" total="1" /> means that there is one still graphic item in the document.
The presence of <afp:totalComponentsOfType qcode="ninat:picture" total="3" /> means that there is three picture items in the document.
The absence of totalComponentsOfType element for other item natures means that there is no animated graphic and video item in the document.

The extension and stats elements are optional (i.e., they may or may not present). When they are present they appear at most once per document.

Multimedia content expressed using the XML syntax of HTML

Multimedia documents: the multimedia content is provided using the XML syntax of HTML in the content set of the main news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
            </itemMeta>
            <contentSet>
                <inlineXML contenttype="application/xhtml+xml">
                    <html xmlns="http://www.w3.org/1999/xhtml">
                        <head>
                            <title>
                                YSL-Bergé collection sets new world record at auction 
                                for a private collection
                            </title>
                        </head>
                        <body>
                            <p>
                                The Yves Saint Laurent and Pierre Bergé collection sets 
                                new world record at auction for a private collection. 
                                Hundreds of art treasures amassed by late fashion designer
                                Yves Saint Laurent and his companion Pierre Berge over half
                                a century are being auctioned.
                            </p>
                            <p>
                                <!-- Embedded content from a picture item -->
                                <span class="g2item g2picture">
                                    <a style="display: none" href="urn:newsml:afp.com:20100101:7a0846c9-e341-45dc-a3a2"></a>
                                    <img src="image1.jpeg" style="float: left;" 
                                         generator-unable-to-provide-required-alt="" height="163" width="245" />
                                </span>
                            </p>    
                            <p>
                                Bids hit 206 million euros (261 million dollars) on February
                                23, 2009 making it the biggest private collection ever 
                                auctioned with two days of sales still left to run.
                            </p>
                            <p>
                                <!--  Embedded content from a video item -->
                                <span class="g2item g2video">
                                    <a style="display: none" href="urn:newsml:afp.com:20100101:7633a15b-a990-4db6-9052"></a>
                                    <video style="float: right;" controls="controls" height="138" width="245"
                                           poster="keyframe1.jpeg">
                                        <source src="video1.mp4" type="video/mp4" />
                                    </video>
                                </span>
                            </p>
                            <p>
                                <!-- An hypertext link to an external resource -->
                                The <a class="ignorableTextFalse" href="http://en.wikipedia.org/wiki/Yves_Saint_Laurent_(designer)">
                                wikipedia page about Yves Saint-Laurent</a> claims that ...
                            </p>
                        </body>
                    </html>
                </inlineXML>
            </contentSet>
        </newsItem>
        <newsItem guid="urn:newsml:afp.com:20100101:7a0846c9-e341-45dc-a3a2">
            ...
        </newsItem>
        <newsItem guid="urn:newsml:afp.com:20100101:7633a15b-a990-4db6-9052">
            ...
        </newsItem>
    </itemSet>
</newsMessage>

The multimedia content expressed using the XML syntax of HTML is the main journalistic content of the document. It is provided by an inlineXML element. A contentType attribute with a value of application/xhtml+xml explicitly denotes the usage of the XML syntax of HTML.

The multimedia content contains the main textual content intermingled with links and audiovisual content. As shown in this figure, some parts of this content (e.g., pictures, videos, etc.) may be described by their own news items. These parts are referred to as "components". These news items describing them are themselves part of the NewsML-G2 document.

You can see in the example above that we use a microformat [Microformat] to denote a component and the reference to the news item that describes it. This allows to provide displayable information (e.g., an img tag) along with semantic markup (e.g., the reference to the news item) which can be machine-processed by your system.

This microformat consists in a span elements with a class attribute that contains "g2item". In addition, we provide another class name denoting the type of the referenced item (e.g., "g2picture", "g2video", etc.).

The first child element of such a span is always the reference to the news item that describe the component. It is represented as an a tag whose href attribute provides the GUID of the news item. This element is marked as non displayable as it is not meant to be directly displayed. Following this element, additional HTML markup defines embedded content for displaying a default rendition of this component. For example, a document may contains an img element displaying a picture.

This microformat is called the g2item microformat. Another microformat called the g2document microformat is used to represent links to other NewsML-G2 documents. In is described in its dedicated section below.

The following sections detail how various types of components and links are represented.

Picture

The class name "g2item" signals that we use the g2item microformat: the span represents a component along with a reference to the associated news item. The class name "g2picture" denotes that the referenced news item provides picture content. Inside the span, the first element provides the guid of that news item. The second element defines embedded content for displaying a default rendition of the picture, using a standard HTML img tag. For example:

<span class="g2item g2picture">
    <a style="display: none" href="urn:newsml:afp.com:20100101:7a0846c9-e341-45dc-a3a2"></a>
    <img src="image1.jpeg" style="float: left;" 
         generator-unable-to-provide-required-alt="" height="163" width="245" />
</span>

Still graphic

Embedded still graphic is defined like embedded picture except that in the span element we use the class name g2graphic instead of g2picture. For example:

<span class="g2item g2graphic">
    <a style="display: none" href="urn:newsml:afp.com:20100101:7a123456-a542-76fg-ab6a"></a>
    <img src="image1.jpeg" style="float: left;" 
         generator-unable-to-provide-required-alt="" height="163" width="245"/>
</span>

Video

For embedded video we also use the use g2item microformat. The class name g2video denotes that the referenced news item provides video content. Inside the span, the first element provides the guid of that news item. The embedded video is then defined using a standard HTML video tag. An illustration image may be provided by poster attribute, and additional attributes such as autoplay, loop, etc. may be used as well. For example:

<span class="g2item g2video">
    <a style="display: none" href="urn:newsml:afp.com:20100101:7633a15b-a990-4db6-9052"></a>
    <video style="float: right;" controls="controls" height="138" width="245"
           poster="keyframe1.jpeg">
        <source src="video1.mp4" type="video/mp4" />
    </video>
</span>

Hypertext links to other resources

The HTML can contain hypertext links to other resources such as Web pages. They may be provided by a elements. For example here is a link to a wikipedia page:

<a class="ignorableTextFalse"
   href="http://en.wikipedia.org/wiki/Yves_Saint_Laurent_(designer)" >wikipedia page about Yves Saint-Laurent</a>

ignorableTextFalse

ignorableTextFalse means that if you process the HTML in order to remove links then not removing the text associated with this link will produce a better result.

For example, suppose that the HTML contains the following fragment before removing the hypertext links :

Pierre Bergé quoted the 
<a class="ignorableTextFalse" 
   href="http://en.wikipedia.org/wiki/Yves_Saint_Laurent_(designer)">wikipedia page about Yves Saint-Laurent</a>
to illustrate...

After removing hypertext links the fragment should be:

Pierre Bergé quoted the wikipedia page about Yves Saint-Laurent to illustrate...

ignorableTextTrue

ignorableTextTrue means that if you process the HTML in order to remove links then also removing the text associated with this link will produce a better result.

For example, suppose that the HTML contains the following fragment before removing the hypertext links :

Some text before.
<a class="ignorableTextTrue" 
   href="http://en.wikipedia.org/wiki/Yves_Saint_Laurent_(designer)">
   This Web page provides additional information.
</a> 
 Some text after.

After removing hypertext links the fragment should be:

Some text before. Some text after.

Links to other NewsML-G2 documents

<span class="g2document g2text ignorableTextFalse">
    <a style="display: none" href="http://doc.afp.com/7W37U"></a>
    <a style="display: none" href="otherDocument.xml"></a>
    some text
</span>

The content of the span element is organized as follow:

The first child element of such a span is an a tag whose href attribute provides the GUID of the NewsML-G2 document. Note that while it may look like a dereferencable URI, it actually isn't. This element is marked as non displayable as it is not meant to be directly displayed.
Following this element, another non displayable a tag may provide the dereferencable URI reference of the NewsML-G2 document. Typically, this element will be present if the AFP delivery system determines that it has delivered the corresponding document to you and know where to locate it in your delivery space.
Finally, we provide the part of the textual content the other NewsML-G2 document is associated with.

The following table lists the class names used to specify the type of a referenced NewsML-G2 document. See the overview section for a presentation of the various document types.

Types of referenced NewsML-G2 document
Class name	Type
g2text	Text
g2multimedia	Multimedia
g2picture	Picture
g2graphic	Still graphic
g2animated	Animated graphic
g2video	Video
g2liveReport	Live report index
g2interactive	Interactive graphic

ignorableTextFalse

ignorableTextFalse means that if you process the HTML in order to remove links then not removing the text associated with this link will produce a better result.

For example, suppose that the HTML contains the following fragment before removing the links :

Pierre Bergé quoted  
<span class="g2document g2text ignorableTextFalse">
    <a style="display: none" href="http://doc.afp.com/7W37U"></a>
    <a style="display: none" href="otherDocument.xml"></a>
    a recent AFP news story
</span>
to illustrate...

After removing links the fragment should be:

Pierre Bergé quoted a recent AFP news story to illustrate...

ignorableTextTrue

ignorableTextTrue means that if you process the HTML in order to remove links then also removing the text associated with this link will produce a better result.

For example, suppose that the HTML contains the following fragment before removing the links :

Some text before.
<span class="g2document g2text ignorableTextFalse">
    <a style="display: none" href="http://doc.afp.com/7W37U"></a>
    <a style="display: none" href="otherDocument.xml"></a>
    This AFP news story provides additional information.
</span>
Some text after.

After removing links the fragment should be:

Some text before. Some text after.

Data specific to Live report posts

Live report posts are represented by multimedia documents. They can contain additional dedicated metadata, as described in this section.

Live report intertitle

Live report posts: the indication that a post is an intertitle is provided in the item metadata section of the main news item.

<newsMessage>
    <itemSet>
        <newsItem>
            <itemMeta>
                <!-- This link element tells that this news item is the main item of the multimedia document  -->
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
                
                <!-- This link element tells that this multimedia document represents an intertitle in a live report  -->
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/liveReportIntertitle"/>
            </itemMeta>
        </newsItem>
     </itemSet>
</newsMessage>

While most posts carry a news bit about the ongoing event being reported, some differ as they represent intertitles. An intertitle typically provides some text describing a phase of the ongoing event, or another regroupment of a subset of posts. An intertitle is identified by the presence of a specific element in the item metadata section of its main item: a link element whose rel attribute convey the concept URI http://cv.iptc.org/newscodes/conceptrelation/isA (using the QCode crel:isA) and whose href attribute is the URI http://cv.afp.com/itemnatures/liveReportIntertitle.

Timestamp in Live Report

>Live report posts: the timestamp in live report is provided in the item metadata section of the main news item.

<newsMessage xmlns="http://iptc.org/std/nar/2006-10-01/" xmlns:afp="http://www.afp.com/format/internal/">
    <itemSet>
        <newsItem>
            <itemMeta>
                <link rel="crel:isa" href="http://cv.afp.com/itemnatures/mmdMainComp"/>
                <afp:extension>
                    <afp:timestampInLiveReport>
                        <afp:date>2016-07-09T15:30:33.928Z</afp:date>
                        <afp:label>15h30</afp:label>
                    </afp:timestampInLiveReport>
                </afp:extension>
            </itemMeta>
        </newsItem>
    </itemSet>
</newsMessage>

The timestamp in live report is provided for multimedia documents that represent posts in live reports. Each post is associated with a timestamp. This timestamp is provided by a timestampInLiveReport element in a extension element inside the item metadata section. It is made of :

a precise date/time that determine the chronological order of posts in the live report. It is provided as a W3C XML Schema 1.0 date/time by a date element.
a label that is meant to be displayed along the content of the post, and is tailored to the context. For example timestamps labels in a live report for a soccer match may be expressed in minutes since the beginning of the match: "Min 45", "Min 46", etc.

These extension, timestampInLiveReport, date and label elements are in the XML namespace http://www.afp.com/format/internal/.

Data specific to live report indexes

Some data is specific to live report indexes. This section details these data elements.

Lead

Live report indexes: a lead of the live report may be provided in the content metadata section of the package item.

<newsMessage>
    <itemSet>
        <packageItem>
            <contentMeta>
                <description role="afpdescRole:lead">
                    <html:html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml">
                        <head />
                        <body>
                            <p>Live inside Christie's auction of Yves Saint-Laurent/bergé collection.</p>
                            <p>Auction sparks huge interest. Follow our report and analysis live.</p>
                        </body>
                    </html:html>
                </description>
            </contentMeta>
        </packageItem>
    </itemSet>
</newsMessage>

A "lead" for the live report may be provided by a description element whose a role attribute, the QCode afpdescRole:lead, resolves to http://cv.afp.com/descriptionRoles/lead. Inside this element the lead is provided using the XML syntax of HTML in an html element in namespace http://www.w3.org/1999/xhtml.

When present, the lead contains a short description (typically around one hundred words) of what the live report is about.

List of posts

Live report indexes: the list of posts of the live report is provided in the groupSet section of the package item.

<newsMessage xmlns="http://iptc.org/std/nar/2006-10-01/" xmlns:afp="http://www.afp.com/format/internal/">
    <itemSet>
        <packageItem>
            <groupSet>
                <group role="afpgroup:elements">
                    <!-- An example of a live report index with three posts.
                    As a story develops, real live reports can include tens or hundred of posts. -->
                    <itemRef href="d-oc1ku.xml">
                        <afp:iteminfo>
                            <headline>Auction opens</headline>
                        </afp:iteminfo>
                    </itemRef>
                    <itemRef href="d-oc02w.xml">
                        <afp:iteminfo>
                            <headline>Christie's shows ten most intriguing pieces</headline>
                        </afp:iteminfo>
                    </itemRef>
                    <itemRef href="d-ob2p7.xml">
                        <afp:iteminfo>
                            <headline>Press conference scheduled at 7 PM</headline>
                        </afp:iteminfo>
                    </itemRef>                    
                </group>
            </groupSet>
        </packageItem>
    </itemSet>
</newsMessage>

The list of posts is provided as a list of links to the NewsML-G2 documents that represent individual posts. These links are provided inside the group set of the package item, in a group element whose role attribute, the QCode afpgroup:elements, resolves to http://cv.afp.com/grouproles/elements. Each link is provided by an itemRef element, through an href attribute (see the NewsML-G2 documentation [G2Doc] for more information about the itemRef construct).

Inside each itemRef, an itemInfo element in the XML namespace http://www.afp.com/format/internal/ may provide a title for the post in an headline element.

The list is chronologically ordered: the first itemRef links to the most recent post, the second itemRef links to the second most recent, etc.

Accessing visual content through URI references

In a document, a number of elements provide links to actual visual content in formats such as JPEG, MPEG-4, etc. Some of these elements are defined by NewsML-G2 while others are defined by HTML, as AFP text and multimedia documents can contain HTML (in XML syntax) embedded right into NewsML-G2. For example, such links can be provided by:

href attributes in remoteContent and icon elements.
src attributes in img elements, video elements, etc.
poster attributes in video elements.
etc.

A link of this type is an URI reference as defined by [RFC3986]. This means it is either an URI or a relative-ref (colloquially referred as "relative URI").

At some point when dealing with a NewsML-G2 document, you'll typically want to retrieve the actual visual content, in order to process or display it.

If the link is a (non relative) URI per [RFC3986], you can directly dereference it, using standard software components, to retrieve the actual visual content. Typically, the scheme(s) used for such URI depend(s) on the specific delivery architecture established between you and AFP. Examples of commonly used schemes are: http, ftp and cid.

If the link is a relative-ref, then you need to resolve it to its target URI. You can then dereference the target URI to retrieve the actual visual content.

Note that with most standard libraries providing URI reference resolution, resolving a (non-relative) URI is the identity operation. That way, you don't have to determine whether you have been handed an (non-relative) URI or a relative-ref: you can just resolve the URI reference and then dereference it to retrieve the actual visual content.

Section 5 of [RFC3986] defines the process of resolving an URI reference. To carry on this process, you need the URI reference itself (as stated earlier, it is provided in the document, for example in an href attribute, src attribute, etc.) and a base URI. Typically the base URI is the URI that allows retrieving the NewsML-G2 document.

For example, if AFP delivers you a package that contains both an AFP NewsML-G2 document and data files for the associated visual content, the base URI is the URI that allows accessing the NewsML-G2 document after delivery. Suppose AFP delivers content in your file system in the directory "/deliverySpace/internet-journal/topnews/", producing the following file structure :

Sample delivery structure

In this context, the base URI is the URI that allows accessing the NewsML-G2 document after delivery. If your NewsML-G2 processor accesses the NewsML-G2 document at file:///deliverySpace/internet-journal/topnews/doc.afp.com-9719Z-2.xml, then this is the base URI. The URI references linking to the visual content can be resolved relatively to this base URI. For example, the URI reference 5b9c11cbf6871cb93696bebab8bdbc2c16afc44b-highDef.jpg would resolve to file:///deliverySpace/internet-journal/topnews/5b9c11cbf6871cb93696bebab8bdbc2c16afc44b-highDef.jpg, which can then be dereferenced to access that particular visual content.

Several libraries provide URI reference resolution. For instance, in Java, one could use the resolve() method of the java.net.URI class.

Release Notes

August 2021

The section Role in workflow has been enhanced to show that a flash can be followed by an urgent but not by an alert.

The section on caption has been thoroughly rewritten to explain that captions may be provided in two parts, the content description and the context description.

The new concept of renditions dedicated to cancelled documents has been documented in the section on publishing status.

The section on subjects has been completed to explain that some subjects are identified by an uri attribute. The section on locations that are subject matter of the document has been completed to show how a location can be specified using a geo URI.

In the section on locations from which the content originates, the entry about graphics has been corrected.

The section on mandatory processing has been enhanced.

The section on catchlines now states that a multimedia documents may provide a catchline identified by the role http://cv.afp.com/headlineroles/introduction.

The section on subtitles now states that subtitles are only provided for text and multimedia documents and that usually there is at most two subtitles.

The XML syntax for HTML was formerly referred to as "XHTML". As the latest versions of the HTML living standard no longer use that term, this document no longer use that term either.

This version also includes a number of editorial improvements.

July 2019

A section about mandatory processing has been added.

The sections about visual content rendition types and icon renditions types have been thoroughly updated.

A section about the copyright notice metadata has been added.

The section on content creation date now states that for photo combos, the content creation date we provide is the date of creation of the combo (instead of a shooting date).

A convergence effort between the metadata models of text and multimedia documents is underway in our production system. As a result the Related production and Role in workflow metadata may now be provided on multimedia documents. The documentation has been updated to reflect this change.

The section on publishing status, including information about cancelling documents, has been thoroughly rewritten to provide additional and more precise information.

Update about content warnings: our editorial system now makes use of the newly standardized content warning for "suffering". This documentation has been updated to reflect it.

The section about Visual Dimensions now states that the "millimeters" dimension unit may be used in AFP newsML-G2 documents.

"Related interactive graphic" has been added to the section about related production.

This version also includes a number of editorial improvements.

March 2018

Major update for multimedia documents, including initial documentation of our HTML microformats.

The documentation now states that a location of origin of content can be a "point of interest", in addition to already documented types (city, country area, country). See section Locations From Which The Content Originates.

The documentation provides a more accurate description of the "synthe" metadata, now stating that it concerns speakers heard during audio or film recording where an important value of the clip consists of what is said. In previous versions it was described as applying only to visible speakers. See section Speakers heard during audio or film recording (aka synthe).

Tables listing the main languages used in AFP production and their corresponding BCP 47 codes are now provided. See sections Language of the content and Language of metadata.

Various editorial improvements.

August 2016

The documentation has been updated thoroughly to allow processing AFP NewsML-G2 documents without resolving QCodes.

The documentation now states that along with event identifiers, the names of the events may be provided.

The documentation now states that posts in live report indexes are ordered chronologically (therefore it is no longer your responsibility to sort them).

The description of the "Timestamp in live report" metadata has been improved to include documentation for the label element.

The documentation of live reports now covers the notion of intertitle.

A number of improvements and clarifications have been made.

July 2016

The documentation for live reports has been added.

This document is now entirely self contained in one file, which makes it easier to distribute and use.

An important correction has been made: in previous versions of this documentation the concept URI for the "forbyline" role (cf. section on creators and contributors) was incorrectly specified as http://cv.afp.com/creatorroles/forbyline . This has been corrected; the correct concept URI is: http://cv.afp.com/contributorroles/forbyline.

A section on mentions of related production has been added.

An example has been added to the section on textual content of text document showing that the content can contain hypertext links.

A number of improvements and clarifications have been made.

February 2016

This documentation has been updated thoroughly for text documents.

February 2014

Documentation updated thoroughly in preparation of public delivery of NewsML-G2 documents.

January 2012

Initial version.

References

[G2Doc]	"NewsML-G2 Documentation". IPTC. Available from https://iptc.org/standards/newsml-g2/using-newsml-g2/
[MediaTypes]	MIME Media Types. Available at http://www.iana.org/assignments/media-types/index.html
[IPTCCPNatures]	The IPTC controlled vocabulary for basic natures of concepts. Available at http://cv.iptc.org/newscodes/cpnature/
[IPTCDimUnits]	The IPTC controlled vocabulary for dimension units. Available at http://cv.iptc.org/newscodes/dimensionunit/
[IPTCGenres]	The IPTC controlled vocabulary for genres. Available at http://cv.iptc.org/newscodes/genre/
[IPTCLocTypes]	The IPTC controlled vocabulary for location types. Available at http://cv.iptc.org/newscodes/location/
[IPTCMediaTopics]	The IPTC controlled vocabulary for media topics. Available at http://cv.iptc.org/newscodes/mediatopic/
[IPTCNProviders]	The IPTC controlled vocabulary for news providers. Available at http://cv.iptc.org/newscodes/newsprovider/
[IPTCTimeUnits]	The IPTC controlled vocabulary for time units. Available at http://cv.iptc.org/newscodes/timeunit/
[IPTCCWarn]	The IPTC controlled vocabulary for content warnings. Available at http://cv.iptc.org/newscodes/contentwarning/
[ISO3166]	ISO 3166 Maintenance Agency. Available at http://www.iso.org/iso/country_codes.htm
[HTTPURI]	"RFC 2616, section 3.2: Uniform Resource Identifiers". R. Fielding & al. June 1999. Available at http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.2
[RFC3085bis]	"URN Namespace for news-related resources". M. Steidl and J. Lorenzen. July 2009. Draft available at http://tools.ietf.org/html/draft-steidl-newsml-urn-rfc3085bis-00
[RFC3986]	"Uniform Resource Identifier (URI): Generic Syntax". T. Berners-Lee, R. Fielding and L. Masinter. January 2005. Available at http://tools.ietf.org/html/rfc3986
[RFC3987]	"Internationalized Resource Identifiers (IRIs)". M. Duerst and M. Suignard. January 2005. Available at http://www.ietf.org/rfc/rfc3987
[RFC5646]	"Tags for Identifying Languages". A. Phillips and M. Davis. September 2009. Available at http://tools.ietf.org/html/rfc5646
[RFC5870]	"A Uniform Resource Identifier for Geographic Locations ('geo' URI)". A. Mayrhofer and C. Spanring. June 2010. Available at http://tools.ietf.org/html/rfc5870
[TagCloud]	Wikipedia article on tag cloud. Available at http://en.wikipedia.org/wiki/Tag_Cloud
[XMLSchemaDataTypes]	XML Schema Part 2: Datatypes. Available at http://www.w3.org/TR/xmlschema-2/
[XMLSpec]	"Extensible Markup Language (XML) 1.0". Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, François Yergeau. Available at http://www.w3.org/TR/xml/
[Microformat]	Wikipedia article on microformats. Available at http://en.wikipedia.org/wiki/Microformat
[HTMPSpec]	HTML Living Standard. Available at https://html.spec.whatwg.org

Prepared and written by Philippe Mougin

Technical guide to AFP NewsML-G2

Table of Contents

Introduction

Mandatory processing

Undocumented features

Overview

Text documents

Picture and still graphic documents

Video and animated graphic documents

Multimedia documents

Live reports

Document walk-through

Controlled vocabularies and qualified codes

Concepts are identified with concept URIs

Concepts URIs might be represented by QCodes in NewsML-G2 documents

Why it is useful to resolve QCodes to concept URIs

What to do if you can't implement QCode resolution with your tool chain

How to perform QCode resolution

How to read the examples

Common data

Creators & Contributors

Content warning

Correction signal

Dates

Document transmission date

Document creation date

Document version creation date

Content creation date

Picture, still graphic, animated graphic and video documents

Multimedia documents

Text documents

Live report indexes

Embargo

Event identifiers

General editorial note

Genres

Identifier and version number

Information sources

Keywords

Language of the content

Language of metadata

Locations

Locations from which the content originates (aka datelines)

Locations that are subject matter of the document

Products the document belongs to

Provider

Publishing Status

Subjects

Titles & subtitles

Titles

Subtitles

Type of document

Urgency

Data specific to text and multimedia documents

Catchline

Number of hypertext links to external resources in textual or multimedia content

Related production

Role in workflow

Word count

Data specific to text documents

Textual content

Hypertext links to other resources

ignorableTextFalse

ignorableTextTrue

Links to other NewsML-G2 documents

ignorableTextFalse

ignorableTextTrue

Data specific to visual content

Caption

Copyright Notice

Visual content

Basic format

Additional properties of renditions

Rendition type

Media type and format

Visual dimensions

Size

Data specific to picture and still graphic content

Additional data about visual content

Orientation