Categories
Archives
The IPTC News Architecture Working Group is happy to announce the release of NewsML-G2 version 2.34.
This version, approved at the IPTC Standards Committee Meeting at the New York Times offices on Wednesday 17th April 2024, contains one small change and one additional feature:
Change Request 218, increase nesting of <related> tags: this allows for <related> items to contain child <related> items, up to three levels of nesting. This can be applied to many NewsML-G2 elements:
- pubHistory/published
- QualRelPropType (used in itemClass, action)
- schemeMeta
- ConceptRelationshipsGroup (used in concept, event, Flex1PropType, Flex1RolePropType, FlexPersonPropType, FlexOrganisationPropType, FlexGeoAreaPropType, FlexPOIPropType, FlexPartyPropType, FlexLocationPropType)
Note that we chose not to allow for recursive nesting because this caused problems with some XML code generators and XML editors.
Change Request 219, add dataMining element to rightsinfo: In accordance with other IPTC standards such as the IPTC Photo Metadata Standard and Video Metadata Hub, we have now added a new element to the <rightsInfo> block to convey a content owner’s wishes in terms of data mining of the content. We recommend the use of the PLUS Vocabulary that is also recommended for the other IPTC standards: https://ns.useplus.org/LDF/ldf-XMPSpecification#DataMining
Here are some examples of its use:
Denying all Generative AI / Machine Learning training using this content:
<rightsInfo> <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-AIMLTRAINING"/> </rightsInfo>
A simple text-based constraint:
<rightsInfo> <usageTerms> Data mining allowed for academic and research purposes only. </usageTerms> <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT" /> </rightsInfo>
A simple text based constraint, expressed using a QCode instead of a URI:
<rightsInfo> <usageTerms> Reprint rights excluded. </usageTerms> <dataMining qcode="plusvocab:DMI-PROHIBITED-SEECONSTRAINT" /> </rightsInfo>
A text-based constraint expressed in both English and French:
<rightsInfo> <usageTerms xml:lang="en"> Reprint rights excluded. </usageTerms> <usageTerms xml:lang="fr"> droits de réimpression exclus </usageTerms> <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT" /> </rightsInfo>
Using the “see embedded rights expression” constraint to express a complex machine-readable rights expression in RightsML:
<rightsInfo> <rightsExpressionXML langid="http://www.w3.org/ns/odrl/2/"> <!-- RightsML goes here... --> </rightsExpressionXML> <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR"/>> </rightsInfo>
For more information, contact the IPTC News Architecture Working Group via the public NewsML-G2 mailing list.
The IPTC News Architecture Working Group is happy to announce that the NewsML-G2 Guidelines and NewsML-G2 Specification documents have been updated to align with version 2.33 of NewsML-G2, which was approved in October 2023.
The changes include:
Specification changes:
- Adding the newest additions authoritystatus and digitalsourcetype added in NewsML-G2 versions 2.32 and 2.33
- Clarification on how @uri, @qcode and @literal attributes should be treated throughout
- Clarification on how roles should be added to infosource element when an entity plays more than one role
- Clarifying and improving cross-references and links throughout the document
Guidelines changes:
- Documentation of the Authority Status attrribute and its related vocabulary, added in version 2.32
- Documentation of the Digital Source Type element and its related vocabulary, added in version 2.33
- Clarification on how @uri, @qcode and @literal attributes should be treated throughout
- Clarifying and improving cross-references and links throughout the document
- Improved the additional resources section including links to related IPTC standards and added links to the SportsML-G2 Guidelines
- See the What’s New in NewsML-G2 2.32 and 2.33 section for full details.
We always welcome feedback on our specification and guideline documents: please use the Contact Us form to ask for clarifications or suggest changes.
The IPTC NewsML-G2 Working Group and the News Architecture Working Group are happy to announce the release of the latest version of our flagship XML-based news syndication standard: NewsML-G2 v2.33.
Changes in the latest version are small but significant. We have added support for the Digital Source Type property which is already being used in IPTC’s sister standards IPTC Photo Metadata Standard and IPTC Video Metadata Hub and ninjs. This property can be used to declare when content has been created or modified by software, including by Generative AI engines.
Examples of other possible values for the digital source type property using the recommended IPTC Digital Source Type NewsCodes vocabulary are:
ID (in QCode format) | Name | Example |
digsrctype:digitalCapture | Original digital capture sampled from real life:
The digital media is captured from a real-life source using a digital camera or digital recording device |
Digital video taken using a digital film, video or smartphone camera |
digsrctype:negativeFilm | Digitised from a negative on film:
The digital image was digitised from a negative on film on any other transparent medium |
Digital photo scanned from a photographic negative |
digsrctype:minorHumanEdits | Original media with minor human edits:
Minor augmentation or correction by a human, such as a digitally-retouched photo used in a magazine |
Original audio with minor edits (e.g. to eliminate breaks) |
digsrctype:algorithmicallyEnhanced | Algorithmic enhancement: Minor augmentation or correction by algorithm |
A photo that has been digitally enhanced using a mechanism such as Google Photos’ “denoise” feature |
digsrctype:dataDrivenMedia | Data-driven media: Digital media representation of data via human programming or creativity |
Textual weather report generated by code using readings from weather detection instruments |
digsrctype:trainedAlgorithmicMedia | Trained algorithmic media: Digital media created algorithmically using a model derived from sampled content |
A “deepfake” video using a combination of a real actor and a trained model
|
The above list is a subset of the full list of recommended values. See the full IPTC Digital Source Type NewsCodes vocabulary for the complete list.
Guidance on using Digital Source Type
The IPTC Photo Metadata User Guide contains a section on Guidance for using Digital Source Type including examples for various types of media, including images, video, audio and text. The examples referenced in this guide can also apply to NewsML-G2 content.
Where Digital Source Type can be used in NewsML-G2 documents
The new <digitalSourceType> property can be added to the contentMeta section of any G2 NewsItem, PackageItem, KnowledgeItem, ConceptItem or PlanningItem to describe the digital source type of an item in its entirety.
It can also be used in the partMeta section of any G2 NewsItem, PackageItem or KnowledgeItem to describe the digital source type of a part of the item. In this way, content such as a video that includes some captured shots and AI-generated shots can be fully described using NewsML-G2.
Find out more about NewsML-G2 v2.33
All information related to NewsML-G2 2.33 is at https://iptc.org/std/NewsML-G2/2.33/.
The NewsML-G2 Specification document has been updated to cover the new version 2.33.
Example instance documents are at https://iptc.org/std/NewsML-G2/2.33/examples/.
Full XML Schema documentation is located at https://iptc.org/std/NewsML-G2/2.33/specification/XML-Schema-Doc-Power/
XML source documents and unit tests are hosted in the public NewsML-G2 GitHub repository.
The NewsML-G2 Generator tool has also been updated to produce NewsML-G2 2.33 files using the version 38 catalog.
For any questions or comments, please contact us via the IPTC Contact Us form or post to the iptc-newsml-g2@groups.io mailing list. IPTC members can ask questions at the weekly IPTC News Architecture Working Group meetings.
This weekend at the IBC broadcast industry event in Amsterdam, the Arab States Broadcasting Union (ASBU) will launch its news exchange network, ASBU Cloud, with IPTC’s NewsML-G2 standard at its core.
Developed for ASBU by IPTC member Broadcast Solutions, a systems integrator in the broadcast industry, ASBU Cloud uses NewsML-G2 to distribute content to partners.
After evaluating several metadata formats, ASBU chose to implement NewsML-G2 as a metadata schema and worked with IPTC to implement the standard. This ensures that ASBU content can be used easily by other international organisations like the European Broadcasting Union (EBU) and Asia-Pacific Broadcasting Union (ABU).
Broadcast Solutions System Architect Jean-Christophe Liechti explained the use of NewsML-G2 in an interview with Broadcast Pro magazine: “This XML-based standard for news exchange was developed and is maintained by the International Press Telecommunications Council (IPTC). It’s a successor to the original NewsML format and it provides a comprehensive and flexible framework for distributing any type of media, including text, images, audio and video. This metadata standard is language-agnostic. You can use standard dictionaries or manage your own to structure your data. We reached out to IPTC to ensure that our implementation closely met the standard. ASBU exchanges are now available as a NewsML-G2 feed like partner organisations like the EBU or major news organisations like Reuters, AP or AFP.”
The project is also based on Amazon Web Services, the Dalet Flex media asset management system, and uses innovative systems like the AI video metadata extraction engine Newsbridge (IPTC’s newest member).
The project will be launched at the IBC event in Amsterdam this Sunday, 17 September at 12.00. The launch will take place at the Broadcast Solutions outdoor booth 0.A01 located across from Hall 13 of the RAI exhibition centre.
Read more about ASBU Cloud at Broadcast Pro Middle East or contact IPTC if you’re interested in using NewsML-G2 in your own projects.
The IPTC is proud to announce that after intense work by most of its Working Groups, we have published version 1.0 of our guidelines document: Expressing Trust and Credibility Information in IPTC Standards.
The culmination of a large amount of work over the past several years across many of IPTC’s Working Groups, the document represents a guide for news providers as to how to express signals of trust known as “Trust Indicators” into their content.
Trust Indicators are ways that news organisations can signal to their readers and viewers that they should be considered as trustworthy publishers of news content. For example, one Trust Indicator is a news outlet’s corrections policy. If the news outlet provides (and follows) a clear guideline regarding when and how it updates its news content.
The IPTC guideline does not define these trust indicators: they were taken from existing work by other groups, mainly the Journalism Trust Initiative (an initiative from Reporters Sans Frontières / Reporters Without Borders) and The Trust Project (a non-profit founded by Sally Lehrman of UC Santa Cruz).
The first part of the guideline document shows how trust indicators created by these standards can be embedded into IPTC-formatted news content, using IPTC’s NewsML-G2 and ninjs standards which are both widely used for storing and distributing news content.
The second part of the IPTC guidelines document describes how cryptographically verifiable metadata can be added to media content. This metadata may express trust indicators but also more traditional metadata such as copyright, licensing, description and accessibility information. This can be achieved using the C2PA specification, which implements the requirements of the news industry via Project Origin and of the wider creative industry via the Content Authenticity Initiative. The IPTC guidelines show how both IPTC Photo Metadata and IPTC Video Metadata Hub metadata can be included in a cryptographically signed “assertion”
We expect these guidelines to evolve as trust and credibility standards and specifications change, particularly in light of recent developments in signalling content created by generative AI engines. We welcome feedback and will be happy to make changes and clarifications based on recommendations.
The IPTC sends its thanks to all IPTC Working Groups that were involved in creating the guidelines, and to all organisations who created the trust indicators and the frameworks upon which this work is based.
Feedback can be shared using the IPTC Contact Us form.
The IPTC is happy to announce that NewsML-G2 version 2.32 has been released.
All documentation relating to version 2.32 can be found at the NewsML-G2 2.32 documentation page.
The changes in 2.32 are:
- Added new attributes
authoritystatus
andauthoritystatusuri
to the scheme, schemeMeta and catalog elements. These attributes describe the status of the authority managing a resource such as a scheme or a catalog. - Added a new NewsCodes vocabulary https://cv.iptc.org/newscodes/authoritystatus with the values “No current authority”, “No single authority” and “Country-specific authority”.
- Updated the IPTC catalog to version 38, including the new authoritystatus vocabulary and also added a “cvx.iptc.org” vocabulary, ticker. Added an
authoritystatus
attribute to the following schemes: isin, a1312cat, a1312prio, a1312svc, a1312vers. Also update to note on the frmt vocabulary removing the part that says it is only applicable to NewsML 1. - Update schema documentation for qcode, uri and literal throughout to be more accurate.
- Remove https from CV references in schema documentation
- Update dev schema to use 2.32.
- The schema documentation for “creator” and “creatoruri” attributes is now correct and consistent across all instances.
All information related to NewsML-G2 2.32 is at https://iptc.org/std/NewsML-G2/2.32/.
Example instance documents are at https://iptc.org/std/NewsML-G2/2.32/examples/.
Full XML Schema documentation is located at https://iptc.org/std/NewsML-G2/2.32/specification/XML-Schema-Doc-Power/
The NewsML-G2 Generator tool has also been updated to produce NewsML-G2 2.32 files using the version 38 catalog.
For any questions or comments, please contact us via the IPTC Contact Us form or post to the iptc-newsml-g2@groups.io mailing list. IPTC members can ask questions at the weekly IPTC News Architecture Working Group meetings.
The IPTC’s flagship news exchange standard, NewsML-G2, is now updated to version 2.31. The change was approved at the IPTC Standards Committee Meeting at the IPTC Autumn Meeting 2022.
The full NewsML-G2 XML Schema, NewsML-G2 Guidelines document and NewsML-G2 specification document have all now been updated.
The only change (Change Request CR00215) is that we now allow the hasInstrument element on any concept or assert. Previously we required hasInstrument to be declared on organisations only, but we realised that not every financial instrument related to an organisation: for example an exchange-traded fund, or the instrument for a commodity, do not directly relate to a specific company.
Interestingly, hasInstrument elements in <assert>
s did appear to work in previous versions, but that is because of NewsML-G2’s use of the xs:any
construct which allows asserts to be augmented with arbitrary elements. No validation took place on elements which were added in this way.
Examples
Example 1: hasInstrument as a child of concept
<concept> <conceptId qcode="P:18040196349" /> <type qcode="cptType:97"/> <name>Invesco Capital Appreciation Fund;R6</name> <hasInstrument symbol="OPTFX.O" type="symType:RIC" symbolsrc="symSrc:RFT"/> <hasInstrument symbol="US00141G7328" symbolsrc="symSrc:ISO" type="symType:ISIN"/> </concept>
Example 2: hasInstrument as a child of assert
<assert qcode="P:18040196349"> <name>Invesco Capital Appreciation Fund;R6</name> <type qcode="cptType:97"/> <hasInstrument symbol="OPTFX.O" type="symType:RIC" symbolsrc="symSrc:RFT"/> <hasInstrument symbol="US00141G7328" symbolsrc="symSrc:ISO" type="symType:ISIN"/> </assert>
Example 3: hasInstrument within assert/organisationDetails
This usage still works, but is now deprecated.
<assert qcode="P:18040196349"> <name>Invesco Capital Appreciation Fund;R6</name> <type qcode="cptType:97"/> <organisationDetails> <hasInstrument symbol="OPTFX.O" type="symType:RIC" symbolsrc="symSrc:RFT"/> <hasInstrument symbol="US00141G7328" symbolsrc="symSrc:ISO" type="symType:ISIN"/> <rtr:anyOtherElement> Other elements in other namespaces allowed here due to xs:any other </rtr:anyOtherElement> </organisationDetails> </assert>
- The top-level folder of the NewsML-G2 v2.31 release is http://iptc.org/std/NewsML-G2/2.31/.
- The NewsML-G2 Implementation Guidelines document, updated to cover version 2.31 is available at https://www.iptc.org/std/NewsML-G2/guidelines
- The latest NewsML-G2 Specification document is available at https://www.iptc.org/std/NewsML-G2/specification/
- The XML Schema for NewsML-G2 v2.31 is at http://iptc.org/std/NewsML-G2/2.31/specification/NewsML-G2_2.31-spec-All-Power.xsd
XML Schema documentation of version 2.31 version is available on GitHub and at http://iptc.org/std/NewsML-G2/2.31/specification/XML-Schema-Doc-Power/.
NewsML-G2 Generator updated
The NewsML-G2 Generator has been updated to use version 2.31. There are no substantive changes but the version number of generated files has been updated to 2.31.
Thanks to Dave Compton of Refinitiv (an LSE Group Company) and the NewsML-G2 Working Group for their work on the update, and to Kelvin Holland on his work on the documentation.
To follow our work on GitHub, please see the IPTC NewsML-G2 GitHub repository.
The full NewsML-G2 change log showing the Change Requests included in each new version is available at the dev.iptc.org site.
At the recent IPTC Standards Committee Meeting, NewsML-G2 version 2.30 was approved.
The full NewsML-G2 XML Schema, NewsML-G2 Guidelines document and NewsML-G2 specification document have all now been updated.
The biggest change (Change Request CR00211) is that <catalogRef/>
and <catalog/>
elements are now optional. This is so that users who choose to use full URIs instead of QCodes do not need to include an unnecessary element.
The other user-facing change is CR00212 which adds residrefformat
and residrefformaturi
attributes to the targetResourceAttributes
attribute group, used in <link>
, <icon>
and <remoteContent>
.
Other changes CR00213 and CR00214 aren’t visible to end users and don’t change any functionality, but make the XML Schema easier to read and maintain.
- The top-level folder of the NewsML-G2 v2.30 release is http://iptc.org/std/NewsML-G2/2.30/.
- The NewsML-G2 Implementation Guidelines document, updated to cover version 2.30 is available at https://www.iptc.org/std/NewsML-G2/guidelines
- The latest NewsML-G2 Specification document is available at https://www.iptc.org/std/NewsML-G2/specification/
- The XML Schema for NewsML-G2 v2.30 is at http://iptc.org/std/NewsML-G2/2.30/specification/NewsML-G2_2.30-spec-All-Power.xsd
XML Schema documentation of version 2.30 version is available on GitHub and at http://iptc.org/std/NewsML-G2/2.30/specification/XML-Schema-Doc-Power/.
NewsML-G2 Generator updated
The NewsML-G2 Generator has been updated to use version 2.30. This means that catalogRef is only included if QCode mode is chosen. The Generator also uses the new layout which means that the target document is updated in real time as the form is completed.
To follow our work on GitHub, please see the IPTC NewsML-G2 GitHub repository.
The full NewsML-G2 change log showing the Change Requests included in each new version is available at the dev.iptc.org site.
IPTC members and our guests have just finished a very busy 2021 edition of our IPTC Autumn Meeting. Held online over three days, the meeting was a mix of IPTC Working Group presentations, members presenting recent projects, and invited guest speakers on important topics in the news and media world.
This year we heard member presentations from:
- Honor Craig-Bennett of the BBC reporting on the Images Digital Asset Management system, based on the Guardian’s open-source GRID system. We heard from Andy Read about this system
- Heather Edwards from Associated Press spoke about their project to replace their existing rules-based classification system
- Mark Milstein from Microstocksolutions spoke about a new project he is working on to create “synthetic media” AI-generated images and videos based on textual descriptions and metadata
- DATAGROUP Consulting Group’s Robert Schmidt-Nia spoke about a project using AWS’s Comprehend text classification service to power a serverless news classification system using IPTC’s Media Topics vocabulary
- Frameright‘s Marina Ekroos speaking about an EU stars4media project they are working on called “Artificial Intelligence in photojournalism: can it work?”
- Scott Yates from new Startup Member JournalList spoke about the trust.txt project, letting news providers state their affiliates and official social media channels in a simple way
- Bruce MacCormack from CBC / Radio Canada spoke about Project Origin, looking at authenticity for video and news media, passing requirements to the C2PA work
- The BBC‘s Charlie Halford spoke about C2PA, updating members with a deep technical view on how the system is planned to work, as detailed in the recently-released draft specification.
In addition, we heard from guest speakers:
- Keesiu Wong of Design AI spoke about the Videre AI project, looking at “next-generation video understanding”. He was joined by project partner Javier Picazo from Associate Member Agencia EFE, Spain’s national news agency.
- Alex Lakatos of Interledger spoke about the distributed payments technology which is used by…
- Uchi Uchibeke of Coil who use Interledger to implement micropayments which can be implemented on publisher websites by adding one line of HTML.
New standard versions
The Working Group presentations were also packed with content, in particular three new standard versions that were proposed to the Standards Committee:
- NewsML-G2 v2.30 adds fields for “residrefformat” and “residrefformaturi” to enable publishers to describe the format of a resource ID reference, and makes catalog and catalogRef optional to support publishers who only use URIs for controlled values and therefore have no need for catalogs
- The News in JSON Working Group’s ninjs v2.0 is a non-backwards-compatible new release which changes the way repeating values are handled, moving from patternProperties fields with arbitrary names such as “body_text” and “body_html” to arrays with fixed names such as “bodies”. The objects within the array elements include properties “role” and “contenttype” which take the place of the arbitrary extension to the “body_” tag.
- The IPTC Photo Metadata Standard v2021.1 adds new properties to IPTC Core which are intended to be used for accessibility purposes: “Alt Text (Accessibility)” and “Extended Description (Accessibility)”. We have also added and Event Identifier property to align with other metadata ID properties, and modified the Description Writer field to include the writer of the accessibility fields.
New faces
We were very happy to welcome new members Frameright, JournalList, Spotlight Sports Group, Glide Publishing Platform to the meeting.
The Standards Committee was chaired for the first time by new Chair Paul Harman of Bloomberg.
The AGM was the first for new Treasurer, Gerald Innerwinkler of Austria Press Agentur APA.
And we congratulate Philippe Mougin of Agence France-Presse AFP for being voted on to the IPTC Board of Directors, along with the existing Board members who were all re-elected.
It was another great meeting with over 70 representatives from 42 organisations in 17 different countries! We’re hoping that the next IPTC member meeting will be back to face-to-face, and we have provisionally booked Tallinn, Estonia for 16 – 18 May, 2022. We will confirm this in January 2022.
We are pleased to announce the release of the NewsML-G2 Generator, a simple tool to help understand the structure and layout of NewsML-G2 files.
To see how easy it can be to create a valid NewsML-G2 file, simply visit https://iptc.org/std/NewsML-G2/generator/, fill in the form and press the button labelled “Show content as NewsML-G2 2.29”.
Then the box below the form will be filled in with a valid NewsML-G2 document.
The tool demonstrates several key features of NewsML-G2:
- Adding copyright and rights information through the
<copyrightHolder/>
,<copyrightNotice/>
and<usageTerms/>
elements - Adding news-item metadata via the
<itemMeta>
container, such as<firstCreated/>
,<versionCreated/>
, item type (text, audio, video, graphic or composite, selected via a drop-down), publication status (usage, cancelled or withheld, selected via a drop-down) - Adding subject metadata using IPTC Media Topics, via a selection with all of the top-level categories enabled. Subjects are added using the
<subject/>
construct within the<contentMeta>
container. - Referring to the IPTC catalog that declares standard metadata vocabularies, using the
<catalogRef/>
tag - Adding the body content using embedded NITF. In the future, we will add a radio button so users can select whether to embed the news content using NITF or XHTML, which is the other common format used by IPTC members to mark up news content.
Your test content is never saved and only exists within your browser.
The source code of the generator is available in the NewsML-G2 GitHub repository.
This is a simple 1.0 version, and only scratches the surface of the capabilities of NewsML-G2. It is based on the successful ninjs generator used to demonstrate our ninjs standard, which was launched along with ninjs 1.3 earlier this year.
In the future, we are thinking of adding features such as:
- Switch between NITF and XHTML for the content body
- Demonstrate referring to images and video files using
<remoteContent/>
- Switch between using qcodes and URIs for metadata
- Demonstrate multiple language support in NewsML-G2
- Demonstrate usage of partMeta to show adding metadata to segments in files, such as audio and video
- Integrate the tool with the ninjs generator so users can switch between ninjs and NewsML-G2 with one click!
If you have any more ideas, please raise an issue on the GitHub repository, or contact us via the IPTC Contact Us form.
To learn more about NewsML-G2, the global standard used for distributing news content, see our introduction to NewsML-G2, or the NewsML-G2 Guidelines.