Paul Kelly speaking at the Sports Video Group's Content Management Forum in July 2023
Paul Kelly speaking about IPTC Sport Schema at the Sports Video Group’s Content Management Forum in New York, July 2023.

The IPTC Sports Content Working Group is happy to announce the release of IPTC Sport Schema version 1.0.

The first new IPTC standard to be released in more than 10 years, IPTC Sport Schema is a comprehensive model for the storage, transmission and querying of sports data. It has been tested on real-world use cases that are common in any newsroom or sports organisation.

IPTC Sport Schema has evolved from its predecessor SportsML. In contrast to the document-oriented nature of SportsML, IPTC Sport Schema takes a data-centric approach which is better suited to systems dealing with large volumes of data and also helps with integration across data sets.

“We reached out to many companies dealing with sports content and built up a clear picture of their needs,” says IPTC Sports Content Working Group lead Paul Kelly. “They wanted up-to-date formats, easy querying, the ability to handle e-sports and the ability to cross-reference between different media and data silos. IPTC Sport Schema addresses those requirements with a new basic model at the abstract end, and adhering to common use cases to keep things grounded.”

Content in Sports Schema is represented in the W3C’s universal Resource Description Framework (RDF), which renders any kind of data as a triple in the form of subject->predicate->object. Each component of a Sports Schema triple has a reference to an ontology, which defines the model at the heart of the standard. Querying is done using the W3C’s SPARQL standard, a kind of SQL for RDF.

Schema diagram for IPTC Sport Schema, showing the entities and the relationships between them.
The schema diagram for IPTC Sport Schema, showing the entities and the relationships between them. For more information see www.sportschema.org.

“The IPTC has been working on RDF and semantic web standards for more than 10 years, going back to rNews and RightsML,” said IPTC Managing Director Brendan Quinn. “So we are very happy to release another semantic standard that can help organisations to publish and share sports data in a vendor-neutral, interoperable way.”

Being RDF-based, IPTC Sport Schema can be rendered in XML, JSON and the simple Turtle format, and can be converted easily between all three formats using free tools such as Apache Jena.

“Those familiar with SportsML or SportsJS should recognise the basic components of Sport Schema,” says Kelly, “both in the ontology and in the sports vocabularies introduced with SportsML 3.0, which were designed specifically with semantic technologies in mind.”

To support take-up and share information about the new standard, the IPTC has created a dedicated website, sportschema.org. The site contains:

Those wishing to try out some SPARQL queries against some sports data should visit Sport Schema’s query endpoint. It includes example queries showing how to build a team roster, league standings and more from our sample data sets.

For more information on IPTC Sport Schema, see the IPTC’s landing pages on the IPTC Sport Schema standard, the standalone site sportschema.org, or the project’s GitHub repository.

If you are interested in joining those who are working on implementing IPTC Sport Schema in your project or your organisation, we would love to hear from you. Please contact us via IPTC’s contact form.

"A reel of film unspooling and transforming into a stream of binary digits"
Made with Bing Image Creator. Powered by DALL-E.
“A reel of film unspooling and transforming into a stream of binary digits”
Made with Bing Image Creator. Powered by DALL-E.

Following the IPTC’s recent announcement that Rights holders can exclude images from generative AI with IPTC Photo Metadata Standard 2023.1 , the IPTC Video Metadata Working Group  is very happy to announce that the same capability now exists for video, through IPTC Video Metadata Hub version 1.5.

The “Data Mining” property has been added to this new version of IPTC Video Metadata Hub, which was approved by the IPTC Standards Committee on October 4th, 2023. Because it uses the same XMP identifier as the Photo Metadata Standard property, the existing support in the latest versions of ExifTool will also work for video files.

Therefore, adding metadata to a video file that says it should be excluded from Generative AI indexing is as simple as running this command in a terminal window:

exiftool -XMP-plus:DataMining="Prohibited for Generative AI/ML training" example-video.mp4

(Please note that this will only work in ExifTool version 12.67 and above, i.e. any version of ExifTool released after September 19, 2023)

The possible values of the Data Mining property are listed below:

PLUS URI Description (use exactly this text with ExifTool)

http://ns.useplus.org/ldf/vocab/DMI-UNSPECIFIED

Unspecified – no prohibition defined

http://ns.useplus.org/ldf/vocab/DMI-ALLOWED (Allowed)

Allowed

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-AIMLTRAINING

Prohibited for AI/ML training

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-GENAIMLTRAINING

Prohibited for Generative AI/ML training

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-EXCEPTSEARCHENGINEINDEXING

Prohibited except for search engine indexing

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED

Prohibited

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT

Prohibited, see plus:OtherConstraints

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR

Prohibited, see iptcExt:EmbdEncRightsExpr

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEELINKEDRIGHTSEXPR

Prohibited, see iptcExt:LinkedEncRightsExpr

A corresponding new property “Other Constraints” has also been added to Video Metadata Hub v1.5. This property allows plain-text human-readable constraints to be placed on the video when using the “Prohibited, see plus:OtherConstraints” value of the Data Mining property.

The Video Metadata Hub User Guide and Video Metadata Hub Generator have also been updated to include the new Data Mining property added in version 1.5.

We look forward to seeing video tools (and particularly crawling engines for generative AI training systems) implement the new properties.

Please feel free to discuss the new version of Video Metadata Hub on the public iptc-videometadata discussion group, or contact IPTC via the Contact us form.

The IPTC NewsML-G2 Working Group and the News Architecture Working Group are happy to announce the release of the latest version of our flagship XML-based news syndication standard: NewsML-G2 v2.33.

Changes in the latest version are small but significant. We have added support for the Digital Source Type property which is already being used in IPTC’s sister standards IPTC Photo Metadata Standard and IPTC Video Metadata Hub and ninjs. This property can be used to declare when content has been created or modified by software, including by Generative AI engines.

Examples of other possible values for the digital source type property using the recommended IPTC Digital Source Type NewsCodes vocabulary are:

ID (in QCode format) Name Example
digsrctype:digitalCapture Original digital capture sampled from real life:

The digital media is captured from a real-life source using a digital camera or digital recording device

Digital video taken using a digital film, video or smartphone camera

digsrctype:negativeFilm Digitised from a negative on film:

The digital image was digitised from a negative on film on any other transparent medium

Digital photo scanned from a photographic negative

digsrctype:minorHumanEdits Original media with minor human edits:

Minor augmentation or correction by a human, such as a digitally-retouched photo used in a magazine

Original audio with minor edits (e.g. to eliminate breaks)

digsrctype:algorithmicallyEnhanced Algorithmic enhancement:
Minor augmentation or correction by algorithm

A photo that has been digitally enhanced using a mechanism such as Google Photos’ “denoise” feature

digsrctype:dataDrivenMedia Data-driven media:
Digital media representation of data via human programming or creativity

Textual weather report generated by code using readings from weather detection instruments

digsrctype:trainedAlgorithmicMedia Trained algorithmic media:
Digital media created algorithmically using a model derived from sampled content

A “deepfake” video using a combination of a real actor and a trained model

 

The above list is a subset of the full list of recommended values. See the full IPTC Digital Source Type NewsCodes vocabulary for the complete list.

Guidance on using Digital Source Type

The IPTC Photo Metadata User Guide contains a section on Guidance for using Digital Source Type including examples for various types of media, including images, video, audio and text. The examples referenced in this guide can also apply to NewsML-G2 content.

Where Digital Source Type can be used in NewsML-G2 documents

The new <digitalSourceType> property can be added to the contentMeta section of any G2 NewsItem, PackageItem, KnowledgeItem, ConceptItem or PlanningItem to describe the digital source type of an item in its entirety.

It can also be used in the partMeta section of any G2 NewsItem, PackageItem or KnowledgeItem to describe the digital source type of a part of the item. In this way, content such as a video that includes some captured shots and AI-generated shots can be fully described using NewsML-G2.

Find out more about NewsML-G2 v2.33

All information related to NewsML-G2 2.33 is at https://iptc.org/std/NewsML-G2/2.33/.

The NewsML-G2 Specification document has been updated to cover the new version 2.33.

Example instance documents are at https://iptc.org/std/NewsML-G2/2.33/examples/

Full XML Schema documentation is located at https://iptc.org/std/NewsML-G2/2.33/specification/XML-Schema-Doc-Power/

XML source documents and unit tests are hosted in the public NewsML-G2 GitHub repository.

The NewsML-G2 Generator tool has also been updated to produce NewsML-G2 2.33 files using the version 38 catalog.

For any questions or comments, please contact us via the IPTC Contact Us form or post to the iptc-newsml-g2@groups.io mailing list. IPTC members can ask questions at the weekly IPTC News Architecture Working Group meetings.

Dynamic fountains out of the Drau river in Villach, Carinthia, Austria (Europe). This image contains the new Data Mining property. Clicking on the image will show the metadata as extracted by IPTC’s online Get Photo Metadata tool.

Updated in June 2024 to include an image containing the new metadata property

Many image rights owners noticed that their assets were being used as training data for generative AI image creators, and asked the IPTC for a way to express that such use is prohibited. The new version 2023.1 of the IPTC Photo Metadata Standard now provides means to do this: a field named “Data Mining” and a standardised list of values, adopted from the PLUS Coalition. These values can show that data mining is prohibited or allowed either in general, for AI or Machine Learning purposes or for generative AI/ML purposes. The standard was approved by IPTC members on 4th October 2023 and the specifications are now publicly available.

Because these data fields, like all IPTC Photo Metadata, are embedded in the file itself, the information will be retained even after an image is moved from one place to another, for example by syndicating an image or moving an image through a Digital Asset Management system or Content Management System used to publish a website. (Of course, this requires that the embedded metadata is not stripped out by such tools.)

Created in a close collaboration with PLUS Coalition, the publication of the new properties comes after the conclusion of a public draft review period earlier this year. The properties are defined as part of the PLUS schema and incorporated into the IPTC Photo Metadata Standard in the same way that other properties such as Copyright Owner have been specified.

The new properties are now finalised and published. Specifically, the new properties are as follows:

The IPTC and PLUS Consortium wish to draw users attention to the following notice included in the specification:

Regional laws applying to an asset may prohibit, constrain, or allow data mining for certain purposes (such as search indexing or research), and may overrule the value selected for this property. Similarly, the absence of a prohibition does not indicate that the asset owner grants permission for data mining or any other use of an asset.

The prohibition “Prohibited except for search engine indexing” only permits data mining by search engines available to the public to identify the URL for an asset and its associated data (for the purpose of assisting the public in navigating to the URL for the asset), and prohibits all other uses, such as AI/ML training.

The IPTC encourages all photo metadata software vendors to incorporate the new properties into their tools as soon as possible, to support the needs of the photo industry.

ExifTool, the command-line tool for accessing and manipulating metadata in image files, already supports the new properties. Support was added in the ExifTool version 12.67 release, which is available for download on exiftool.org.

The new version of the specification can be accessed at https://www.iptc.org/std/photometadata/specification/IPTC-PhotoMetadata or from the navigation menu on iptc.org. The IPTC Get Photo Metadata tool and IPTC Photo Metadata Reference images been updated to use the new properties.

The IPTC and PLUS Coalition wish to thank many IPTC and PLUS member organisations and others who took part in the consultation process around these changes. For further information, please contact IPTC using the Contact Us form.

Daniel Lynch of Arqiva talks about the DPP Live Production Exchange project which will be based on the EventsML schema, now part of NewsML-G2, and the ninjs standard.
Daniel Lynch of Arqiva talks about the DPP Live Production Exchange project which will be based on the EventsML schema, now part of NewsML-G2, and the ninjs standard.

Last week the IPTC held another very successful member meeting. The 2023 IPTC Autumn Meeting, held virtually this time, had well over 50 attendees from over 30 organisations in at least 15 different countries.

Highlights included Standards Committee approval of a brand new standard, IPTC Sport Schema v1.0, plus the approval of three new versions of existing standards: NewsML-G2 v2.33, IPTC Photo Metadata Standard 2023.1, and Video Metadata Hub v1.5. We will be publishing more information about each of these updates over the coming weeks.

We also heard from two real-world projects in the broadcast industry. One has just completed and the other is still in its planning stages, but both are based on IPTC standards. The ASBU Cloud system was presented by IPTC member Broadcast Solutions, and the DPP Live Production Exchange was presented by new IPTC member Arqiva.

We also heard guest presentation from Will Kreth of the HAND talent identity platform, the latest on C2PA and Project Origin, had a demo of IPTC Sport Schema using MarkLogic from Progress Software (also an IPTC member).

All Working Groups presented their recent work, which included interesting discussions about proposed new NewsCodes vocabularies and how to address the needs of Artificial Intelligence across all of our standards, work that has been going on for years but now acquires a new urgency.

At the IPTC Annual General Meeting 2023, we voted in the existing Board of Directors for another term and and heard updates about all the great work that has taken place at IPTC this year.

It was a great event and we are already looking forward to the next member meeting, to be held in person in New York in April 2024!

 

Screenshot of the change to the Media Topic tree browser tool, showing information icons where terms have had notes added.
Screenshot of the change to the Media Topic tree browser tool, showing information icons where terms have had notes added.

The IPTC News Codes Working Group has just released a new batch of changes to the IPTC NewsCodes family of controlled vocabularies.

Note that we skipped the Q2 update this year because there weren’t many changes, and also because there were already so many changes in Q1 of this year.

Media Topic changes

Here’s a summary of changes to Media Topic vocabulary:

Change to Media Topic tree browser

We have made a small change to the Media Topic tree browser tool: we now display a small “i” icon next to the label name for terms that have notes defined.

The terms that have notes are usually retired terms, and the note gives the user information regarding which terms should be used instead of the retired term. But in other cases notes are used to help explain changes or clarify usage.

Changes to other vocabularies

Other vocabularies have also been updated:

  • Content Production Party Role sees two new terms, contentEditor and metadataEditor, that can be used to show changes made by humans or systems (such as AI engines)
  • Format had a small change to indicate that it is not just for NewsML 1 documents.
  • User Action Type had a small bug fix, changed references to Twitter / X and retired Google Plus as a term. More changes will be coming soon covering other social media platforms and ways to track user interactions with media content.
  • The rendition CV has been updated to make it more generic – renditions can apply to any type of media, not just images and video.
  • The digitalsourcetype CV had already been updated in July to handle inpainting and outpainting but we mention it again here as a reminder.

Thanks to the representatives from IPTC members AFP, NTB, Bonnier News, ABC Australia, Bloomberg, New York Times and Associated Press for their contributions to the changes this quarter via the NewsCodes Working Group.

We are still working on our regular review of Media Topics – currently we are in the middle of a review of the Economy branch. The review is not yet complete but we hope for it to be ready for the Q4 or Q1 update.

The IPTC is pleased to announce the final agenda for next week’s IPTC Autumn Meeting.

Held online via Zoom from 13.00 to 18.00 UTC from Monday to Wednesday, the meeting will include:

Monday 2 October

  • Working Group presentations:
    • Video Metadata Working Group who will be proposing a new version of Video Metadata Hub handling metadata specifying whether content can be included in AI training data and other forms of data mining
    • Sports Content Working Group who will be proposing the 1.0 version of the new IPTC Sport Schema standard
  • Member presentations:
    • Demo of how to implement IPTC Sport Schema in MarkLogic and Semaphore from Progress Software
  • Guest presentations:
    • Update from the HAND project: a new approach to identity in media and entertainment

Tuesday 3 October

  • Working Group presentations:
    • NewsML-G2 Working Group who will be proposing a new version of NewsML-G2 that includes Digital Source Type metadata to bring it in line with other IPTC standards
    • Photo Metadata Working Group who will be proposing a new version of the IPTC Photo Metadata Standard handling metadata specifying whether content can be included in AI training data and other forms of data mining
    • NewsCodes Working Group (including a discussion of a proposed new vocabulary on “editorial tone”)
  • Member presentations:
    • Broadcast Solutions presenting the ASBU Cloud project, based on NewsML-G2
    • New member Newsbridge presenting how their systems implement IPTC standards
  • Guest presentations:
    • Update on Project Origin and C2PA, featuring the CEO of Media City Bergen, Helge Svela

Wednesday 4 October

  • IPTC Annual General Meeting 2023
    • Election of IPTC Board of Directors and Chair
    • Approval of budget for 2023
    • Updates from the Managing Director and Chair of the Board of Directors
  • IPTC Autumn 2023 Standards Committee meeting
    • Votes on new versions of NewsML-G2Photo Metadata StandardVideo Metadata Hub and presenting proposed new standard IPTC Sport Schema
  • Working Group presentations:
    • News in JSON Working Group
  • Working Group and member presentations:
    • Discussion on a simple rights format
    • DPP Live Production Exchange (LPX) project from CNN and new member Arqiva
  •  

IPTC member representatives can view the full schedule on the IPTC Autumn Meeting 2023 event page in the Members-Only Zone.

Users working on the ASBU Cloud system in the office. Credit: ASBU
Users working on the ASBU Cloud system in the office. Image credit: ASBU

This weekend at the IBC broadcast industry event in Amsterdam, the Arab States Broadcasting Union (ASBU) will launch its news exchange network, ASBU Cloud, with IPTC’s NewsML-G2 standard at its core.

Developed for ASBU by IPTC member Broadcast Solutions, a systems integrator in the broadcast industry, ASBU Cloud uses NewsML-G2 to distribute content to partners.

After evaluating several metadata formats, ASBU chose to implement NewsML-G2 as a metadata schema and worked with IPTC to implement the standard. This ensures that ASBU content can be used easily by other international organisations like the European Broadcasting Union (EBU) and Asia-Pacific Broadcasting Union (ABU).

Broadcast Solutions System Architect Jean-Christophe Liechti explained the use of NewsML-G2 in an interview with Broadcast Pro magazine: “This XML-based standard for news exchange was developed and is maintained by the International Press Telecommunications Council (IPTC). It’s a successor to the original NewsML format and it provides a comprehensive and flexible framework for distributing any type of media, including text, images, audio and video. This metadata standard is language-agnostic. You can use standard dictionaries or manage your own to structure your data. We reached out to IPTC to ensure that our implementation closely met the standard. ASBU exchanges are now available as a NewsML-G2 feed like partner organisations like the EBU or major news organisations like Reuters, AP or AFP.”

The project is also based on Amazon Web Services, the Dalet Flex media asset management system, and uses innovative systems like the AI video metadata extraction engine Newsbridge (IPTC’s newest member).

The project will be launched at the IBC event in Amsterdam this Sunday, 17 September at 12.00. The launch will take place at the Broadcast Solutions outdoor booth 0.A01 located across from Hall 13 of the RAI exhibition centre.

Read more about ASBU Cloud at Broadcast Pro Middle East or contact IPTC if you’re interested in using NewsML-G2 in your own projects.

Paul Kelly speaking at the Sports Video Group's Content Management Forum in July 2023
Paul Kelly speaking at the Sports Video Group’s Content Management Forum in New York, July 2023

The video recording of IPTC’s presentation at the Sports Video Group Content Management Forum has now been released.

Paul Kelly, Lead of the IPTC Sports Content Working Group, gave a live presentation introducing the IPTC Sport Schema to participants at the event, held in New York City in July 2023.

Many of those participants have helped with the development and testing of the new model for sports data, including PGA Tour, NBA, NFL, NHL and more.

The full video is now available on SVG’s on-demand video platform, SVG Play.

In the presentation, Paul describes among other things:

  • the motivation for the new model
  • how it is different from IPTC’s existing sports standard SportsML
  • how it can handle sports from tennis to athletics to football to golf and more
  • how it might be used by broadcasters and sports data providers to attach sports data to video and other forms of media content

The Sports Content Working Group is now putting the final touches to the schema and its supporting documentation before it is put to the IPTC Standards Committee to be turned into an official IPTC standard.

Watch the full video here.

Screenshot of the IPTC wiki page showing how to read and write IPTC Photo Metadata in JavaScript.
Screenshot of the IPTC wiki page showing how to read and write IPTC Photo Metadata in JavaScript.

We at IPTC receive many requests for help and advice regarding editing embedded photo and video metadata, and this has only increased with the recent news about the IPTC Digital Source Type property being used to identify content created by a generative AI engine.

In response, we have created some guidance: Developers’ and power users’ guide to reading and writing IPTC Photo Metadata 

This takes the form of a wiki, so that it can be easily maintained and extended with more information and examples.

In its initial form, the documentation focuses on:

In each guide, we advise on how to read and create DigitalSourceType metadata for generative AI images, and also how to read and write the Creator, Credit Line, Web Statement of Rights and Licensor information that is currently used by Google image search to expose copyright information alongside search results.

Showing how IPTC metadata properties are used in Google Images search results.

We hope that these guides will help to demystify image metadata and encourage more developers to include more metadata in their image editing and publishing workflows.

We will add more guidance over the coming months in more programming languages, libraries and frameworks. Of particular interest are guides to reading and writing IPTC Photo Metadata in PHP, C and Rust.

Contributions and feedback are welcome. Please contact us if you are interested in contributing.