(Most of) the IPTC Board of Directors gathering outside The New York Times offices for the IPTC Spring Meeting 2024.

Last week, the IPTC Spring Meeting 2024 brought media industry experts together for three days in New York City to discuss many topics including AI, archives and authenticity.

Hosted by both The New York Times and Associated Press, over 50 attendees from 14 countries participated in person, with another 30+ delegates attending online.

As usual, the IPTC Working Group leads presented a summary of their most recent work, including a new release of NewsML-G2 (version 2.34, which will be released very soon); forthcoming work on ninjs to support events, planned news coverage and live streamed video; updates to NewsCodes vocabularies; more evangelism of IPTC Sport Schema; and further work on Video Metadata Hub, the IPTC Photo Metadata Standard and our emerging framework for a simple way to express common rights statements using RightsML.

We were very happy to hear many IPTC member organisations presenting at the Spring Meeting. We heard from:

  • Anna Dickson of recently-joined member Google talked about their work with IPTC in the past and discussed areas where we could collaborate in the future
  • Aimee Rinehart of Associated Press presented AP’s recent report on the use of generative AI in local news
  • Scott Yates of JournalList gave an update on the trust.txt protocol
  • Andreas Mauczka, Chief Digital Officer at Austria Press Agency APA presented on APA’s framework for use of generative AI in their newsroom
  • Drew Wanczowski of  Progress Software gave a demonstration of how IPTC standards can be implemented in Progress’s tools such as Semaphore and MarkLogic
  • Vincent Nibart and Geert Meulenbelt of new IPTC Startup Member Kairntech presented on their recent work with AFP on news categorisation using IPTC Media Topics and other vocabularies
  • Mathieu Desoubeaux of IPTC Startup Member IMATAG presented their work, also with AFP, on watermarking images for tracking and metadata retrieval purposes

In addition we heard from guest speakers:

  • Jim Duran of the Vanderbilt TV News Archive spoke about how they are using AI to catalog and tag their extensive archive of decades of broadcast news content
  • John Levitt of Elvex spoke about their system which allows media organisations to present a common interface (web interface and developer API) to multiple generative AI models, including tracking, logging, cost monitoring, permissions and other governance features which are important to large organisations using AI models.
  • Toshit Panigrahi, co-founder of TollBit spoke about their platform for “AI content licensing at scale”, allowing content owners to establish rules and monitoring around how their content should be licensed for both the training of AI models and for retrieval-augmented generation (RAG)-style on-demand content access by AI agents.
  • We also heard an update about the TEMS – Trusted European Media Data Space project. 

We were also lucky enough to take tours of the Associated Press Corporate Archive on Tuesday and the New York Times archive on Wednesday. Valierie Komor of AP Corporate Archives and Jeff Roth of The New York Times Archival Library (known to staffers as “the morgue”) both gave fascinating insights and stories about how both archives preserve the legacy of these historically important news organisations.

Brendan Quinn, speaking for Judy Parnall of the BBC, also presented an update of the recent work of C2PA and Project Origin and introduced the new IPTC Media Provenance Committee, dedicated to bringing C2PA technology to the news and media industry.

On behalf all attendees, we would like to thank The New York Times and Associated Press for hosting us, and especially to thank Jennifer Parrucci of The New York Times and Heather Edwards of The Associated Press for their hard work in coordinating use of their venues for our meeting.

The next IPTC Member Meeting will be the 2024 Autumn Meeting, which will be held online from Monday September 30th to Wednesday October 2nd, and will include the 2024 IPTC Annual General Meeting. The Spring Meeting 2025 will be held in Western Europe at a location still to be determined.

Paul Kelly speaking at the Sports Video Group's Content Management Forum in July 2023
Paul Kelly speaking at the Sports Video Group’s Content Management Forum in New York, July 2023

As we wrap up 2023, we thought it would be useful to give an update you on the IPTC’s work in 2023, including updates to most of our standards.

Two successful member meetings, one in person!

This year we finally held our first IPTC Member meeting in person since 2019, in Tallinn Estonia. We had around 30 people attend in person and 50 attended online from over 40 organisations. Presentations and discussions ranged from the e-Estonia digital citizen experience to building re-usable news content widgets with Web Components, and of course included generative AI, credibility and fact checking, and more. Here’s our report on the  IPTC 2023 Spring Meeting.

For our Autumn Meeting we went back to an online format, with over 50 attendees, and more watching the recordings afterwards (which are available to all members). Along with discussions of generative AI and content licensing at this year’s meetings, it was great to hear the real-world implementation experience of the ASBU Cloud project from the Arab States Broadcasting Union. The system was created by IPTC members Broadcast Solutions, based on NewsML-G2. The DPP Live Production Exchange, led by new members Arqiva, will be another real-world implementation coming soon. We heard about the project’s first steps at the Autumn Meeting.

Also at this years Autumn Meeting we also heard from Will Kreth of the HAND Identity platform and saw a demo of IPTC Sport Schema from IPTC member Progress Software (previously MarkLogic). More on IPTC Sport Schema below! All news from the Autumn Meeting is summed up in our post AI, Video in the cloud, new standards and more: IPTC Autumn Meeting 2023

We’re very happy to say that the IPTC Spring Meeting 2024 will be held in New York from April 15 – 17. All IPTC member delegates are welcome to attend the meeting at no cost. If you are not a member but would like to present your work at the meeting, please get in touch using our Contact Us form.

IPTC Photo Metadata Conference, 7 May 2024: save the date!

Due to several issues, we were not able to run a Photo Metadata Conference in 2023, but we will be back with an online Photo Metadata Conference on 7th May 2024. Please mark the date in your calendar!

As usual, the event will be free and open for anyone to attend.

If you would like to present to the people most interested in photo metadata from around the world, please let us know!

Presentations at other conferences and work with other organisations

IPTC was represented at the CEPIC Congress in France, the EBU DataTech Seminar in Geneva, Sports Video Group Content Management Forum in New York and the DMLA’s International Digital Media Licensing Conference in San Francisco.

We also worked with CIPA, the organisation behind the Exif photo metadata standard, on aligning Exif with IPTC Photo Metadata, and supported them in their work towards Exif 3.0 which was announced in June.

The IPTC will be advising the TEMS project which is an EU-funded initiative to build a “media data space” for Europe, and possibly beyond: IPTC working with alliance to build a European Media Data Space.

IPTC’s work on Generative AI and media

Of course the big topic for media in 2023 has been Generative AI. We have been looking at this topic for several years, since it was known as “synthetic media” and back in 2022 we created a taxonomy of “digital source types” that can be used to describe various forms of machine-generated and machine-assisted content creation. This was a joint effort across our NewsCodes, Video Metadata and Photo Metadata Working Groups.

AI-generated image of a cute robot sitting at a garden table sketching on a notepad.
Image created by Brendan Quinn using Bing Image Creator. This image file contains digitalsourcetype metadata which was added manually using exiftool.

It turns out that this was very useful, and the IPTC Digital Source Type taxonomy has been adopted by Google, Midjourney, C2PA and others as a way to describe content. Here are some of our news posts from 2023 on this topic:

IPTC’s work on Trust and Credibility

IPTC’s guidance on implementing trust and credibility indicators across IPTC standards such as NewsML-G2, ninjs, the IPTC Photo Metadata Standard and IPTC Video Metadata Hub.

After a lot of drafting work over several years, we released the Guidelines for Expressing Trust and Credibility signals in IPTC standards that shows how to embed trust infiormation in the form of “trust indicators” such as those from The Trust Project into content marked up using IPTC standards such as NewsML-G2 and ninjs. The guideline also discusses how media can be signed using C2PA specification.

We continue to work with C2PA on the underlying specification allowing signed metadata to be added to media content so that it becomes “tamper-evident”. However C2PA specification in its current form does not prescribe where the certificates used for signing should come from. To that end, we have been working with Microsoft, BBC, CBC / Radio Canada and The New York Times on the Steering Committee of Project Origin to create a trust ecosystem for the media industry. Stay tuned for more developments from Project Origin during 2024.

IPTC’s newest standard: IPTC Sport Schema

The Sport Schema website includes examples showing how typical sports results such as football/soccer, golf and olympic events can be represented in the IPTC Sport Schema model.

After years of work, the IPTC Sports Content Working Group released version 1.0 of IPTC Sport Schema. IPTC Sport Schema takes the experience of IPTC’s 10+ years of maintaining the XML-based SportsML standard and applies it to the world of the semantic web, knowledge graphs and linked data.

Paul Kelly, Lead of the IPTC Sports Content Working Group, presented IPTC Sport Schema to the world’s top sports media technologists: IPTC Sport Schema launched at Sports Video Group Content Management Forum.

Take a look at out dedicated site https://sportschema.org/ to see how it works, look at some demonstration data and try out a query engine to explore the data.

If you’re interested in using IPTC Sport Schema as the basis for sports data at your organisation, please let us know. We would be very happy to help you to get started.

Standard and Working Group updates

  • Our IPTC NewsCodes vocabularies had two big updates, the NewsCodes 2023-Q1 update and the NewsCodes Q3 2023 update. For our main subject taxonomy Media Topics, over the year we added 12 new concepts, retired 73 under-used terms, and modified 158 terms to make their labels and/or descriptions easier to understand. We also added or updated vocabularies such as Digital Source Type and Authority Status.
  • The News in JSON Working Group released ninjs 2.1 and ninjs 1.5  in parallel, so that people who cannot move from the 1.x schema can still get the benefits of new additions. The group is currently working on adding events and planning items to ninjs based on requirements the DPP Live Production Exchange project: expect to see something released in 2024.
  • NewsML-G2 2.32 and NewsML-G2 v2.33 were released this year, including support for Generative AI via the Digital Source Type vocabulary.
  • The IPTC Photo Metadata Standard 2023.1 allows rightsholders to express whether or not they are willing to allow their content to be indexed by search engines and data mining crawlers, and whether the content can be used as training data for Generative AI. This work was done in partnership with the PLUS Coalition. We also updated the IPTC Photo Metadata Mapping Guidelines to accommodate Exif 3.0.
  • Through discussions and workshops at our Member Meetings in 2022 and 2023, we have been working on making RightsML easier to use and easier to understand. Stay tuned for more news on RightsML in 2024.
  • Video Metadata Hub 1.5 adds the same properties to allow content to be excluded from generative AI training data sets. We have also updated the Video Metadata Hub Generator tool to generate C2PA-compliant metadata “assertions”.

New faces at IPTC

Ian Young of Alamy / PA Media Group stepped up to become the lead of the News in JSON Working Group, taking over from Johan Lindgren of TT who is winding down his duties but still contributes to the group.

We welcomed Bonnier News, Newsbridge, Arqiva, the Australian Broadcasting Corporation and Neuwo.ai as new IPTC members, plus a very well known name who will be joining at the start of 2024. We’re very happy to have you all as members!

We are always happy to work with more organisations in the media and related industries. If you would like to talk to us about joining IPTC, please complete our membership enquiry form.

Here’s to a great 2024!

Thanks to everyone who gave IPTC your support, and we look forward to working with you in the coming year.

If you have any questions or comments (and especially if you would like to speak at one of our events in 2024!), you can contact us via our contact form.

Best wishes,

Brendan Quinn
Managing Director, IPTC
and the IPTC Board of Directors: Dave Compton (LSE Group), Heather Edwards (The Associated Press), Paul Harman (Bloomberg LP), Gerald Innerwinkler (APA), Philippe Mougin (Agence France-Presse), Jennifer Parrucci (The New York Times), Robert Schmidt-Nia of DATAGROUP (Chair of the Board), Guowei Wu (Xinhua)

Paul Kelly speaking at the Sports Video Group's Content Management Forum in July 2023
Paul Kelly speaking about IPTC Sport Schema at the Sports Video Group’s Content Management Forum in New York, July 2023.

The IPTC Sports Content Working Group is happy to announce the release of IPTC Sport Schema version 1.0.

The first new IPTC standard to be released in more than 10 years, IPTC Sport Schema is a comprehensive model for the storage, transmission and querying of sports data. It has been tested on real-world use cases that are common in any newsroom or sports organisation.

IPTC Sport Schema has evolved from its predecessor SportsML. In contrast to the document-oriented nature of SportsML, IPTC Sport Schema takes a data-centric approach which is better suited to systems dealing with large volumes of data and also helps with integration across data sets.

“We reached out to many companies dealing with sports content and built up a clear picture of their needs,” says IPTC Sports Content Working Group lead Paul Kelly. “They wanted up-to-date formats, easy querying, the ability to handle e-sports and the ability to cross-reference between different media and data silos. IPTC Sport Schema addresses those requirements with a new basic model at the abstract end, and adhering to common use cases to keep things grounded.”

Content in Sports Schema is represented in the W3C’s universal Resource Description Framework (RDF), which renders any kind of data as a triple in the form of subject->predicate->object. Each component of a Sports Schema triple has a reference to an ontology, which defines the model at the heart of the standard. Querying is done using the W3C’s SPARQL standard, a kind of SQL for RDF.

Schema diagram for IPTC Sport Schema, showing the entities and the relationships between them.
The schema diagram for IPTC Sport Schema, showing the entities and the relationships between them. For more information see www.sportschema.org.

“The IPTC has been working on RDF and semantic web standards for more than 10 years, going back to rNews and RightsML,” said IPTC Managing Director Brendan Quinn. “So we are very happy to release another semantic standard that can help organisations to publish and share sports data in a vendor-neutral, interoperable way.”

Being RDF-based, IPTC Sport Schema can be rendered in XML, JSON and the simple Turtle format, and can be converted easily between all three formats using free tools such as Apache Jena.

“Those familiar with SportsML or SportsJS should recognise the basic components of Sport Schema,” says Kelly, “both in the ontology and in the sports vocabularies introduced with SportsML 3.0, which were designed specifically with semantic technologies in mind.”

To support take-up and share information about the new standard, the IPTC has created a dedicated website, sportschema.org. The site contains:

Those wishing to try out some SPARQL queries against some sports data should visit Sport Schema’s query endpoint. It includes example queries showing how to build a team roster, league standings and more from our sample data sets.

For more information on IPTC Sport Schema, see the IPTC’s landing pages on the IPTC Sport Schema standard, the standalone site sportschema.org, or the project’s GitHub repository.

If you are interested in joining those who are working on implementing IPTC Sport Schema in your project or your organisation, we would love to hear from you. Please contact us via IPTC’s contact form.

Paul Kelly speaking at the Sports Video Group's Content Management Forum in July 2023
Paul Kelly speaking at the Sports Video Group’s Content Management Forum in New York, July 2023

The video recording of IPTC’s presentation at the Sports Video Group Content Management Forum has now been released.

Paul Kelly, Lead of the IPTC Sports Content Working Group, gave a live presentation introducing the IPTC Sport Schema to participants at the event, held in New York City in July 2023.

Many of those participants have helped with the development and testing of the new model for sports data, including PGA Tour, NBA, NFL, NHL and more.

The full video is now available on SVG’s on-demand video platform, SVG Play.

In the presentation, Paul describes among other things:

  • the motivation for the new model
  • how it is different from IPTC’s existing sports standard SportsML
  • how it can handle sports from tennis to athletics to football to golf and more
  • how it might be used by broadcasters and sports data providers to attach sports data to video and other forms of media content

The Sports Content Working Group is now putting the final touches to the schema and its supporting documentation before it is put to the IPTC Standards Committee to be turned into an official IPTC standard.

Watch the full video here.

A screenshot of an example page from sportschema.org showing how IPTC Sport Schema can be used to represent an Olympic Athletics event.

NEW YORK, NY, 26 JULY 2023: The IPTC today announced the beginning of a public feedback and review period of IPTC Sport Schema, which aims to be “the standard for the next generation of sports data.”

The announcement was made by Paul Kelly, Lead of the IPTC Sports Content Working Group, at the Sports Video Group’s Content Management Forum held at 230 Fifth Penthouse, New York.

“The SVG Content Management Forum is attended by senior tech experts from sports broadcasters and sports leagues from the US and around the world, so it is the perfect place to launch the IPTC Sport Schema,” said Kelly. “Many members of SVG have advised us on our work so far, including organisations such as Warner Bros Discovery, NBC Universal, PGA TOUR, Major League Baseball and Riot Games. Presenting our work at their event is a great way to say thanks for their help.”

While not yet an official IPTC standard, the IPTC Sports Content Working Group feels that the schema describing IPTC Sport Schema is solid enough to be published for public feedback.

Sports data for the era of linked data and knowledge graphs

The purpose of the IPTC Sport Schema project is to create a new RDF-based sports data standard, while making the most of the experience the IPTC has gained from the last 20 years of maintaining SportsML, the open XML-based sports data standard used by news and sports organisations around the world.

Another screenshot from sportschema.org showing the full ontology diagram, a generic model that can be used to represent athletes and teams, various competition structures, results and statistics across many sports.
Another screenshot from sportschema.org showing the full ontology diagram, a generic model that can be used to represent athletes and teams, various competition structures, results and statistics across many sports.

While XML served the industry well for many years, more recently developers and IPTC members have asked the Sports Content Working Group whether a standard would become available in a more modern serialisation format such as JSON, and whether knowledge graph protocols would be supported.

Because it is based on the W3C-standard RDF and OWL specifications, IPTC Sport Schema leverages the wide range of tools and expertise in the world of knowledge graphs, semantic web and linked open data, including the SPARQL query language, the JSON-LD serialisation into JSON format, inference using RDF Schema and OWL, and more.

“Using IPTC Sport Schema, sports leagues can choose to own their data,” said IPTC Managing Director Brendan Quinn. “Content publishers or sports leagues can publish open data on their website if they choose, in a way that can be re-mixed and re-used by others around the world.” IPTC Sport Schema can also be used for a more traditional model of aggregation and syndication by sports statistics providers who add value to the raw data being collected by sports leagues.

Like its ancestor SportsML, IPTC Sport Schema is created as a generic sports data model that can represent results, statistics, schedules and rosters across many sports. “Plugins” for specific sports extend the generic schema with specific statistics elements for 10 sports such as soccer, motor racing, tennis, rugby and esports. But the generic model can be used to handle any competitive sports competition, either team-based, head-to-head or individual.

As well as IPTC’s SportsML standard, the project is based on previous work by the BBC on its BBC Sport Ontology (some of its creators worked on this project). We have also consulted with and analysed related projects and formats such as OpenTrack and the IOC’s Olympics Data Feed format.

For more information on IPTC Sport Schema, please see the dedicated site sportschema.org, the project’s GitHub repository

Those who are interested in the details can see an introduction to the IPTC Sport Schema ontology design, the full ontology diagram or full RDF/OWL ontology documentation

There may be significant changes to the schema between now and when it is released as a fully endorsed IPTC Standard, so we don’t recommend that it is implemented in production systems yet. But we welcome analysis and experimentation with the model, and look forward to seeing feedback from those who would like to implement it in the real world.

People and organisations who are not IPTC members can give feedback by posting to the IPTC SportsML public discussion group or use the IPTC Contact Us form.

Brendan Quinn at EBU DataTech Summit 2023.

IPTC Managing Director Brendan Quinn presented at the European Broadcasting Union’s Data Technology Seminar last week.

The DataTech Seminar, known in previous years as the Metadata Developers Network, brought over 100 technologists together in person in Geneva to discuss topics related to managing data at broadcasters in Europe and around the world.

Brendan spoke on Tuesday 21st March on a panel discussing Artificial Intelligence and the Media. Brendan used the opportunity to discuss IPTC’s current work on “do not train” signals in metadata, and on establishing best practices for how AI tools can embed metadata indicating the origin of their media.

The work of C2PA, Project Origin and Content Authenticity Initiative on addressing content provenance and tamper-evident media was also highlighted by Brendan during the panel discussion, as this relates to the prevalence of “deepfake” content that can be created by generative AI engines.

On Wednesday 22nd March, Brendan spoke in lieu of Paul Kelly, lead of the IPTC Sports Content Working Group about the IPTC SportSchema project. The session was called “IPTC Sport Schema – the next generation of sports data.” An evolution of IPTC’s SportsML standard, IPTC Sport Schema brings our 20 years of experience in sports data markup to the world of Knowledge Graphs and the Semantic Web. The specification is coming close to a version 1, so we were very proud to present it to some of the world’s top broadcasters and industry players.

The IPTC SportSchema site sportschema.org now includes comprehensive documentation of the ontology behind sports data model, examples of how it can be queried using SPARQL, example data files and instance diagrams showing how it can be used to represent common sports such as athletics, soccer, golf and hockey.

We look forward to discussing IPTC Sport Schema much more over the coming months, as we draw close to its general release.

EBU members can watch the full presentation at the EBU.ch site.