Close-up screenshot of Pinterest’s label for AI-generated content.
As reported in Social Media Today, Pinterest has started using IPTC embedded Photo Metadata to signal when content in “Image Pins” has been generated by AI.
Reports started in February that Pinterest had started labelling AI-generated images. Now it has been confirmed via an official update to Pinterest’s user documentation.
In the Pinterest documentation, a new section has recently been added that describes how it works:
Screenshot of Pinterest’s help pages showing how IPTC metadata is used to signal AI-generated content.
“Pinterest may display a label in the foreground of an image Pin when we detect that it has been generated or modified with AI. This is in accordance with IPTC standard for photo metadata. We’re working on ways to expand our capabilities to better identify GenAI content in the future through additional technologies.”
On behalf of our memberships, IPTC and PLUS respectfully suggest that existing UK copyright law is sufficient to enable licensing of content to AI platforms. There is no “fair use” provision in UK copyright law, and “fair dealing” does not cover commercial AI training. Existing copyright law should be enforced.
IPTC and PLUS Photo Metadata provide a technical means for expressing the creator’s intent as to whether their creations may be used in generative AI training data sets. This takes the form of metadata embedded in image and video files. This solution, in combination with other solutions such as the Text and Data Mining Reservation Protocol, could take the place of a formal licence agreement between parties, making an opt-in approach technically feasible and scalable.
It is true that our technical solutions would also be relevant if the UK government chooses to implement an “opt-out” approach similar to that adopted in the EU. However, an opt-out-based approach does not currently protect owners’ rights well, due to the routine activity of “metadata stripping” – removing important rights and accessibility metadata that is embedded in media files, in the misguided belief that it will improve site performance. Metadata stripping is performed by many publishers and publishing systems – often inadvertently. (See our research on metadata stripping by social media platforms from 2019; very little has changed since then)
As a result, we can only recommend that the UK adopts an opt-in approach. We request that the UK ensures metadata embedded in media files be declared as a core part of any technical mechanism to declare content owner’s desire for content to be included or excluded from training data sets.
During the course of this consultation, it has become clear that content creators are a core part of the UK economy and have a strong voice. We agree with their position, but we don’t simply come with another voice of complaint: we bring a viable, ready-made technical solution that can be used today to implement true opt-in data mining permissions and reservations.
The Sport Schema schema diagram showing changes from version 1.0 in yellow. For full detail, see https://sportschema.org/schema-diagram/
The IPTC Sports Content Working Group is proud to release version 1.1 of IPTC Sport Schema.
Documented at the dedicated site sportschema.org, Sport Schema is IPTC’s semantic web (RDF) based ontology for describing sports listings, results, statistics and even play-by-play actions during any kind of sports event.
Version 1.1 adds the following new features:
We add a Club type, which can handle the organisation that hosts one or more teams of varying types, possibly across various sports. (Did you know that Bayern Munich has not just the famous men’s, women’s and junior football/soccer teams, but also basketball, handball, table tennis and even chess teams!)
We also add a TeamMembership type, so a Team can be a member of a Club (and could theoretically move from one Club to another).
We add support for the concept of “sports facets” that we originated in SportsML. (based on SportsML / NewsCodes facets) so we can now say that an event is “women’s 400 metres relay swimming”, not just “swimming”. The boxing example shows that the weight class of the event is “welterweight”.
Added the ability to link from Athlete to Team via a new teamParticipation property.
Add an AssociateMembership type so an Associate (such as a coach) can have a tenure relating to any Agent, including an Athlete or a Team. Previously Associates were linked to Teams via Participation objects which wasn’t satisfactory.
Expanded and added to examples including a Boxing example showing how the new AssociateMembership type can be used to represent a coach or manager of an individual athlete (in this case a boxer)
Added new golf ontology taking some properties from SportsML and some from the Golf vocabularies in IPTC NewsCodes. We have merged them together in Sport Schema for ease of use.
Many cleanups to the SHACL Shapes used for validation of data.
Please take a look at Sport Schema and let us know what you think! We would love to hear about Sport Schema being implemented in real-world projects. Please contact IPTC using the Contact Us form or via the public discussion list at groups.io/g/iptc-sportsml/
2024 has flown by, but it was a great year for IPTC.
The highlight was definitely the creation of the IPTC’s first new Committee in many years – the Media Provenance Committee, with its working groups for Advocacy and Education, Best Practices and Implementation, and Governance.
Thanks to our Committee Chair Bruce MacCormack (and previous Chair Judy Parnall), and Working Group leads Helge O. Svela, Laura Ellis and Charlie Halford for all their hard work.
The Committee’s latest milestone is the updating of its governance guidelines to make it easier for Verified News Publishers to obtain certificates. More information will be announced very soon.
Our other Working Groups under the Standards Committee have not been idle – this year has seen new versions of NewsML-G2, ninjs 3.0, Photo Metadata 2024.1, several updates to our NewsCodes vocabularies, including updates to the widely-used Digital Source Type vocabulary, and a forthcoming update to Sport Schema.
Our standards are being used by many of the biggest organisations in the world. Google uses IPTC metadata for AI transparency (among other things). Axel Springer spoke at our Autumn Meeting about how they use Video Metadata Hub to manage their Video On Demand system. AFP and Kairntech talked about how they auto-classify content using Media Topics. And many more of course!
We submitted position papers to the IETF workshop on AI Control and to the EU’s consultation on the AI Act, making the case for respecting embedded metadata in determining whether media files can be used as training data for AI engines.
We had many new members join: Google, Finnish Broadcasting Company Yle, HAND (Human & Digital), China Association of Press Technicians, Kairntech, DW (Deutsche Welle), Factiverse, Media Cluster Norway, RNZ, John Simmons. European Broadcasting Union (upgraded to Voting Membership), and Trufo. We’re very happy to have you all onboard!
We had two amazing member meetings: one physical meeting in New York (thanks to The New York Times and The Associated Press for hosting us) and one virtual meeting in the Autumn. Our attendees especially loved the guides tours of the NYT and AP archives. We’re already planning the next Spring Meeting which will be held in Juan-les-Pins, France in May 2025.
The IPTC Photo Metadata Conference in May saw Adobe, Camera Bits, Foto Forensics, Numbers Protocol and others present many aspects of photo metadata to celebrate 20 years of the IPTC Photo Metadata Standard.
We gave presentations about IPTC standards in the Netherlands, UK, Brazil, USA and France – and probably other places too!
At the IPTC Autumn Meeting, the IPTC Standards Committee voted on a change proposed by the Photo Metadata Working Group, which created version 2024.1 of the IPTC Photo Metadata Standard.
The change is minor but important to some: the definition of the Keywords property now includes the following text:
Keywords to express the subject and other aspects of the content of the image. Keywords may be free text and don’t have to be taken from a controlled vocabulary. Codes from the controlled vocabulary IPTC Subject NewsCodes must go to the “Subject Code” field.
This aligns the property definition with the way in which many photo agencies and photographers were already using the field: to convey aspects such as the lighting or lens effects used, “mood” of the image, dominant colour and more.
The IPTC has worked together with the DPP and stakeholders from Reuters, Arqiva and Warner Brothers Discovery to develop a pioneering new initiative called DPP Live Production Exchange (LPX). The LPX protocol covers API and a data schema for information related to news coverage of live events, including the ability for B2B event subscribers to be informed about upcoming news events and their coverage.
IPTC’s contribution to the project was to enhance and evolve our ninjs (News in JSON) standard to support news coverage of events and live streamed content. The News in JSON Working Group dedicated a lot of its time to this work over the past two years, including participating in the DPP LPX Hackathon in Spring 2024.
Screenshot of the DPP LPX website JSON Schema page
The underlying data model for the events and planning work in ninjs comes from the IPTC News Architecture and is based on EventsML-G2, a part of the NewsML-G2 family of standards which was created over 10 years ago.
Ian Young of PA Media, Lead of the IPTC News in JSON Working Group, said “Basing the work of ninjs 3.0 on the stable foundation of the IPTC News Architecture made our work much simpler. IPTC members have been syndicating news events for years using this model so we know that it works. That meant that we could focus on making ninjs 3.0 handle live events and streaming video in a way that is practical and simple, both for developers and for users.”
IPTC Managing Director Brendan Quinn said “we owe our thanks to our teammates and partners on this project: David Thompson from IPTC liaison partners the DPP, JJ Eynon from CNN / Warner Brothers Discovery, Tania Vivero and Ian McLaren from Reuters (IPTC Voting Member), and Daniel Lynch from Arqiva (IPTC Associate Member). Through a very friendly and collegial but also productive and results-driven collaboration, we have arrived at a solution that should make syndicated news events much easier to handle in all newsroom workflows.”
ninjs 3.0 and the LPX API are or will soon be supported by tools from Arqiva, Reuters and Wolftech (who were recently acquired by Avid). We hope that many more implementations will be emerge in the coming months.
For more on ninjs 3.0, see the following resources:
The IPTC is happy to announce that EIDR and IPTC have signed a liaison agreement, committing to work together on projects of mutual interest including media metadata, content distribution technologies and work on provenance and authenticity for media.
The Entertainment Identifier Registry Association (EIDR) was established to provide a universal identifier registry that supports the full range of asset types and relationships between assets. Members of EIDR include Apple, Amazon MGM Studios, Fox, the Library of Congress, Netflix, Paramount, Sony Pictures, Walt Disney Studios and many more.
EIDR’s primary focus is managing globally unique, curated, and resolvable content identification (which applies equally to news and entertainment media), via the Emmy Award-winning EIDR Content ID, and content delivery services, via the EIDR Video Service ID. In support of this, EIDR is built upon and helps promulgate the MovieLabs Digital Distribution Framework (MDDF), a suite of standards and specifications that address core aspects of digital distribution, including identification, metadata, avails, asset delivery, and reporting.
IPTC’s Video Metadata Hub standard already provides a mapping to EIDR’s Data Fields and the MDDF fields from related organisation MovieLabs. The organisations will work together to keep these mappings up-to-date and to work on future initiatives including making C2PA metadata work for both the news and the entertainment sides of the media industry. IPTC members have already started working in this area via IPTC’s Media Provenance Committee.
“In the Venn diagram of media, there is significant overlap between news and entertainment interests in descriptive metadata standards, globally-unique content identification, and media provenance and authenticity,” said Richard W. Kroon, EIDR’s director of technical operations. “By working together, we each benefit from the other’s efforts and can bring forth useful standards and practices that span the entire commercial media industry.
“Our hope here is to find common ground that can align our respective metadata standards to support seamless metadata management across the commercial media landscape.”
Helge, Judy, Bruce and Charlie speaking at the Origin Media Provenance Summit in October 2024.
In October 2024, 70 people representing 30 organisations from 15 countries across four continents gathered at the BBC building in Salford to join the Origin Media Provenance Seminar. The seminar was organised by BBC R&D with partners from Media Cluster Norway(MCN) in Bergen.
Media provenance is a way to record digitally signed information about the provenance of imagery, video and audio – information (or signals) that shows where a piece of media has come from and how it’s been edited. Like an audit trail or a history, these signals are called ‘content credentials’, and are developed as an open standard by the C2PA (Coalition for Content Provenance and Authenticity). Content credentials have just been selected by Time magazine as one of their ‘Best Inventions of 2024’.
Attendees came from all over the world, including the US, Japan, all over Europe, and also sub-Saharan Africa.
According to the BBC blog post:
In order for news organisations to show their consumers that they really are looking at some content from the real “BBC”, content credentials use the same technology as websites – digital certificates – to prove who signed it. The International Press Telecommunications Council (IPTC) has created a programme called “Origin Verified News Publishers”, which allows news organisations to register to get their identity checked. Once their ID has been verified, they can get a certificate, which gives consumers assurance that the content certifiably comes from the organisation they have chosen to trust.
An image created with Google’s Gemini model. The image contains values for the IPTC Photo Metadata properties Digital Source type (trainedAlgorithmnicMedia) and Credit (“Made with Google AI”).
On Thursday, Google announced that it will be extending its usage of AI content labelled using the IPTC Digital Source Type vocabulary.
In a blog post published on Friday, John Fisher, Engineering Director for Google Photos and Google One posted that “[n]ow we’re taking it a step further, making this information visible alongside information like the file name, location and backup status in the Photos app.”
Hannes Schulz from Axel Springer showing their implementation of IPTC Video Metadata Hub for an internal video management system.
Last week we held the latest IPTC member meeting, the IPTC Autumn Meeting 2024. With over 80 attendees, it was a great success. We heard from IPTC members and invited guests on new developments in the world of media technology and metadata.
Speakers from Axel Springer, Reuters, Global Media Registry and the EBU
We heard from Axel Springer who have implemented an internal video management system based on IPTC Video Metadata Hub as a metadata model; from Reuters who are basing their live events streaming API on the forthcoming DPP Live Production Exchange (DPP LPX) protocol based on the recently approved ninjs 3.0, from Global Media Registry who are developing a “Unique Media ID” standard to identify media publishers, and from the EBU on the Trusted European Media Data Space (TEMS) project.
Committees and Working Groups
Our new Media Provenance Committee and its Working Groups are working very hard on the details for implementing C2PA in the media industry, including launching the IPTC Origin Verified News Publisher List. We also heard from Tessa Sproule of CBC / Radio Canada on their implementation of C2PA.
Our Standards Committee Working Groups are also working hard, with new developments on all of our standards and three new standard versions, as described below.
Hannes Schulz from Axel Springer showing their implementation of IPTC Video Metadata Hub for an internal video management system.
New Versions of Sport Schema, ninjs and Photo Metadata
The Standards Committee approved three new standard versions at its meeting on Wednesday 2nd October.
IPTC Sport Schema version 1.1 was approved, adding a Club class so we can model Clubs that contain multiple Teams, even teams that play different sports (who knew that Bayern Munich had a chess team?!). The update also added Associate relationships for individual athletes (such as a coach for a tennis player or a boxer), and facets for sports so we can now declare that an event was a women’s 200 metres breaststroke swimming event using IPTC facets metadata taken from our NewsCodes sports facet vocabulary.
IPTC Photo Metadata Standard version 2024.1 modifies the text of the Keywords property to broaden its scope, matching current industry usage.
IPTC’s JSON standard for news ninjs version 3.0 was also approved, adding requirements for the DPP LPX project including events and planning information, plus renditions support for live event streams. The 3.0 version of the standard also moves property names to “camelCase”, which is the de facto standard for GraphQL and many other JSON-based technologies.
All three updates, plus the NewsML-G2 v2.34_2 errata update, will be released in the coming weeks.