The IPTC has responded to a multi-stakeholder consultation on the recently-agreed European Union Artificial Intelligence Act (EU AI Act).

Although the IPTC is officially based in the UK, many of our members and staff operate from the European Union, and of course all of our members’ content is available in the EU, so it is very important to us that the EU regulates Artificial Intelligence providers in a way that is fair to all parts of the ecosystem, including content rightsholders, AI providers, AI application developers and end users.

In particular, we drew the EU AI Office’s attention to the IPTC Photo Metadata Data Mining property, which enables rightsholders to inform web crawlers and AI training systems of the rightsholders’ agreement as to whether or not the content can be used as part of a training data set for building AI models.

The points made are the same as the ones that we made to the IETF/IAB Workshop consultation: that embedded data mining declarations should be part of the ecosystem of opt-outs, because robots.txt, W3C TDM, C2PA and other solutions are not sufficient for all use cases. 

The full consultation text and all public responses will be published by the EU in due course via the consultation home page.

 

Tuesday’s IPTC Photo Metadata Conference was a great success. With 12 speakers from the media and software industries and over 200 people registered, it continues to be the largest gathering of photo and image metadata experts globally.

Introduction and welcome, 20 years of IPTC Photo Metadata, Recent work on Photo Metadata at IPTC

We started off with David Riecks and Michael Steidl, co-leads of the IPTC Photo Metadata Working Group, giving an update on what the IPTC has been working on in the areas of photo metadata since the last conference in 2022, along with Brendan Quinn, IPTC Managing Director.

A lot has been happening, including Meta announcing support for IPTC metadata for Generative AIlaunching the IPTC Media Provenance Committee and updating the IPTC Photo Metadata User Guide, including our guidance for how to tag Generative AI content with metadata and how to use the DigitalSourceType field.

Panel 1: AI and Image Authenticity

The first panel saw Leonard Rosenthol of Adobe, Lead of the C2PA Technical Working Group; Dennis Walker of Camera Bits, creators of Photo Mechanic; Dr. Neal Krawetz, Computer security specialist, forensic researcher, and founder of FotoForensics; and Bofu Chen, Founder & CTO of Numbers Protocol speak about image provenance and authenticity, covering the C2PA spec, the problems of fraudulent images, what it’s like to implement C2PA technology in existing software, and how blockchain-based systems could be built on top of C2PA to potentially extend its capabilities.

Session on Adobe’s Custom Metadata Panel

James Lockman, Group Manager, Digital Media Services at Adobe demonstrated the Custom Metadata Panel plugin for some Adobe tools (Bridge, Illustrator, Photoshop and Premiere Pro) that allows the full range of IPTC Photo Metadata Standard and IPTC Video Metadata Hub, or any other metadata schema, to be edited directly in Adobe’s interface.

Panel 2: AI-powered asset management

Speakers  Nancy Wolff, Partner at Cowan, DeBaets, Abrahams & Sheppard, LLP; Serguei Fomine, Founder and CEO of IQPlug; Jeff Nova, Chief Executive Officer at Colorhythm and Mark Milstein, co-founder and Director of Business Development at vAIsual discussed the impact of AI on copyright, metadata and media asset management.

The full event recording is also available as a YouTube playlist.

Thanks to everyone for coming and especial thanks to our speakers. We’re already looking forward to next year!

The IPTC News Architecture Working Group is happy to announce the release of NewsML-G2 version 2.34.

This version, approved at the IPTC Standards Committee Meeting at the New York Times offices on Wednesday 17th April 2024, contains one small change and one additional feature:

Change Request 218, increase nesting of <related> tags: this allows for <related> items to contain child <related> items, up to three levels of nesting. This can be applied to many NewsML-G2 elements:

  • pubHistory/published
  • QualRelPropType (used in itemClass, action)
  • schemeMeta
  • ConceptRelationshipsGroup (used in concept, event, Flex1PropType, Flex1RolePropType, FlexPersonPropType, FlexOrganisationPropType, FlexGeoAreaPropType, FlexPOIPropType, FlexPartyPropType, FlexLocationPropType)

Note that we chose not to allow for recursive nesting because this caused problems with some XML code generators and XML editors.

Change Request 219, add dataMining element to rightsinfo: In accordance with other IPTC standards such as the IPTC Photo Metadata Standard and Video Metadata Hub, we have now added a new element to the <rightsInfo> block to convey a content owner’s wishes in terms of data mining of the content. We recommend the use of the PLUS Vocabulary that is also recommended for the other IPTC standards: https://ns.useplus.org/LDF/ldf-XMPSpecification#DataMining

Here are some examples of its use:

Denying all Generative AI / Machine Learning training using this content:

<rightsInfo>
  <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-AIMLTRAINING"/>
</rightsInfo>

A simple text-based constraint:

<rightsInfo>
  <usageTerms>
    Data mining allowed for academic and research purposes only.
  </usageTerms>
  <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT" />
</rightsInfo>

A simple text based constraint, expressed using a QCode instead of a URI:

<rightsInfo>
  <usageTerms>
    Reprint rights excluded.
  </usageTerms>
  <dataMining qcode="plusvocab:DMI-PROHIBITED-SEECONSTRAINT" />
</rightsInfo>

A text-based constraint expressed in both English and French:

<rightsInfo>
  <usageTerms xml:lang="en">
    Reprint rights excluded.
  </usageTerms>
  <usageTerms xml:lang="fr">
    droits de réimpression exclus
  </usageTerms>
  <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT" />
</rightsInfo>

Using the “see embedded rights expression” constraint to express a complex machine-readable rights expression in RightsML:

<rightsInfo>
  <rightsExpressionXML langid="http://www.w3.org/ns/odrl/2/">
    <!-- RightsML goes here... -->
  </rightsExpressionXML>
  <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR"/>>
</rightsInfo>

For more information, contact the IPTC News Architecture Working Group via the public NewsML-G2 mailing list.

 

The 2024 IPTC Photo Metadata Conference takes place as a webinar on Tuesday 7th May from 1500 – 1800 UTC. Speakers hail from Adobe (makers of Photoshop), CameraBits (makers of PhotoMechanic), Numbers Protocol, Colorhythm, vAIsual and more.

First off, IPTC Photo Metadata Working Group co-leads, David Riecks and Michael Steidl, will give an overview of what has been happening in the world of photo metadata since our last Conference in November 2022, including IPTC’s work on metadata for AI labelling, “do not train” signals, provenance, diversity and accessibility.

Next, a panel session on AI and Image Authenticity: Bringing trust back to photography? discusses approaches to the problem of verifying trust and credibility for online images. The panel features C2PA lead architect Leonard Rosenthol (Adobe), Dennis Walker (Camera Bits), Neal Krawetz (FotoForensics) and Bofu Chen (Numbers Protocol).

Next, James Lockman of Adobe presents the Custom Metadata Panel, which is a plugin for Photoshop, Premiere Pro and Bridge that allows for any XMP-based metadata schema to be used – including IPTC Photo Metadata and IPTC Video Metadata Hub. James will give a demo and talk about future ideas for the tool.

Finally, a panel on AI-Powered Asset Management: Where does metadata fit in? discusses teh relevance of metadata in digital asset management systems in an age of AI. Speakers include Nancy Wolff (Cowan, DeBaets, Abrahams & Sheppard, LLP),  Serguei Fomine (IQPlug), Jeff Nova (Colorhythm) and Mark Milstein (vAIsual).

The full agenda and links to register for the event are available at https://iptc.org/events/photo-metadata-conference-2024/

Registration is free and open to anyone who is interested.

See you there on Tuesday 7th May!

Screenshot of an article from SearchEngineLand entitled "Google wants you to label AI-generated images used in Merchant Center", with the subtitle: "Don't remove embedded metadata tags such as trainedAlgorithmicMedia from such images," Google wrote.
The new guidance has received some press in the Search Engine Optimisation (SEO) world, including this post on SearchEngineLand.

Google has added Digital Source Type support to Google Merchant Center, enabling images created by generative AI engines to be flagged as such in Google’s products such as Google search, maps, YouTube and Google Shopping.

In a new support post, Google reminds merchants who wish their products to be listed in Google search results and other products that they should not strip embedded metadata, particularly the Digital Source Type field which can be used to signal that content was created by generative AI.

We at the IPTC fully endorse this position. We have been saying for years that website publishers should not strip metadata from images. This should also include tools for maintaining online product inventories, such as Magento and WooCommerce. We welcome contact from developers who wish to learn more about how they can preserve metadata in their images.

Here’s the full text of Google’s recommendation:

Preserving metadata tags for AI-generated images in Merchant Center
February 2024
If you’re using AI-generated images in Merchant Center, Google requires that you preserve any metadata tags which indicate that the image was created using generative AI in the original image file.

Don't remove embedded metadata tags such as trainedAlgorithmicMedia from such images. All AI-generated images must contain the IPTC DigitalSourceType trainedAlgorithmicMedia tag. Learn more about IPTC photo metadata.

These requirements apply to the following image attributes in Merchant Center Classic and Merchant Center Next:

Image link [image_link]
Additional image link [additional_image_link]
Lifestyle image link [lifestyle_image_link]
Learn more about product data specifications.

A cute robot penguin painting a picture of itself using a canvas mounted on a a wooden easel, in the countryside. Generated by Imagine with Meta AI
An image generated by Imagine with Meta AI, using the prompt “A cute robot penguin painting a picture of itself using a canvas mounted on a wooden easel, in the countryside.” The image contains IPTC DigitalSourceType metadata showing that it was generated by AI.

Yesterday Nick Clegg, Meta’s President of Global Affairs, announced that Meta would be using IPTC embedded photo metadata to label AI-Generated Images on Facebook, Instagram and Threads.

Meta already uses the IPTC Photo Metadata Standard’s Digital Source Type property to label images generated by its platform. The image to the right was generated using Imagine with Meta AI, Meta’s image generation tool. Viewing the image’s metadata with the IPTC’s Photo Metadata Viewer tool shows that the Digital Source Type field is set to “trainedAlgorithmicMedia” as recommended in IPTC’s Guidance on metadata for AI-generated images.

Clegg said that “we do several things to make sure people know AI is involved, including putting visible markers that you can see on the images, and both invisible watermarks and metadata embedded within image files. Using both invisible watermarking and metadata in this way improves both the robustness of these invisible markers and helps other platforms identify them.”

This approach of both direct and indirect disclosure is in line with the Partnership on AI’s Best Practices on signalling the use of generative AI.

Also, Meta are building recognition of this metadata into their tools: “We’re building industry-leading tools that can identify invisible markers at scale – specifically, the “AI generated” information in the C2PA and IPTC technical standards – so we can label images from Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock as they implement their plans for adding metadata to images created by their tools.”

We have previously shared the news that Google, Microsoft, Adobe, Midjourney and Shutterstock will use IPTC metadata in their generated images, either directly in the IPTC Photo Metadata block or using the IPTC Digital Source Type vocabulary as part of a C2PA assertion. OpenAI has just announced that they have started using IPTC via C2PA metadata to signal the fact that images  from DALL-E are generated by AI.

A call for platforms to stop stripping image metadata

We at the IPTC agree that this is a great step towards end-to-end support of indirect disclosure of AI-generated content.

As the Meta and OpenAI posts points out, it is possible to strip out both IPTC and C2PA metadata either intentionally or accidentally, so this is not a solution to all problems of content credibility.

Currently, one of the main ways metadata is stripped from images is when they are uploaded to Facebook or other social media platforms. So with this step, we hope that Meta’s platforms will stop stripping metadata from images when they are shared – not just the fields about generative AI, but also the fields regarding accessibility (alt text), copyright, creator’s rights and other information embedded in images by their creators.

Video next?

Meta’s post indicates that this type of metadata isn’t commonly used for video or audio files. We agree, but to be ahead of the curve, we have added Digital Source Type support to IPTC Video Metadata Hub so videos can be labelled in the same way.

We will be very happy to work with Meta and other platforms on making sure IPTC’s standards are implemented correctly in images, videos and other areas.

"A reel of film unspooling and transforming into a stream of binary digits"
Made with Bing Image Creator. Powered by DALL-E.
“A reel of film unspooling and transforming into a stream of binary digits”
Made with Bing Image Creator. Powered by DALL-E.

Following the IPTC’s recent announcement that Rights holders can exclude images from generative AI with IPTC Photo Metadata Standard 2023.1 , the IPTC Video Metadata Working Group  is very happy to announce that the same capability now exists for video, through IPTC Video Metadata Hub version 1.5.

The “Data Mining” property has been added to this new version of IPTC Video Metadata Hub, which was approved by the IPTC Standards Committee on October 4th, 2023. Because it uses the same XMP identifier as the Photo Metadata Standard property, the existing support in the latest versions of ExifTool will also work for video files.

Therefore, adding metadata to a video file that says it should be excluded from Generative AI indexing is as simple as running this command in a terminal window:

exiftool -XMP-plus:DataMining="Prohibited for Generative AI/ML training" example-video.mp4

(Please note that this will only work in ExifTool version 12.67 and above, i.e. any version of ExifTool released after September 19, 2023)

The possible values of the Data Mining property are listed below:

PLUS URI Description (use exactly this text with ExifTool)

http://ns.useplus.org/ldf/vocab/DMI-UNSPECIFIED

Unspecified – no prohibition defined

http://ns.useplus.org/ldf/vocab/DMI-ALLOWED (Allowed)

Allowed

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-AIMLTRAINING

Prohibited for AI/ML training

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-GENAIMLTRAINING

Prohibited for Generative AI/ML training

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-EXCEPTSEARCHENGINEINDEXING

Prohibited except for search engine indexing

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED

Prohibited

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT

Prohibited, see plus:OtherConstraints

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR

Prohibited, see iptcExt:EmbdEncRightsExpr

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEELINKEDRIGHTSEXPR

Prohibited, see iptcExt:LinkedEncRightsExpr

A corresponding new property “Other Constraints” has also been added to Video Metadata Hub v1.5. This property allows plain-text human-readable constraints to be placed on the video when using the “Prohibited, see plus:OtherConstraints” value of the Data Mining property.

The Video Metadata Hub User Guide and Video Metadata Hub Generator have also been updated to include the new Data Mining property added in version 1.5.

We look forward to seeing video tools (and particularly crawling engines for generative AI training systems) implement the new properties.

Please feel free to discuss the new version of Video Metadata Hub on the public iptc-videometadata discussion group, or contact IPTC via the Contact us form.

The IPTC NewsML-G2 Working Group and the News Architecture Working Group are happy to announce the release of the latest version of our flagship XML-based news syndication standard: NewsML-G2 v2.33.

Changes in the latest version are small but significant. We have added support for the Digital Source Type property which is already being used in IPTC’s sister standards IPTC Photo Metadata Standard and IPTC Video Metadata Hub and ninjs. This property can be used to declare when content has been created or modified by software, including by Generative AI engines.

Examples of other possible values for the digital source type property using the recommended IPTC Digital Source Type NewsCodes vocabulary are:

ID (in QCode format) Name Example
digsrctype:digitalCapture Original digital capture sampled from real life:

The digital media is captured from a real-life source using a digital camera or digital recording device

Digital video taken using a digital film, video or smartphone camera

digsrctype:negativeFilm Digitised from a negative on film:

The digital image was digitised from a negative on film on any other transparent medium

Digital photo scanned from a photographic negative

digsrctype:minorHumanEdits Original media with minor human edits:

Minor augmentation or correction by a human, such as a digitally-retouched photo used in a magazine

Original audio with minor edits (e.g. to eliminate breaks)

digsrctype:algorithmicallyEnhanced Algorithmic enhancement:
Minor augmentation or correction by algorithm

A photo that has been digitally enhanced using a mechanism such as Google Photos’ “denoise” feature

digsrctype:dataDrivenMedia Data-driven media:
Digital media representation of data via human programming or creativity

Textual weather report generated by code using readings from weather detection instruments

digsrctype:trainedAlgorithmicMedia Trained algorithmic media:
Digital media created algorithmically using a model derived from sampled content

A “deepfake” video using a combination of a real actor and a trained model

 

The above list is a subset of the full list of recommended values. See the full IPTC Digital Source Type NewsCodes vocabulary for the complete list.

Guidance on using Digital Source Type

The IPTC Photo Metadata User Guide contains a section on Guidance for using Digital Source Type including examples for various types of media, including images, video, audio and text. The examples referenced in this guide can also apply to NewsML-G2 content.

Where Digital Source Type can be used in NewsML-G2 documents

The new <digitalSourceType> property can be added to the contentMeta section of any G2 NewsItem, PackageItem, KnowledgeItem, ConceptItem or PlanningItem to describe the digital source type of an item in its entirety.

It can also be used in the partMeta section of any G2 NewsItem, PackageItem or KnowledgeItem to describe the digital source type of a part of the item. In this way, content such as a video that includes some captured shots and AI-generated shots can be fully described using NewsML-G2.

Find out more about NewsML-G2 v2.33

All information related to NewsML-G2 2.33 is at https://iptc.org/std/NewsML-G2/2.33/.

The NewsML-G2 Specification document has been updated to cover the new version 2.33.

Example instance documents are at https://iptc.org/std/NewsML-G2/2.33/examples/

Full XML Schema documentation is located at https://iptc.org/std/NewsML-G2/2.33/specification/XML-Schema-Doc-Power/

XML source documents and unit tests are hosted in the public NewsML-G2 GitHub repository.

The NewsML-G2 Generator tool has also been updated to produce NewsML-G2 2.33 files using the version 38 catalog.

For any questions or comments, please contact us via the IPTC Contact Us form or post to the iptc-newsml-g2@groups.io mailing list. IPTC members can ask questions at the weekly IPTC News Architecture Working Group meetings.

Dynamic fountains out of the Drau river in Villach, Carinthia, Austria (Europe). This image contains the new Data Mining property. Clicking on the image will show the metadata as extracted by IPTC’s online Get Photo Metadata tool.

Updated in June 2024 to include an image containing the new metadata property

Many image rights owners noticed that their assets were being used as training data for generative AI image creators, and asked the IPTC for a way to express that such use is prohibited. The new version 2023.1 of the IPTC Photo Metadata Standard now provides means to do this: a field named “Data Mining” and a standardised list of values, adopted from the PLUS Coalition. These values can show that data mining is prohibited or allowed either in general, for AI or Machine Learning purposes or for generative AI/ML purposes. The standard was approved by IPTC members on 4th October 2023 and the specifications are now publicly available.

Because these data fields, like all IPTC Photo Metadata, are embedded in the file itself, the information will be retained even after an image is moved from one place to another, for example by syndicating an image or moving an image through a Digital Asset Management system or Content Management System used to publish a website. (Of course, this requires that the embedded metadata is not stripped out by such tools.)

Created in a close collaboration with PLUS Coalition, the publication of the new properties comes after the conclusion of a public draft review period earlier this year. The properties are defined as part of the PLUS schema and incorporated into the IPTC Photo Metadata Standard in the same way that other properties such as Copyright Owner have been specified.

The new properties are now finalised and published. Specifically, the new properties are as follows:

The IPTC and PLUS Consortium wish to draw users attention to the following notice included in the specification:

Regional laws applying to an asset may prohibit, constrain, or allow data mining for certain purposes (such as search indexing or research), and may overrule the value selected for this property. Similarly, the absence of a prohibition does not indicate that the asset owner grants permission for data mining or any other use of an asset.

The prohibition “Prohibited except for search engine indexing” only permits data mining by search engines available to the public to identify the URL for an asset and its associated data (for the purpose of assisting the public in navigating to the URL for the asset), and prohibits all other uses, such as AI/ML training.

The IPTC encourages all photo metadata software vendors to incorporate the new properties into their tools as soon as possible, to support the needs of the photo industry.

ExifTool, the command-line tool for accessing and manipulating metadata in image files, already supports the new properties. Support was added in the ExifTool version 12.67 release, which is available for download on exiftool.org.

The new version of the specification can be accessed at https://www.iptc.org/std/photometadata/specification/IPTC-PhotoMetadata or from the navigation menu on iptc.org. The IPTC Get Photo Metadata tool and IPTC Photo Metadata Reference images been updated to use the new properties.

The IPTC and PLUS Coalition wish to thank many IPTC and PLUS member organisations and others who took part in the consultation process around these changes. For further information, please contact IPTC using the Contact Us form.

The IPTC NewsCodes Working Group has approved an addition to the Digital Source Type NewsCodes vocabulary.

Illustration: August Kamp × DALL·E, outpainted from Girl with a Pearl Earring by Johannes Vermeer
Image used by DALL-E to illustrate outpainting. OpenAI’s caption: “Illustration: August Kamp × DALL·E, outpainted from Girl with a Pearl Earring by Johannes Vermeer”

The new term, “Composite with Trained Algorithmic Media“, is intended to handle situations where the “synthetic composite” term is not specific enough, for example a composite that is specifically made using an AI engine’s “inpainting” or “outpainting” operations.

The full Digital Source Type vocabulary can be accessed from https://cv.iptc.org/newscodes/digitalsourcetype. It can be downloaded in NewsML-G2 (XML), SKOS (RDF/XML, Turtle or JSON-LD) to be integrated into content management and digital asset management systems.

The new term can be used immediately with any tool or standard that supports IPTC’s Digital Source Type vocabulary, including the C2PA specification, the IPTC Photo Metadata Standard and IPTC Video Metadata Hub.

Information on the new term will soon be added to IPTC’s Guidance on using Digital Source Type in the IPTC Photo Metadata User Guide.