Screenshot of an article from SearchEngineLand entitled "Google wants you to label AI-generated images used in Merchant Center", with the subtitle: "Don't remove embedded metadata tags such as trainedAlgorithmicMedia from such images," Google wrote.
The new guidance has received some press in the Search Engine Optimisation (SEO) world, including this post on SearchEngineLand.

Google has added Digital Source Type support to Google Merchant Center, enabling images created by generative AI engines to be flagged as such in Google’s products such as Google search, maps, YouTube and Google Shopping.

In a new support post, Google reminds merchants who wish their products to be listed in Google search results and other products that they should not strip embedded metadata, particularly the Digital Source Type field which can be used to signal that content was created by generative AI.

We at the IPTC fully endorse this position. We have been saying for years that website publishers should not strip metadata from images. This should also include tools for maintaining online product inventories, such as Magento and WooCommerce. We welcome contact from developers who wish to learn more about how they can preserve metadata in their images.

Here’s the full text of Google’s recommendation:

Preserving metadata tags for AI-generated images in Merchant Center
February 2024
If you’re using AI-generated images in Merchant Center, Google requires that you preserve any metadata tags which indicate that the image was created using generative AI in the original image file.

Don't remove embedded metadata tags such as trainedAlgorithmicMedia from such images. All AI-generated images must contain the IPTC DigitalSourceType trainedAlgorithmicMedia tag. Learn more about IPTC photo metadata.

These requirements apply to the following image attributes in Merchant Center Classic and Merchant Center Next:

Image link [image_link]
Additional image link [additional_image_link]
Lifestyle image link [lifestyle_image_link]
Learn more about product data specifications.

A cute robot penguin painting a picture of itself using a canvas mounted on a a wooden easel, in the countryside. Generated by Imagine with Meta AI
An image generated by Imagine with Meta AI, using the prompt “A cute robot penguin painting a picture of itself using a canvas mounted on a wooden easel, in the countryside.” The image contains IPTC DigitalSourceType metadata showing that it was generated by AI.

Yesterday Nick Clegg, Meta’s President of Global Affairs, announced that Meta would be using IPTC embedded photo metadata to label AI-Generated Images on Facebook, Instagram and Threads.

Meta already uses the IPTC Photo Metadata Standard’s Digital Source Type property to label images generated by its platform. The image to the right was generated using Imagine with Meta AI, Meta’s image generation tool. Viewing the image’s metadata with the IPTC’s Photo Metadata Viewer tool shows that the Digital Source Type field is set to “trainedAlgorithmicMedia” as recommended in IPTC’s Guidance on metadata for AI-generated images.

Clegg said that “we do several things to make sure people know AI is involved, including putting visible markers that you can see on the images, and both invisible watermarks and metadata embedded within image files. Using both invisible watermarking and metadata in this way improves both the robustness of these invisible markers and helps other platforms identify them.”

This approach of both direct and indirect disclosure is in line with the Partnership on AI’s Best Practices on signalling the use of generative AI.

Also, Meta are building recognition of this metadata into their tools: “We’re building industry-leading tools that can identify invisible markers at scale – specifically, the “AI generated” information in the C2PA and IPTC technical standards – so we can label images from Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock as they implement their plans for adding metadata to images created by their tools.”

We have previously shared the news that Google, Microsoft, Adobe, Midjourney and Shutterstock will use IPTC metadata in their generated images, either directly in the IPTC Photo Metadata block or using the IPTC Digital Source Type vocabulary as part of a C2PA assertion. OpenAI has just announced that they have started using IPTC via C2PA metadata to signal the fact that images  from DALL-E are generated by AI.

A call for platforms to stop stripping image metadata

We at the IPTC agree that this is a great step towards end-to-end support of indirect disclosure of AI-generated content.

As the Meta and OpenAI posts points out, it is possible to strip out both IPTC and C2PA metadata either intentionally or accidentally, so this is not a solution to all problems of content credibility.

Currently, one of the main ways metadata is stripped from images is when they are uploaded to Facebook or other social media platforms. So with this step, we hope that Meta’s platforms will stop stripping metadata from images when they are shared – not just the fields about generative AI, but also the fields regarding accessibility (alt text), copyright, creator’s rights and other information embedded in images by their creators.

Video next?

Meta’s post indicates that this type of metadata isn’t commonly used for video or audio files. We agree, but to be ahead of the curve, we have added Digital Source Type support to IPTC Video Metadata Hub so videos can be labelled in the same way.

We will be very happy to work with Meta and other platforms on making sure IPTC’s standards are implemented correctly in images, videos and other areas.

The IPTC is happy to announce that it will be working as an affiliate organisation of the TEMS project to build a Trusted European Media Data Space.

The product of a two-year-long tender and award process with the European Commission, TEMS is a joint undertaking of 43 organisations representing hundreds of stakeholders from 14 countries in the cultural and creative sectors, which aims to conceive and implement a common media data space across Europe.

The initiative is supported by the European Commission’s Digital Europe Programme (DIGITAL) and is a core element in the implementation of the European Data Strategy. With an investment of EUR 16.5 million, the consortium represents a milestone in the way the media sector will be able to share and extract value from data. By doing so, TEMS aims to support the economic development and future growth of local and regional media ecosystems across Europe.

Concretely, TEMS will lead the way for the large-scale deployment of cutting-edge services, infrastructures, and platforms. It will also address fighting misinformation, audience analysis, improving data flows in production chains, and supporting the adoption of AI and Virtual Reality technologies.

TEMS will evolve existing media platforms, embryonic data space infrastructures and provide open access to a common data space for any interested media stakeholder from any media sub-sector. This will help digital transformation and improve the competitiveness of the European media industry.

IPTC’s role is an advisory one. We will be working with TEMS partners on structuring data models and metadata formats, re-using and extending existing IPTC standards to create the framework the will power the data sharing for the media industry and beyond.

The TEMS partner organisations range from global news organisations to data platform providers, broadcasters, archives and media innovation units, as can be seen below.

To follow news from the TEMS project, see http://www.tems-dataspace.eu/ or follow @TEMS_EU on X/Twitter.

Reporters Sans Frontiers (Reporters Without Borders) logoOn Tuesday 12 November 2023, a group of news, journalism and media organisations released what they call the “Paris Charter on AI and Journalism.” Created by 17 organisations brought together by Reporters sans frontières and chaired by journalist and Nobel Peace Prize laureate Maria Ressa, the Charter aims to give journalism organisations some guidelines that they can use to navigate the intersection of Artificial Intelligence systems and journalism.

The IPTC particularly welcomes the Charter because it aligns well with several of our ongoing initiatives and recent projects. IPTC technologies and standards give news organisations a way to implement the Charter simply and easily in their existing newsroom workflows.

In particular, we have some comments to offer on some principles:

Principle 3: AI SYSTEMS USED IN JOURNALISM UNDERGO PRIOR, INDEPENDENT EVALUATION

“The AI systems used by the media and journalists should undergo an independent, comprehensive, and thorough evaluation involving journalism support groups. This evaluation must robustly demonstrate adherence to the core values of journalistic ethics. These systems must respect privacy, intellectual property and data protection laws.”

We particularly agree that AI systems must respect intellectual property laws. To support this, we have recently released the Data Mining property in the IPTC Photo Metadata Standard which allows content owners to express any permissions or restrictions that they apply regarding the use of their content in Generative AI training or other data mining purposes. The Data Mining property is also supported in IPTC Video Metadata Hub.

Principle 5: MEDIA OUTLETS MAINTAIN TRANSPARENCY IN THEIR USE OF AI SYSTEMS.

“Any use of AI that has a significant impact on the production or distribution of journalistic content should be clearly disclosed and communicated to everyone receiving information alongside the relevant content. Media outlets should maintain a public record of the AI systems they use and have used, detailing their purposes, scopes, and conditions of use.”

To enable clear declaration of generated content, we have created extra terms in the Digital Source Type vocabulary to express content that was created or edited by AI. These values can be used in both IPTC Photo Metadata and IPTC Video Metadata Hub.

Principle 6: MEDIA OUTLETS ENSURE CONTENT ORIGIN AND TRACEABILITY.

“Media outlets should, whenever possible, use state-of-the-art tools that guarantee the authenticity and provenance of published content, providing reliable details about its origin and any subsequent changes it may have undergone. Any content not meeting these authenticity standards should be regarded as potentially misleading and should undergo thorough verification.”

Through IPTC’s work with Project Origin, C2PA and the Content Authenticity Initiative, we are pushing forward in making provenance and authenticity technology available and accessible to journalists and newsrooms around the world.

 

In conclusion, the Charter says: “In affirming these principles, we uphold the right to information, champion independent journalism, and commit to trustworthy news and media outlets in the era of AI.”

Users working on the ASBU Cloud system in the office. Credit: ASBU
Users working on the ASBU Cloud system in the office. Image credit: ASBU

This weekend at the IBC broadcast industry event in Amsterdam, the Arab States Broadcasting Union (ASBU) will launch its news exchange network, ASBU Cloud, with IPTC’s NewsML-G2 standard at its core.

Developed for ASBU by IPTC member Broadcast Solutions, a systems integrator in the broadcast industry, ASBU Cloud uses NewsML-G2 to distribute content to partners.

After evaluating several metadata formats, ASBU chose to implement NewsML-G2 as a metadata schema and worked with IPTC to implement the standard. This ensures that ASBU content can be used easily by other international organisations like the European Broadcasting Union (EBU) and Asia-Pacific Broadcasting Union (ABU).

Broadcast Solutions System Architect Jean-Christophe Liechti explained the use of NewsML-G2 in an interview with Broadcast Pro magazine: “This XML-based standard for news exchange was developed and is maintained by the International Press Telecommunications Council (IPTC). It’s a successor to the original NewsML format and it provides a comprehensive and flexible framework for distributing any type of media, including text, images, audio and video. This metadata standard is language-agnostic. You can use standard dictionaries or manage your own to structure your data. We reached out to IPTC to ensure that our implementation closely met the standard. ASBU exchanges are now available as a NewsML-G2 feed like partner organisations like the EBU or major news organisations like Reuters, AP or AFP.”

The project is also based on Amazon Web Services, the Dalet Flex media asset management system, and uses innovative systems like the AI video metadata extraction engine Newsbridge (IPTC’s newest member).

The project will be launched at the IBC event in Amsterdam this Sunday, 17 September at 12.00. The launch will take place at the Broadcast Solutions outdoor booth 0.A01 located across from Hall 13 of the RAI exhibition centre.

Read more about ASBU Cloud at Broadcast Pro Middle East or contact IPTC if you’re interested in using NewsML-G2 in your own projects.

Paul Kelly speaking at the Sports Video Group's Content Management Forum in July 2023
Paul Kelly speaking at the Sports Video Group’s Content Management Forum in New York, July 2023

The video recording of IPTC’s presentation at the Sports Video Group Content Management Forum has now been released.

Paul Kelly, Lead of the IPTC Sports Content Working Group, gave a live presentation introducing the IPTC Sport Schema to participants at the event, held in New York City in July 2023.

Many of those participants have helped with the development and testing of the new model for sports data, including PGA Tour, NBA, NFL, NHL and more.

The full video is now available on SVG’s on-demand video platform, SVG Play.

In the presentation, Paul describes among other things:

  • the motivation for the new model
  • how it is different from IPTC’s existing sports standard SportsML
  • how it can handle sports from tennis to athletics to football to golf and more
  • how it might be used by broadcasters and sports data providers to attach sports data to video and other forms of media content

The Sports Content Working Group is now putting the final touches to the schema and its supporting documentation before it is put to the IPTC Standards Committee to be turned into an official IPTC standard.

Watch the full video here.

The workshop note has been published on the Royal Society's website.
The note summarising the workshop note has been published on the Royal Society’s website.

In September 2022, IPTC Managing Director Brendan Quinn was invited to attended a workshop at the Royal Society in London, held in conjunction with the BBC. It was convened to discuss concerns about content provenance, threats to society due to misinformation and disinformation, and the idea of a “public-service internet.”

A note summarising the outcomes of the meeting has now been published. The Royal Society says, “This note provides a summary of workshop discussions exploring the potential of digital content provenance and a ‘public service internet’.”

The workshop note gives a summary of key takeaways from the event:

  • Digital content provenance is an imperfect and limited – yet still critically important – solution to the challenge of AI-generated misinformation.
  • A provenance-establishing system that can account for the international and culturally diverse nature of misinformation is essential for its efficacy.
  • Digital content provenance tools present significant technical and ethical challenges, including risks related
    to privacy, security and literacy.
  • Understanding how best to embed ideas such as digital content provenance into counter-misinformation strategies may require revisiting the rules which dictate how information is transmitted over the internet.
  • A ‘public service internet’ presents an interesting and new angle through which public service objectives can shape the information environment; however, the end state of such a system requires greater clarity and should include a wide range of voices, including historically excluded groups.

The IPTC is already participating in several projects looking at concrete responses to the problems of misinformation and disinformation in the media, via our work theme on Trust and Credibility in the Media. We are on the Steering Committee of Project Origin, and work closely with C2PA and the Content Authenticity Initiative.

The IPTC looks forward to further work in this area.The IPTC and its members will be happy to contribute to more workshops and studies by the Royal Society and other groups.

In partnership with the IPTC, the PLUS Coalition has published for public comment a draft on proposed revisions to the PLUS License Data Format standard. The changes cover a proposed standard for expressing image data mining permissions, constraints and prohibitions. This includes declaring in image files whether an image can be used as part of a training data set used to train a generative AI model.

Review the draft revisions in this read-only Google doc, which includes a link to a form for leaving comments.

Here is a summary of the new property:

XMP Property plus:DataMining
XMP Value Type URL
XMP Category External
Namespace URI http://ns.useplus.org/ldf/xmp/1.0/DataMining
Comments
  • A prohibition on the data mining of an asset may or may not not prohibit use of that asset under applicable regional laws, which may or may not permit data mining for purposes such as search, research, and search indexing.
  • The PLUS “Other Constraints” property is human readable. The IPTC properties “Embedded Encoded Rights Expression” and “Linked Rights Expression” are machine readable.
Cardinality 0..1

According to the PLUS proposal, the value of the property would be a value from the following controlled vocabulary:

CV Term URI Description
http://ns.useplus.org/ldf/vocab/DMI-UNSPECIFIED Neither allowed nor prohibited
http://ns.useplus.org/ldf/vocab/DMI-ALLOWED Allowed
http://ns.useplus.org/ldf/vocab/DMI-ALLOWED-noaitraining Allowed except for AI/ML training
http://ns.useplus.org/ldf/vocab/DMI-ALLOWED-noaigentraining Allowed except for AI/ML generative training
http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED Prohibited
http://ns.useplus.org/ldf/vocab/DMI-SEE-constraint Allowed with constraints expressed in Other Constraints property
http://ns.useplus.org/ldf/vocab/DMI-SEE-embeddedrightsexpr Allowed with constraints expressed in IPTC Embedded Encoded Rights Expression property

http://ns.useplus.org/ldf/vocab/DMI-SEE-linkedrightsexpr

Allowed with constraints expressed in IPTC Linked Encoded Rights Expression property

The public comment period will close on July 20, 2023.

If it is accepted by the PLUS membership and published as a PLUS property, the IPTC Photo Metadata Working Group plans to adopt this new property into a new version of the IPTC Photo Metadata Standard at the IPTC Autumn Meeting in October 2023.

"A photograph of a  pleasant beach scene with visible computer code overlaid on the image." Created by DALL-E via Bing Image Creator.
“A photograph of a pleasant beach scene with visible computer code overlaid on the image.” Created by DALL-E via Bing Image Creator.

CIPA, the Camera and Imaging Products Association based in Japan, has released version 3.0 of the Exif standard for camera data.

The new specification, “CIPA DC-008-Translation-2023 Exchangeable image file format for digital still cameras: Exif Version 3.0” can be downloaded from https://www.cipa.jp/std/documents/download_e.html?DC-008-Translation-2023-E.

Version 1.0 of Exif was released in 1995. The previous revision, 2.32, was released in 2019. The new version introduces some major changes so the creators felt it was necessary to increment the major version number.

Fully internationalised text tags

In previous versions, text-based fields such as Copyright and Artist were required to be in ASCII format, meaning that it was impossible to express many non-English words in Exif tags. (In practice, many software packages simply ignored this advice and used other character sets anyway, violating the specification.)

In Exif 3.0, a new datatype “UTF-8” is introduced, meaning that the same field can now support internationalised character sets, from Chinese to Arabic and Persian.

Unique IDs

The definition of the ImageUniqueID tag has been updated to more clearly specify what type of ID can be used, when it should be updated (never!), and to suggest an algorithm:

This tag indicates an identifier assigned uniquely to each image. It shall be recorded as an ASCII string in hexadecimal notation equivalent to 128-bit fixed length UUID compliant with ISO/IEC 9834-8. The UUID shall be UUID Version 1 or Version 4, and UUID Version 4 is recommended. This ID shall be assigned at the time of shooting image, and the recorded ID shall not be updated or erased by any subsequent editing.

Guidance on when and how tag values can be modified or removed

Exif 3.0 adds a new appendix, Annex H, “Guidelines for Handling Tag Information in Post-processing by Application Software”, which groups metadata into categories such as “structure-related metadata” and “shooting condition-related metadata”. It also classifies metadata in groups based on when they should be modified or deleted, if ever.

Category

Description

Examples (list may not be exhaustive)

Update 0

Shall be updated with image structure change

DateTime (should be updated with every edit), ImageWidth, Compression, BitsPerSample

Update 1

Can be updated regardless of image structure change

ImageDescription, Software, Artist, Copyright, UserComment, ImageTitle, ImageEditor, ImageEditingSoftware, MetadataEditingSoftware

Freeze 0

Shall not be deleted/updated at any time

ImageUniqueID

Freeze 1

Can be deleted in special cases

Make, Model, BodySerialNumber

Freeze 2

Can be corrected [if wrong], added [if empty] or deleted [in special cases]

DateTimeOriginal, DateTimeDigitized, GPSLatitude, GPSLongitude, LensSpecification, Humidity

Collaboration between CIPA and IPTC

CIPA and IPTC representatives meet regularly to discuss issues that are relevant to both organisations. During these meetings IPTC has contributed suggestions to the Exif project, particularly around internationalised fields and unique IDs.

We are very happy for our friends at CIPA for reaching this milestone, and hope to continue collaborating in the future.

Developers of photo management software understand that values of Exif tags and IPTC Photo Metadata properties with a similar purpose should be synchronised, but sometimes it wasn’t clear exactly which properties should be aligned. IPTC and CIPA collaborated to create a Mapping Guideline to help software developers implement it properly. Most professional photo software now supports these mappings.

Complete list of changes in Exif 3.0

The full set of changes in Exif 3.0 are as follows (taken from the history section of the PDF document):

  • Added Tag Type of UTF-8 as Exif specific tag type.
    • Enabled to select UTF-8 character string in existing ASCII-type tags
  • Enabled APP11 Marker Segment to store a Box-structured data compliant with the JPEG System standard
  • Added definition of Box-structured Annotation Data
  • Added and changed the following tags:
    • Added Title Tag
    • Added Photographer Information related Tags (Photographer and ImageEditor)
    • Added Software Information related Tags (CameraFirmware, RAWDevelopingSoftware, ImageEditingSoftware, and MetadataEditingSoftware)
    • Changed Software, Artist, and ImageUniqueID
    • Corrected incorrect definition of GPSAltitudeRef
    • GPSMeasureMode tag became to support positioning information obtained from GNSS in addition to GPS
  • Changed the description support levels of the following tags:
    • XResolution
    • YResolution
    • ResolutionUnit
    • FlashpixVersion
  • Discarded Annex E.3 to specify Application Software Guidelines
  • Added Annex H. (at the time of publication) to specify Guidelines for Handling Tag Information in Post-processing by Application Software
  • Added Annex I.and J. (both at the time of publication) for supplemental information of Annotation Data
  • Added Annex K. (at the time of publication) to specify Original Preservation Image
  • Corrected errors, typos and omissions accumulated up to this edition
  • Restructured and revised the entire document structure and style
Screenshot of the home page of Project Origin's web site, originproject.info.
Screenshot of the home page of Project Origin’s web site, originproject.info.

The IPTC is very happy to announce that it has joined the Steering Committee of Project Origin, one of the industry’s key initiatives to fight misinformation online through the use of tamper-evident metadata embedded in media files.

After working with Project Origin over a number of years, and co-hosting a series of workshops during 2022, the organisation formally invited the IPTC to join the Steering Committee.

Current Steering Committee members are Microsoft, the BBC and CBC / Radio Canada. The New York Times also participates in Steering Committee meetings through its Research & Development department. 

“We were very happy to co-host with Project Origin a productive series of webinars and workshops during 2022, introducing the details of C2PA technology to the news and media industry and discussing the remaining issues to drive wider adoption,” says Brendan Quinn, Managing Director of the IPTC.

C2PA, the Coalition for Content Provenance and Authenticity, took a set of requirements from both Project Origin and the Content Authenticity Initiative to create a technical means of associating media files with information on the origin and subsequent modifications of news stories and other media content.

“Project Origin’s aim is to take the ground-breaking technical specification created by C2PA and make it realistic and relevant for newsrooms around the world,” Quinn said. “This is very much in keeping with the IPTC’s mission to help media organisations to succeed by sharing best practices, creating open standards and facilitating collaboration between media and technology organisations.”

“The IPTC is a perfect partner for Project Origin as we work to connect newsrooms through secure metadata,” said Bruce MacCormack, the CBC/Radio-Canada Co-Lead. 

The announcement was made at the Trusted News Initiative event held in London today, 30 March 2023, where representatives of the BBC, AFP, Microsoft, Meta and many others gathered to discuss trust, misinformation and authenticity in news media.

Learn more about Project Origin by contacting us or viewing the video below: