photo of a computer screen showing a portion of the ninjs 3.1 schema featuring the new Digital Source Type property
ninjs 3.1 schema featuring the new Digital Source Type property

The IPTC is excited to announce the latest updates to ninjs, our JSON-based standard for representing news content metadata. Version 3.1 is now available, along with updated versions 2.2 and 1.6 for those using earlier schemas.

These releases reflect IPTC’s ongoing commitment to supporting structured, machine-readable news content across a variety of technical and editorial workflows.

What is ninjs?

ninjs (News in JSON) is a flexible, developer-friendly format for describing news items in a structured way. It allows publishers, aggregators, and news tech providers to encode rich metadata about articles, images, videos, and more, using a clean JSON format that fits naturally into modern content pipelines.

What’s new in ninjs 3.1, 2.2 and 1.6?

The new releases add a new property for the IPTC Digital Source Type property, which was first used with the IPTC Photo Metadata Standard but now used across the industry to declare the source of media content, including content generated or manipulated by a Generative AI engine.

The new property (called digitalSourceType in 3.1 and digitalsourcetype in 2.2 and 1.6 to match the case conventions of each standard version) has the following properties:

  • Name: the name of the digital source type, such as “Created using Generative AI”
  • URI: the official identifier of the digital source type from the IPTC Digital Source Type vocabulary or another vocabulary, such as http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgorithmicMedia (the official ID for generative AI content)
  • Literal: an optional way to add new digital source types that are not part of a controlled vocabulary.

IPTC supports multiple versions of ninjs in parallel to ensure stability and continuity for publishers and platforms that depend on long-term schema support.

The new property is part of the general ninjs schema, and so can be used in the main body of a ninjs object to describe the main news item and can also be used in an “association” object which refers to an associated media item.

Access the schemas

All versions are publicly available on the IPTC website:

ninjs generator and user guide

The ninjs Generator tool has been updated to cover the latest versions. Fill in the form fields and see what that content looks like in ninjs format. You can switch between the schema versions to see how the schema changes between 1.6, 2.2 and 3.1.

The ninjs User Guide has also been updated to reflect the newly added property.

Why it matters

As the news industry becomes increasingly reliant on metadata for content distribution, discoverability, and rights management, ninjs provides a modern, extensible foundation that supports both human and machine workflows. It’s trusted by major news agencies, technology platforms, and AI developers alike.

Get involved

We welcome feedback from the community and encourage you to share how you’re using ninjs in your own products or platforms. If you would like to discuss ninjs, you can join the public mailing list at https://groups.io/g/iptc-ninjs.

If you’re interested in contributing to the development of IPTC standards, join us!

China Daily’s article about the new Chinese guidelines.

The news outlet China Daily reported on Friday that China will require all AI-generated content to be labelled from September 1st, 2025.      

China Daily reports:

Chinese authorities issued guidelines on Friday requiring labels on all artificial intelligence-generated content circulated online, aiming to combat the misuse of AI and the spread of false information.    

The regulations, jointly issued by the Cyberspace Administration of China, the Ministry of Industry and Information Technology, the Ministry of Public Security, and the National Radio and Television Administration, will take effect on Sept 1.

A spokesperson for the Cyberspace Administration said the move aims to “put an end to the misuse of AI generative technologies and the spread of false information.” 

According to China Daily, “[t]he guidelines stipulate that content generated or synthesized using AI technologies, including texts, images, audios, videos and virtual scenes, must be labeled both visibly and invisibly” (emphasis added by IPTC). This potentially means that IPTC or another form of embedded metadata must be used, in addition to a visible watermark. 

“Content identification numbers”

The article goes on to state that “[t]he guideline requires that implicit labels be added to the metadata of generated content files. These labels should include details about the content’s attributes, the service provider’s name or code, and content identification numbers.”

It is not clear from this article which particular identifiers should be used. There is currently no globally-recognised mechanism to identify individual pieces of content by identification numbers, although IPTC Photo Metadata does allow for image identifiers to be included via the Digital Image GUID property and the Video Metadata Hub Video Identifier field, which is based on Dublin Core’s generic dc:identifier property.

IPTC Photo Metadata’s Digital Source Type property is the global standard for identifying AI-generated images and video files, being used by Meta, Apple, Pinterest, Google and others, and also being adopted by the C2PA specification for digitally-signed metadata embedded in media files.

According to the article, “Service providers that disseminate content online must verify that the metadata of the content files contain implicit AIGC labels, and that users have declared the content as AI-generated or synthesized. Prominent labels should also be added around the content to inform users.”

Spain’s equivalent legislation on labelling AI-generated content

This follows on from Spain’s legislation requiring labelling of AI-generated content, announced last week.

The Spanish proposal has been approved by the upper house of parliament but must still be approved by the lower house. The legislation will be enforced by the newly-created Spanish AI supervisory agency AESIA.

If companies do not comply with the proposed Spanish legislation, they could incur fines of up to 35 million euros ($38.2 million) or 7% of their global annual turnover.

An image created with Google's Gemini model. The image contains values for the IPTC Photo Metadata properties Digital Source type (trainedAlgorithmnicMedia) and Credit ("Made with Google AI").
An image created with Google’s Gemini model. The image contains values for the IPTC Photo Metadata properties Digital Source type (trainedAlgorithmnicMedia) and Credit (“Made with Google AI”).

On Thursday, Google announced that it will be extending its usage of AI content labelled using the IPTC Digital Source Type vocabulary.

We have previously shared that Google uses IPTC Photo Metadata to signal AI-generated and AI-edited media, for example labelling images edited with the Magic Eraser tool on Pixel phones. 

In a blog post published on Friday, John Fisher, Engineering Director for Google Photos and Google One posted that “[n]ow we’re taking it a step further, making this information visible alongside information like the file name, location and backup status in the Photos app.”

This is based on IPTC’s Digital Source Type vocabulary, which was updated a few weeks ago to include new terms such as “Multi-frame computational capture sampled from real life” and “Screen capture“.

Google already surfaces Digital Source Type information in search results via the “About this image” feature.

Still taken from a demonstration of how AI-labelling information will look in Google Photos.
Still taken from a demonstration of how AI-labelling information will look in Google Photos.

Also, the human-readable label for the term http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgorithmicMedia was clarified to be “Created using Generative AI” and similarly the label for the term  http://cv.iptc.org/newscodes/digitalsourcetype/compositeWithTrainedAlgorithmicMedia was clarified to be “Edited with Generative AI.” These terms are both used by Google.

An extract of IPTC Media Topics vocabulary tree browser showing the new "show retired" button.
An extract of IPTC Media Topics vocabulary tree browser.

The IPTC NewsCodes Working Group is pleased to announce the latest release of the IPTC NewsCodes, our set of controlled vocabularies for the news industry.

Updates this time span many vocabularies, with the biggest updates to Media Topic and Digital Source Type.

Media Topic updates

Most of the recent work has been in the politics branch.

3 new concepts: by-election, recall election, coalition building

2 retired concepts: political campaigns, church elections

4 modified concept names (in English): voting system, referendum, fundamental rights, football (yes we finally refer to the sport as “football” in en-GB and “soccer” in en-US!)

Modified concept definitions: 22 civil rights, election, voting system, intergovernmental elections, local elections, primary elections, referendum, regional elections, voting, fundamental rights, censorship and freedom of speech, freedom of religion, freedom of the press, human rights, football, political debates, privacy, women’s rights, breaking (breakdance)

1 hierarchy move: fundamental rights has been moved from politics to society.

Also, the Wikidata mapping URIs have all been changed to point to the http:// version of the URI instead of the https:// version. This follows the official Wikidata guidance.

See the official Media Topic vocabulary on the IPTC Controlled Vocabulary server, and an easier-to-navigate tree view. An Excel version of IPTC Media Topics is also available.

Digital Source Type updates

5 new concepts have been added:

2 concepts have been retired: Original media with minor human edits, and Digital art, as explained above.

8 concepts have had their names and definitions modified, while retaining the same machine-readable ID for backwards-compatibility purposes:

Our thanks go to IPTC representatives and experts from Partnership on AI, Google, Adobe, C2PA, CIPA and many others on making these updates to our vocabulary, which is now widely used to identify Generative AI content.

Updates to other NewsCodes vocabularies

Alternative  Identifier Role (altidrole)

  • Vocabulary’s name changed to fix a spelling mistake.
  • New concept: IPTC Video Metadata Hub ID (altidrole:vmhVideoId)

Event Occur Status (eocstat)

  • Fix spelling mistake “occurence” -> “occurrence” throughout.

Golf Shot (spgolshot)

Rights Property (rightsprop)

Sports Concept (spct)

Screenshot of an article from SearchEngineLand entitled "Google wants you to label AI-generated images used in Merchant Center", with the subtitle: "Don't remove embedded metadata tags such as trainedAlgorithmicMedia from such images," Google wrote.
The new guidance has received some press in the Search Engine Optimisation (SEO) world, including this post on SearchEngineLand.

Google has added Digital Source Type support to Google Merchant Center, enabling images created by generative AI engines to be flagged as such in Google’s products such as Google search, maps, YouTube and Google Shopping.

In a new support post, Google reminds merchants who wish their products to be listed in Google search results and other products that they should not strip embedded metadata, particularly the Digital Source Type field which can be used to signal that content was created by generative AI.

We at the IPTC fully endorse this position. We have been saying for years that website publishers should not strip metadata from images. This should also include tools for maintaining online product inventories, such as Magento and WooCommerce. We welcome contact from developers who wish to learn more about how they can preserve metadata in their images.

Here’s the full text of Google’s recommendation:

Preserving metadata tags for AI-generated images in Merchant Center
February 2024
If you’re using AI-generated images in Merchant Center, Google requires that you preserve any metadata tags which indicate that the image was created using generative AI in the original image file.

Don't remove embedded metadata tags such as trainedAlgorithmicMedia from such images. All AI-generated images must contain the IPTC DigitalSourceType trainedAlgorithmicMedia tag. Learn more about IPTC photo metadata.

These requirements apply to the following image attributes in Merchant Center Classic and Merchant Center Next:

Image link [image_link]
Additional image link [additional_image_link]
Lifestyle image link [lifestyle_image_link]
Learn more about product data specifications.

The IPTC NewsML-G2 Working Group and the News Architecture Working Group are happy to announce the release of the latest version of our flagship XML-based news syndication standard: NewsML-G2 v2.33.

Changes in the latest version are small but significant. We have added support for the Digital Source Type property which is already being used in IPTC’s sister standards IPTC Photo Metadata Standard and IPTC Video Metadata Hub and ninjs. This property can be used to declare when content has been created or modified by software, including by Generative AI engines.

Examples of other possible values for the digital source type property using the recommended IPTC Digital Source Type NewsCodes vocabulary are:

ID (in QCode format) Name Example
digsrctype:digitalCapture Original digital capture sampled from real life:

The digital media is captured from a real-life source using a digital camera or digital recording device

Digital video taken using a digital film, video or smartphone camera

digsrctype:negativeFilm Digitised from a negative on film:

The digital image was digitised from a negative on film on any other transparent medium

Digital photo scanned from a photographic negative

digsrctype:minorHumanEdits Original media with minor human edits:

Minor augmentation or correction by a human, such as a digitally-retouched photo used in a magazine

Original audio with minor edits (e.g. to eliminate breaks)

digsrctype:algorithmicallyEnhanced Algorithmic enhancement:
Minor augmentation or correction by algorithm

A photo that has been digitally enhanced using a mechanism such as Google Photos’ “denoise” feature

digsrctype:dataDrivenMedia Data-driven media:
Digital media representation of data via human programming or creativity

Textual weather report generated by code using readings from weather detection instruments

digsrctype:trainedAlgorithmicMedia Trained algorithmic media:
Digital media created algorithmically using a model derived from sampled content

A “deepfake” video using a combination of a real actor and a trained model

 

The above list is a subset of the full list of recommended values. See the full IPTC Digital Source Type NewsCodes vocabulary for the complete list.

Guidance on using Digital Source Type

The IPTC Photo Metadata User Guide contains a section on Guidance for using Digital Source Type including examples for various types of media, including images, video, audio and text. The examples referenced in this guide can also apply to NewsML-G2 content.

Where Digital Source Type can be used in NewsML-G2 documents

The new <digitalSourceType> property can be added to the contentMeta section of any G2 NewsItem, PackageItem, KnowledgeItem, ConceptItem or PlanningItem to describe the digital source type of an item in its entirety.

It can also be used in the partMeta section of any G2 NewsItem, PackageItem or KnowledgeItem to describe the digital source type of a part of the item. In this way, content such as a video that includes some captured shots and AI-generated shots can be fully described using NewsML-G2.

Find out more about NewsML-G2 v2.33

All information related to NewsML-G2 2.33 is at https://iptc.org/std/NewsML-G2/2.33/.

The NewsML-G2 Specification document has been updated to cover the new version 2.33.

Example instance documents are at https://iptc.org/std/NewsML-G2/2.33/examples/. 

Full XML Schema documentation is located at https://iptc.org/std/NewsML-G2/2.33/specification/XML-Schema-Doc-Power/

XML source documents and unit tests are hosted in the public NewsML-G2 GitHub repository.

The NewsML-G2 Generator tool has also been updated to produce NewsML-G2 2.33 files using the version 38 catalog.

For any questions or comments, please contact us via the IPTC Contact Us form or post to the iptc-newsml-g2@groups.io mailing list. IPTC members can ask questions at the weekly IPTC News Architecture Working Group meetings.

Categories

Archives