Categories
Archives

NEW YORK, NY, 26 JULY 2023: The IPTC today announced the beginning of a public feedback and review period of IPTC Sport Schema, which aims to be “the standard for the next generation of sports data.”
The announcement was made by Paul Kelly, Lead of the IPTC Sports Content Working Group, at the Sports Video Group’s Content Management Forum held at 230 Fifth Penthouse, New York.
“The SVG Content Management Forum is attended by senior tech experts from sports broadcasters and sports leagues from the US and around the world, so it is the perfect place to launch the IPTC Sport Schema,” said Kelly. “Many members of SVG have advised us on our work so far, including organisations such as Warner Bros Discovery, NBC Universal, PGA TOUR, Major League Baseball and Riot Games. Presenting our work at their event is a great way to say thanks for their help.”
While not yet an official IPTC standard, the IPTC Sports Content Working Group feels that the schema describing IPTC Sport Schema is solid enough to be published for public feedback.
Sports data for the era of linked data and knowledge graphs
The purpose of the IPTC Sport Schema project is to create a new RDF-based sports data standard, while making the most of the experience the IPTC has gained from the last 20 years of maintaining SportsML, the open XML-based sports data standard used by news and sports organisations around the world.

While XML served the industry well for many years, more recently developers and IPTC members have asked the Sports Content Working Group whether a standard would become available in a more modern serialisation format such as JSON, and whether knowledge graph protocols would be supported.
Because it is based on the W3C-standard RDF and OWL specifications, IPTC Sport Schema leverages the wide range of tools and expertise in the world of knowledge graphs, semantic web and linked open data, including the SPARQL query language, the JSON-LD serialisation into JSON format, inference using RDF Schema and OWL, and more.
“Using IPTC Sport Schema, sports leagues can choose to own their data,” said IPTC Managing Director Brendan Quinn. “Content publishers or sports leagues can publish open data on their website if they choose, in a way that can be re-mixed and re-used by others around the world.” IPTC Sport Schema can also be used for a more traditional model of aggregation and syndication by sports statistics providers who add value to the raw data being collected by sports leagues.
Like its ancestor SportsML, IPTC Sport Schema is created as a generic sports data model that can represent results, statistics, schedules and rosters across many sports. “Plugins” for specific sports extend the generic schema with specific statistics elements for 10 sports such as soccer, motor racing, tennis, rugby and esports. But the generic model can be used to handle any competitive sports competition, either team-based, head-to-head or individual.
As well as IPTC’s SportsML standard, the project is based on previous work by the BBC on its BBC Sport Ontology (some of its creators worked on this project). We have also consulted with and analysed related projects and formats such as OpenTrack and the IOC’s Olympics Data Feed format.
For more information on IPTC Sport Schema, please see the dedicated site sportschema.org, the project’s GitHub repository,
Those who are interested in the details can see an introduction to the IPTC Sport Schema ontology design, the full ontology diagram or full RDF/OWL ontology documentation,
There may be significant changes to the schema between now and when it is released as a fully endorsed IPTC Standard, so we don’t recommend that it is implemented in production systems yet. But we welcome analysis and experimentation with the model, and look forward to seeing feedback from those who would like to implement it in the real world.
People and organisations who are not IPTC members can give feedback by posting to the IPTC SportsML public discussion group or use the IPTC Contact Us form.

In September 2022, IPTC Managing Director Brendan Quinn was invited to attend a workshop at the Royal Society in London, held in conjunction with the BBC. It was convened to discuss concerns about content provenance, threats to society due to misinformation and disinformation, and the idea of a “public-service internet.”
A note summarising the outcomes of the meeting has now been published. The Royal Society says, “This note provides a summary of workshop discussions exploring the potential of digital content provenance and a ‘public service internet’.”
The workshop note gives a summary of key takeaways from the event:
- Digital content provenance is an imperfect and limited – yet still critically important – solution to the challenge of AI-generated misinformation.
- A provenance-establishing system that can account for the international and culturally diverse nature of misinformation is essential for its efficacy.
- Digital content provenance tools present significant technical and ethical challenges, including risks related
 to privacy, security and literacy.
- Understanding how best to embed ideas such as digital content provenance into counter-misinformation strategies may require revisiting the rules which dictate how information is transmitted over the internet.
- A ‘public service internet’ presents an interesting and new angle through which public service objectives can shape the information environment; however, the end state of such a system requires greater clarity and should include a wide range of voices, including historically excluded groups.
The IPTC is already participating in several projects looking at concrete responses to the problems of misinformation and disinformation in the media, via our work theme on Trust and Credibility in the Media. We are on the Steering Committee of Project Origin, and work closely with C2PA and the Content Authenticity Initiative.
The IPTC looks forward to further work in this area.The IPTC and its members will be happy to contribute to more workshops and studies by the Royal Society and other groups.

The IPTC Standards Committee is happy to announce that ninjs, IPTC’s schema for marking up news content in JSON, has been revised to versions 2.1 and 1.5.
The vote to approve the new versions was taken at the recent IPTC Spring Meeting in Tallinn, Estonia and online.
This is in keeping with IPTC’s decision to maintain two parallel versions of ninjs: one for those who can’t upgrade to the 2.x version of backwards compatibility reasons, and those who prefer the simpler structure of ninjs 2.x that is easier to handle in some tools.
The ninjs User Guide has been updated to reflect the changes, which are summarised below.
ContactInfo added to ninjs 1.5 and 2.1
ninjs 2.1 and ninjs 1.5 both include the new contactinfo structure which can be used in the people, organisations, places and infosources properties (and their ninjs 1.x equivalents person, organisation, place and infosource).
The contactInfo structure can contain physical or online contact information such as a street address or postal address, a username on social media such as Twitter, Instagram or TikTok, or even a locator such as what3words.
Here are some examples of how the contactinfo property can be used:
"people": [
  {
    "name": "Jonas Svensson",
    "contactinfo": [
      {
        "type":"phone",
        "role": "work",
        "value": "+46 (0)8-7887500"
      }
    ]
  }
],
"organisations": [
  {
    "name": "International Committee of the Red Cross",
    "contactinfo": [
      {
        "type": "web",
        "value": "https://www.icrc.org/"
      },
      {
        "type": "address",
        "address": {
          "lines": [
            "19 Avenue de la paix",
            "1202 Geneva",
            "Switzerland"
          ]
        }
      },
      {
        "type": "telephone",
        "value": "+41 22 734 60 01"
      }
    ]
  }
]
Better support for organisation identifiers such as tickers, ISIN etc
ninjs 2.1 and 1.5 also include the new symboltype and symbol properties under symbols. Symbol can identify any type of URI describing the type of the symbol. The CV http://cv.iptc.org/newscodes/financialinstrumentsymboltype is recommended.
The ticker sub-property under symbols is now deprecated. This means that it can still be used if necessary, but use is not recommended.
We now recommend that ticker symbols are stored using symbol="TCKR" and symboltype="https://cv.iptc.org/newscodes/financialinstrumentsymboltype/Ticker".
Better support for machine classification
The subjects (ninjs 2.x) / subject (ninjs 1.x) properties now allow for the sub-properties creator, relevance and confidence. 
This allows organisations to more accurately use machine-generated subject tags in their content.  while stating that it was created by a machine (using the creator property), and giving numerical values for the relevance and confidence scores that are reported by machine tagging engines. (Of course, these properties can also be used for human-created subject tags if necessary!)
In addition, some internal changes to the schema were made to fix a validation bug that existed in previous versions. In order to accommodate these changes, the ninjs 2.1 schema uses the https://json-schema.org/draft/2020-12/schema version of JSON Schema.
Thanks to Johan Lindgren, welcome Ian Young as Working Group Lead
At the Spring Meeting in Tallinn we said farewell to Johan Lindgren as Lead of the News in JSON Working Group.
Johan, of the TT news agency in Sweden, was instrumental in bringing the News in JSON Working Group back from its quiet period after the initial launch of ninjs. This directly led to the release of several new versions of ninjs over the past few years, and its adoption by many of the world’s top news providers.
The IPTC wishes to thank Johan for all his contributions, and wishes him well for his retirement.
Johan’s work will be taken over by Ian Young from PA Media Group / Alamy based in the UK. Ian steps up to the Lead role after participating in the Working Group for many years, since the earliest days of ninjs.
We thank Ian for being willing to take on the lead role, and we look forward to seeing what developments will emerge from the News in JSON Working Group in the future.
In partnership with the IPTC, the PLUS Coalition has published for public comment a draft on proposed revisions to the PLUS License Data Format standard. The changes cover a proposed standard for expressing image data mining permissions, constraints and prohibitions. This includes declaring in image files whether an image can be used as part of a training data set used to train a generative AI model.
Review the draft revisions in this read-only Google doc, which includes a link to a form for leaving comments.
Here is a summary of the new property:
| XMP Property | plus:DataMining | 
| XMP Value Type | URL | 
| XMP Category | External | 
| Namespace URI | http://ns.useplus.org/ldf/xmp/1.0/DataMining | 
| Comments | 
 | 
| Cardinality | 0..1 | 
According to the PLUS proposal, the value of the property would be a value from the following controlled vocabulary:
| CV Term URI | Description | 
| http://ns.useplus.org/ldf/vocab/DMI-UNSPECIFIED | Neither allowed nor prohibited | 
| http://ns.useplus.org/ldf/vocab/DMI-ALLOWED | Allowed | 
| http://ns.useplus.org/ldf/vocab/DMI-ALLOWED-noaitraining | Allowed except for AI/ML training | 
| http://ns.useplus.org/ldf/vocab/DMI-ALLOWED-noaigentraining | Allowed except for AI/ML generative training | 
| http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED | Prohibited | 
| http://ns.useplus.org/ldf/vocab/DMI-SEE-constraint | Allowed with constraints expressed in Other Constraints property | 
| http://ns.useplus.org/ldf/vocab/DMI-SEE-embeddedrightsexpr | Allowed with constraints expressed in IPTC Embedded Encoded Rights Expression property | 
| Allowed with constraints expressed in IPTC Linked Encoded Rights Expression property | 
The public comment period will close on July 20, 2023.
If it is accepted by the PLUS membership and published as a PLUS property, the IPTC Photo Metadata Working Group plans to adopt this new property into a new version of the IPTC Photo Metadata Standard at the IPTC Autumn Meeting in October 2023.
The IPTC is happy to announce that NewsML-G2 version 2.32 has been released.
All documentation relating to version 2.32 can be found at the NewsML-G2 2.32 documentation page.
The changes in 2.32 are:
- Added new attributes authoritystatusandauthoritystatusurito the scheme, schemeMeta and catalog elements. These attributes describe the status of the authority managing a resource such as a scheme or a catalog.
- Added a new NewsCodes vocabulary https://cv.iptc.org/newscodes/authoritystatus with the values “No current authority”, “No single authority” and “Country-specific authority”.
- Updated the IPTC catalog to version 38, including the new authoritystatus vocabulary and also added a “cvx.iptc.org” vocabulary, ticker. Added an authoritystatusattribute to the following schemes: isin, a1312cat, a1312prio, a1312svc, a1312vers. Also update to note on the frmt vocabulary removing the part that says it is only applicable to NewsML 1.
- Update schema documentation for qcode, uri and literal throughout to be more accurate.
- Remove https from CV references in schema documentation
- Update dev schema to use 2.32.
- The schema documentation for “creator” and “creatoruri” attributes is now correct and consistent across all instances.
All information related to NewsML-G2 2.32 is at https://iptc.org/std/NewsML-G2/2.32/.
Example instance documents are at https://iptc.org/std/NewsML-G2/2.32/examples/.
Full XML Schema documentation is located at https://iptc.org/std/NewsML-G2/2.32/specification/XML-Schema-Doc-Power/
The NewsML-G2 Generator tool has also been updated to produce NewsML-G2 2.32 files using the version 38 catalog.
For any questions or comments, please contact us via the IPTC Contact Us form or post to the iptc-newsml-g2@groups.io mailing list. IPTC members can ask questions at the weekly IPTC News Architecture Working Group meetings.

CIPA, the Camera and Imaging Products Association based in Japan, has released version 3.0 of the Exif standard for camera data.
The new specification, “CIPA DC-008-Translation-2023 Exchangeable image file format for digital still cameras: Exif Version 3.0” can be downloaded from https://www.cipa.jp/std/documents/download_e.html?DC-008-Translation-2023-E.
Version 1.0 of Exif was released in 1995. The previous revision, 2.32, was released in 2019. The new version introduces some major changes so the creators felt it was necessary to increment the major version number.
Fully internationalised text tags
In previous versions, text-based fields such as Copyright and Artist were required to be in ASCII format, meaning that it was impossible to express many non-English words in Exif tags. (In practice, many software packages simply ignored this advice and used other character sets anyway, violating the specification.)
In Exif 3.0, a new datatype “UTF-8” is introduced, meaning that the same field can now support internationalised character sets, from Chinese to Arabic and Persian.
Unique IDs
The definition of the ImageUniqueID tag has been updated to more clearly specify what type of ID can be used, when it should be updated (never!), and to suggest an algorithm:
This tag indicates an identifier assigned uniquely to each image. It shall be recorded as an ASCII string in hexadecimal notation equivalent to 128-bit fixed length UUID compliant with ISO/IEC 9834-8. The UUID shall be UUID Version 1 or Version 4, and UUID Version 4 is recommended. This ID shall be assigned at the time of shooting image, and the recorded ID shall not be updated or erased by any subsequent editing.
Guidance on when and how tag values can be modified or removed
Exif 3.0 adds a new appendix, Annex H, “Guidelines for Handling Tag Information in Post-processing by Application Software”, which groups metadata into categories such as “structure-related metadata” and “shooting condition-related metadata”. It also classifies metadata in groups based on when they should be modified or deleted, if ever.
| Category | Description | Examples (list may not be exhaustive) | 
| Update 0 | Shall be updated with image structure change | DateTime (should be updated with every edit), ImageWidth, Compression, BitsPerSample | 
| Update 1 | Can be updated regardless of image structure change | ImageDescription, Software, Artist, Copyright, UserComment, ImageTitle, ImageEditor, ImageEditingSoftware, MetadataEditingSoftware | 
| Freeze 0 | Shall not be deleted/updated at any time | ImageUniqueID | 
| Freeze 1 | Can be deleted in special cases | Make, Model, BodySerialNumber | 
| Freeze 2 | Can be corrected [if wrong], added [if empty] or deleted [in special cases] | DateTimeOriginal, DateTimeDigitized, GPSLatitude, GPSLongitude, LensSpecification, Humidity | 
Collaboration between CIPA and IPTC
CIPA and IPTC representatives meet regularly to discuss issues that are relevant to both organisations. During these meetings IPTC has contributed suggestions to the Exif project, particularly around internationalised fields and unique IDs.
We are very happy for our friends at CIPA for reaching this milestone, and hope to continue collaborating in the future.
Developers of photo management software understand that values of Exif tags and IPTC Photo Metadata properties with a similar purpose should be synchronised, but sometimes it wasn’t clear exactly which properties should be aligned. IPTC and CIPA collaborated to create a Mapping Guideline to help software developers implement it properly. Most professional photo software now supports these mappings.
Complete list of changes in Exif 3.0
The full set of changes in Exif 3.0 are as follows (taken from the history section of the PDF document):
- Added Tag Type of UTF-8 as Exif specific tag type.
- Enabled to select UTF-8 character string in existing ASCII-type tags
 
- Enabled APP11 Marker Segment to store a Box-structured data compliant with the JPEG System standard
- Added definition of Box-structured Annotation Data
- Added and changed the following tags:
- Added Title Tag
- Added Photographer Information related Tags (Photographer and ImageEditor)
- Added Software Information related Tags (CameraFirmware, RAWDevelopingSoftware, ImageEditingSoftware, and MetadataEditingSoftware)
- Changed Software, Artist, and ImageUniqueID
- Corrected incorrect definition of GPSAltitudeRef
- GPSMeasureMode tag became to support positioning information obtained from GNSS in addition to GPS
 
- Changed the description support levels of the following tags:
- XResolution
- YResolution
- ResolutionUnit
- FlashpixVersion
 
- Discarded Annex E.3 to specify Application Software Guidelines
- Added Annex H. (at the time of publication) to specify Guidelines for Handling Tag Information in Post-processing by Application Software
- Added Annex I.and J. (both at the time of publication) for supplemental information of Annotation Data
- Added Annex K. (at the time of publication) to specify Original Preservation Image
- Corrected errors, typos and omissions accumulated up to this edition
- Restructured and revised the entire document structure and style

Following the recent announcements of Google’s signalling of generative AI content and Midjourney and Shutterstock the day after, Microsoft has now announced that it will also be signalling the provenance of content created by Microsoft’s generative AI tools such as Bing Image Creator.
Microsoft’s efforts go one step beyond those of Google and Midjourney, because they are adding the image metadata in a way that can be verified using digital certificates. This means that not only is the signal added to the image metadata, but verifiable information is added on who added the metadata and when.
As TechCrunch puts it, “Using cryptographic methods, the capabilities, scheduled to roll out in the coming months, will mark and sign AI-generated content with metadata about the origin of the image or video.”
The system uses the specification created by the Coalition for Content Provenance and Authenticity. a joint project of Project Origin and the Content Authenticity Initiative.
The 1.3 version of the C2PA Specification specifies how a C2PA Action can be used to signal provenance of Generative AI content. This uses the IPTC DigitalSourceType vocabulary – the same vocabulary used by the Google and Midjourney implementations.
This follows IPTC’s guidance on how to use the DigitalSourceType property, published earlier this month.

We have just finished the IPTC Spring Meeting in Tallinn, Estonia. Our first face-to-face IPTC Member Meeting since 2019, those who could attend in person were very happy to be back together, enabling collaboration, knowledge sharing and building bonds across organisations in the media industry.
We were also joined by over 50 online attendees from IPTC member organisations, who braved sometimes difficult timezone differences to view many of the sessions in real time and participate in discussions. Other IPTC members who weren’t able to be there either physically or virtually will be able to watch recordings of the sessions soon.
Themes this time obviously included Generative AI, but also fact-checking and provenance, social media embedding and social stories,
Highlights of the Monday included a special briefing about digital citizenship and digital governance at the e-Estonia Briefing Centre, where members heard from an Estonian government representative who described Estonia’s electronic tax, medicine, administration and even e-voting system, all powered by the cryptographically-protected digital ID card and the X-Road system of interconnecting all of e-Estonia’s services, across both the private and public sector.
Also on the Monday we heard from Gerd Kamp (dpa) who explained how dpa are using Web Components technology to embed social media into their articles in a way that’s much easier for their customers to process. We also heard Working Group presentations and new standard proposals from the NewsML-G2 Working Group and the News in JSON Working Group, whose lead Johan Lindgren (TT) handed over the reins to Ian Young (PA Media / Alamy) who promises to be a fine leader of the group in the future. We say many thanks to Johan for all his contributions to IPTC over the past 25 years!
We also heard from Evi Varsou (ATC) who demonstrated some of ATC’s tools for fighting fake news and misinformation, used by some of the world’s top news organisations.
Day 2 saw Dave Compton (Refinitiv, an LSE Group Company) describe some of their work on handling augmenting news content in real time with analytics information. Then we heard invited speaker Maria Amelie (Factiverse) talk about her troubles with the Norwegian authorities, being deported, and eventually getting Norwegian law changed to support refugees like herself. She now runs the startup Factiverse which is looking at using AI to help promote fact checks as fast as possible, via their site Factisearch (among other projects).
After a discussion on rights and RightsML, we heard from Estonian startup Texta (who provide several tools for media organisations, including an automated comment feed moderator that works in many languages), and German startup Storifyme.com who have created a tool that lets media companies quickly and easily create social posts from news stories – still very relevant even as Google AMP is being wound down.
Tuesday was rounded off by Jennifer Parrucci (The New York Times) presenting the NewsCodes Working Group‘s update, and Paul Kelly (Individual Member) giving an update on the huge amount of work on IPTC Sport Schema from the Sports Content Working Group.
On Wednesday, after an EGM voting on an update to the Articles of Association, we heard from Charlie Halford (BBC) on Project Origin and C2PA, and Sebastian Posth of International Standard Content Code.
We also voted in updates to NewsML-G2 and ninjs, which will be announced here soon.
We’re already looking forward to the Autuymn Meeting, held in October online, and Spring Meeting 2024, hopefully in New York City!

As a follow-up to yesterday’s news on Google using IPTC metadata to mark AI-generated content we are happy to announce that generative AI tools from Midjourney and Shutterstock will both be adopting the same guidelines.
According to a post on Google’s blog, Midjourney and Shutterstock will be using the same mechanism as Google – that is, using the IPTC “Digital Source Type” property to embed a marker that the content was created by a generative AI tool. Google will be detecting this metadata and using it to show a signal in search results that the content has been AI-generated.
A step towards implementing responsible practices for AI
We at IPTC are very excited to see this concrete implementation of our guidance on metadata for synthetic media.
We also see it as a real-world implementation of the guidelines on Responsible Practices for Synthetic Media from the Partnership on AI, and of the AI Ethical Guidelines for the Re-Use and Production of Visual Content from CEPIC, the alliance of European picture agencies. Both of these best practice guidelines emphasise the need for transparency in declaring content that was created using AI tools.
The phrase from the CEPIC transparency guidelines is “Inform users that the media or content is synthetic, through
labelling or cryptographic means, when the media created includes synthetic elements.”
The equivalent recommendation from the Partnership on AI guidelines is called indirect disclosure:
“Indirect disclosure is embedded and includes, but is not limited to, applying cryptographic provenance to synthetic outputs (such as the C2PA standard), applying traceable elements to training data and outputs, synthetic media file metadata, synthetic media pixel composition, and single-frame disclosure statements in videos”
Here is a simple, concrete way of implementing these disclosure / transparency guidelines using existing metadata standards.
Moving towards a provenance ecosystem
IPTC is also involved in efforts to embed transparency and provenance metadata in a way that can be protected using cryptography: C2PA, the Content Authenticity Initiative, and Project Origin.
C2PA provides a way of declaring the same “Digital Source Type” information in a more robust way, that can provide mechanisms to retrieve metadata even after the image was manipulated or after the metadata was stripped from the file.
However implementing C2PA technology is more complicated, and involves obtaining and managing digital certificates, among other things. Also C2PA technology has not been implemented by platforms or search engines on the display side.
In the short term, AI content creation systems can use this simple mechanism to add disclosure information to their content.
The IPTC is happy to help any other parties to implement these metadata signals: please contact IPTC via the Contact Us form.

At today’s Google I/O event keynote, Sundar Pichai, CEO of Google, explained how Google will be using embedded IPTC image metadata to signal visual media created by generative AI models.
“Moving forward, we are building our models to include watermarking and other techniques from the start,” Pichai said. “If you look at a synthetic image, it’s impressive how real it looks, so you can imagine how important this is going to be in the future.
“Metadata allows content creators to associate additional context with original files, giving you more information whenever you encounter an image. We’ll ensure every one of our AI-generated images has that metadata.”
The IPTC Photo Metadata section of Google Images’ guidance on metadata has been updated with new guidance on the DigitalSourceType field:

This follows the guidance on IPTC Photo Metadata for Generative AI that was recently published by IPTC.
“AI-Generated” label on Google Images
The above guidance hints at an “AI-generated label” to be used on Google Images in the future. Google recommends that all creators of AI-generated images use the IPTC Digital Source Type property to signal AI-generated content. While Google says that “you may not see the label in Google Images right away”, it appears that it will soon be available in Google Images search results.