Categories
Archives
Anyone who has managed photo metadata can attest that it is often difficult to know which metadata properties to use for different purposes. It is especially tricky to know how to tag consistently across different metadata standards. For example, how should a copyright notice be expressed in Exif, IPTC Photo Metadata and schema.org metadata?
For software vendors wanting to build accurate mapping into their tools to make life easier for their customers, it’s no easier. For a while, a document created by a consortium of vendors known as the Metadata Working Group solved some of the problems, but the MWG Guidelines are no longer available online.
To solve this problem, the IPTC collaborated with Exif experts at CIPA, the camera products industry group that maintains the Exif standard. We also spoke with the team behind schema.org. Based on these conversations, we created a document that describes how to map properties between these formats. The aim is to remove any ambiguity regarding which IPTC Photo Metadata properties are semantically equivalent to Exif tags and schema.org properties.
Generally, Exif tags and IPTC Photo Metadata properties represent different things: Exif mainly represents the technical data around capturing an image, while IPTC focuses on describing the image and its administrative and rights metadata, and schema.org covers expressing metadata in a web page. However, quite a few properties are shared by all standards, such as who is the Creator of the image, the free-text description of what the image shows, or the date when the image was taken. Therefore it is highly recommended to have the same value in the corresponding fields of the different standards.
The IPTC Photo Metadata Mapping Guidelines outlines the 17 IPTC Photo Metadata Standard properties with corresponding fields in Exif and/or Schema.org. Further short textual notes help to implement these mappings correctly.
The intended audience of the document is those managing the use of photo metadata in businesses and the makers of software that handles photo metadata.The IPTC Photo Metadata Mapping Guidelines document can be accessed on the iptc.org website. We encourage IPTC members to provide feedback through the usual channels, and non-members to respond with feedback and questions on the public IPTC Photo Metadata email discussion group.
Next Thursday 10th March, IPTC members will be presenting a webinar on IPTC Media Topics and Wikidata. It will be held in association with the European Broadcasting Union as part of the EBU Wikidata Workshop.

The webinar is part of our series of “member-to-member” webinars, but as this is a special event in conjunction with EBU, attendance is open to the public.
The IPTC component of the workshop features Jennifer Parrucci of The New York Times, lead of the IPTC NewsCodes Working Group which manages the Media Topics vocabulary, and Managing Director of IPTC Brendan Quinn, introducing Media Topics and how they can be used with Wikidata. Then Tor Kristian Flage of Norwegian agency NTB and Gustav Carlberg of vendor and IPTC member iMatrics will present on their recent project to integrate IPTC Media Topics and Wikidata into their newsroom workflow.
Other speakers at the workshop on March 10th include France TV, RAI Italy, YLE Finland, Gruppo RES, Media Press and Perfect Memory.
Register to attend the full workshop (including the IPTC webinar) for free here.

The IPTC has an ongoing project to the news and media industry deal with content credibility and provenance. As part of this, we have started working with Project Origin, a consortium of news and technology organisations who have come together to fight misinformation through the use of content provenance technologies.
On Tuesday 22nd February, Managing Director of IPTC Brendan Quinn spoke on a panel at an invite-only Executive Briefing event attended by leaders from news organisations around the world.
Other speakers at the event included Marc Lavallee, Head of R&D for The New York Times, Pascale Doucet of France Télévision, Eric Horvitz of Microsoft Research, Andy Parsons of Adobe, and Laura Ellis, Jamie Angus and Jatin Aythora of the BBC.
The event marks the beginning of the next phase of the industry’s work on content credibility. C2PA has now delivered the 1.0 version of its spec, so the next phase of the work is for the news industry to get together to create best practices around implementing it in news workflows.
IPTC and Project Origin will be working together with stakeholders from all parts of the news industry to establish guidelines for making provenance work in a practical way across the entire news ecosystem.
We have just released a small update to the Media Topics controlled vocabulary for news and media content. The changes support the Winter Olympics which starts this week.
The changes are:
- The definition of bobsleigh (medtop:20000854) was changed to reflect the fact that bobsleigh now offers a one-person version (which is incidentally referred to as “monobob”). The new definition is: One, two or four people racing down a course in a sled that consists of a main hull, a frame, two axles and sets of runners. The total time of all heats in a competition is added together to determine the winner.
- Similarly, the definition of freestyle skiing (medtop:20001058) was changed to reflect new events this year. The new definition is: Skiing competitions which, in contrast to alpine skiing, incorporate acrobatic moves and jumps. Events include aerials, halfpipe, slopestyle, ski cross, moguls and big air.
We also took the opportunity to add a term which was recently suggested by ABC Australia and Fourth Estate in the US:
- tsunami (medtop:20001353), child of medtop:20000151 natural disaster – High and powerful ocean waves caused by an underwater land disturbance, such as an earthquake or volcanic eruption, known to cause significant damage and loss when they hit land
We would like to thank to all Media Topics users and maintainers for their feedback and support.
Welcome to 2022! We thought a good way to kick off the new year would be to share the text of the speech given by our new Standards Committee Chair, Paul Harman of Bloomberg, at the IPTC Standards Committee meeting on 20 October 2021. In this piece, Paul does a particularly good job of explaining IPTC’s mission and calls on all IPTC members to participate in our standards work.
The IPTC was founded to secure fair access to modern telecommunications infrastructure. Using satellite technology would enable news providers and distributors to report from conflict zones, or from the other side of the world, at greater speed and with less risk of disruption from regional disputes or actions which could affect landline alternatives.

Once such access was secured, they had to decide how to use it. News agencies required technical standards for information interchange, and that’s what IPTC set out to provide, in the name of interoperability. It’s a remit we continue to carry out today. Organisations both inside the news technology arena, and outside, look to IPTC for guidance on media metadata; IPTC is perhaps best known for the Photo Metadata standards that were incorporated into Adobe products, and from there across the photo ecosystem.
Today we face a different problem: not a lack of standards, but an over-abundance of them; and alongside that, regular misuse – or lack of use – of the standards we actually have. As the popular XKCD comic highlights, the solution isn’t to create “one new standard to rule them all”, as this just perpetuates the problem. Increasingly the activities of our Working Groups are about documenting how to use the standards – IPTC and external – that already exist, and how to map between them.

To do this, we need an understanding of what news is, and what each step in the workflow is trying to achieve. We must step away from the bits and bytes of transfer protocols, and instead examine the semantics of news – define an abstract data model representing the concepts in news collection, curation, distribution and feedback, and how those concepts inter-relate – separating the meaning of the metadata from the mechanics of how they are expressed. Only then can we successfully reflect that understanding back into whatever formats our members can use based on the constraints they are operating within.
New protocols and representations evolve all the time: SGML, XML, JSON, YAML, Turtle, Avro, protobuf… they are just serialisation formats. It shouldn’t have to matter whether you choose schema.org or rNews or RDFa or microdata or JSON-LD to embed metadata into HTML; what matters is a consistency of meaning, regardless of the mechanism.
Our Working Groups are already doing this, to a greater or lesser extent. The Video Metadata Hub is precisely an abstract model that defines serialisations into existing formats. The Photo Metadata Standard grew out of IIM and XMP work and describes the serialisations into, and necessary synchronisations between, current and future photo metadata standards. The News in JSON Working Group is attempting to map the same data model across JSON, Avro and Protocol Buffers, based on the News Architecture which was conceived as a data model but quickly became defined via its expression in XML, namely NewsML-G2. The Sports Content Working Group is currently working on taking the semantics from SportsML and SportsJS and re-expressing them in terms of RDF. For machine-readable rights, IPTC worked with the W3C on ODRL and used it as the basis for RightsML. And the NewsCodes Working Group is taking the Media Topics scheme and mapping it to Wikidata, which could be used as a lingua franca between any classification systems.
But this work is far from trivial, and requires continuous effort. IPTC is a member organisation, and it is through the time volunteered by delegates and their organisations that the work progresses. IPTC has but one member of staff – Brendan – who does a huge amount of work across all of our standards, but he also needs to run the business. Therefore we need your help to create and maintain our standards for the benefit of your businesses. Please join the working group sessions, or recommend somebody from your organisation to get involved, in the areas of interest to you and your organisation.
In particular, we have heard again at this meeting the need for machine-readable rights. The standard exists, in the form of RightsML. What it needs now is tooling to support the standard, a user guide with use cases, and potentially some how-tos or templates for typical use cases – similar maybe to Creative Commons licences – that cover the majority of our use cases. Most meetings, we hear from members on how crucial machine-readable rights are to effective workflows in their business, but the Working Group is currently without a lead. If you work at a member organisation who would benefit, please consider volunteering to participate in this group.
I would remind the Working Groups that IPTC has provision in the budget for technical authoring and software development – so I would encourage you to propose to the Board how you might use that. We can then decide where to spend, and also use this as input on future budgets. Let the Board know how we can help and support you.
I’d like to close by thanking the Working Group Leads, and their organisations, for so generously giving of their time: Dave, Jennifer, Johan, Paul, Michael and Pam. Special thanks to David Riecks for agreeing to co-chair the Photo Metadata group, and to Brendan for his support and development work on tools such as the Generators and Unit Testing frameworks. Thanks also to Kelvin Holland, our technical author, for his work on the NewsML-G2 Specification and User Guide. And thanks to the members of all of the working groups for their efforts on our standards which play such a crucial role in the newstech industry.
Thank you.
Paul Harman
Chair, IPTC Standards Committee
20 October 2021
The IPTC NewsCodes Working Group has now released the Q4 update to Media Topics, IPTC’s subject taxonomy used for classifying news content.
The main changes were made to the religion branch as part of our regular review cycle, and to sport events after discussing the terms with the Sports Content Working Group. This means that we have retired 29 terms and added 15 others. So in total we currently have 1,159 active terms in the vocabulary.
All new terms were created in en-GB and en-US versions, and have translations in Norwegian thanks to NTB. Other language translations will be added as they are contributed.
Below is a list of all of the changes.
New terms:
- medtop:20001338 education policy
- medtop:20001339 wellness
- medtop:20001340 mental wellbeing
- medtop:20001341 regular competition
- medtop:20001342 playoff championship
- medtop:20001343 final game
- medtop:20001344 Catholicism
- medtop:20001345 bar and bat mitzvah
- medtop:20001346 canonisation
- medtop:20001347 Shia Islam
- medtop:20001348 Sunni Islam
- medtop:20001349 atheism and agnosticism
- medtop:20001350 Eid al-Adha
- medtop:20001351 Hasidism
- medtop:20001352 Hanukkah
Retired terms:
- medtop:20000660 ecumenism
- medtop:20000662 Old Catholic
- medtop:20000665 Anglican
- medtop:20000666 Baptist
- medtop:20000667 Lutheran
- medtop:20000668 Mennonite
- medtop:20000669 Methodist
- medtop:20000670 Reformed
- medtop:20000671 Roman Catholic
- medtop:20000672 concordat
- medtop:20000675 Freemasonry
- medtop:20000687 interreligious dialogue
- medtop:20000689 religious event
- medtop:20000701 temple
- medtop:20001109 continental championship
- medtop:20001110 continental cup
- medtop:20001111 continental games
- medtop:20001112 international championship
- medtop:20001113 international cup
- medtop:20001114 international games
- medtop:20001115 national championship
- medtop:20001116 national cup
- medtop:20001117 national games
- medtop:20001118 regional championship
- medtop:20001119 regional cup
- medtop:20001120 regional games
- medtop:20001121 world championship
- medtop:20001122 world cup
- medtop:20001123 world games
Label changes:
- medtop:12000000 religion and belief -> religion
- medtop:20000128 international court or tribunal -> international court and tribunal
- medtop:20000423 environmental politics -> environmental policy
- medtop:20000458 mental health and disorders -> mental health
- medtop:20000657 religious belief -> belief systems
- medtop:20000661 Mormon -> Mormonism
- medtop:20000663 Orthodoxy -> Christian Orthodoxy
- medtop:20000664 Protestant -> Protestantism
- medtop:20000674 cult and sect -> cult
- medtop:20000690 religious festival or holiday -> religious festival and holiday
- medtop:20000697 religious facilities -> religious facility
- medtop:20000702 religious institutions and state relations -> relations between religion and government
Definition changes:
- medtop:12000000 religion
- medtop:20000117 arbitration and mediation
- medtop:20000423 environmental policy
- medtop:20000657 belief systems
- medtop:20000658 Buddhism
- medtop:20000659 Christianity
- medtop:20000661 Mormonism
- medtop:20000664 Protestantism
- medtop:20000673 Confucianism
- medtop:20000674 cult
- medtop:20000676 Hinduism
- medtop:20000677 Islam
- medtop:20000678 Jainism
- medtop:20000679 Judaism
- medtop:20000680 nature religion
- medtop:20000681 Zoroastrianism
- medtop:20000682 Scientology
- medtop:20000683 Shintoism
- medtop:20000684 Sikhism
- medtop:20000685 Taoism
- medtop:20000686 Unificationism
- medtop:20000690 religious festival and holiday
- medtop:20000691 Christmas
- medtop:20000692 Easter
- medtop:20000693 Pentecost
- medtop:20000694 Ramadan
- medtop:20000695 Yom Kippur
- medtop:20000696 religious ritual
- medtop:20000697 religious facility
- medtop:20000698 church
- medtop:20000699 mosque
- medtop:20000700 synagogue
- medtop:20000702 relations between religion and government
- medtop:20000704 pope
- medtop:20000705 religious text
- medtop:20000706 Bible
- medtop:20000708 Torah
- medtop:20001271 All Saints Day
- medtop:20001273 baptism
Hierarchy moves:
- medtop:20000423 environmental policy moved from medtop:06000000 environment to medtop:20000621 government policy
- medtop:20000479 healthcare policy moved from medtop:07000000 health to medtop:20000621 government policy
- medtop:20000480 government health care moved from medtop:20000479 healthcare policy to medtop:07000000 health
- medtop:20000483 health insurance moved from medtop:20000479 healthcare policy to medtop:07000000 health
- medtop:20000690 religious festival and holiday moved from medtop:20000689 religious event to medtop:12000000 religion
- medtop:20000696 religious ritual moved from medtop:20000689 religious event to medtop:12000000 religion
- medtop:20001177 Olympic Games moved from medtop:20001123 world gamesto medtop:20001108 sport event
- medtop:20001178 Paralympic Games moved from medtop:20001123 world games to medtop:20001108 sport event
- medtop:20001239 exercise and fitness moved from medtop:10000000 lifestyle and leisure to medtop:20001339 wellness
- medtop:20001293 streaming service moved from medtop:20000045 mass media to medtop:20000304 media
As always, the Media Topics vocabularies can be viewed in the following ways:
- In a collapsible tree view
- As a downloadable Excel spreadsheet
- On one page on the cv.iptc.org server
- In machine readable formats such as RDF/XML and Turtle using the SKOS vocabulary format: see the cv.iptc.org guidelines document for more detail.
For more information on IPTC NewsCodes in general, please see the IPTC NewsCodes Guidelines.
At the recent IPTC Standards Committee Meeting, NewsML-G2 version 2.30 was approved.
The full NewsML-G2 XML Schema, NewsML-G2 Guidelines document and NewsML-G2 specification document have all now been updated.
The biggest change (Change Request CR00211) is that <catalogRef/>
and <catalog/>
elements are now optional. This is so that users who choose to use full URIs instead of QCodes do not need to include an unnecessary element.
The other user-facing change is CR00212 which adds residrefformat
and residrefformaturi
attributes to the targetResourceAttributes
attribute group, used in <link>
, <icon>
and <remoteContent>
.
Other changes CR00213 and CR00214 aren’t visible to end users and don’t change any functionality, but make the XML Schema easier to read and maintain.
- The top-level folder of the NewsML-G2 v2.30 release is http://iptc.org/std/NewsML-G2/2.30/.
- The NewsML-G2 Implementation Guidelines document, updated to cover version 2.30 is available at https://www.iptc.org/std/NewsML-G2/guidelines
- The latest NewsML-G2 Specification document is available at https://www.iptc.org/std/NewsML-G2/specification/
- The XML Schema for NewsML-G2 v2.30 is at http://iptc.org/std/NewsML-G2/2.30/specification/NewsML-G2_2.30-spec-All-Power.xsd
XML Schema documentation of version 2.30 version is available on GitHub and at http://iptc.org/std/NewsML-G2/2.30/specification/XML-Schema-Doc-Power/.
NewsML-G2 Generator updated
The NewsML-G2 Generator has been updated to use version 2.30. This means that catalogRef is only included if QCode mode is chosen. The Generator also uses the new layout which means that the target document is updated in real time as the form is completed.
To follow our work on GitHub, please see the IPTC NewsML-G2 GitHub repository.
The full NewsML-G2 change log showing the Change Requests included in each new version is available at the dev.iptc.org site.

Bill Kasdorf, IPTC Individual Member, has written about IPTC Photo Metadata in his latest column for Publishers Weekly.
In the article, a double-page spread in the printed version of the 11/22/2021 issue of Publishers Weekly and an extended article online, Bill references Caroline Desrosiers of IPTC Startup member Scribely saying “if publications are born accessible, then their images should be born accessible, as well.”
The article describes how the new Alt Text (Accessibility) and Extended Description (Accessibility) properties in IPTC Photo Metadata can be used to make EPUBs more accessible.
Bill goes on to provide an example, supplied by Caroline Desrosiers, of how an image’s caption, alt text and extended description fulfil very different purposes, and mentions that it’s perfectly fine to leave alt text blank in some cases! For more details, read the article here.

“Metadata is the wheel in the digital business model,” according to Carl-Gustav Linden of University of Bergen in Norway. “We can use it to combine the right content with the right readers, listeners and viewers. That’s why metadata is so essential.”
Professor Linden was speaking at the JournalismAI Festival taking place this week, hosted by the Polis think-tank at the London School of Economics and Political Science. The JournalismAI project is a collaboration between POLIS and newsrooms and institutes around the world, funded by the Google News Initiative.
We are very happy to see several mentions of IPTC standards and IPTC members, particularly the New York Times and iMatrics. The New York Times is seen as a forerunner in content classification, with Jennifer Parrucci (lead of the IPTC NewsCodes Working Group) giving presentations recently about their work. iMatrics supplies an automated content classification system based on IPTC Media Topics which can be used as part of editorial workflows.
One thing we would like to note is that Professor Linden mentions that the IPTC vocabularies are influenced by our background in US-based news organisations, citing an example of the schools terms being focussed on the US system. We are happy to say that in a recent update to IPTC Media Topics we clarified our terms around school systems, making the label names and descriptions much more generic and based on the international schools classifications.
This change was the result of many IPTC member organisations working together from different parts of the world, including Scandinavia, to come to a result that hopefully works for everyone (and of course, each user of Media Topics is welcome to extend the vocabulary for their own purposes if necessary). This is an example of the great work that takes place when our members work together.
The JournalismAI festival continues until Friday this week. All sessions from the festival are available on YouTube.
Thanks again to Polis and the JournalismAI team for giving us a mention!

Today, IPTC announces the release of version 2.0 of the news industry’s standard for exchanging content in JSON: ninjs.
The new version introduces a completely new way of declaring multiple headlines, body texts and description fields, which is compatible with binary data serialisation formats such as Avro and Protocol Buffers.
“We are very excited about releasing the 2.0-version of News in JSON (ninjs),” says Johan Lindgren (TT), lead of the working group responsible for developing the standard. “When working on improving the 1.3 version, we realised that a number of suggestions would mean breaking changes and after some consideration we took that step. Now we have a version of ninjs that is better suited for APIs, databases like Elastic and conversion to binary methods like Protocol Buffers.”
The IPTC News in JSON Working Group has kept the original focus on two main use cases: data in transit and data at rest.
In recent years, more systems have started to convert from JSON formats into binary data serialisation protocols such as Avro and Protocol Buffers for data in transit. However ninjs 1.x couldn’t be converted into these protocols because of the dynamic way that keys could be defined, for example “headline_main” and “headline_subhead”. In ninjs 2.0, all properties are given well-defined names, so they can be converted into Protobufs schemas. The GitHub repository for ninjs now includes a demonstration of how ninjs 2.0 can be used with Protocol Buffers.
Other tools included in the repository are an example GraphQL server for ninjs and example XSLTs to convert from IPTC XML-based formats like NewsML and NITF.
The ninjs Generator tool has been updated to create ninjs 2.0. In fact, using the tool, users can switch between generating ninjs 1.3 and ninjs 2.0 output at the click of a radio button.
The official location of the ninjs 2.0 JSON Schema is https://iptc.org/std/ninjs/ninjs-schema_2.0.json.
A full list of the changes in ninjs 2.0 can be viewed in section 7.5 of the ninjs User Guide.