Marc Lavallee of The New York Times, Brendan Quinn of IPTC, Pascale Doucet of France Télévision and Scott Yates of JournalList spoke on a panel at the Project Origin event on February 22, 2022.
Marc Lavallee of The New York Times, Brendan Quinn of IPTC, Pascale Doucet of France Télévision and Scott Yates of JournalList spoke on a panel at the Project Origin event on February 22, 2022.

The IPTC has an ongoing project to the news and media industry deal with content credibility and provenance. As part of this, we have started working with Project Origin, a consortium of news and technology organisations who have come together to fight misinformation through the use of content provenance technologies.

On Tuesday 22nd February, Managing Director of IPTC Brendan Quinn spoke on a panel at an invite-only Executive Briefing event attended by leaders from news organisations around the world.

Other speakers at the event included Marc Lavallee, Head of R&D for The New York Times, Pascale Doucet of France Télévision, Eric Horvitz of Microsoft Research, Andy Parsons of Adobe, and Laura Ellis, Jamie Angus and Jatin Aythora of the BBC.

The event marks the beginning of the next phase of the industry’s work on content credibility. C2PA has now delivered the 1.0 version of its spec, so the next phase of the work is for the news industry to get together to create best practices around implementing it in news workflows.

IPTC and Project Origin will be working together with stakeholders from all parts of the news industry to establish guidelines for making provenance work in a practical way across the entire news ecosystem.

We have just released a small update to the Media Topics contrextract from IPTC MediaTopics Feb 2021olled vocabulary for news and media content. The changes support the Winter Olympics which starts this week.

The changes are:

  • The definition of bobsleigh (medtop:20000854) was changed to reflect the fact that bobsleigh now offers a one-person version (which is incidentally referred to as “monobob”). The new definition is: One, two or four people racing down a course in a sled that consists of a main hull, a frame, two axles and sets of runners. The total time of all heats in a competition is added together to determine the winner.
  • Similarly, the definition of freestyle skiing (medtop:20001058) was changed to reflect new events this year. The new definition is: Skiing competitions which, in contrast to alpine skiing, incorporate acrobatic moves and jumps. Events include aerials, halfpipe, slopestyle, ski cross, moguls and big air.

We also took the opportunity to add a term which was recently suggested by ABC Australia and Fourth Estate in the US:

We would like to thank to all Media Topics users and maintainers for their feedback and support.

Welcome to 2022! We thought a good way to kick off the new year would be to share the text of the speech given by our new Standards Committee Chair, Paul Harman of Bloomberg, at the IPTC Standards Committee meeting on 20 October 2021. In this piece, Paul does a particularly good job of explaining IPTC’s mission and calls on all IPTC members to participate in our standards work.

The IPTC was founded to secure fair access to modern telecommunications infrastructure. Using satellite technology would enable news providers and distributors to report from conflict zones, or from the other side of the world, at greater speed and with less risk of disruption from regional disputes or actions which could affect landline alternatives.

This extract from a 1967 IPTC newsletter illustrates the early work of the IPTC in securing access to telegraph and satellite lines for the news industry.

Once such access was secured, they had to decide how to use it. News agencies required technical standards for information interchange, and that’s what IPTC set out to provide, in the name of interoperability. It’s a remit we continue to carry out today. Organisations both inside the news technology arena, and outside, look to IPTC for guidance on media metadata; IPTC is perhaps best known for the Photo Metadata standards that were incorporated into Adobe products, and from there across the photo ecosystem.

Today we face a different problem: not a lack of standards, but an over-abundance of them; and alongside that, regular misuse – or lack of use – of the standards we actually have. As the popular XKCD comic highlights, the solution isn’t to create “one new standard to rule them all”, as this just perpetuates the problem. Increasingly the activities of our Working Groups are about documenting how to use the standards – IPTC and external – that already exist, and how to map between them. 

The classic XKCD comic illustrating the problem of trying to create “one new standard to rule them all” Source: https://xkcd.com/927/ Licensed under CC-BY-NC.

To do this, we need an understanding of what news is, and what each step in the workflow is trying to achieve. We must step away from the bits and bytes of transfer protocols, and instead examine the semantics of news – define an abstract data model representing the concepts in news collection, curation, distribution and feedback, and how those concepts inter-relate – separating the meaning of the metadata from the mechanics of how they are expressed. Only then can we successfully reflect that understanding back into whatever formats our members can use based on the constraints they are operating within.

New protocols and representations evolve all the time: SGML, XML, JSON, YAML, Turtle, Avro, protobuf… they are just serialisation formats. It shouldn’t have to matter whether you choose schema.org or rNews or RDFa or microdata or JSON-LD to embed metadata into HTML; what matters is a consistency of meaning, regardless of the mechanism.

Our Working Groups are already doing this, to a greater or lesser extent. The Video Metadata Hub is precisely an abstract model that defines serialisations into existing formats. The Photo Metadata Standard grew out of IIM and XMP work and describes the serialisations into, and necessary synchronisations between, current and future photo metadata standards. The News in JSON Working Group is attempting to map the same data model across JSON, Avro and Protocol Buffers, based on the News Architecture which was conceived as a data model but quickly became defined via its expression in XML, namely NewsML-G2. The Sports Content Working Group is currently working on taking the semantics from SportsML and SportsJS and re-expressing them in terms of RDF. For machine-readable rights, IPTC worked with the W3C on ODRL and used it as the basis for RightsML. And the NewsCodes Working Group is taking the Media Topics scheme and mapping it to Wikidata, which could be used as a lingua franca between any classification systems.

But this work is far from trivial, and requires continuous effort. IPTC is a member organisation, and it is through the time volunteered by delegates and their organisations that the work progresses. IPTC has but one member of staff – Brendan – who does a huge amount of work across all of our standards, but he also needs to run the business. Therefore we need your help to create and maintain our standards for the benefit of your businesses. Please join the working group sessions, or recommend somebody from your organisation to get involved, in the areas of interest to you and your organisation.

In particular, we have heard again at this meeting the need for machine-readable rights. The standard exists, in the form of RightsML. What it needs now is tooling to support the standard, a user guide with use cases, and potentially some how-tos or templates for typical use cases – similar maybe to Creative Commons licences – that cover the majority of our use cases. Most meetings, we hear from members on how crucial machine-readable rights are to effective workflows in their business, but the Working Group is currently without a lead. If you work at a member organisation who would benefit, please consider volunteering to participate in this group.

I would remind the Working Groups that IPTC has provision in the budget for technical authoring and software development – so I would encourage you to propose to the Board how you might use that. We can then decide where to spend, and also use this as input on future budgets. Let the Board know how we can help and support you.

I’d like to close by thanking the Working Group Leads, and their organisations, for so generously giving of their time: Dave, Jennifer, Johan, Paul, Michael and Pam. Special thanks to David Riecks for agreeing to co-chair the Photo Metadata group, and to Brendan for his support and development work on tools such as the Generators and Unit Testing frameworks. Thanks also to Kelvin Holland, our technical author, for his work on the NewsML-G2 Specification and User Guide. And thanks to the members of all of the working groups for their efforts on our standards which play such a crucial role in the newstech industry.

Thank you.

Paul Harman
Chair, IPTC Standards Committee
20 October 2021

extract from IPTC MediaTopics Feb 2021

The IPTC NewsCodes Working Group has now released the Q4 update to Media Topics, IPTC’s subject taxonomy used for classifying news content.

The main changes were made to the religion branch as part of our regular review cycle, and to sport events after discussing the terms with the Sports Content Working Group. This means that we have retired 29 terms and added 15 others. So in total we currently have 1,159 active terms in the vocabulary.

All new terms were created in en-GB and en-US versions, and have translations in Norwegian thanks to NTB. Other language translations will be added as they are contributed.

Below is a list of all of the changes.

New terms:

Retired terms:

Label changes:

Definition changes:

Hierarchy moves:

  • medtop:20000423 environmental policy moved from medtop:06000000 environment to medtop:20000621 government policy
  • medtop:20000479 healthcare policy moved from medtop:07000000 health to medtop:20000621 government policy
  • medtop:20000480 government health care moved from medtop:20000479 healthcare policy to medtop:07000000 health
  • medtop:20000483 health insurance moved from medtop:20000479 healthcare policy to medtop:07000000 health
  • medtop:20000690 religious festival and holiday moved from medtop:20000689 religious event to medtop:12000000 religion
  • medtop:20000696 religious ritual moved from medtop:20000689 religious event to medtop:12000000 religion
  • medtop:20001177 Olympic Games moved from medtop:20001123 world gamesto medtop:20001108 sport event
  • medtop:20001178 Paralympic Games moved from medtop:20001123 world games to medtop:20001108 sport event
  • medtop:20001239 exercise and fitness moved from medtop:10000000 lifestyle and leisure to medtop:20001339 wellness
  • medtop:20001293 streaming service moved from medtop:20000045 mass media to medtop:20000304 media

As always, the Media Topics vocabularies can be viewed in the following ways:

For more information on IPTC NewsCodes in general, please see the IPTC NewsCodes Guidelines.

At the recent IPTC Standards Committee Meeting, NewsML-G2 version 2.30 was approved.

The IPTC NewsML-G2 Generator has also been updated to produce NewsML-G2 2.30-compliant content.

The full NewsML-G2 XML Schema, NewsML-G2 Guidelines document and NewsML-G2 specification document have all now been updated.

The biggest change (Change Request CR00211) is that <catalogRef/> and <catalog/> elements are now optional. This is so that users who choose to use full URIs instead of QCodes do not need to include an unnecessary element.

The other user-facing change is CR00212 which adds residrefformat and residrefformaturi attributes to the targetResourceAttributes attribute group, used in <link>, <icon> and <remoteContent>.

Other changes CR00213 and CR00214 aren’t visible to end users and don’t change any functionality, but make the XML Schema easier to read and maintain.

XML Schema documentation of version 2.30 version is available on GitHub and at http://iptc.org/std/NewsML-G2/2.30/specification/XML-Schema-Doc-Power/.

NewsML-G2 Generator updated

The NewsML-G2 Generator has been updated to use version 2.30. This means that catalogRef is only included if QCode mode is chosen. The Generator also uses the new layout which means that the target document is updated in real time as the form is completed.

To follow our work on GitHub, please see the IPTC NewsML-G2 GitHub repository.

The full NewsML-G2 change log showing the Change Requests included in each new version is available at the dev.iptc.org site.

A screenshot of a browser showing Bill Kasdorf's latest column. Follow the link to read the full article.
Bill Kasdorf’s article on PublishersWeekly.com discusses IPTC Photo Metadata Standard’s new properties, Alt Text (Accessibility) and Extended Description (Acessibility).

Bill Kasdorf, IPTC Individual Member, has written about IPTC Photo Metadata in his latest column for Publishers Weekly.

In the article, a double-page spread in the printed version of the 11/22/2021 issue of Publishers Weekly and an extended article online, Bill references Caroline Desrosiers of IPTC Startup member Scribely saying “if publications are born accessible, then their images should be born accessible, as well.”

The article describes how the new Alt Text (Accessibility) and Extended Description (Accessibility) properties in IPTC Photo Metadata can be used to make EPUBs more accessible.

Bill goes on to provide an example, supplied by Caroline Desrosiers, of how an image’s caption, alt text and extended description fulfil very different purposes, and mentions that it’s perfectly fine to leave alt text blank in some cases! For more details, read the article here.

Carl-Gustav Linden of University of Bergen on the use of IPTC metadata as a means of powering AI in journalism, speaking at the JournalismAI Festival on 30 November 2021.
Carl-Gustav Linden of University of Bergen on the use of IPTC metadata as a means of powering AI in journalism, speaking at the JournalismAI Festival on 30 November 2021.

“Metadata is the wheel in the digital business model,” according to Carl-Gustav Linden of University of Bergen in Norway. “We can use it to combine the right content with the right readers, listeners and viewers. That’s why metadata is so essential.”

Professor Linden was speaking at the JournalismAI Festival taking place this week, hosted by the Polis think-tank at the London School of Economics and Political Science. The JournalismAI project is a collaboration between POLIS and newsrooms and institutes around the world, funded by the Google News Initiative.

We are very happy to see several mentions of IPTC standards and IPTC members, particularly the New York Times and iMatrics. The New York Times is seen as a forerunner in content classification, with Jennifer Parrucci (lead of the IPTC NewsCodes Working Group) giving presentations recently about their work. iMatrics supplies an automated content classification system based on IPTC Media Topics which can be used as part of editorial workflows.

One thing we would like to note is that Professor Linden mentions that the IPTC vocabularies are influenced by our background in US-based news organisations, citing an example of the schools terms being focussed on the US system. We are happy to say that in a recent update to IPTC Media Topics we clarified our terms around school systems, making the label names and descriptions much more generic and based on the international schools classifications.

This change was the result of many IPTC member organisations working together from different parts of the world, including Scandinavia, to come to a result that hopefully works for everyone (and of course, each user of Media Topics is welcome to extend the vocabulary for their own purposes if necessary). This is an example of the great work that takes place when our members work together.

The JournalismAI festival continues until Friday this week. All sessions from the festival are available on YouTube.

Thanks again to Polis and the JournalismAI team for giving us a mention!

Screenshot of a code editor showing an extract of a news story in ninjs 2.0 format

Today, IPTC announces the release of version 2.0 of the news industry’s standard for exchanging content in JSON: ninjs.

The new version introduces a completely new way of declaring multiple headlines, body texts and description fields, which is compatible with binary data serialisation formats such as Avro and Protocol Buffers.

“We are very excited about releasing the 2.0-version of News in JSON (ninjs),” says Johan Lindgren (TT), lead of the working group responsible for developing the standard. “When working on improving the 1.3 version, we realised that a number of suggestions would mean breaking changes and after some consideration we took that step. Now we have a version of ninjs that is better suited for APIs, databases like Elastic and conversion to binary methods like Protocol Buffers.”

The IPTC News in JSON Working Group has kept the original focus on two main use cases: data in transit and data at rest.

In recent years, more systems have started to convert from JSON formats into binary data serialisation protocols such as Avro and Protocol Buffers for data in transit. However ninjs 1.x couldn’t be converted into these protocols because of the dynamic way that keys could be defined, for example “headline_main” and “headline_subhead”. In ninjs 2.0, all properties are given well-defined names, so they can be converted into Protobufs schemas. The GitHub repository for ninjs now includes a demonstration of how ninjs 2.0 can be used with Protocol Buffers.

Other tools included in the repository are an example GraphQL server for ninjs and example XSLTs to convert from IPTC XML-based formats like NewsML and NITF.

The ninjs Generator tool has been updated to create ninjs 2.0. In fact, using the tool, users can switch between generating ninjs 1.3 and ninjs 2.0 output at the click of a radio button.

The official location of the ninjs 2.0 JSON Schema is https://iptc.org/std/ninjs/ninjs-schema_2.0.json.

A full list of the changes in ninjs 2.0 can be viewed in section 7.5 of the ninjs User Guide.

We had a great IPTC Photo Metadata Conference last week, focussing on accessibility, interoperability and authenticity.

Videos of all sessions are embedded in this post. Videos are also available from the event page. All videos have subtitles available – just click the “CC” button in the YouTube toolbar at the bottom of each video.

We started off with an introduction from IPTC Managing Director, Brendan Quinn:

Accessibility and “Born Accessible Content”

We then went into the first session, where David Riecks, co-lead of the IPTC Photo Metadata Working Group, introduced the new accessibility properties in the IPTC Photo Metadata Standard:

Next up was Sam Joehl of Level Access, who gave a fascinating presentation showing how a screen-reader application deals with images on the web, showcasing the need for good alternative text and image descriptions:

Next was a panel moderated by Caroline Desrosiers of Scribely, entitled Making Images Accessible Across Industries: How Does it Work and What’s Next? Speakers included James Tiller, Cailin Meyer and Rebecca Snyder of the Smithsonian Institution, Rachel Comerford from Macmillan Learning and Jon Sasala from Morey Creative Studios. The subject matter ranged from Smithsonian’s image description guidelines for scientific research to Macmillan’s “Born Accessible Content” initiative to the problems with “overlay” software that attempts to write alt text automatically. View the session here:

The next panel was moderated by David Riecks, and focused on “Image Accessibility Behind the Scenes: Metadata, DAMs, and Workflows.” Speakers were Andrew Kirkpatrick, Director of Accessibility at Adobe, Margaret Warren, founder of ImageSnippets, and Janos Farkas, CEO of CLink Media. This session looked at the implementor’s view and covered issues around user interfaces, ensuring metadata stays with images throughout their lifecycles, and of course asked when the new accessibility properties would be available in Adobe products!

Interoperability with Michael Steidl

Next up we moved on from accessibility to the second theme of the day, Interoperability. Michael Steidl, the other co-lead of the IPTC Photo Metadata Working Group, demonstrated IPTC’s Photo Metadata Interoperability Tests, new tools to allow users and vendors to test the capabilities of image management software, and compare their metadata handling to the IPTC Photo Metadata Standard specification.

Authenticity, CAI and C2PA

The third theme of the day was authenticity. We invited Santiago Lyon, Head of Advocacy and Education for the Adobe-led Content Authenticity Initiative, to speak about the CAI and its sister project, C2PA – the Coalition for Content Authenticity and Provenance. We looked at some details around how C2PA technology will fulfil the requirements of CAI to provide tamper-evident images and videos.

Finally, Brendan gave some final comments and discussed the details that we know so far about next year’s event. He also encouraged everyone to join the Friends of IPTC Newsletter, so that they can be the first to hear about next year’s event!

As part of our series highlighting speakers at this week’s Photo Metadata Conference, we are very happy to showcase the panellists who will be speaking at the second session: Making Images Accessible Across Industries: How Does it Work and What’s Next?

Don’t forget to register for the event which is less than two days away!

Headshot of Kate Sherwood of Smithsonian, panellist at this year's IPTC Photo Metadata Conference.

James Tiller

Photographer, Smithsonian Institution, National Museum of Natural History

James Tiller is a Biological Anthropologist and Photographer, which has led them on field expeditions around the world and on hundreds of photoshoots of human skeletal remains related to prehistoric and historic archaeological contexts and forensic cases for the Smithsonian’s National Museum of Natural History (NMNH). Since 2017, James has produced over ten thousand images documenting the Smithsonian’s collections, exhibitions, museum staff, and research, which have been featured on the front page of The Washington Post and appeared in The New York Times, NPR, and many other news outlets and scientific publications. As a disabled photographer, she strives to increase the accessibility of museum collections and research, especially for those who have been historically marginalised.

Cailin Meyer

Headshot of Cailin Meyer, panellist at the 2021 IPTC Photo Metadata Conference.

Museum Collections Technician, Smithsonian Institution, National Museum of Natural History

Cailin Meyer is a Collections Technician at the National Museum of Natural History. Working with all types of natural history collections, Cailin specialises in disaster response and training, biohazard concerns, and increasing digital accessibility for individuals in museum spaces. Cailin is a co-chair of the Smithsonian Institution’s DEAI working group, and actively works to increase resources for and understanding of digital accessibility concerns amongst Smithsonian staff. She has most recently tackled the unique issue of designing image description guidelines geared towards scientific and natural history specimens, working alongside James Tiller and Rebecca Snyder. Cailin’s background is in zooarchaeology, human and comparative anatomy, and dissection techniques. She earned her MA in Museum Studies from University of Kansas, her MA in Zooarchaeology from Illinois State University, and her BFA from Rhodes College.

Rebecca Snyder

Headshot of Rebecca Snyder, panellist at the 2021 IPTC Photo Metadata Conference.

Informatics Branch Chief, Smithsonian Institution, National Museum of Natural History

Rebecca Snyder is the Informatics Branch Chief at the National Museum of Natural History, Smithsonian Institution (SI). Rebecca is responsible for the digital stewardship and preservation of collections and research data. Recent projects include the application of persistent identifiers for SI collections data and media, the Smithsonian Open Access initiative where she was responsible for designing the data flows between all SI collections systems, assisting in the development of a system of record for 3D data, and data quality improvement projects adhering to FAIR data principles. She is also a member of the Audubon Core data standard maintenance group, focusing on creating standards for the sharing of 3D data.

Rachel Comerford, Macmillan Learning

Headshot of Rachel Comerford, panellist at the 2021 IPTC Photo Metadata Conference.

Senior Director of Accessibility Outreach and Communication, Macmillan Publishing

Rachel Comerford is the Senior Director of Accessibility Outreach and Communication at Macmillan Learning where she leads cross-functional efforts to ensure students of all abilities have access to their course materials. In 2020, BISG awarded Rachel the Industry Innovator award for her work helping Macmillan Learning to become the first Global Certified Accessible publisher by Benetech. Under her leadership, Macmillan was recognised by WIPO’s Accessible Book Consortium with the International Excellence Award for Accessible Publishing in 2020 for their work towards providing educational materials that any student can use. Rachel has over a decade of experience in the print and digital publishing world. Prior to coming to Macmillan Learning as an editor, she held a variety of editorial and sales positions at WW Norton and Pearson.

Jon Sasala

President, Morey Creative Studios

Jon Sasala is president of Morey Creative Studios, a New York-based HubSpot Partner Agency specialising in inbound B2B marketing, content development, web design, lead generation and sales support. He joined the Morey team in 2001 as a graphic designer, and has grown with the organisation throughout the last two decades to his current position. Jon heads the HubSpot User Group for New York City and hosts the ‘Inbound & Down’ podcast. In addition to his agency responsibilities, Jon co-founded InclusionHub.com, an online database, resource nexus, and community designed to help businesses make better decisions around web accessibility and digital inclusion. Closely connected with the design, programming and content side of web development—coupled with a comprehensive understanding of business operations—he has a unique perspective on the importance of web accessibility for companies operating in the digital world.

Caroline Desrosiers, moderator of a panel at the 2021 IPTC Photo Metadata Conference.

The panel will be moderated by Caroline Desrosiers, Founder and CEO of IPTC Startup Member Scribely. Scribely is a company on a mission to make images and videos more accessible to blind and visually-impaired people and more discoverable to search engines. Scribely’s team of expert writers specialise in writing alt text for images and audio description for videos, helping digital media providers create born-accessible visual content for a more inclusive, equitable, and sustainable world. Before starting Scribely, Caroline worked for a global academic digital publisher, SAGE Publishing, where she led a working group to improve the accessibility of interactive eBooks. Caroline is also the Co-Host of Say My Meme, a podcast that describes the internet’s best memes for people who cannot see them.

Accessibility features including live closed captions will be used at the event.

For those in timezones where the timing is inconvenient, please go ahead and register anyway – you will be sent a link to the event recordings afterwards.

There’s still time to register. Attendance is free of charge for anyone – there is no requirement to be an IPTC Member to join.

For full schedule details and a link to register, see the event page.