Welcome to 2022! We thought a good way to kick off the new year would be to share the text of the speech given by our new Standards Committee Chair, Paul Harman of Bloomberg, at the IPTC Standards Committee meeting on 20 October 2021. In this piece, Paul does a particularly good job of explaining IPTC’s mission and calls on all IPTC members to participate in our standards work.

The IPTC was founded to secure fair access to modern telecommunications infrastructure. Using satellite technology would enable news providers and distributors to report from conflict zones, or from the other side of the world, at greater speed and with less risk of disruption from regional disputes or actions which could affect landline alternatives.

This extract from a 1967 IPTC newsletter illustrates the early work of the IPTC in securing access to telegraph and satellite lines for the news industry.

Once such access was secured, they had to decide how to use it. News agencies required technical standards for information interchange, and that’s what IPTC set out to provide, in the name of interoperability. It’s a remit we continue to carry out today. Organisations both inside the news technology arena, and outside, look to IPTC for guidance on media metadata; IPTC is perhaps best known for the Photo Metadata standards that were incorporated into Adobe products, and from there across the photo ecosystem.

Today we face a different problem: not a lack of standards, but an over-abundance of them; and alongside that, regular misuse – or lack of use – of the standards we actually have. As the popular XKCD comic highlights, the solution isn’t to create “one new standard to rule them all”, as this just perpetuates the problem. Increasingly the activities of our Working Groups are about documenting how to use the standards – IPTC and external – that already exist, and how to map between them. 

The classic XKCD comic illustrating the problem of trying to create “one new standard to rule them all” Source: https://xkcd.com/927/ Licensed under CC-BY-NC.

To do this, we need an understanding of what news is, and what each step in the workflow is trying to achieve. We must step away from the bits and bytes of transfer protocols, and instead examine the semantics of news – define an abstract data model representing the concepts in news collection, curation, distribution and feedback, and how those concepts inter-relate – separating the meaning of the metadata from the mechanics of how they are expressed. Only then can we successfully reflect that understanding back into whatever formats our members can use based on the constraints they are operating within.

New protocols and representations evolve all the time: SGML, XML, JSON, YAML, Turtle, Avro, protobuf… they are just serialisation formats. It shouldn’t have to matter whether you choose schema.org or rNews or RDFa or microdata or JSON-LD to embed metadata into HTML; what matters is a consistency of meaning, regardless of the mechanism.

Our Working Groups are already doing this, to a greater or lesser extent. The Video Metadata Hub is precisely an abstract model that defines serialisations into existing formats. The Photo Metadata Standard grew out of IIM and XMP work and describes the serialisations into, and necessary synchronisations between, current and future photo metadata standards. The News in JSON Working Group is attempting to map the same data model across JSON, Avro and Protocol Buffers, based on the News Architecture which was conceived as a data model but quickly became defined via its expression in XML, namely NewsML-G2. The Sports Content Working Group is currently working on taking the semantics from SportsML and SportsJS and re-expressing them in terms of RDF. For machine-readable rights, IPTC worked with the W3C on ODRL and used it as the basis for RightsML. And the NewsCodes Working Group is taking the Media Topics scheme and mapping it to Wikidata, which could be used as a lingua franca between any classification systems.

But this work is far from trivial, and requires continuous effort. IPTC is a member organisation, and it is through the time volunteered by delegates and their organisations that the work progresses. IPTC has but one member of staff – Brendan – who does a huge amount of work across all of our standards, but he also needs to run the business. Therefore we need your help to create and maintain our standards for the benefit of your businesses. Please join the working group sessions, or recommend somebody from your organisation to get involved, in the areas of interest to you and your organisation.

In particular, we have heard again at this meeting the need for machine-readable rights. The standard exists, in the form of RightsML. What it needs now is tooling to support the standard, a user guide with use cases, and potentially some how-tos or templates for typical use cases – similar maybe to Creative Commons licences – that cover the majority of our use cases. Most meetings, we hear from members on how crucial machine-readable rights are to effective workflows in their business, but the Working Group is currently without a lead. If you work at a member organisation who would benefit, please consider volunteering to participate in this group.

I would remind the Working Groups that IPTC has provision in the budget for technical authoring and software development – so I would encourage you to propose to the Board how you might use that. We can then decide where to spend, and also use this as input on future budgets. Let the Board know how we can help and support you.

I’d like to close by thanking the Working Group Leads, and their organisations, for so generously giving of their time: Dave, Jennifer, Johan, Paul, Michael and Pam. Special thanks to David Riecks for agreeing to co-chair the Photo Metadata group, and to Brendan for his support and development work on tools such as the Generators and Unit Testing frameworks. Thanks also to Kelvin Holland, our technical author, for his work on the NewsML-G2 Specification and User Guide. And thanks to the members of all of the working groups for their efforts on our standards which play such a crucial role in the newstech industry.

Thank you.

Paul Harman
Chair, IPTC Standards Committee
20 October 2021

This is part of a series of posts about the IPTC Autumn Meeting 2019. In Ljubljana. See separate posts about Day 1 and Day 2.

Day 3 of the Ljubljana Meeting was where we got down to business: we hosted the 2019 IPTC Annual General Meeting, where Voting Members get to have their say on how the organisation is run, voting on the budget and the Board.

We are very happy to announce that Jennifer Parrucci of the New York Times and Paul Harman of Bloomberg were both voted as new IPTC Board members and Directors of the company, and that Robert Schmidt-Nia of DPA was voted as Chair of the IPTC Board.

On behalf of all IPTC members we would like to say thanks, and welcome!

Then we had the IPTC Standards Committee meeting where we approved the latest versions of two of our key standards: IPTC Photo Metadata Standard 2019.1 and ninjs 1.2. Stay tuned for more information on both of those very soon!

In the afternoon we had a visit to the Triglav Lab  where we were able to see some interesting developments in the Slovenian tech scene related to media. Thanks again to Aljoša Rehar from STA and Marko Grobelnik from Jožef Stefan Institute for their help in organising the afternoon.

We heard from:

  • Event Registry , a “news intelligence platform” that semantically tags content to identify meaning and disambiguate between potential meanings for the same words, allowing for cross-language tagging and clustering to create “news events” across articles in multiple languages
  • Embeddia, a JSI project in partnership with STT Finland and Ekspress meedia in Estonia, looking at using structured data to create “intelligence augmentation”, as opposed to AI, helping journalists to do their jobs better. It also focuses on less-popular languages such as Slovene, Finnish and Estonian where some standard AI tools don’t work as well.
  • Internal projects at STA including an article tracker which allows the agency can track usage of their content, even if it is modified; a tool called D4 which allows STA to see which stories they have been missing and how they can find new stories; and Newsmapper which provides analytics about STA’s own content – where was it popular? with what types of readers? 
  • Behaviour Exchange, a user analytics tracker that allows publishers to understand their users’ demographics and preferences, including their own ad network, rather than outsourcing their user analytics to the big platforms.
  • A representative from Blockchain Think Tank Slovenia spoke about media implications of blockchain and what it could do for the media industry.
  • Finspektor showed how they are used open data to analyse how the Slovenian government is spending their taxpayers’ money
  • and Parlameter showed how they are bringing politicians to account, showing their activity in parliament, on social media and in the media.

IPTC members visiting Postojna Cave, Slovenia.

It was a great way to end a fascinating three day event!

And on the Thursday some of us were able to go on a networking day to Postojna Cave where we were able to see the beautiful tunnels and some very impressive underground. Some of us were also able to go on to the stunning Lake Bled, which we definitely would recommend!

Thanks again to Aljoša, Nejc, Marjana, Marko and the team from STA and JSI for helping to organise the event.