Categories
Archives
Last week, the IPTC Spring Meeting 2024 brought media industry experts together for three days in New York City to discuss many topics including AI, archives and authenticity.
Hosted by both The New York Times and Associated Press, over 50 attendees from 14 countries participated in person, with another 30+ delegates attending online.
As usual, the IPTC Working Group leads presented a summary of their most recent work, including a new release of NewsML-G2 (version 2.34, which will be released very soon); forthcoming work on ninjs to support events, planned news coverage and live streamed video; updates to NewsCodes vocabularies; more evangelism of IPTC Sport Schema; and further work on Video Metadata Hub, the IPTC Photo Metadata Standard and our emerging framework for a simple way to express common rights statements using RightsML.
We were very happy to hear many IPTC member organisations presenting at the Spring Meeting. We heard from:
- Anna Dickson of recently-joined member Google talked about their work with IPTC in the past and discussed areas where we could collaborate in the future
- Aimee Rinehart of Associated Press presented AP’s recent report on the use of generative AI in local news
- Scott Yates of JournalList gave an update on the trust.txt protocol
- Andreas Mauczka, Chief Digital Officer at Austria Press Agency APA presented on APA’s framework for use of generative AI in their newsroom
- Drew Wanczowski of Progress Software gave a demonstration of how IPTC standards can be implemented in Progress’s tools such as Semaphore and MarkLogic
- Vincent Nibart and Geert Meulenbelt of new IPTC Startup Member Kairntech presented on their recent work with AFP on news categorisation using IPTC Media Topics and other vocabularies
- Mathieu Desoubeaux of IPTC Startup Member IMATAG presented their work, also with AFP, on watermarking images for tracking and metadata retrieval purposes
In addition we heard from guest speakers:
- Jim Duran of the Vanderbilt TV News Archive spoke about how they are using AI to catalog and tag their extensive archive of decades of broadcast news content
- John Levitt of Elvex spoke about their system which allows media organisations to present a common interface (web interface and developer API) to multiple generative AI models, including tracking, logging, cost monitoring, permissions and other governance features which are important to large organisations using AI models.
- Toshit Panigrahi, co-founder of TollBit spoke about their platform for “AI content licensing at scale”, allowing content owners to establish rules and monitoring around how their content should be licensed for both the training of AI models and for retrieval-augmented generation (RAG)-style on-demand content access by AI agents.
- We also heard an update about the TEMS – Trusted European Media Data Space project.
We were also lucky enough to take tours of the Associated Press Corporate Archive on Tuesday and the New York Times archive on Wednesday. Valierie Komor of AP Corporate Archives and Jeff Roth of The New York Times Archival Library (known to staffers as “the morgue”) both gave fascinating insights and stories about how both archives preserve the legacy of these historically important news organisations.
Brendan Quinn, speaking for Judy Parnall of the BBC, also presented an update of the recent work of C2PA and Project Origin and introduced the new IPTC Media Provenance Committee, dedicated to bringing C2PA technology to the news and media industry.
On behalf all attendees, we would like to thank The New York Times and Associated Press for hosting us, and especially to thank Jennifer Parrucci of The New York Times and Heather Edwards of The Associated Press for their hard work in coordinating use of their venues for our meeting.
The next IPTC Member Meeting will be the 2024 Autumn Meeting, which will be held online from Monday September 30th to Wednesday October 2nd, and will include the 2024 IPTC Annual General Meeting. The Spring Meeting 2025 will be held in Western Europe at a location still to be determined.
As we wrap up 2023, we thought it would be useful to give an update you on the IPTC’s work in 2023, including updates to most of our standards.
Two successful member meetings, one in person!
This year we finally held our first IPTC Member meeting in person since 2019, in Tallinn Estonia. We had around 30 people attend in person and 50 attended online from over 40 organisations. Presentations and discussions ranged from the e-Estonia digital citizen experience to building re-usable news content widgets with Web Components, and of course included generative AI, credibility and fact checking, and more. Here’s our report on the IPTC 2023 Spring Meeting.
For our Autumn Meeting we went back to an online format, with over 50 attendees, and more watching the recordings afterwards (which are available to all members). Along with discussions of generative AI and content licensing at this year’s meetings, it was great to hear the real-world implementation experience of the ASBU Cloud project from the Arab States Broadcasting Union. The system was created by IPTC members Broadcast Solutions, based on NewsML-G2. The DPP Live Production Exchange, led by new members Arqiva, will be another real-world implementation coming soon. We heard about the project’s first steps at the Autumn Meeting.
Also at this years Autumn Meeting we also heard from Will Kreth of the HAND Identity platform and saw a demo of IPTC Sport Schema from IPTC member Progress Software (previously MarkLogic). More on IPTC Sport Schema below! All news from the Autumn Meeting is summed up in our post AI, Video in the cloud, new standards and more: IPTC Autumn Meeting 2023
We’re very happy to say that the IPTC Spring Meeting 2024 will be held in New York from April 15 – 17. All IPTC member delegates are welcome to attend the meeting at no cost. If you are not a member but would like to present your work at the meeting, please get in touch using our Contact Us form.
IPTC Photo Metadata Conference, 7 May 2024: save the date!
Due to several issues, we were not able to run a Photo Metadata Conference in 2023, but we will be back with an online Photo Metadata Conference on 7th May 2024. Please mark the date in your calendar!
As usual, the event will be free and open for anyone to attend.
If you would like to present to the people most interested in photo metadata from around the world, please let us know!
Presentations at other conferences and work with other organisations
IPTC was represented at the CEPIC Congress in France, the EBU DataTech Seminar in Geneva, Sports Video Group Content Management Forum in New York and the DMLA’s International Digital Media Licensing Conference in San Francisco.
We also worked with CIPA, the organisation behind the Exif photo metadata standard, on aligning Exif with IPTC Photo Metadata, and supported them in their work towards Exif 3.0 which was announced in June.
The IPTC will be advising the TEMS project which is an EU-funded initiative to build a “media data space” for Europe, and possibly beyond: IPTC working with alliance to build a European Media Data Space.
IPTC’s work on Generative AI and media
Of course the big topic for media in 2023 has been Generative AI. We have been looking at this topic for several years, since it was known as “synthetic media” and back in 2022 we created a taxonomy of “digital source types” that can be used to describe various forms of machine-generated and machine-assisted content creation. This was a joint effort across our NewsCodes, Video Metadata and Photo Metadata Working Groups.
It turns out that this was very useful, and the IPTC Digital Source Type taxonomy has been adopted by Google, Midjourney, C2PA and others as a way to describe content. Here are some of our news posts from 2023 on this topic:
- IPTC publishes metadata guidance for AI-generated “synthetic media”
- Google announces use of IPTC metadata for generative AI images
- Midjourney and Shutterstock AI sign up to use of IPTC Digital Source Type to signal generated AI content
- Microsoft announces signalling of generative AI content using IPTC and C2PA metadata
- Royal Society/BBC workshop on Generative AI and content provenance
- New “digital source type” term added to support inpainting and outpainting in Generative AI
- IPTC releases technical guidance for creating and editing metadata, including DigitalSourceType
IPTC’s work on Trust and Credibility
After a lot of drafting work over several years, we released the Guidelines for Expressing Trust and Credibility signals in IPTC standards that shows how to embed trust infiormation in the form of “trust indicators” such as those from The Trust Project into content marked up using IPTC standards such as NewsML-G2 and ninjs. The guideline also discusses how media can be signed using C2PA specification.
We continue to work with C2PA on the underlying specification allowing signed metadata to be added to media content so that it becomes “tamper-evident”. However C2PA specification in its current form does not prescribe where the certificates used for signing should come from. To that end, we have been working with Microsoft, BBC, CBC / Radio Canada and The New York Times on the Steering Committee of Project Origin to create a trust ecosystem for the media industry. Stay tuned for more developments from Project Origin during 2024.
IPTC’s newest standard: IPTC Sport Schema
After years of work, the IPTC Sports Content Working Group released version 1.0 of IPTC Sport Schema. IPTC Sport Schema takes the experience of IPTC’s 10+ years of maintaining the XML-based SportsML standard and applies it to the world of the semantic web, knowledge graphs and linked data.
Paul Kelly, Lead of the IPTC Sports Content Working Group, presented IPTC Sport Schema to the world’s top sports media technologists: IPTC Sport Schema launched at Sports Video Group Content Management Forum.
Take a look at out dedicated site https://sportschema.org/ to see how it works, look at some demonstration data and try out a query engine to explore the data.
If you’re interested in using IPTC Sport Schema as the basis for sports data at your organisation, please let us know. We would be very happy to help you to get started.
Standard and Working Group updates
- Our IPTC NewsCodes vocabularies had two big updates, the NewsCodes 2023-Q1 update and the NewsCodes Q3 2023 update. For our main subject taxonomy Media Topics, over the year we added 12 new concepts, retired 73 under-used terms, and modified 158 terms to make their labels and/or descriptions easier to understand. We also added or updated vocabularies such as Digital Source Type and Authority Status.
- The News in JSON Working Group released ninjs 2.1 and ninjs 1.5 in parallel, so that people who cannot move from the 1.x schema can still get the benefits of new additions. The group is currently working on adding events and planning items to ninjs based on requirements the DPP Live Production Exchange project: expect to see something released in 2024.
- NewsML-G2 2.32 and NewsML-G2 v2.33 were released this year, including support for Generative AI via the Digital Source Type vocabulary.
- The IPTC Photo Metadata Standard 2023.1 allows rightsholders to express whether or not they are willing to allow their content to be indexed by search engines and data mining crawlers, and whether the content can be used as training data for Generative AI. This work was done in partnership with the PLUS Coalition. We also updated the IPTC Photo Metadata Mapping Guidelines to accommodate Exif 3.0.
- Through discussions and workshops at our Member Meetings in 2022 and 2023, we have been working on making RightsML easier to use and easier to understand. Stay tuned for more news on RightsML in 2024.
- Video Metadata Hub 1.5 adds the same properties to allow content to be excluded from generative AI training data sets. We have also updated the Video Metadata Hub Generator tool to generate C2PA-compliant metadata “assertions”.
New faces at IPTC
Ian Young of Alamy / PA Media Group stepped up to become the lead of the News in JSON Working Group, taking over from Johan Lindgren of TT who is winding down his duties but still contributes to the group.
We welcomed Bonnier News, Newsbridge, Arqiva, the Australian Broadcasting Corporation and Neuwo.ai as new IPTC members, plus a very well known name who will be joining at the start of 2024. We’re very happy to have you all as members!
We are always happy to work with more organisations in the media and related industries. If you would like to talk to us about joining IPTC, please complete our membership enquiry form.
Here’s to a great 2024!
Thanks to everyone who gave IPTC your support, and we look forward to working with you in the coming year.
If you have any questions or comments (and especially if you would like to speak at one of our events in 2024!), you can contact us via our contact form.
Best wishes,
Brendan Quinn
Managing Director, IPTC
and the IPTC Board of Directors: Dave Compton (LSE Group), Heather Edwards (The Associated Press), Paul Harman (Bloomberg LP), Gerald Innerwinkler (APA), Philippe Mougin (Agence France-Presse), Jennifer Parrucci (The New York Times), Robert Schmidt-Nia of DATAGROUP (Chair of the Board), Guowei Wu (Xinhua)
The IPTC is proud to announce that after intense work by most of its Working Groups, we have published version 1.0 of our guidelines document: Expressing Trust and Credibility Information in IPTC Standards.
The culmination of a large amount of work over the past several years across many of IPTC’s Working Groups, the document represents a guide for news providers as to how to express signals of trust known as “Trust Indicators” into their content.
Trust Indicators are ways that news organisations can signal to their readers and viewers that they should be considered as trustworthy publishers of news content. For example, one Trust Indicator is a news outlet’s corrections policy. If the news outlet provides (and follows) a clear guideline regarding when and how it updates its news content.
The IPTC guideline does not define these trust indicators: they were taken from existing work by other groups, mainly the Journalism Trust Initiative (an initiative from Reporters Sans Frontières / Reporters Without Borders) and The Trust Project (a non-profit founded by Sally Lehrman of UC Santa Cruz).
The first part of the guideline document shows how trust indicators created by these standards can be embedded into IPTC-formatted news content, using IPTC’s NewsML-G2 and ninjs standards which are both widely used for storing and distributing news content.
The second part of the IPTC guidelines document describes how cryptographically verifiable metadata can be added to media content. This metadata may express trust indicators but also more traditional metadata such as copyright, licensing, description and accessibility information. This can be achieved using the C2PA specification, which implements the requirements of the news industry via Project Origin and of the wider creative industry via the Content Authenticity Initiative. The IPTC guidelines show how both IPTC Photo Metadata and IPTC Video Metadata Hub metadata can be included in a cryptographically signed “assertion”
We expect these guidelines to evolve as trust and credibility standards and specifications change, particularly in light of recent developments in signalling content created by generative AI engines. We welcome feedback and will be happy to make changes and clarifications based on recommendations.
The IPTC sends its thanks to all IPTC Working Groups that were involved in creating the guidelines, and to all organisations who created the trust indicators and the frameworks upon which this work is based.
Feedback can be shared using the IPTC Contact Us form.
The IPTC Standards Committee is happy to announce that ninjs, IPTC’s schema for marking up news content in JSON, has been revised to versions 2.1 and 1.5.
The vote to approve the new versions was taken at the recent IPTC Spring Meeting in Tallinn, Estonia and online.
This is in keeping with IPTC’s decision to maintain two parallel versions of ninjs: one for those who can’t upgrade to the 2.x version of backwards compatibility reasons, and those who prefer the simpler structure of ninjs 2.x that is easier to handle in some tools.
The ninjs User Guide has been updated to reflect the changes, which are summarised below.
ContactInfo added to ninjs 1.5 and 2.1
ninjs 2.1 and ninjs 1.5 both include the new contactinfo
structure which can be used in the people
, organisations
, places
and infosources
properties (and their ninjs 1.x equivalents person
, organisation
, place
and infosource
).
The contactInfo structure can contain physical or online contact information such as a street address or postal address, a username on social media such as Twitter, Instagram or TikTok, or even a locator such as what3words.
Here are some examples of how the contactinfo
property can be used:
"people": [ { "name": "Jonas Svensson", "contactinfo": [ { "type":"phone", "role": "work", "value": "+46 (0)8-7887500" } ] } ], "organisations": [ { "name": "International Committee of the Red Cross", "contactinfo": [ { "type": "web", "value": "https://www.icrc.org/" }, { "type": "address", "address": { "lines": [ "19 Avenue de la paix", "1202 Geneva", "Switzerland" ] } }, { "type": "telephone", "value": "+41 22 734 60 01" } ] } ]
Better support for organisation identifiers such as tickers, ISIN etc
ninjs 2.1 and 1.5 also include the new symboltype
and symbol
properties under symbols
. Symbol can identify any type of URI describing the type of the symbol. The CV http://cv.iptc.org/newscodes/financialinstrumentsymboltype is recommended.
The ticker
sub-property under symbols is now deprecated. This means that it can still be used if necessary, but use is not recommended.
We now recommend that ticker symbols are stored using symbol="TCKR"
and symboltype="https://cv.iptc.org/newscodes/financialinstrumentsymboltype/Ticker"
.
Better support for machine classification
The subjects
(ninjs 2.x) / subject
(ninjs 1.x) properties now allow for the sub-properties creator
, relevance
and confidence
.
This allows organisations to more accurately use machine-generated subject tags in their content. while stating that it was created by a machine (using the creator
property), and giving numerical values for the relevance
and confidence
scores that are reported by machine tagging engines. (Of course, these properties can also be used for human-created subject tags if necessary!)
In addition, some internal changes to the schema were made to fix a validation bug that existed in previous versions. In order to accommodate these changes, the ninjs 2.1 schema uses the https://json-schema.org/draft/2020-12/schema version of JSON Schema.
Thanks to Johan Lindgren, welcome Ian Young as Working Group Lead
At the Spring Meeting in Tallinn we said farewell to Johan Lindgren as Lead of the News in JSON Working Group.
Johan, of the TT news agency in Sweden, was instrumental in bringing the News in JSON Working Group back from its quiet period after the initial launch of ninjs. This directly led to the release of several new versions of ninjs over the past few years, and its adoption by many of the world’s top news providers.
The IPTC wishes to thank Johan for all his contributions, and wishes him well for his retirement.
Johan’s work will be taken over by Ian Young from PA Media Group / Alamy based in the UK. Ian steps up to the Lead role after participating in the Working Group for many years, since the earliest days of ninjs.
We thank Ian for being willing to take on the lead role, and we look forward to seeing what developments will emerge from the News in JSON Working Group in the future.
Here is a wrap-up of IPTC has been up to in 2022, covering our latest work, including updates to most of our key standards.
Two successful member meetings and five member webinars
This year we again held our member meetings online, in May and October. We had over 70 registered attendees each time, from over 40 organisations, which is well over half of our member organisations so it shows that the virtual format works well.
This year we had guests from United Robots, Kairntech, EDRLab, Axate, HAND Identity, RealityDefender.ai, synthetic media consultant Henrik de Gyor and metaverse expert Toby Allen, as well as member presentations from The New York Times, Agence France-Presse, Refinitiv (an LSE Group company), DATAGROUP Consulting, TT Sweden, iMatrics and more. And that’s not even counting our regular Working Group presentations! So we had a very busy three days in May and October.
We also had some very interesting members-only webinars including a deep dive into ninjs 2.0, JournalList and the trust.txt protocol, a joint webinar with the EBU on how Wikidata and IPTC Media Topics can be used together, and a great behind the scenes question-and-answer session with a product manager from Wikidata itself.
Recordings of all presentations and webinars are available to IPTC members in the Members-Only Zone.
A fascinating Photo Metadata Conference
This year’s IPTC Photo Metadata Conference was held online in November and we had over 150 registrants and 19 speakers from Microsoft, CBC Radio Canada, BBC, Adobe, Content Authenticity Initiative, the Smithsonian and more. The general theme was bringing the IPTC Photo Metadata Standard to the real world, focussing on adoption of the recently-introduced accessibility properties, looking at adoption and interoperability between different software tools, including a new comparison tool that we have introduced; use of C2PA and Content Authenticity in newsroom workflows, with demos from the BBC and CBC (with Microsoft Azure).
We also had an interesting session discussing the future of AI-generated images and how metadata could help to identify which images are synthetic, the directions and algorithms used to create them, and whether or not the models were trained on copyrighted images.
Recordings of all sessions are available online.
Presentations at other conferences, work with other organisations
IPTC was represented at the CEPIC Congress in Spain, the DigiTIPS conference run by imaging.org, the Sports Video Group’s content management group, and several Project Origin events.
Our work with C2PA is progressing well. As of version 1.2 of the C2PA Specification, assertions can now include any property from IPTC Photo Metadata Standard and/or IPTC Video Metadata Hub. C2PA support is growing in tools and is now available in Adobe Photoshop.
IPTC is also working with Project Origin on enabling C2PA in the news industry.
We had an IPTC member meet-up at the NAB Show in Las Vegas in May.
We also meet regularly with Google, schema.org, CIPA (the camera-makers behind the Exif standard), ISO, CEPIC and more.
Standard and Working Group updates
- Our IPTC NewsCodes vocabularies had regular updates each quarter, including 12 new terms at least 20 retired terms. See the details in our news posts about the September Update, July Update, May Update, and the February Update (in time for the Winter Olympics). We also extended the Digital Source Type vocabulary specifically to address “synthetic media” or AI-generated content.
- The News in JSON Working Group released ninjs 1.4, a parallel release for those who can’t upgrade to ninjs 2.0 which was released in 2021. We published a case study showing how Alamy uses ninjs 2.0 for its content API.
- NewsML-G2 v2.31 includes support for financial instruments without the need to attach them to organisations.
- Photo Metadata Standard 2022.1 includes a Contributor structure aligned with Video Metadata Hub which can handle people who worked on a photograph but did not press the shutter, such as make-up artists, stylists or set designers;
- The Sports Content Working Group is working on the IPTC Sport Schema, which is pre-release but we are showing it to various stakeholders before a wider release for feedback. If you are interested, please let me know!
- Video Metadata Hub 1.4 includes new properties for accessibility, content warnings, AI-generated content, and clarifies the meanings of many other properties.
New faces at IPTC
We waved farewell to Johan Lindgren of TT as a Board Member, after five years of service. Thankfully Johan is staying on as Lead of the News in JSON Working Group.
We welcomed long-time member Heather Edwards of The Associated Press as our newest board member.
We welcomed Activo, Data Language, Denise Kremer, MarkLogic, Truefy, Broadcast Solutions and Access Intelligence as new IPTC members, plus Swedish publisher Bonnier News who are joining at the start of 2023. We’re very happy to have you all as members!
If you are interested in joining, please fill out our membership enquiry form.
Web site updates
We launched a new, comprehensive navigation bar on this website, making it easier to find our most important content.
We have also just launched a new section highlighting the “themes” that IPTC is watching across all of our Working Groups:
We would love to hear what you think about the new sections, which hopefully bring the site to life.
Best wishes to all for a successful 2023!
Thanks to everyone who has supported IPTC this year, whether as members, speakers at our events, contributors to our standards development or software vendors implementing our standards. Thanks for all your support, and we look forward to working with you more in the coming year.
If you have any questions or comments, you can contact me directly at mdirector@iptc.org.
Best wishes,
Brendan Quinn
Managing Director, IPTC
Alamy, a stock photo agency offering a collection of over 300 million images along with millions of videos, has recently launched a new Partnerships API, and has chosen IPTC’s ninjs 2.0 standard as the main format behind the API.
Alamy is an IPTC member via its parent company PA Media, and Alamy staff have contributed to the development of ninjs in recent years, leading to the introduction of ninjs 2.0 in 2021.
“When looking at a response format, we sought to adopt an industry standard which would aid in the communication of the structure of the responses but also ease integration with partners who may already be familiar with the standard,” said Ian Young, Solutions Architect at Alamy.
“With this in mind, we chose IPTCs news in JSON format, ninjs,” he said. “We selected version 2 specifically due to its structural improvements over version 1 as well as its support for rights expressions.”
Young continued: “ninjs allows us to convey the metadata for our content, links to the media itself and the various supporting renditions as well as conveying machine readable rights in a concise payload.”
“We’ve integrated with customers who are both familiar with IPTC standards and those who are not, and each have found the API equally easy to work with.”
Learn more about ninjs via IPTC’s ninjs overview pages, consult the ninjs User Guide, or try it out yourself using the ninjs generator tool.
At the IPTC Spring Meeting in May 2022, IPTC’s Standards Committee voted to approve ninjs 1.4, the latest version in the 1.x track of IPTC’s standard for news content in the JSON format.
Johan Lindgren of TT Nyhetsbyrån, Lead of the IPTC News in JSON Working Group, said:
“After the launch of ninjs 2.0 in the autumn of 2021, we received requests to add some of the new 2.0 features to the first generation of ninjs, so that those who are using the 1.x branch of ninjs can use the new features without making breaking changes. So we are excited to publish version 1.4 of ninjs, where these features are included.”
Those changes include:
- New property contentcreated, denoting the date and time when the content of this ninjs object was originally created (as opposed to the date and time when the ninjs object itself was created). For example, an old photo that is now handled as a ninjs object may have a firstcreated and versioncreated of “2022-06-02T12:00:00+00:00”, but a contentcreated value of “1933-04-03T00:00:00+00:00”. The contents must be a valid JSON Schema date-time object.
- New property expires, showing “the date and time after which the Item is no longer considered editorially relevant by its provider.” Note that this is not the same as a rights-related expiration, it simply conveys the desire of the content creator to highlight the content until a certain time. A good example might be a football match preview, which would no longer be editorially relevant after the game commences. The contents must be a valid JSON Schema date-time object.
- New property rightsinfo, which holds an expression of rights to be applied to the content. It contains sub-properties langid (a URI which specifies the language used to specify rights such as RightsML or ODRL), and one of either linkedrights (containing a link to a remotely-hosted declaration of the rights associated with the content) or encodedrights (which includes an embedded encoding of the rights statements within the ninjs object).
Which version of ninjs should I choose for my project?
There might be some confusion since we have released ninjs 1.4 after the release of ninjs 2.0. Please note that this is simply an update to the 1.x branch of ninjs to make it easier for users who cannot upgrade to 2.x branch due to breaking changes.
If you are starting a new project that requires JSON-encoded news content, we recommend using ninjs 2.0. This version should be easiest for developers to work with.
If you are already using a 1.x version of ninjs, we recommend at least upgrading to version 1.4. This should be an easy change, because 1.4 is backwards-compatible with versions 1.0, 1.1, 1.2 and 1.3. We would also recommend upgrading to 2.0 if possible, but if not, 1.4 is the best version of the 1.x branch.
Supporting materials for ninjs 1.4 and ninjs 2.0 can be found at these locations:
- JSON Schema for ninjs 1.4: https://iptc.org/std/ninjs/ninjs-schema_1.4.json
- JSON Schema for ninjs 2.0: https://iptc.org/std/ninjs/ninjs-schema_2.0.json
- User Guide for ninjs (covering version 2.0 but mentioning 1.4): https://iptc.org/std/ninjs/userguide/
- The ninjs Generator has been updated to output 1.4-compatible JSON: https://www.iptc.org/std/ninjs/generator/
Thanks to Johan and the IPTC News in JSON Working Group for working on this release.
Where else can you hear about the difficulties of examining photo metadata in NFTs, see a lifelike image of a human being generated from pure data before your eyes, see how Wikidata can be used to take semantic fingerprints of news articles, and discover that an hour is nowhere near long enough to discuss simplifying machine-readable rights? Nowhere but the IPTC Meeting, of course! And this year’s Spring Meeting was the venue for all of this and much more.
We held the meeting virtually from Monday May 16 to Wednesday May 18th, and attending were over 70 people from at least 45 organisations across more than 20 countries.
Along with our usual Working Group updates and committee meetings, we invited speakers from several fascinating startups, services and projects at member companies. Here’s a quick summary of their sessions:
- We heard from Kairntech who are working on a classification system based on extracting entities from news stories and building a “semantic fingerprint” which can be used for cross-language classification, search and content enhancement
- The New York Times’ R&D Lab presented PaperTrail, a project to enhance the quality of the Times’ print archive through the use of machine learning to improve on basic OCR techniques (they’re looking for collaborators, more info coming soon!)
- Bria.ai showed us how an API can be used to enhance and create images and videos through the use of a custom GAN model trained in a “responsible AI” method
- Margaret Warren talked us through her efforts in creating and selling an NFT, looking at the process view the perspective of a photo metadata expert
- Consultant and author Henrik de Gyor talked us through the latest in synthetic media, which will be helpful in helping us to finalise our Digital Source Type vocabulary for synthetic media
- Laurent Le Meur from EDRLab presented his project’s recommendation on a Text and Data Mining Reservation Protocol, which can be used by publishers to restrict the rights of data miners in scraping any content for the purpose of analysis or building a model
- We heard from Dominic Young of Axate on his approach to offer pay-as-you-go payment options on paywalled news sites based on a simple pre-paid wallet mechanism.
We also had many announcements and discussions around IPTC standards, many of which we will be revealing in the coming months. One notable update is that the Standards Committee approved ninjs version 1.4 which we will release soon.
Thanks to all the IPTC members, Working Group leads, committee members and guests who made this member meeting one to remember.
Today, IPTC announces the release of version 2.0 of the news industry’s standard for exchanging content in JSON: ninjs.
The new version introduces a completely new way of declaring multiple headlines, body texts and description fields, which is compatible with binary data serialisation formats such as Avro and Protocol Buffers.
“We are very excited about releasing the 2.0-version of News in JSON (ninjs),” says Johan Lindgren (TT), lead of the working group responsible for developing the standard. “When working on improving the 1.3 version, we realised that a number of suggestions would mean breaking changes and after some consideration we took that step. Now we have a version of ninjs that is better suited for APIs, databases like Elastic and conversion to binary methods like Protocol Buffers.”
The IPTC News in JSON Working Group has kept the original focus on two main use cases: data in transit and data at rest.
In recent years, more systems have started to convert from JSON formats into binary data serialisation protocols such as Avro and Protocol Buffers for data in transit. However ninjs 1.x couldn’t be converted into these protocols because of the dynamic way that keys could be defined, for example “headline_main” and “headline_subhead”. In ninjs 2.0, all properties are given well-defined names, so they can be converted into Protobufs schemas. The GitHub repository for ninjs now includes a demonstration of how ninjs 2.0 can be used with Protocol Buffers.
Other tools included in the repository are an example GraphQL server for ninjs and example XSLTs to convert from IPTC XML-based formats like NewsML and NITF.
The ninjs Generator tool has been updated to create ninjs 2.0. In fact, using the tool, users can switch between generating ninjs 1.3 and ninjs 2.0 output at the click of a radio button.
The official location of the ninjs 2.0 JSON Schema is https://iptc.org/std/ninjs/ninjs-schema_2.0.json.
A full list of the changes in ninjs 2.0 can be viewed in section 7.5 of the ninjs User Guide.
IPTC members and our guests have just finished a very busy 2021 edition of our IPTC Autumn Meeting. Held online over three days, the meeting was a mix of IPTC Working Group presentations, members presenting recent projects, and invited guest speakers on important topics in the news and media world.
This year we heard member presentations from:
- Honor Craig-Bennett of the BBC reporting on the Images Digital Asset Management system, based on the Guardian’s open-source GRID system. We heard from Andy Read about this system
- Heather Edwards from Associated Press spoke about their project to replace their existing rules-based classification system
- Mark Milstein from Microstocksolutions spoke about a new project he is working on to create “synthetic media” AI-generated images and videos based on textual descriptions and metadata
- DATAGROUP Consulting Group’s Robert Schmidt-Nia spoke about a project using AWS’s Comprehend text classification service to power a serverless news classification system using IPTC’s Media Topics vocabulary
- Frameright‘s Marina Ekroos speaking about an EU stars4media project they are working on called “Artificial Intelligence in photojournalism: can it work?”
- Scott Yates from new Startup Member JournalList spoke about the trust.txt project, letting news providers state their affiliates and official social media channels in a simple way
- Bruce MacCormack from CBC / Radio Canada spoke about Project Origin, looking at authenticity for video and news media, passing requirements to the C2PA work
- The BBC‘s Charlie Halford spoke about C2PA, updating members with a deep technical view on how the system is planned to work, as detailed in the recently-released draft specification.
In addition, we heard from guest speakers:
- Keesiu Wong of Design AI spoke about the Videre AI project, looking at “next-generation video understanding”. He was joined by project partner Javier Picazo from Associate Member Agencia EFE, Spain’s national news agency.
- Alex Lakatos of Interledger spoke about the distributed payments technology which is used by…
- Uchi Uchibeke of Coil who use Interledger to implement micropayments which can be implemented on publisher websites by adding one line of HTML.
New standard versions
The Working Group presentations were also packed with content, in particular three new standard versions that were proposed to the Standards Committee:
- NewsML-G2 v2.30 adds fields for “residrefformat” and “residrefformaturi” to enable publishers to describe the format of a resource ID reference, and makes catalog and catalogRef optional to support publishers who only use URIs for controlled values and therefore have no need for catalogs
- The News in JSON Working Group’s ninjs v2.0 is a non-backwards-compatible new release which changes the way repeating values are handled, moving from patternProperties fields with arbitrary names such as “body_text” and “body_html” to arrays with fixed names such as “bodies”. The objects within the array elements include properties “role” and “contenttype” which take the place of the arbitrary extension to the “body_” tag.
- The IPTC Photo Metadata Standard v2021.1 adds new properties to IPTC Core which are intended to be used for accessibility purposes: “Alt Text (Accessibility)” and “Extended Description (Accessibility)”. We have also added and Event Identifier property to align with other metadata ID properties, and modified the Description Writer field to include the writer of the accessibility fields.
New faces
We were very happy to welcome new members Frameright, JournalList, Spotlight Sports Group, Glide Publishing Platform to the meeting.
The Standards Committee was chaired for the first time by new Chair Paul Harman of Bloomberg.
The AGM was the first for new Treasurer, Gerald Innerwinkler of Austria Press Agentur APA.
And we congratulate Philippe Mougin of Agence France-Presse AFP for being voted on to the IPTC Board of Directors, along with the existing Board members who were all re-elected.
It was another great meeting with over 70 representatives from 42 organisations in 17 different countries! We’re hoping that the next IPTC member meeting will be back to face-to-face, and we have provisionally booked Tallinn, Estonia for 16 – 18 May, 2022. We will confirm this in January 2022.