Categories
Archives
The NewsCodes Working Group of IPTC has completed mapping of the top two levels of hierarchical terms of Media Topics to Wikidata.
Media Topics is an IPTC standard – a 1,100-term taxonomy with a focus on categorizing text. Released in 2010 as a development based on the IPTC Subject Codes, use of Media Topics is free and available in different formats. They can be viewed on the IPTC Controlled Vocabulary server, or in a user-friendly tree hierarchy tool.
IPTC creates and maintains taxomonies and controlled vocabularies – to assign terms as metadata values to news objects like text, photographs, graphics, audio and video files and streams. This allows for a consistent coding of news metadata across news providers, over the course of time.
“The idea of semantic mapping and being involved in a linked data initiative like Wikidata is a natural step for IPTC,” said Jennifer Parrucci, chair of the IPTC NewsCodes Working Group and senior taxonomist for The New York Times. “When linking an existing taxonomy to another, Wikidata serves as a central point of reference.”
Wikidata is a free, collaborative, multilingual knowledge base that can be read and edited by both humans and machines. It provides centralized storage for an access to structured data for all Wikimedia projects, as well as for use on external websites.
In total about 100 mappings from Media Topics to Wikidata have been manually applied. The mappings use SKOS mapping relationships.
Media Topics began with the Subject Codes vocabulary and extended the tree from 3 to 5 levels and reused the same 17 top-level terms. The lower-level terms have been revised and rearranged. Each Media Topic provides a mapping back to one of the Subject Codes.
More information:
Media Topics Page, IPTC.org
IPTC Controlled Vocabulary server
Guidelines
Tree Hierarchy Tool
News CodesSubject Codes
Questions? Contact us.
Mittmedia and Journalism++ Stockholm, two news organizations in Sweden, are successfully developing and incorporating AIPs, automation tools and robots into workflows to enhance to the capabilities of newsrooms, as reported at the International Press Communications Council’s (IPTC) Spring Meeting 2016.
News organizations continue to experiment with bots as part of a frontier in automation journalism, as publishers draw on the benefits of the massive amounts of data available to newsrooms, including information about their own audiences. Despite some apprehension, the benefits of automating parts of the publishing process are many: aiding journalists in storytelling with the ability to sift through big data, refining workflows and reducing workloads, and more precise and faster content delivery to customers.
Mittmedia began their automation efforts in 2015 with a weather forecast text bot, which pulls data from the Swedish Meteorological and Hydrological Institute.
Set up initially as a testing tool based on a simple minimum viable product (MVP), it now delivers daily forecasts for 42 municipalities, soon to be 63.
Mittmedia’s next project was Rosalinda, a sports robot that transforms data into text for immediate publishing. Data is pulled from the Swedish website Everysport API, giving developers access to information on 90,000 teams and 1,500,000 matches. Rosalinda now reports all football, ice hockey and floor ball matches played in Sweden, which filled a need in the market. United Media, owned by Mittmedia and two other companies, developed the tool.
Mittmedia has adopted a data-driven mindset and work process to gain a competitive edge over other local news sources. “We aim to deliver more content – faster, and provide it to the right person, at the right time and at the right place,” said Mikael Tjernström, Mittmedia API Editor.
Faster publication and more personal and relevant content were also among the reason for Journalism++’s development of the automated news service Marple, which focuses on story finding and investigation, rather than text generation. according to Jens Finnäs, the organization’s founder.
One of four Swedish projects to receive funding from Google’s Digital News Initiative (DNI) this year, Marple is used for finding targeted local stories in public data. For example, Marple has analyzed monthly crime statistics and found a wave of bike thefts in Gothenburg and a record number of reported narcotics offences in Sollefteå.
“Open data has been a highly underutilized resource in journalism. We are hoping to change that,” Finnäs said. “We don’t think the robots will replace journalists, but we are positive that automation can make journalism smarter and more efficient, and that there are thousands of untold stories to be found.”
The grant from Google’s DNI gives Journalism++ a unique opportunity to test Marple and possibly turn it into a commercially viable product, Finnäs said.
More information:
Jens Finnäs: jens.finnas@gmail.com Twitter @jensfinnas
Mikael Tjernström: Twitter @micketjernstrom
Photo by Photo by CC/FLICKR/Peyri_Herrera.
Join us for IPTC’s Autumn Meeting 2016 in Berlin! Anyone interested in IPTC’s work can attend our face-to-face meetings, held three times a year, or take part in regular conference call sessions as a guest. Our meetings are the perfect opportunity to network with industry peers, learn about emerging industry topics from leading professionals, and simplify product development with technical standards.
The venue for the Autumn 2016 Meeting is dpa Headquarter Berlin, Markgrafenstraße 20, 10969 Berlin. Please contact us for hotel accommodations and conference registration information.
The agenda will include Video Day on 25 October: IPTC will release and introduce its new Video Metadata Hub Recommendation. Speakers from video makers, video suppliers, video publishers and system vendors will discuss how video workflows can be improved.
We’ll also hear the latest news about the progress of the EXTRA project, as well as Permission & Obligation Expressions under development by a W3C Working Group with contributions from IPTC.
Additionally, the IPTC membership will hold its Annual General Meeting. Locations for IPTC’s three face-to-face meetings per year are rotated worldwide, with at least one meeting held in Europe annually.
Interested in attending? Contact Us, please.
For the new version 2.23 of NewsML-G2, the specification part of the Annual Release is now available and can be downloaded from the NewsML-G2 Release Section of the IPTC Developer Site.
The NewsML-G2 standard provides state-of-the-art XML format metadata to combine rich functionality, ease of use, compactness and compatibility with the Semantic Web. It is a single format for exchanging text, images, video, audio news and event or sports data – and packages thereof.
This specification part of the Annual Release of NewsML-G2 v2.23 includes the XML Schemas and the Structure Matrix document. The updated Quick Start Guides, Implementation Guidelines and full specifications will be released in October. This is part of an ongoing incremental development of NewsML-G2, as providers expand their content use-cases.
Three changes/improvements in version 2.23 are:
- It allows the addition of further Rights Expression properties <rightsInfo> and now covers these Rights cases:
– Allows embedding or referencing a rights expression
– Allows use of XML or JSON as format for embedding - It allows address properties (locality, area, country, etc.) to include a World Region.
- It allows the addition of facets to the Item Class property, this provides for more flexibility.
Please visit the IPTC Standards Page for a full list of the available IPTC standards.
The International Press Telecommunications Council (IPTC) released SportsML 3.0, the recently approved comprehensive update of the open and highly flexible standard for the interchange of sports data.
Developed by the Sports Content Working Party of IPTC, which includes organisations from eight different countries, SportsML 3.0 is designed to be easy to understand and implement, and covers the full gamut of sports events. Sports Markup Language is the tech-industry standard XML vocabulary for Sports scores, lineups, schedules, standings and statistics.
SportsML has been adopted by many international news organizations, including the BBC (UK), NTB (Norway), TT (Sweden), APA (Austria), AP (USA), and more. It has been applied to results from the Olympics, European football competitions, as well as the major North American sports leagues, for team, individual and head-to-head sports.
“We’ve had 12 years of input from sports experts at news organizations since SportsML 1.0,” said Paul Kelly, Chairman of the Sports Content Working Party and Director of Software Development at XML Team Solutions, a sports-focused agency. “SportsML 3.0 addresses the requirements of anyone handling sports results and statistics and will save the time and cost of developing an in-house format. Companies can also defend against vendor lock-in caused by adopting proprietary formats.”
Highlights of the new SportsML version include the public release of 113 sports-related controlled vocabularies (CVs). “The most important thing we did was design SportsML to play well with the current generation of semantic technologies,” says Kelly. These CVs cover the statistical properties, player positions, on-field actions, infractions, etc., of 11 sports plus 37 CVs that cover all sports. These will be available publicly as a package.
These terms can be combined with SportsML 3’s new generic stat structure to incorporate both IPTC and external properties, such as those published by the IOC or any other vendor. “You can easily add new properties and continue to process gracefully using the powerful vocabulary-management the IPTC has devised,” says Kelly. “That’s usually missing from even the most prominent sports formats.”
The specification and documentation can be downloaded from https://iptc.org/standards/sportsml-g2/. Additionally, the IPTC Developer Site provides technical information about SportsML, and the SportsML Users Forum is used to share experiences and raise questions, and also connects companies, organizations and vendors.
When IPTC member Sourcefabric presents their flagship product Superdesk – an extensible end-to-end news production, curation and distribution platform – they always recognize the importance of IPTC standard NewsML-G2 as its backbone.
As Sourcefabric CTO Holman Romero explained at the IPTC Summer Meeting in Stockholm (13 – 15 June 2016), Superdesk was built on the principles of the News Architecture part of the NewsML-G2 specification by the IPTC. This is because Superdesk is not a traditional Web CMS, but rather a platform developed from the ground up for journalists to manage the numerous processes of a newsroom.
“Superdesk is more than a news management tool built for journalists by journalists, for creation, archiving, distribution, workflow structure, and editorial communications,” Romero said. “We at Sourcefabric also see it as the cornerstone of the new common open-source code base for quality, professional journalism.”
NewsML-G2 is a blueprint that provides all the concepts and business logic for a news architecture framework. It also standardises the handling of metadata that ultimately enables all types of content to be linked, searched, and understood by end users. NewsML-G2 metadata properties are designed to comply with RDF, the data model of the Semantic Web, enabling the development of new applications and opportunities for news organisations in evolving digital markets.
There were several important factors that led Sourcefabric, Europe’s largest developer of open source tools for news media, to the decision to use NewsML-G2:
- IPTC has established credibility as a consortium of the world’s leading news agencies and publishers. Additionally, NewsML-G2 has been adopted by some of the world’s major news agencies as the standard de facto for news distribution. “Why reinvent the wheel?” Romero said. “IPTC’s standards are based on years of experience of top news industry professionals.”
- NewsML-G2 met the requirements of Sourcefabric’s content model: granularity, structured data, flexibility, and reusability.
- NewsML-G2 met the requirements for Sourcefabric’s design principals:
- Every piece of content is a News Item.
- Content types are text, image, video, audio.
- Content profiles support the creation of story profiles and templates.
- It can format items for content packages and highlights.
- Content can be created once and used in many places. Sourcefabric refers to this as the COPE model: Create Once, Publish Everywhere. This structure enables and frees the content to be used seamlessly and automatically across multiple channels and devices, and in a variety of previously impossible contexts.
Sourcefabric also stresses the importance of metadata, the building blocks of “structured journalism.” As explained by Romero in a recent blog post: “The foundation of structured journalism is built on the ability to access and locate enormous amounts of data from all over the web and from within the system itself (i.e. content from previous articles). Without providing valuable metadata for each of your stories and subsequent pieces of visual collateral, finding key information located inside of them becomes infinitely more difficult.”
News organizations that use Superdesk include the Australian news agency AAP and Norweigian News Agency NTB. Other IPTC standards supported by the platform are NewsML 1, ninjs, NITF, Subject Codes, IPTC 7901.
Source code repositories are publicly available in Github: https://github.com/superdesk
About Sourcefabric: Sourcefabric’s mission is to make professional-grade technology available to all who believe that quality independent journalism has a fundamental role to play in any healthy society. They generate revenue by IT services – managed hosting, SaaS, custom development, integration into existing workflows – as well as project-by-project funding, grants, donations.
Categories
Archives
- January 2026
- December 2025
- November 2025
- October 2025
- September 2025
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- February 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- December 2019
- November 2019
- October 2019
- September 2019
- July 2019
- June 2019
- May 2019
- April 2019
- February 2019
- November 2018
- October 2018
- September 2018
- August 2018
- July 2018
- June 2018
- May 2018
- April 2018
- March 2018
- January 2018
- November 2017
- October 2017
- September 2017
- August 2017
- June 2017
- May 2017
- April 2017
- December 2016
- November 2016
- October 2016
- September 2016
- August 2016
- July 2016
- June 2016
- May 2016
- April 2016
- February 2016
- January 2016
- December 2015
- November 2015
- October 2015
- September 2015
- June 2015
- April 2015
- March 2015
- February 2015
- November 2014