Mittmedia and Journalism++ Stockholm, two news organizations in Sweden, are successfully developing and incorporating AIPs, automation tools and robots into workflows to enhance to the capabilities of newsrooms, as reported at the International Press Communications Council’s (IPTC) Spring Meeting 2016.
News organizations continue to experiment with bots as part of a frontier in automation journalism, as publishers draw on the benefits of the massive amounts of data available to newsrooms, including information about their own audiences. Despite some apprehension, the benefits of automating parts of the publishing process are many: aiding journalists in storytelling with the ability to sift through big data, refining workflows and reducing workloads, and more precise and faster content delivery to customers.
Mittmedia began their automation efforts in 2015 with a weather forecast text bot, which pulls data from the Swedish Meteorological and Hydrological Institute.
Set up initially as a testing tool based on a simple minimum viable product (MVP), it now delivers daily forecasts for 42 municipalities, soon to be 63.
Mittmedia’s next project was Rosalinda, a sports robot that transforms data into text for immediate publishing. Data is pulled from the Swedish website Everysport API, giving developers access to information on 90,000 teams and 1,500,000 matches. Rosalinda now reports all football, ice hockey and floor ball matches played in Sweden, which filled a need in the market. United Media, owned by Mittmedia and two other companies, developed the tool.
Mittmedia has adopted a data-driven mindset and work process to gain a competitive edge over other local news sources. “We aim to deliver more content – faster, and provide it to the right person, at the right time and at the right place,” said Mikael Tjernström, Mittmedia API Editor.
Faster publication and more personal and relevant content were also among the reason for Journalism++’s development of the automated news service Marple, which focuses on story finding and investigation, rather than text generation. according to Jens Finnäs, the organization’s founder.
One of four Swedish projects to receive funding from Google’s Digital News Initiative (DNI) this year, Marple is used for finding targeted local stories in public data. For example, Marple has analyzed monthly crime statistics and found a wave of bike thefts in Gothenburg and a record number of reported narcotics offences in Sollefteå.
“Open data has been a highly underutilized resource in journalism. We are hoping to change that,” Finnäs said. “We don’t think the robots will replace journalists, but we are positive that automation can make journalism smarter and more efficient, and that there are thousands of untold stories to be found.”
The grant from Google’s DNI gives Journalism++ a unique opportunity to test Marple and possibly turn it into a commercially viable product, Finnäs said.
Jens Finnäs: email@example.com Twitter @jensfinnas
Mikael Tjernström: Twitter @micketjernstrom
Photo by Photo by CC/FLICKR/Peyri_Herrera.
Join us for IPTC’s Autumn Meeting 2016 in Berlin! Anyone interested in IPTC’s work can attend our face-to-face meetings, held three times a year, or take part in regular conference call sessions as a guest. Our meetings are the perfect opportunity to network with industry peers, learn about emerging industry topics from leading professionals, and simplify product development with technical standards.
The venue for the Autumn 2016 Meeting is dpa Headquarter Berlin, Markgrafenstraße 20, 10969 Berlin. Please contact us for hotel accommodations and conference registration information.
The agenda will include Video Day on 25 October: IPTC will release and introduce its new Video Metadata Hub Recommendation. Speakers from video makers, video suppliers, video publishers and system vendors will discuss how video workflows can be improved.
Additionally, the IPTC membership will hold its Annual General Meeting. Locations for IPTC’s three face-to-face meetings per year are rotated worldwide, with at least one meeting held in Europe annually.
Interested in attending? Contact Us, please.
For the new version 2.23 of NewsML-G2, the specification part of the Annual Release is now available and can be downloaded from the NewsML-G2 Release Section of the IPTC Developer Site.
The NewsML-G2 standard provides state-of-the-art XML format metadata to combine rich functionality, ease of use, compactness and compatibility with the Semantic Web. It is a single format for exchanging text, images, video, audio news and event or sports data – and packages thereof.
This specification part of the Annual Release of NewsML-G2 v2.23 includes the XML Schemas and the Structure Matrix document. The updated Quick Start Guides, Implementation Guidelines and full specifications will be released in October. This is part of an ongoing incremental development of NewsML-G2, as providers expand their content use-cases.
Three changes/improvements in version 2.23 are:
- It allows the addition of further Rights Expression properties <rightsInfo> and now covers these Rights cases:
– Allows embedding or referencing a rights expression
– Allows use of XML or JSON as format for embedding
- It allows address properties (locality, area, country, etc.) to include a World Region.
- It allows the addition of facets to the Item Class property, this provides for more flexibility.
Please visit the IPTC Standards Page for a full list of the available IPTC standards.
The International Press Telecommunications Council (IPTC) released SportsML 3.0, the recently approved comprehensive update of the open and highly flexible standard for the interchange of sports data.
Developed by the Sports Content Working Party of IPTC, which includes organisations from eight different countries, SportsML 3.0 is designed to be easy to understand and implement, and covers the full gamut of sports events. Sports Markup Language is the tech-industry standard XML vocabulary for Sports scores, lineups, schedules, standings and statistics.
SportsML has been adopted by many international news organizations, including the BBC (UK), NTB (Norway), TT (Sweden), APA (Austria), AP (USA), and more. It has been applied to results from the Olympics, European football competitions, as well as the major North American sports leagues, for team, individual and head-to-head sports.
“We’ve had 12 years of input from sports experts at news organizations since SportsML 1.0,” said Paul Kelly, Chairman of the Sports Content Working Party and Director of Software Development at XML Team Solutions, a sports-focused agency. “SportsML 3.0 addresses the requirements of anyone handling sports results and statistics and will save the time and cost of developing an in-house format. Companies can also defend against vendor lock-in caused by adopting proprietary formats.”
Highlights of the new SportsML version include the public release of 113 sports-related controlled vocabularies (CVs). “The most important thing we did was design SportsML to play well with the current generation of semantic technologies,” says Kelly. These CVs cover the statistical properties, player positions, on-field actions, infractions, etc., of 11 sports plus 37 CVs that cover all sports. These will be available publicly as a package.
These terms can be combined with SportsML 3’s new generic stat structure to incorporate both IPTC and external properties, such as those published by the IOC or any other vendor. “You can easily add new properties and continue to process gracefully using the powerful vocabulary-management the IPTC has devised,” says Kelly. “That’s usually missing from even the most prominent sports formats.”
The specification and documentation can be downloaded from https://iptc.org/standards/sportsml-g2/. Additionally, the IPTC Developer Site provides technical information about SportsML, and the SportsML Users Forum is used to share experiences and raise questions, and also connects companies, organizations and vendors.
When IPTC member Sourcefabric presents their flagship product Superdesk – an extensible end-to-end news production, curation and distribution platform – they always recognize the importance of IPTC standard NewsML-G2 as its backbone.
As Sourcefabric CTO Holman Romero explained at the IPTC Summer Meeting in Stockholm (13 – 15 June 2016), Superdesk was built on the principles of the News Architecture part of the NewsML-G2 specification by the IPTC. This is because Superdesk is not a traditional Web CMS, but rather a platform developed from the ground up for journalists to manage the numerous processes of a newsroom.
“Superdesk is more than a news management tool built for journalists by journalists, for creation, archiving, distribution, workflow structure, and editorial communications,” Romero said. “We at Sourcefabric also see it as the cornerstone of the new common open-source code base for quality, professional journalism.”
NewsML-G2 is a blueprint that provides all the concepts and business logic for a news architecture framework. It also standardises the handling of metadata that ultimately enables all types of content to be linked, searched, and understood by end users. NewsML-G2 metadata properties are designed to comply with RDF, the data model of the Semantic Web, enabling the development of new applications and opportunities for news organisations in evolving digital markets.
There were several important factors that led Sourcefabric, Europe’s largest developer of open source tools for news media, to the decision to use NewsML-G2:
- IPTC has established credibility as a consortium of the world’s leading news agencies and publishers. Additionally, NewsML-G2 has been adopted by some of the world’s major news agencies as the standard de facto for news distribution. “Why reinvent the wheel?” Romero said. “IPTC’s standards are based on years of experience of top news industry professionals.”
- NewsML-G2 met the requirements of Sourcefabric’s content model: granularity, structured data, flexibility, and reusability.
- NewsML-G2 met the requirements for Sourcefabric’s design principals:
- Every piece of content is a News Item.
- Content types are text, image, video, audio.
- Content profiles support the creation of story profiles and templates.
- It can format items for content packages and highlights.
- Content can be created once and used in many places. Sourcefabric refers to this as the COPE model: Create Once, Publish Everywhere. This structure enables and frees the content to be used seamlessly and automatically across multiple channels and devices, and in a variety of previously impossible contexts.
Sourcefabric also stresses the importance of metadata, the building blocks of “structured journalism.” As explained by Romero in a recent blog post: “The foundation of structured journalism is built on the ability to access and locate enormous amounts of data from all over the web and from within the system itself (i.e. content from previous articles). Without providing valuable metadata for each of your stories and subsequent pieces of visual collateral, finding key information located inside of them becomes infinitely more difficult.”
News organizations that use Superdesk include the Australian news agency AAP and Norweigian News Agency NTB. Other IPTC standards supported by the platform are NewsML 1, ninjs, NITF, Subject Codes, IPTC 7901.
Source code repositories are publicly available in Github: https://github.com/superdesk
About Sourcefabric: Sourcefabric’s mission is to make professional-grade technology available to all who believe that quality independent journalism has a fundamental role to play in any healthy society. They generate revenue by IT services – managed hosting, SaaS, custom development, integration into existing workflows – as well as project-by-project funding, grants, donations.
The International Press Telecommunications Council (IPTC) is close to finalizing a new recommendation for video standards: the IPTC Video Metadata Hub.
The Video Metadata Working Group, which is comprised of members worldwide from news organisations, vendors and experts in the metadata field, is planning to vote on a recommendation of the Video Metadata Hub (VMD Hub) at the IPTC Autumn Meeting, 24 – 26 October 2016, in Berlin. The final Draft #4 has been published for a last round of reviews: http://dev.iptc.org/Video-Metadata.
Because there are several different existing standards for video – for compressing video and audio, file formats and different schemas of metadata properties – IPTC is presenting a “hub” recommendation that covers many use cases and exchange of metadata over multiple standards.
The VMD Hub is comprised of a single set of video metadata properties, which can be expressed by multiple technical standards (namely XMP for metadata embedded into binary video files, and EBU Core for non-embedded metadata stored in sidecar files). These properties can be used for describing the visible and audible content, rights data, administrative details and technical characteristics of a video.
Likewise, the VMD Hub supports workflow, exchange of metadata, and search functions across other existing standards, and will include mapping to Apple Quicktime, PBCore, MPEG7 and Schema.org, and perhaps more in the future.
“Users of videos of different standards told IPTC they need a common ground in metadata for efficient workflows,” said Michael Steidl, Managing Director of IPTC. “This is what we deliver now with the Video Metadata Hub.”
The IPTC Autumn Meeting will feature a Video Day on 25 October. In addition to the presentation about the VMD Hub, speakers from video makers, video suppliers, video content publishers and system vendors will discuss how video workflows can be improved.
For information about attending the IPTC Autumn Meeting and Video Day, contact us.
IPTC has secured funding and the foundation for language and technical requirements for its EXTRA Project – a rules-based classification system, as reported at IPTC’s Summer Meeting 2016 by Stuart Myles, project lead and IPTC Chairman of the Board.
EXTRA is the EXTraction Rules Apparatus, a multilingual open-source platform for rules-based classification of news content. EXTRA will allow newsrooms to automatically annotate news content with high-quality metadata subjects using a predefined set of rules. IPTC was awarded a grant from the first round of Google’s Digital News Initiative Innovation Fund to build and freely distribute the initial version of EXTRA.
The EXTRA project team has delivered a road map for the project to Google’s Digital News Initiative, and are finalizing their plans for language requirements and rules, as well as technical requirements and licensing. IPTC will approach existing open source communities, linguists and programmers to facilitate development.
For easy adoption and consistency in the news industry, IPTC is creating rules for tagging documents with its industry standard Media Topics vocabulary, used widely by publishers. IPTC plans to provide example rules for at least two of the languages supported by Media Topics: Arabic, English, French, German and Spanish.
“For small to medium size publishers who are dissatisfied with hand-tagging their content or grappling with complex machine-learning tools, EXTRA is an open-source news classification engine that will let you easily apply rich metadata to breaking news content,” said Myles. “Unlike manual techniques, which can be slow and inconsistent, or traditional statistical methods, which aren’t suitable for breaking news, EXTRA’s rules-based classification will provide fast, consistent and relevant metadata to enrich search, advertising and content analytics.”
IPTC invites other parties to join the development of the EXTRA project. To get involved, contact Myles at firstname.lastname@example.org.
At our three-day summer meeting in Stockholm, 13-15 June 2016, about 30 IPTC member delegates and 10 invited experts networked and discussed emerging issues and challenges affecting technology and the news industry.
Thanks to the several news agencies and vendors who gave examples of IPTC standards as the backbone of their news exchange systems and products:
Mittmedia about its use of APIs, automated creation of text news by text robots, and data-driven journalism; Profium on the use of multicast; VG on integrating newsrooms with product and technology; Infomaker on its Newspilot publishing platform; Swedish news agency TT on their Toolbox and development of digital content, strategies, and new business; Sourcefabric on its Superdesk publishing platform; Journalism++ on robots – when, where why to start; Fotoware on its digital asset management software FotoWeb. Special thanks to Johan Lindgren and TT for helping us navigate Stockholm, as well as coordinate these presentations.
We approved SportsML 3.0, a major upgrade of the premier open standard for sports data, and NewsML-G2 version 2.23 to further refine the most widely-used standard for representing news and events across all media types.
We also talked about ideas for marketing the IPTC and ways to grow our membership. We will increase our discussion of relevant news and events, as well as information about how the IPTC’s work is applied by news companies every day. We plan to produce more hands-on information about photo metadata as, judging from the traffic to our website, that’s something a lot of people are looking for.
A new and exciting way to get involved with the IPTC is EXTRA: The EXTraction Rules Apparatus. We received a grant from Google’s Digital News Initiative to build and freely distribute a multilingual open-source platform for rules-based classification of news content. If that sounds interesting, then get in touch to learn more.
Please consider joining us for our Autumn Meeting in Berlin (24 – 26 October 2016), which will feature a video workshop day, on 25 October. We plan to launch the Video Metadata Hub recommendation, a single set of video metadata properties covering the entire video workflow, including mappings and guidelines for many existing video standards.
Chairman, IPTC / Director of Information Management, Associated Press
Stockholm photo: Jill Laurinaitis
IPTC’s Summer Meeting takes place 13 June to 15 June in Stockholm. IPTC members, working groups and parties gather three times a year to discuss emerging industry topics, updates to standards and other IPTC projects.
The Summer Meeting will focus on a new major version the standard SportsML 3.0, a final draft of the Video Metadata Hub recommendation and the EXTRA project, funded by a Google DNI grant.
Presentations in Stockholm will be given by IPTC members and invited guests.
The full list of topics and presenters can be found at: https://iptc.org/events/summer-meeting-2016/
The International Press Telecommunications Council’s (IPTC) Photo Metadata Conference 2016, on May 26 in Zagreb (Croatia), will focus on how to “Keep Metadata Alive and Intact” throughout the life cycle of images.
Held annually since 2007, this day-long annual event will address how information can be properly retained when images are moved from one person or system to the next, or through archiving processes. Speakers administrating the sessions – among the industry’s most respected experts in image and data management, digital preservation, information architecture, and photography – will show how metadata is produced, used and preserved in new and innovative ways.
The morning session will cover two topics with multiple presentations: “Protecting Metadata While Using Social Media,” which will discuss results of the IPTC Social Media Photo Metadata Test 2016, including the finding that most photo metadata is removed when uploading or downloading images to many popular social media platforms. Some major media companies will discuss how they protect their metadata through this process. “Strongly Attached Metadata: What You Need to Know,” featuring system vendors and speakers from photo businesses as well as a university, will cover how to apply and organize metadata in an efficient way, to keep it alive in distribution chains.
“Much time and money is spent to protect metadata in in-house systems,” said Michael Steidl, managing director of IPTC and lead of its photo metadata workstream. “Therefore it is a business requirement to protect the descriptive data and rights information from getting lost. This conference will raise awareness and share knowledge about how to keep metadata alive.”
During the afternoon sessions, Steidl will present a close to final draft of the much anticipated “IPTC Video Metadata Hub,” a new technical recommendation which has been in development by IPTC since 2014 and is usable across many existing video standards. Sarah Saunders, expert in procurement and implementation of digital asset management systems, will also present the “Cultural Heritage Photo Metadata Panel” for Adobe software, which supports a rich set of metadata for cultural heritage objects shown in images. The panel can be downloaded for free and installed on a computer.
The Photo Metadata Conference 2016 will be held again in conjunction with the annual CEPIC Congress, in Zagreb (Croatia). Photographers, small and large photo agencies and libraries, and trade associations from the photo business are all encouraged to attend. Registration is required, either as participant of the CEPIC Congress, or through the Conference’s registration form.
See detailed agenda and speakers.