IPTC has secured funding and the foundation for language and technical requirements for its EXTRA Project – a rules-based classification system, as reported at IPTC’s Summer Meeting 2016 by Stuart Myles, project lead and IPTC Chairman of the Board.
EXTRA is the EXTraction Rules Apparatus, a multilingual open-source platform for rules-based classification of news content. EXTRA will allow newsrooms to automatically annotate news content with high-quality metadata subjects using a predefined set of rules. IPTC was awarded a grant from the first round of Google’s Digital News Initiative Innovation Fund to build and freely distribute the initial version of EXTRA.
The EXTRA project team has delivered a road map for the project to Google’s Digital News Initiative, and are finalizing their plans for language requirements and rules, as well as technical requirements and licensing. IPTC will approach existing open source communities, linguists and programmers to facilitate development.
For easy adoption and consistency in the news industry, IPTC is creating rules for tagging documents with its industry standard Media Topics vocabulary, used widely by publishers. IPTC plans to provide example rules for at least two of the languages supported by Media Topics: Arabic, English, French, German and Spanish.
“For small to medium size publishers who are dissatisfied with hand-tagging their content or grappling with complex machine-learning tools, EXTRA is an open-source news classification engine that will let you easily apply rich metadata to breaking news content,” said Myles. “Unlike manual techniques, which can be slow and inconsistent, or traditional statistical methods, which aren’t suitable for breaking news, EXTRA’s rules-based classification will provide fast, consistent and relevant metadata to enrich search, advertising and content analytics.”
IPTC invites other parties to join the development of the EXTRA project. To get involved, contact Myles at email@example.com.
The International Press Telecommunications Council (IPTC) will use a grant from the first round of Google’s Digital News Initiative Innovation Fund to build and freely distribute an initial version of EXTRA: The EXTraction Rules Apparatus, a multilingual open-source platform for rules-based classification of news content.
EXTRA will be a classification system for annotating news documents with high-quality subject tags. Such tags will allow publishers to deliver a variety of valuable services including content recommendations, improved advertising targeting and subject-specific content streams, such as alerts and topic pages.
“By creating a freely available rules-based classification engine, IPTC will help publishers to enhance their content with all sorts of metadata services, including enriched search, intelligent recommendations and precise analytics,” said Stuart Myles, chairman of IPTC.
EXTRA will provide news publishers with several key capabilities: the ability to automatically categorize documents by subject (for example, terrorism, sports, names of celebrities); the ability to author classification rule sets tailored to existing taxonomies; and the ability to classify documents using the industry standard IPTC Media Topics taxonomy. Taxonomies are used by many news organizations to classify their content. Classification is used in various ways, including improved online news navigation by grouping and linking, to organize editorial workflows and to enrich search.
So that EXTRA is immediately useful to the news publishing community, IPTC will create different suites of rules in two languages for classifying news documents into the IPTC Media Topics taxonomy, an industry-standard taxonomy used by several leading news providers.
“We hope that the EXTRA project will support a migration in the news publishing community towards a common industry-wide open source platform,” said Michael Steidl, managing director of IPTC. “We believe that a freely available document classification platform will provide great benefit to small-to-medium sized publishers.”
IPTC invites other parties to join the development of EXTRA.
Contact firstname.lastname@example.org to learn more, including how you can get involved.
Over €27m has been offered by Google to 128 projects, large and small, from 23 countries across Europe – each designed to advance innovation in the news industry. DNI is a collaboration between Google and news publishers in Europe to support high quality journalism and encourage a more sustainable news ecosystem through technology.
About IPTC: The IPTC, based in London, brings together the world’s leading news agencies, publishers and industry vendors. It develops and promotes efficient technical standards to improve the management and exchange of information between content providers, intermediaries and consumers. The standards enable easy, cost-effective and rapid innovation. Visit www.iptc.org and follow on Twitter: @IPTC
IPTC Releases Results of 2016 Social Media Sites Photo Metadata Test
Important image metadata is not retained in images after upload to some of the most popular social media sites, according to a study by the International Press Telecommunications Council (IPTC). The missing data includes key copyright and identification information as well as descriptive data about the image.
The IPTC, a consortium of over 50 news agencies and media companies, sets international technical standards for news exchange, including metadata embedded in image files. The recent Social Media Sites Photo Metadata Test repeats a survey in 2013; while improvements are noted, some sites scored lower this time around.
The Social Media Sites Photo Metadata Test evaluated 15 top social media sites, and checked if embedded metadata was retained and displayed on upload to the sites or downloads of various version of the image. The results are displayed at www.embeddedmetadata.org/testresults.
Only one social media site, Behance, received favorable results for retaining and displaying embedded data. A few systems retained embedded metadata but failed to use it when displaying metadata on the web site. Ten sites removed at least some metadata when images were downloaded to a desktop environment.
“There are many important reasons to embed and preserve metadata – to protect copyrights, ensure proper licensing, track image use, smooth workflow, and make them searchable on- or offline,” said Michael Steidl, Managing Director of IPTC. “If users provide captions, dates, a copyright notice and the creator within their images, that data shouldn’t be removed when sharing them on social media websites without their knowledge.”
There may be several reasons social media services remove metadata – and some may not be intentional. Test results showed that in some cases, when images were downloaded to a desktop environment, the metadata was preserved if the size of the image remained unchanged. But if the image was rescaled, the metadata was stripped. “The quality assurance of these sites might not be aware that their software strips metadata inadvertently,” said Steidl.
“Because many of the social media sites are essentially free, users become the product, and not necessarily the customers,” said David Riecks, a photographer and metadata consultant who owns ControlledVocabulary.com and worked on the test. “Users are often not aware of these practices. There should be a sweet spot between these social sites preserving all metadata and removing it all. I’d like to see more engineers working together to find solutions.”
The Embedded Metadata Manifesto was launched by IPTC in 2011 to draw attention to the importance of retaining important data embedded in image files. The website, www.embeddedmetadata.org also includes Embedded Metadata Manifesto’s five guidelines for how metadata should be handled and preserved in digital media.
About IPTC: The IPTC, based in London, brings together the world’s leading news agencies, publishers and industry vendors. It develops and promotes efficient technical standards to improve the management and exchange of information between content providers, intermediaries and consumers. The standards enable easy, cost-effective and rapid innovation and include the Photo Metadata standard, the news exchange formats NewsML-G2, SportsML-G2 and NITF, rNews for marking up online news, the rights expression language RightsML, and NewsCodes taxonomies for categorizing news. Visit www.iptc.org and follow on Twitter: @IPTC
Extensis, a leading developer of software and services for creative professionals and workgroups, joins IPTC to extend the company’s commitment to advancing standards designed to making working with metadata easier.
“Extensis as system vendor has taken the essential role of enabling companies managing photos to make efficient use of IPTC’s widely used photo metadata standard”, said Michael Steidl, IPTC managing director and lead of the photo metadata work. “IPTC welcomes Extensis as new member of our organization; we will work jointly on improving professional photo workflows”, he added.
Read the Extensis press release.
IPTC develops and maintains a rich set of standards for the media exchange. Now you can easily track the latest updates of all IPTC standards by the new Twitter feed @IPTCupdates. Updated standards show a corresponding flag also on the standards overview page.
The first tracked update is the latest modification of the Media Topic NewsCodes.
Welcome to our newly launched website! After a few months of development and content evaluation we can proudly present a faster, lightweight, more business focused, and most importantly: mobile-ready website.
The IPTC is a volunteer-driven membership organization and it’s important that we are easily approachable — both literally and figuratively. Over the last few years, many of you have told us that you found it too difficult to locate information about the IPTC, our standards or even how to get in touch with us. So we fixed all that.
Focus on business value.
We have the full roster of IPTC standards explained, in brief and with a clear focus on the business value of each standard. Rapid innovation and agile product development cycles at web scale are practically impossible without the underpinning of solid technical standards — the IPTC delivers those, for a variety of use cases in today’s publishing and media environment.
In short, our new website is aimed at those of you who are looking for those IPTC standards, for general information about the organization and for the benefits of being a member. (We maintain a separate website for developers with in-depth technical information.)
We’d also like to say thanks to Kevin and Jonathan, our design and development team at Iron to Iron; they really did a great job and are a pleasure to collaborate with.
Have a look around, let us know if something looks wrong or if you’re missing anything. And do consider joining us — we’d love to have you on board.
(Lead of the website project team)
Adding metadata to images costs money but developments in the image industry indicate a real return for those who invest in their metadata workflow now. In an increasingly automated workflow metadata drives distribution and management in all sectors. At the IPTC Photo Metadata Conference 2015 participants will hear from practitioners and game changers in rights management, software and user interfaces about how quality metadata improves business. The conference will be held on 4 June 2015 in Warsaw, Poland, in conjunction with the CEPIC Congress 2015. Find more at phmdc.org
IPTC announced a new version of its Photo Metadata Standard, the most widely used standard to describe photos. It allows users to add precise and reliable data about people, products, locations and artwork shown in an image, and provides an improved and flexible way to express rights associated with a picture. IPTC is the world’s leading standards body for the news media and aims to simplify the distribution of information. The specification of this standard can be downloaded from the Photo Metadata Standard section.