Last week brought IPTC members together for our twice-yearly Face-to-Face Meeting to discuss news credibility, taxonomies and controlled vocabularies, updates in sports standards and much more!
This year’s IPTC Spring Meeting was in Lisbon, Portugal, and over 40 IPTC member delegates, member experts and invited guests gathered for three days to discuss all the latest developments in news and media technology.
On Monday, IPTC Chair and Director of Information Management for Associated Press Stuart Myles gave a great introduction and overview of what was to come in the meeting. After everyone introduced themselves, Stuart discussed some changes that the IPTC Board has been thinking about, including looking at updating the Mission and Vision of the organisation to reflect how we operate in 2019.
Then Robert Schmidt-Nia from dpa Deutsche Presse-Agentur introduced their C-POP project (in collaboration with STT and the Sanoma group in Finland) which follows on from the Performing Content we saw at the previous meeting in Toronto. It was interesting hearing about the agency’s shift in focus from a strict business-to-business model to a “B2B2C” model thinking about what consumers needed and how agencies could help publishers to deliver on the needs of readers and subscribers, ideally using feedback from publishers to agencies on how well their content is performing according to real metrics like loyalty and subscription revenue. IPTC will be involved in the C-POP project so you can expect to hear more about this in the future.
On the same topic, Andy Read from BBC gave an overview of the “Telescope” internal measurement tool, showing how BBC staff can view in real time how their content is being consumed by region, topic or device.
James Logan from the BBC and Brendan Quinn of IPTC gave an overview of IPTC’s work with news trust and credibility projects The Trust Project and the Journalism Trust Initiative. We decided at the Autumn 2018 Meeting that IPTC wouldn’t create its own standard around news credibility, disinformation and “fake news”, but that we would work with existing groups and help them to incorporate their standards in IPTC’s work. With The Trust Project, that has been going well, and we are almost ready to publish some best practices on implementing the Trust Project’s Trust Indicators in NewsML-G2 content. Trust Project indicators are already used in schema.org markup by over 120 news providers so it’s great to see such strong uptake.
Separately we have been working with Reporters Sans Frontières’ Journalism Trust Initiative which is at an earlier stage and is looking at documenting general standards for trustworthy and ethical journalism. IPTC is part of the JTI’s Technical Task Force which is working with the drafting teams on making their statements specific enough to be answered with data and indexed by machines. Hopefully it will end up with similar indicators to the Trust Project indicators
With both news credibility projects, some questions still need to be addressed, such as assessing the credibility of claims (when a news organisation says they are trustworthy, how can you trust them!), and how these trust indicators work in a multi-provider workflow: if a news agency sends some content to a publisher who then merges it with original reportage, who determines the trust indicators that are attached to the final story? There is definitely a lot more work to do!
Joaquim Carreira from local agency Lusa showed us the “Combate Às Fake News” project focussing on media literacy and helping readers to know what to look for, including the idea of a “nutrition label” for news content looking at criteria such as factuality, readability and use of emotional language.
The day was rounded off with Johan Lindgren of Swedish agency TT presenting the recent work of IPTC’s Sports Content Working Group. The group has recently been tidying up the spec and incorporating suggestions for changes, plus looking at eSports and Chess as two non-traditional sports that are both seeing an increase in interest – in the case of eSports, it is becoming a huge industry. Our tests showed that in simple cases eSports results can be addressed with existing SportsML 3 structures, but to handle more detailed play-by-play results we may need to at least introduce a new controlled vocabulary. Please let us know if you would like to implement SportsML for eSports!
Johan also presented the draft of SportsML 3.1 to be voted on by the IPTC Standards Committee.
Stay tuned for an update on Days 2 and 3!
This report was presented by Stuart Myles, IPTC Chairman, at the IPTC Annual General Meeting in Toronto, Canada on October 17 2018.
IPTC has had a good year – the 53rd year for the organization!
We’ve updated key standards, including NewsML-G2, the Video Metadata Hub and the Media Topics, as well as launching RightsML 2.0, a significant upgrade in the way to express machine processable rights for news and media.
Of course, IPTC standards are a means, not an end. The value of the standards is the easier exchange, consumption and handling of news and media by organizations large and small around the world. So it is important that we continue to focus on making our standards straightforward to use and have them adopted as widely as possible. I think we are making progress on the usability front, such as moving away from zip’d PDFs towards actual HTML web pages for documentation of NewsML-G2. Over the last year, we’ve continued to work with other organizations – W3C, Europeana and MINDS – to develop standards, increase adoption – and, perhaps most importantly, to open up IPTC to other perspectives. And we have had a huge win in the recognition of key photo metadata by Google Images. But we clearly need to do more for both usability and adoption. During the course of this meeting, we’ve had some good discussion about what more we can do in both areas and I encourage all members to help spread the word about IPTC standards, and suggest ways we can accelerate adoption.
Of course, the nature of news and media continues to evolve. On the one hand, new forms of story telling are emerging, such as Augmented Reality and Virtual Reality. Equally, using data as the way to power stories continues to increase both data-driven stories and data-supported stories. By data-driven stories, I mean journalists reviewing large databases of information and creating stories based on the trends they find. By data-supported stories, I mean content creators using visually-interesting graphics to support their content. The automated production, curation and consumption of news and media is likely to increase for the foreseeable future, driven by both technological improvements and the seductive economics of replacing people with algorithms. And it is not only economics which are driving these changes and challenges, just as it is no longer fill-in-the-blank text stories being written by robot journalists. Synthetic media – such as “deep fakes” – are able to produce increasingly convincing photo, video and audio stories that are indistinguishable from “real” media. Inevitably, the existence and debunking of these fakes will be used to deny legitimate reporting, with the implications of continued erosion of trust in media. All of these trends – AR, VR, data-powered journalism and dealing with trust, credibility and misinformation – are topics which IPTC has discussed over the last few years, but we have not developed any tracks of work to try to address them. In part, this is because these are, by definition, outside of the areas that our member organizations traditionally deal in and are so quite difficult to tackle in terms of establishing standards.
However, even within the context of standards, IPTC is opening up to new forms of experimentation. As we heard on Monday, the joint project between IPTC and MINDS, to allow for the identification of audience and interest metadata, has lead to the introduction of structures within NewsML-G2 to support rapid prototyping and experimentation. I see this as a positive move, with great potential to accelerate the work we do and to help keep it lightweight and relevant.
Of course, IPTC has had significant changes of its own over the last year. We bid goodbye to Michael Steidl as our Managing Director of 15 years, and welcomed Brendan Quinn as our new Managing Director this summer. We’re grateful that we continue to benefit from Michael’s skills and experience, as he has remained the Chairman of the Photo and Video Working Groups. And I think that Brendan has made a great start in his new role in helping us keep the IPTC moving forward.
As part of the handover from Michael to Brendan, we decided to scan a lot of the old paper documents (link available to members only), including various types of IPTC newsletter, dating back to 1967, two years after the organization was founded. I thought I would look back to what IPTC was up to in the year 2000, the year I became a delegate to the IPTC, back when I worked for Dow Jones.
And there I am in the photo at the top of the page. Or, at least, the back of my head. Some things are quite reminiscent of this week’s meeting – the birth of NewsML, a focus on improved communications, cooperation with other organizations e.g. MPEG-7.
Then I thought I would look back on IPTC in 1968, the year I was born:
Some things were similar to today – such as a focus on fine technical details such as Alphabet Number 5 and a plan to go to Lisbon next year for a meeting. However, most of the focus in those days was mainly on lobbying against tariffs and satellite monopolies.
So I think it is fair to say that the IPTC has never been just a standards body. It is also, more broadly, a community of practice. We are a group of people from around the world who have a common interest in news and media technology. The process of sharing information and experiences with the group, through these face to face meetings and the online development of standards, means that the members of IPTC learn from each other, and so have an opportunity to develop professionally and personally. I hope you will agree that yesterday’s discussion of news search and classification was an excellent example of exchange of experiences, both good and bad, which can help many of us avoid problems and seize opportunities, and so accelerate our work.
I think it is helpful for us to recognize that IPTC is a community which continues to evolve, as the interests, goals and membership of the organization change. I’m confident that – working together – we can continue to reshape the IPTC to better meet the needs of the membership and to move us further forward in support of solving the business and editorial needs of the news and media industry. I look forward to working with all of you on addressing the challenges in 2019 and beyond.
This is the report of Day 3 of the IPTC Autumn 2018 Meeting in Toronto. See the report from Day 1 and the report from Day 2. All the presentations are available to IPTC members in the IPTC Members Only Zone.
Day 3 of IPTC Autumn Meetings always includes the Annual General Meeting, where all Voting Members can have their say in the future of the organisation. This time new Managing Director Brendan Quinn gave his first MD’s report, alongside Stuart Myles’ Chairman’s Report (which will be posted to the IPTC blog soon). Materials from the AGM are available to members in the IPTC Members Only Zone.
Rounding out the discussions for the three days, we had some broad-ranging and future-facing conversations regarding News Credibility projects, where Stuart Myles took us on a tour of the wide range of projects and initiatives around misinformation, the credibility of news and news sources, and the perceived problems of “fake news.” IPTC or IPTC members are helping out several organisations in their efforts in this area such as the w3C Credible Web community group and the Journalism Trust Initiative.
We also had a discussion on funding opportunities and potential IPTC projects, which is an internal discussion involving members only.
Lastly, speaking about the future, we had Michael Young from Civil Media speak to us about their plans to use blockchain technologies to power small newsrooms and fulfil their broad goal to “power sustainable journalism throughout the world.” A lot of focus has been on Civil’s Initial Coin Offering, which closed underfunded and will be returning investors’ money, but they have many other activities, including a suite of WordPress-based plugins allowing news providers to join the Civil ecosystem and pledge openness, fairness and transparency according to the Civil Foundation’s constitution. Mike explained how blockchain based voting and decisions mean that members can be rewarded for pointing out breaches of the constitution, and bad actors can be punished or even removed from the network entirely.
The event ended with a few of us attending the Canadian Journalism Foundation’s event with journalism pundits Vivian Schiller, Jeff Jarvis, Jay Rosen and Matthew Ingram, talking about misinformation and misuse of social media (video recording available via the above link), and ten of us went on a networking and team bonding trip to Niagara Falls and to a local winery on the Thursday.
Overall it was a great Autumn Meeting which set the scene and built the foundation for many more great IPTC meetings to come!
This is the report of Day 2 of the IPTC Autumn 2018 Meeting in Toronto. See the report from Day 1 and the report from Day 3. All the presentations are available to IPTC members in the IPTC Members Only Zone.
Day 2 of the IPTC Autumn 2018 Meeting in Toronto was a deep dive into search and classification. Many of our members are working hard to make their content accessible quickly and easily to their customers, and user expectations are higher than ever, so search is a key part of what they do.
First up we had Diego Ceccarelli from Bloomberg talking through their search architecture. Users of Bloomberg terminals have very high expectations that they will see stories straight away: They have 16m queries and 2m new stories and news items per day, with requirements for a median query response time of less than 200ms and for new items to be available in search results in less than 100ms. And as Diego says, “with huge flexibility comes huge complexity.” For example, because customers expect to see the freshest content straight away, the system has no caching at all!
To achieve this, the Bloomberg team use Apache Solr – in fact they have 3 members of staff dedicated to working on Solr full-time, and have contributed a huge amount of code back to the project, including their machine-learning-based “learning to rank” module which can be trained to rank a set of search results in a nuanced way. Bloomberg also worked with an agency to develop open source code used to monitor a stream of incoming stories against queries, used for alerting. Other topics Diego raised included clustering of search results, balancing relevance and timeliness, crowdsourcing data to train ranking systems, combining permissions into search results, and more – a great talk!
Our heads already reeling with all the information we learned from Bloomberg, we then heard from another search legend, Boerge Svingen, one of the founders of FAST Search in Norway and now Director of Engineering at the New York Times. He spoke about how NYT re-architected their search platform to be based around Apache Kafka, a “distributed log streaming” platform that keeps a record of every article ever published on the Times (since 1851!) and can replay all of them to feed a new search node in around half an hour. The platform is so successful that it is used to feed the “headless CMS” (see yesterday’s report) based on GraphQL which is used to render pages on nytimes.com for all types of devices. Boerge and his team use Protocol Buffers as their schema to keep everything light and fast. More information in Boerge’s slide deck, available to IPTC members.
Next up was Chad Schorr talking about search at Associated Press, discussing their Elastic implementation on Amazon Web Services. Using a devops approach based on “immutable infrastructure” meant that the architecture is now very solid and well-tested. Chad was very open and spoke about issues and problems AP had while they were implementing the project and we had a great discussion about how other organisations can avoid the same problems.
Then Robert Schmidt-Nia from DPA talked about their implementation of a content repository (in effect another “headless CMS”!) based on serialising NewsML-G2 into JSON using a serverless architecture based on Amazon Lambda functions, AWS S3 for storage, SQS queues and Elasticsearch. Robert told of how the entire project was built in three months with one and a half developers, and ended up with only 500 lines of code! It can now be used to provide services to DPA customers that could not be provided before, including subsets of content based on metadata such as all Olympics content.
Next, Solveig Vikene and Roger Bystrøm from Norway’s news agency NTB spoke about and gave a live demo of their new image archive search product. They demonstrated how photographers can pre-enter metadata so that they can send their photos to the wire a few seconds after taking them on the camera. Some functions like global metadata search and replace and a feature-rich query builder made their system look very impressive.
Veronika Zielinska from Associated Press spoke about AP’s rule-based text classification systems, showing the complexity of auto-tagging content (down to disambiguating between two US Republican Congressmen both called Mike Rogers!) and the subtlety of AP’s terms (distinguishing between “violent crime” events versus the social issue of “domestic violence”) therefore the necessity of manually creating, and maintaining, a rules-based system.
Stuart Myles then took us on a tour through AP’s automated image classification activities, looking at whether commercial tools are yet up to the task of classifying news content, the value of assembling good training sets but the difficulties in doing so, and the benefits of starting with a relatively small taxonomy that is easier for machine learning systems to understand.
Dave Compton talked us through Thomson Reuters Knowledge Items used by the OpenCalais classifier and how they use the PermID system to unify concepts across their databases of people, organisations, financial instruments and much more. Dave described how Knowledge Items are represented as NewsML-G2 Knowledge Items, and are mapped to Media Topics where possible.
On that subject, Jennifer Parrucci of the New York Times, and chair of the IPTC NewsCodes Working Group, gave an update on the latest activities of the group, including the ongoing Media Topic definitions review, adding new Media Topic terms after suggestions by the Swedish media industry, and work with schema.org team on mapping between schema.org and Media Topics terms.
As you can see, it was a very busy day!
This is the report of Day 1 of the IPTC Autumn 2018 Meeting in Toronto. See the report from Day 2 and the report from Day 3. All the presentations are available to IPTC members in the IPTC Members Only Zone.
This week we are in Toronto for the IPTC Autumn Meeting. Unfortunately the weather is not as warm as it was last week but we are still enjoying ourselves immensely and learning a lot from each other!
All presentations are available to members on the members-only event page.
After an introduction from Chair Stuart Myles, we heard an update from Michael Steidl, chair or the Video Metadata and Photo Metadata Working Groups. Michael updated us on work promoting the IPTC Video Metadata Hub standard, talking to manufacturers and software vendors at events like IBC in Amsterdam, and pulling together use cases and success stories from existing users of the standard.
On the IPTC Photo Metadata Standard, Michael shared news about the fact that Google Images now displays IPTC Photo Metadata project and the press we have received since that time. Also we are working on new technical features in the standard such as metadata for regions within images. We’re looking for use cases and requirements for storing metadata against regions, so if you have any input, please let Michael, or IPTC Managing Director Brendan Quinn, know!
Dave Compton of Refinitiv, formerly the Financial & Risk business of Thomson Reuters, chair of the NewsML-G2 Working Group, gave an update on recent progress and work towards NewsML-G2 version 2.28 which will be released soon. It will incorporate features for the requirements of auto-tagging systems and a new experimental namespace to be used for potential new updates to NewsML-G2 that aren’t yet ready to be added to the full specification.
The experimental extension to NewsML-G2 is already put in use by Gerald Innerwinkler of APA and Robert Schmidt-Nia of DPA who presented an update on a current project between IPTC and MINDS International looking at metadata for suggesting news stories to users based on psychological and emotional characteristics, plus properties like the likely timeliness for different types of user. Based on the Limbic Map concept from marketing theory, the new proposals are in testing right now.
Chair of the Sports Content Working Group, Johan Lindgren of TT in Sweden, presented an update on SportsML and the work on SportsJS which is nearing a final version now that JSON Schema is soon able to support some new properties that we need to be able to validate Sports content.
Stuart Myles appeared again in his role as chair of the Rights Working Group, updating us on RightsML and where we can take it in the future, including the potential to use RightsML as the basis of blockchain-based rights management systems.
Then we had a focus on “new-generation editorial systems” including a great presentation from Peter Marsh of new IPTC member NEWSCYCLE Solutions on the history and state of the art of content management systems from Tandem-based SII workstations in the 1980s, all the way through to the current wave of headless CMSs as illustrated by this project by The Economist.
Stephane Guerrilot of AFP finished day one presenting AFP’s new-generation system, Iris, which enables AFP customers and partners to search for stories, video and images.
Stay tuned for a report on Day Two!
After a successful IPTC Spring Meeting in Athens and IPTC Photo Metadata Conference in Berlin (see Sarah Saunders’ write-up of the event), we’re currently hard at work planning the IPTC Autumn 2018 Meeting. This year’s Autumn meeting and AGM will be held from 15-17 October in Toronto, Canada.
We’re currently considering topics for the agenda and organisations to invite, so we’re inviting suggestions from members and the industry. Recent topics discussed at IPTC Meetings include 360-degree images, rights management, blockchain and the media industry, non-XML news standards such as NinJS and much more. We want to focus on how we can use technology and technical standards to make the news and media industry function more smoothly.
Registration opens in August. We look forward to seeing as many members as possible attending.
If you’re interested in joining IPTC so that you can attend, or if you’re interested in presenting some relevant work to some of the top technical specialists in the media industry, please submit a short abstract of your presentation topic to Brendan Quinn, IPTC Managing Director at email@example.com.
… the image business in a changing environment
By Sarah Saunders
The web is a Wild West environment for images, with unauthorised uses on a massive scale, and a perception by many users that copyright is no longer relevant. So what is a Smart Photo in this environment? The IPTC Photo Metadata Conference 2018 addressed the challenges for the photo industry and looked at some of the solutions.
Isabel Doran, Chair of UK image Library association BAPLA kicked off the conference with some hard facts. The use of images – our images – has created multibillion dollar industries for social media platforms and search engines, while revenues for the creative industry are diminishing in an alarming way. It has long been been said that creators are the last to benefit from use of their work; the reality now is that creators and their agents are in danger of being squeezed out altogether.
Take this real example of image use: An image library licenses an image of a home interior to a company for use on their website. The image is right-click downloaded from the company’s site, and uploaded to a social media platform. From there it is picked up by a commercial blog which licenses the image to a US real estate newsfeed – without permission. Businesses make money from online advertising, but the image library and photographer receive nothing. The image is not credited and there is no link to the site that licensed the image legitimately, or to the supplier agency, or to the photographer.
Social media platforms encourage sharing and deep linking (where an image is shown through a link back to the social media platform where the image is posted, so is not strictly copied). Many users believe they can use images found on the web for free in any way they choose. The link to the creator is lost, and infringements, where found, are hard to pursue with social media platforms.
Tracking and enforcement – a challenge
The standard procedure for tracking and enforcement involves upload of images to the site of a service provider, which maintains a ‘registry’ of identified images (often using invisible watermarks) and runs automated matches to images on the web to identify unauthorised uses. After licensed images have been identified, the image provider has to decide how to enforce their rights for unauthorised uses in what can only be called a hostile environment. How can the tracking and copyright enforcement processes be made affordable for challenged image businesses, and who is responsible for the cost?
The Copyright Hub was created by the UK Government and now creates enabling technologies to protect Copyright and encourage easier content licensing in the digital environment. Caroline Boyd from Copyright Hub demonstrated the use of the Hub copyright icon for online images. Using the icon (like this one ) promotes copyright awareness, and the user can click on the icon for more information on image use and links back to the creator. Creating the icon involves adding a Hub Key to the image metadata. Abbie Enock, CEO of software company Capture and a board member of the Copyright Hub, showed how image management software can incorporate this process seamlessly into the workflow. The cost to the user should be minimal, depending on the software they are using.
Publishers can display the icon on images licensed for their web site, allowing users to find the creator without the involvement of – and risk to – the publisher.
Meanwhile, suppliers are working hard to create tracking and enforcement systems. We heard from Imatag, Copytrack, PIXRAY and Stockfood who produce solutions that include tracking and watermarking, legal enforcement and follow up.
Design follows devices
Images are increasingly viewed on phones and tablets as well as computers. Karl Csoknyay from Keystone-SDA spoke about responsive design and the challenges of designing interfaces for all environments. He argued that it is better to work from simple to complex, starting with design for the smartphone interface, and offering the same (simple) feature set for all environments.
Smart search engines and smart photos
Use of images in search engines was one of the big topics of the day, with Google running its own workshop as well as appearing in the IPTC afternoon workshop along with the French search engine QWANT.
Image search engines ‘scrape’ images from web sites for use in their image searches and display them in preview sizes. Sharing is encouraged, and original links are soon lost as images pass from one web site to the next.
CEPIC has been in discussion with Google for some time, and some improvements have been made, with general copyright notices more prominently placed, but there is still a way to go. The IPTC conference and Google workshop were useful, with comments from the floor stressing the damage done to photo businesses by use of images in search engines.
Attendees asked if IPTC metadata could be picked up and displayed by search engines. We at IPTC know the technology is possible; so the issue is one of will. Google appears to be taking the issue seriously. By their own admission, it is now in their interest to do so.
Google uses imagery to direct users to other non-image results, searching through images rather than for images. Users searching for ‘best Indian restaurant’ for example are more likely to be attracted to click through by sumptuous images than by dry text. Google wants to ‘drive high quality traffic to the web ecosystem’ and visual search plays an important part in that. Their aim is to operate in a ‘healthy image ecosystem’ which recognises the rights of creators. More dialogue is planned.
Search engines could drive the use of rights metadata
The fact that so few images on the web have embedded metadata (3% have copyright metadata according to a survey by Imatag) is sad but understandable. If search engines were to display the data, there is no doubt that creators and agents would press their software providers and customers to retain the data rather than stripping it, which again would encourage greater uptake. Professional photographers generally supply images with IPTC metadata; to strip or ignore copyright data of this kind is the greatest folly. Google, despite initial scepticism, has agreed to look at the possibilities offered by IPTC data, together with CEPIC and IPTC. That could represent a huge step forward for the industry.
As Isabel Doran pointed out, there is no one single solution which can stand on its own. For creators to benefit from their work, a network of affordable solutions needs to be built up; awareness of copyright needs support from governments and legal systems; social media platforms and search engines need to play their part in upholding rights.
Blueprints for the Smart Photo are out there; the Smart Photo will be easy to use and license, and will discourage freeloaders. Now’s the time to push for change.
Photo credit: Jill Laurinaitis
By Stuart Myles
Chairman of the Board of Directors, IPTC
IPTC holds face-to-face meetings in several locations throughout the year, although, most of the detailed work of the IPTC is now conducted via teleconferences and email discussions. Our Annual General Meeting for 2017 was held in Barcelona in November. As well as being the time for formal votes and elections, the AGM is a chance for the IPTC to look back over the last year and to look ahead about what is in store. What follows are a slightly edited version of my remarks at IPTC’s AGM 2017 in Barcelona.
IPTC has had a good year – the 52nd year for the organization!
We’re continuing to work in partnership with other organizations, to maximize the reach and benefits of our work for the news and media industry. In coordination with CEPIC we organized the 10th annual Photo Metadata Conference, looking to the future of auto tagging and search, examining advanced AI techniques – and considering both their benefits and their drawbacks for publishers. With the W3C we have crafted the ODRL rights standard and are launching plans to create RightsML as the official profile of the ODRL standard, endorsed by both the IPTC and W3C.
We’ve also tackled problems that matter to the media industry with technology solutions which are founded on standards, but go beyond them. The Video Metadata Hub is a comprehensive solution for video metadata management that allows exchange of metadata over multiple existing standards. The EXTRA engine is a Google DNI sponsored project to create an open source rules based classification engine for news.
We’ve had some changes in the make-up of IPTC. Johan Lindgren of TT joined the Board. Bill Kasdorf has taken over as the PR Chair. And we were thrilled to add Adobe as a voting member of IPTC, after many years of working together on photo metadata standards. Of course, with more mixed emotions, we have also learnt that Michael Steidl, the IPTC Managing Director, for 15 years will retire next Summer. As has been clear throughout this meeting and, indeed, every day between the meetings on numerous emails and phone calls, Michael is the backbone of the work of the IPTC. Once again, I ask you to join me in acknowledging the amazing contributions and dedications that Michael displays towards the IPTC.
Later today, we will discuss in detail our plans to recruit a successor for the crucial role of the Managing Director. And this is not the only challenge that the IPTC faces. We describe ourselves as “the global standards body of the news media” and that “we provide the technical foundation for the news ecosystem”. As such, just as the wider news industry is facing a challenging business and technical environment, so is the IPTC.
During this meeting, we’ve talked about some of the technical challenges – including the continuing evolution of file formats and supporting technologies, whilst many of us are still working to adopt the technologies from 5 or 10 year ago. We’ve also talked about the erosion of trust in media organizations and whether a combination of editorial and technical solutions can help.
But I thought I would focus on a particular shift in the business and technical environment for news that may well have a bigger impact than all of those. That shift can be traced back to 2014 which, by coincidence, is when I became Chairman of the IPTC. Last week, Andre Staltz published an interesting and detailed article called “The Web Began Dying in 2014, Here’s How“. If you haven’t read it, I recommend it. The article makes a number of interesting points and backs them up with numerous charts and statistics. I will not attempt to summarize the whole thing, but a few key points are worth highlighting.
Staltz points out that, prior to 2014, Google and Facebook accounted for less than 50% of all of the traffic to news publisher websites. Now those two companies alone account for over 75% of referral traffic. Also, through various acquisitions, Google and Facebook properties now share the top ten websites with news publishers – in the USA 6 of the 10 most popular websites are media properties. In Brazil it is also 6 out of 10. In the UK it is 5 out of 10. The rest all belong to Facebook and Google.
Both Facebook and Google reorganized themselves in 2014, to better focus on their core strengths. In 2014, Facebook bought Whastapp and terminated its search relationship with Bing, effectively relinquishing search to Google and doubling down on social. Also in 2014, Google bought DeepMind and shutdown Orkut, its most successful social product. This, along with the reorganization into Alphabet, meant that Google relinquished social to Facebook and allowing it to focus on search and – even more – artificial intelligence. Thus, each company seems happy to dominate their own massive parts of the web.
But … does that matter to media companies? Well, Facebook said if you want optimal performance on our website, you must adopt Instant Articles. Meanwhile, Google requires publishers to use its Accelerated Mobile Pages or “AMP” format for better performance on mobile devices. And, worldwide, Internet traffic is shifting from the desktop to mobile devices.
Then, if you add in Amazon, Apple and Microsoft, it is clear that another huge shift is going on. All of the Frightful Five are turning away from the Web as a source of growth and instead turning to building brand loyalty via high end devices. Following the successful strategy of Apple, they are all becoming hardware manufacturers with walled gardens. Already we have Siri, Cortana, Alexa and Google Home. But also think about the investments going on by these companies in AR and VR as ways to dominate social interactions, e-commerce and machine learning over the Internet.
So, just as news companies must confront these shifts in the global business and technology environment, so must the IPTC. During this meeting, we’ve talked about our initial efforts to grapple with metadata for AR, VR and 360 degree imagery. We’ve also discussed techniques which are relevant to news taxonomy and classification, including machine learning and artificial intelligence. At the same time, Facebook, Google and others are not totally in control, as they – along with Twitter – found themselves having to explain the spread of disinformation on their platforms and under increased government scrutiny, particular in the EU. So, all of us, whether we describe ourselves as news publishers or not, are dealing with a rapidly changing and turbulent information, technical and business environment.
What does this mean for IPTC? IPTC is a news technology standards organization. But it is also unique in that we are composed of news companies from around the world. We know from the membership survey that both of these factors – influence over technical solutions and access to technology peers from competitors, partners, diverse organizations large and small – are very important to current members. In order to prosper as an organization, IPTC needs to preserve these unique benefits to members, but also scale them up. This means that we need to find ways to open up the organization in ways that preserve the value of the IPTC and fit with the mission, but also in ways that are more flexible. We need to continue to move beyond saying that the only thing we work on is standards and instead use standards as a component of the technical solutions we develop, as we are doing with EXTRA and the Video Metadata Hub. We need to work with diverse groups focused on solving specific business and journalistic problems – such as trust in the media – and in helping news companies learn the best ways to work with emerging technologies, whether it is voice assistants, artificial intelligence or virtual reality.
I’m confident that – working together – we can continue to reshape the IPTC to better meet the needs of the membership and to move us further forward in support of solving the business and editorial needs of the news and media industry. I look forward to working with all of you on addressing the challenges in 2018 and beyond.
Stuart Myles is the Director of Information Management at Associated Press.
IPTC holds its Autumn Meeting this year in Barcelona, Spain, from Monday, 6 November, through Wednesday, 8 November.
A team of IPTC Leads is working on the topics of the Autumn Meeting and we have already a very exciting list:
- Discussion topic: The Art and Science of Practical Metadata
- Discussion topic: System Vendors and News Exchange
- Discussion topic: Automating News – Speed is nothing without Accuracy
- Discussion topic: Trust in the Media
- Discussion topic: JSON – Man for all Seasons?
- Sessions of the NewsML-G2, ninjs, NewsCodes, Video/Photo Metadata and Rights Expression Working Groups; and of the Public Relations and the Standards Committee
- Annual General Meeting
By Sarah Saunders
Ten years ago, the very first IPTC Photo Metadata Conference in Florence was packed with photographers and picture libraries eager to discuss ways of protecting their work in the digital environment. The image industry has expanded enormously since. The image industry has the additional challenge of vast numbers of images crowding the web, and the difficulty of finding the relevant picture, as well as the metadata relating to it.
IPTC Photo Metadata Conference 2017 was designed to look into the future, with a focus on image search using AI or Artificial Intelligence. The ability of computer systems to learn from humans has increased enormously in recent years, and the necessary computer power has now become available. The question for the conference was – how far can these systems help the professional image industry sharpen up its search capability and make gains in productivity?
Solution for Auto Tagging in the Image Industry
Kai-Uwe Barthel, professor of visual computing at HTW Berlin University gave a clear exposition on the history of the field and of the pressing need to create solutions for the picture industry. There are now too many images for classical search systems to handle, but using Neural Network Analysis – a variant of AI – computer systems can be taught to tag and recognise images in a fraction of the time it takes to manually tag. As most online images are untagged, a combination of human tagging and visual similarity search presents a viable way forward. But Barthel and his team have also researched new methods of presenting images, using three dimensional structures to dig into results with large numbers of images visible at one time.
Speakers on the use of AI came from across the industry, presenting solutions which can be put into practice now. The key to success in this area is to have enough content for the computers to learn from, and this can be achieved in a number of different ways. General AI systems produce good results for skies and beaches and general themes because they’ve had the data to learn from. But users can set a system to learn from their own content so that more specialist content can be tagged if the conditions are right. Computers are learning faster, and need fewer images to learn from than before.
AI systems can be trained to recognise faces, text, colours, composition, scenes, and objects, but can also be trained in the aesthetics of image selection, with one speaker maintaining that twenty images are enough to train a system in a particular brand aesthetic. But speakers admitted also that defining the precise location of what is shown in an image by its content was tested but it did not work in a reliable way.
Speakers stressed that the important element in computer learning is the understanding of the nature of the material to be tagged, an attribute which is currently not about to be taken over by artificial intelligence. The benefits in speed and productivity will be enormous but we’re not yet talking about doing away with human skills altogether!
IPTC Video Metadata and Easier Cross Media Distribution
The first afternoon session by IPTC Managing Director Michael Steidl was about the IPTC Video Metadata Hub (VMH), published in 2016 to provide a standard set of fields for use across the varied technologies used in video. Many of the Video Hub fields are equivalent to those in IPTC Photo Metadata Standard, which helps streamline cross media distribution. The VMH can be applied down to the level of video clips, which makes it a useful metadata tool for production, archive and distribution.
Technology That Protects Rights Information in Google Image Search
The IPTC conference was held a day after a CEPIC seminar on Google. The Google image search scrapes images from their original sites and displays them in its own environment. This is bad for rightsholders as images can be saved and downloaded direct from Google without reference to the original site. Picture libraries and agencies lose significant traffic to their sites as a result with a German agencies survey indicating a drop of 50 percent in traffic. The recent fine levied by the EU on Google for anti-competitiveness in comparative shopping sites is encouraging and has paved the way for scrutiny of Google’s actions with online images.
The second presentation of the afternoon presented a solution for the problems raised in the Google seminar. SmartFrame technology allows images to be presented online without the danger of being scraped by Google as this is disabled by technical means. Most of the mechanisms people use to download images – like right-click – are disabled too. Images can be shared as links so social media sharing doesn’t lead to an image becoming orphaned and lost in the websphere. And when an image is viewed as a thumbnail in Google, there is a clear indication that it is a copyrighted image, and a link back to the originating site. Rob Sewel, Pixelrights CEO demonstrated how product items within an image could be linked back to a brand website, providing ways of funding photography in the future where photography provides a link to a paid-for advertising service. The technology could be put to all sorts of uses in both commercial and non-commercial fields, and gives control back to creators and their agents.
The success of this kind of technology, as with all solutions to image grabbing and orphaned images, lies in the uptake of the technology. To be truly protective of copyright, client websites would need to implement a technology like SmartFrame.
The IPTC Photo Metadata Conference 2017 was fascinating from start to finish for the about 60 attendees on location, the level of presentations was extremely high, and the presentations and videos are all available on the website at https://iptc.org/events/photo-metadata-conference-2017/.
Sarah Saunders runs Electric Lane, an independent DAM consultancy specialising in workflow planning, asset retrieval, data management and DAM project management. She works with IPTC’s Photo Metadata Working Group.