What are the IPTC NewsCodes?
The news and media industry has come together through the IPTC to create shared sets of terms that can be used to describe news and media content in a standardised way. Such a set of terms, created in a managed way, is known as a controlled vocabulary or CV.
We have created many different sets of terms to describe different aspects of news content, from the subject of a news story (Media Topics) through the job role of a worker in the media industry (Content Production Party Role) or the technical codec used to create a video file (Video Codec).
Collectively, we call our set of controlled vocabularies the IPTC NewsCodes.
Some of the NewsCodes CVs are very simple lists containing only a few terms. Others are larger and are arranged in a hierarchical form known as a taxonomy, such as the Media Topics.
All IPTC NewsCodes share some key properties:
-
they all have a URL-based ID which will never change (such as “http://cv.iptc.org/newscodes/mediatopic/20000022”;, the NewsCode for "classical music")
-
the URL-based ID can always be shortened into a "QCode" such as “mediatopic:20000022”, the QCode for the same concept of "classical music"
-
they can all be accessed via IPTC’s CV Server at http://cv.iptc.org/newscodes/
-
they are all described in English but some are described in multiple languages, all using the same underlying ID
-
they can all be accessed in machine-readable form if desired - see the separate guide cv.iptc.org-guidelines.html[cv.iptc.org Guidelines] for more details.
Why use IPTC NewsCodes?
Using controlled vocabularies rather than simple keywords allows for a consistent coding of news metadata across news providers and over the course of time. NewsCodes IDs can apply not only to text content but to photographs, graphics, interactives, audio and video files and streams.
All IPTC Controlled Vocabularies are available on our CV server (cv.iptc.org) in multiple formats, both machine-readable and human readable.
What can NewsCodes be used for?
For the news industry – but also far beyond – it is required to be able to state something about the content of a news item, to apply so-called metadata. This could be achieved either by free-text human language (e.g. by a headline or a caption) or by codes that represent concepts. Codes have the advantage that they can be shared across providers, and that each code can have a consistent and comprehensive definition, avoiding amiguity and misunderstandings.
NewsCodes are also language agnostic, thus the code is the same for describing content in different languages, only the label and definition of the code need to be translated to help with understanding its meaning.
Why so many different sets of NewsCodes?
We have codes representing subjects of news stories, aspects of news production such as contributor type and coverage status, technical terms such as video encoding types and layout orientations, and many vocabularies relating to sports results such as team type, player statistics and position names for many different sports.
A single vocabulary containing all types of possible codes would be unwieldy and inefficient.
When and how are updates to NewsCodes released?
Most NewsCodes vocabularies change rarely. The Media Topics vocabulary, however, is regularly updated as we improve labels and definitions,
We don’t expect that users update their systems as soon as a new version is released - we don’t make "breaking changes" such as changing the meaning of a term of re-using a term ID, we only make corrections and clarifications such as improving the label or definition of a term, moving it to a more appropriate parent, or adding a new translation or mapping to an external source such as Wikidata. We also add new terms regularly based on suggestions from members and other interested parties - if you have suggestions for new terms, please contact us!
If we decide to stop using a term, we never remove it from the CV completely, we only mark it as "retired", usually with a "note" describing which term or terms should be used instead.
Updates to the Media Topics (and to all NewsCodes) are announced on the public iptc-newscodes@groups.io discussion list. Joining the public group is free — simply visit https://groups.io/g/iptc-newscodes and sign up.
Can NewsCodes be used free of charge?
Yes, they can. Any NewsCode provided by the IPTC can be used at any stage of a news workflow without any royalty fee. But if one includes IPTC NewsCodes in an application, the intellectual property and the copyright of the IPTC must be explicitly included.
About the IPTC NewsCodes Guidelines
This document is designed to familiarise taxonomists, journalists, developers and architects with the IPTC NewsCodes controlled vocabularies.
It contains both guidance for how to use IPTC NewsCodes in your organisation, and also an explanation of how we make decisions regarding maintaining the vocabularies.
The IPTC welcomes your feedback on how to improve IPTC NewsCodes via the public NewsCodes discussion group.
Staff from IPTC Member organisations are welcome to join the members-only discussion list for the NewsCodes Working Group.
Copyright
Copyright © 2020 IPTC, International Press Telecommunications Council.
The IPTC NewsCodes Guidelines document is published under the Creative Commons Attribution 4.0 license - see the full license agreement at http://creativecommons.org/licenses/by/4.0/
By obtaining, using and/or copying this document, you (the licensee) agree that you have read, understood, and will comply with the terms and conditions of the license.
Materials used in this guide are either in the public domain or are available with the permission of their respective copyright holders. All materials of this IPTC standard covered by copyright shall be licensable at no charge.
Acknowledgements
This document is the result of a team effort by members of the IPTC NewsCodes Working Group of the International Press Telecommunications Council (IPTC), with input and assistance from other contributors.
Contributors to these guidelines are (in alphabetical order):
-
Linda Burman (L. A. Burman Associates Inc.)
-
Dave Compton (Refinitiv)
-
Paul Harman (Bloomberg LP)
-
Paul Kelly (polvo.ca)
-
Johan Lindgren (TT)
-
Philippe Mougin (AFP)
-
Jennifer Parrucci (The New York Times)
-
Michael Steidl (IPTC)
-
Brendan Quinn (IPTC)
-
Andrew Wang (Visual China Group)
-
Veronika Zielinska (Associated Press)
How to contact IPTC
Join the public IPTC Newscodes discussion group:
Submit a message on our website: https://iptc.org/about-iptc/contact-us/
Visit IPTC’s website: https://iptc.org/standards/newscodes/
Follow IPTC on Twitter: @IPTC
About IPTC
The IPTC, based in London, brings together the world’s leading news agencies, publishers and industry vendors. It develops and promotes efficient technical standards to improve the management and exchange of information between content providers, intermediaries and consumers. The standards enable easy, cost-effective and rapid innovation and include the Photo Metadata standard, the Video Metadata Hub, the news exchange formats NewsML-G2, ninjs, SportsML-G2 and NITF, rNews for marking up online news, the rights expression language RightsML, and NewsCodes taxonomies for categorising news.
IPTC is a not-for-profit membership organisation registered in England - find more about membership.
Our contact address is:
IPTC International Press Telecommunications Council 25 Southampton Buildings London WC2A 1AL United Kingdom
1. Groups of NewsCodes
IPTC groups the NewsCodes by their purpose:
-
Descriptive NewsCodes are used to editorially categorise news content (text, photos, video, audio, etc) using subject, genre.
-
Administrative NewsCodes are terms used in the production and distribution of news content, such as technical aspects of media for example colorspace of an image, codecs of audio and video files, urgency and the code representing a particular news provider.
-
NewsML-G2 NewsCodes are used to support common functionalities of NewsML-G2, EventsML-G2 and SportsML-G2.
-
EventsML-G2 specific NewsCodes to support specific functionalities of EventsML-G2, used to manage the news planning process before news items are created.
-
NewsML 1.x NewsCodes to support specific functionalities of IPTC NewsML version 1.x.
-
Photo Metadata specific NewsCodes to support the IPTC Photo Metadata Standard, such as Digital Source Type and Image Region Type.
-
Transmission NewsCodes are used for the transmission of news items, currently only the Priority CV.
1.1. Descriptive NewsCodes
Sets of terms used to describe the contents of news items and other media, such as subject (Media Topics and the older Subject Codes), genre to describe types of news item such as obituary, feature, review and sports report, scene to describe types of video shots, and worldregion for regions of the world (for countries and continents, we recommend using use the ISO Country Codes).
NewsCodes vocabulary | Description | Human-readable formats | Machine-readable formats | ||
---|---|---|---|---|---|
Genre |
Indicates a nature, journalistic or intellectual characteristic of an item, such as obituary, biography, review, feature, birth announcement. |
||||
Media Topics |
Media Topics is IPTC’s main subject taxonomy with a focus on text. Media Topics describe the subject of a news item.
|
||||
Product Genre |
Indicates product genres for media objects, such as factual, entertainment, news and sport. |
||||
Scene |
Indicates a type of scene covered by an item, such as headshot, general view, night scene and satellite. |
||||
Subject Code |
IPTC’s original subject taxonomy, indicating the subject of an item.
|
||||
Subject Qualifier |
Indicates a narrower attribute-like context for a Subject Code, e.g. for sports: the gender of participants, indoor/outdoor competition etc.
|
||||
World Region |
Indicates a region of the world. |
1.2. Administrative NewsCodes
Terms used in the production and distribution of news content, such as technical aspects of media (colorspace of an image or codecs of audio and video files), urgency and the code representing a particular news provider.
NewsCodes vocabulary | Description | Human-readable formats | Machine-readable formats | ||
---|---|---|---|---|---|
Audiocodec |
Indicates a name of an audio-encoder/decoder.
|
||||
Colorspace |
Indicates the colorspace of a digital image. |
||||
Digital Source Type |
Indicates from which source a digital image was created. |
||||
Indicates a product of a News Provider.
|
|||||
Indicates a News Provider registered with the IPTC.
|
|||||
Of Interest To |
Indicates a target audience for an item. |
||||
Urgency |
Indicates the editorial urgency of an item. |
||||
Videocodec |
Indicates a name of a video-encoder/decoder.
|
1.2.1. Deprecated Administrative NewsCodes taxonomies
Do not assign these NewsCodes anymore.
NewsCodes vocabulary | Description | Human-readable formats | Machine-readable formats | ||
---|---|---|---|---|---|
Provider (deprecated) |
Indicates a company, publication or service provider.
|
n/a |
n/a |
1.3. Transmission NewsCodes
The transmission NewsCodes are used in the transmission of news content from a news provider (such as a news agency or photo library) to a consumer (such as a newspaper, broadcaster or archive).
NewsCodes vocabulary | Description | Human-readable formats | Machine-readable formats |
---|---|---|---|
Priority |
Indicates the relative priority of an item for a distribution process. |
1.4. NewsML-G2 NewsCodes
For all G2 standards, a mapping of Scheme URIs to aliases is required: we call this a Catalog. The IPTC provides a Catalog file for all the NewsML-G2 controlled vocabularies listed below. Find more information on this file on a special Catalog page.
NewsCodes vocabulary | Description | Human-readable formats | Machine-readable formats | ||
---|---|---|---|---|---|
Nature of a News Item |
Indicates the type or "nature" of news in a News Item or Package Item. Examples include "Text Item(s)", "Video Item(s)", "Picture Item(s)", "Data Item(s)" and "Composite Item(s)". |
||||
Nature of a Catalog Item |
Indicates the nature of the content of a Catalog Item. |
||||
Nature of a Concept Item |
Indicates the nature of the concept in a Concept Item or Knowledge Item. |
||||
Nature of a Planning Item |
Indicates the nature of the content of a Planning Item. |
||||
Nature of a Concept |
Indicates the basic nature of a concept. |
||||
Application of Metadata Values |
Indicates how the metadata value was applied (i.e. manual or automatic). |
||||
Colour Indicator |
Indicates the basic colouring of an image. |
||||
Communication Technology |
Indicates the communication technology used for this part of the contact info. |
||||
Content Production Party Role |
The role of a party (person or organisation) which created, originated, supplied, enhanced, distributed or contributed in another way to the content, or the role of a party which supported these activities. |
||||
Content Rendition |
Indicates a kind of content rendition, such as "Thumbnail", "Content for print", "High resolution" and "Preview". |
||||
Content Warning |
Indicates why the content of the item should be reviewed as it may be perceived as being offensive. |
||||
Description Role |
Indicates the role that a specific instance of a description takes among all descriptions. Examples include "Summary", "Teaser" and "Caption". |
||||
Dimension Unit |
Indicates the unit used for measuring a dimension of content. |
||||
Editorial Role |
Indicates the role this item takes in an editorial workflow. |
||||
How extracted |
Indicates how a metadata value was extracted from content. |
||||
Hash Type |
Indicates the hash function by which a hash value was generated. Values include "MD5", "SHA-1" and "SHA-2". |
||||
Hash Scope |
Indicates the scope of a hash value. |
||||
Hop Action |
Indicates the action taken at a hop of the hop history. |
||||
Hop Action Target |
Indicates the target to which an action taken at a hop of the hop history applies. |
||||
Infosource Role |
Indicates the role of the information source. |
||||
Item Relation |
Indicates the relationship between the current item and the target resource. |
||||
Item Representation |
Indicates the way the target item is represented at this location. |
||||
Layout Orientation |
Indicates whether the human interpretation of the top of the image is aligned to its short or long side. |
||||
Language Role |
Indicates the role of the language among all languages used by the content. |
||||
Media Type (G2) |
Indicates a basic media type of news content. |
||||
Name Role |
Indicates the role this specific instance of a name takes among all names. |
||||
Name Part |
Indicates the part of a name. |
||||
News Coverage Status |
Indicates the intention of the news provider to cover an event. |
||||
News Message Signal |
Indicates how the items conveyed by a News Message should be processed. |
||||
Packagegroup Mode |
Indicates whether the members of a Package Item group are complementary or alternative and whether their order is relevant. |
||||
Publishing Status |
Indicates the publishing status of a G2 item.
|
||||
Rights Info Aspect |
Indicates the aspect of intellectual property rights that are expressed by a rights info property in a G2 item.
|
||||
Rights Info Scope |
Indicates the scope of a rights info property in a G2 item.
|
||||
Role of a Part of the Contact Info |
Indicates the role of a part of a contact info component.
|
||||
Severity |
Indicates the importance of a signal. |
||||
Signal |
Indicates a specific instruction to the processor of a G2 item that the content requires special handling.
|
||||
Time Unit |
Indicates the unit of time used for measuring the length of audio or video content. |
||||
Value Format |
Indicates the format of an alternative identifier. |
||||
Video Definition |
Indicates the video definition using broad terms. |
||||
Video Scaling |
Indicates the scaling of video content. |
||||
Why Present |
Indicates why the metadata value has been included.
|
1.5. EventsML-G2 specific NewsCodes
For G2-Standards a mapping of Scheme-URIs to aliases is required – a so-called Catalog. The IPTC provides a Catalog file for all the NewsML-G2 controlled vocabularies listed below. Find more information on this file on a special Catalog page.
NewsCodes vocabulary | Description | Human-readable formats | Machine-readable formats |
---|---|---|---|
Event Date Confirmation |
Indicates whether start and/or end dates and, optionally, times are confirmed. |
||
Event Occurence Status |
Indicates how certain the occurrence of the event is. |
||
Event Registration Role |
Indicates the class of a registration. |
||
Event Participant Role |
Indicates the role a participant takes in the event. |
||
Event Organiser Role |
Indicates the role an organiser takes in the event. |
||
Event Contact Info Role |
Indicates the class of the contact information. |
1.6. NewsML 1.x NewsCodes
These NewsCodes support specific functionalities of IPTC NewsML version 1.x, the predecessor to NewsML-G2.
NewsCodes vocabulary | Description | Human-readable formats | Machine-readable formats | ||
---|---|---|---|---|---|
Characteristics Property |
Defines the name and the semantics (not the value) of a physical characteristic of the content of a NewsML 1 news item.
|
||||
Confidence |
Indicates the degree of certainty that an applied metadata value is correct.
|
||||
Encoding |
Indicates an encoding scheme used to transform data.
|
||||
Format |
Indicates the technical format of content. Examples are JPEG for a picture, MP3 for audio or NITF or PDF for text.
|
||||
How Present |
Indicates how a metadata property relates to the content of a NewsML 1 news item.
|
||||
Importance |
Indicates the importance of an item of metadata applied to a news item.
|
||||
Label Type |
Indicates the type of a label. Labels are portions of human readable text, unlike most other metadata which are considered to be primarily machine readable only.
|
||||
Location Type |
Indicates the type of a location.
|
||||
Media Type |
Indicates a basic media type of news content.
|
||||
News Item Type |
Indicates, in a very broad way, the type of ContentItem carried by a NewsML 1 news item.
|
||||
Notation |
Indicates the notation of a ContentItem.
|
||||
Property Type |
Indicates the type of a NewsML 1 Property element.
|
||||
Relevance |
Indicates the relevance of a NewsML 1 news item to the target audience specified by “OfInterestTo”.
|
||||
Role in Package |
Indicates the role of a NewsML 1 news item within a package of several news items. Examples are “Main” (content), “Supporting”, or “Caption”.
|
||||
Status (NewsML 1) |
Indicates the current usability of a NewsML 1 news item.
|
||||
Topic Type |
Indicates the type of a concept represented by a topic.
|
1.7. Photo Metadata specific NewsCodes
The IPTC metadata schemas for photos “IPTC Core” and “IPTC Extension” require a few specific controlled vocabularies:
NewsCodes vocabulary | Description | Human-readable formats | Machine-readable formats |
---|---|---|---|
IPTC Subject NewsCodes |
see above |
||
IPTC Scene NewsCodes |
see above |
||
Digital Source Type |
Indicates from which source a digital image was created. |
||
Image Region Role |
Role of an image region among other image regions of the same image or other images |
||
Image Region Type |
Type of thing(s) depicted by a region in an image |
2. Guides and help for users of NewsCodes
2.1. When and how to use Media Topics
As the Media Topics are simply codes, labels and definitions in a tree structure (or hierarchy), they can be used in many applications. Most commonly, Media Topics are built in to newsroom content management systems, library systems or syndication tools. By building IPTC Media Topics into editorial systems, vendors and system implementers create a future-proof system for maintaining content descriptions and annotations that can evolve over time as IPTC Media Topics are maintained and extended.
In addition, content tags can be automatically mapped to Wikidata and translated to other languages based on the built-in Media Topics language translations. Currently, IPTC Media Topics are available in Arabic, British English, French, German, Portuguese (Brazilian and for Portugal), Spanish and Swedish.
2.2. When and how to use other IPTC NewsCodes
Generally the more technical NewsCodes CVs are only used when transmitting news content from one news organisation to another using a format such as NewsML-G2 or ninjs. These standards require various aspects of the news content (such as codecs, priorities or editorial staff roles) to be expressed in terms of codes from a controlled vocabulary. This is often built in to newsroom software so individual journalists and editors can simply choose values form drop-down lists and the correct codes will be used automatically.
2.3. Using Media Topics and Facets in your content
In 2017, IPTC extended Media Topics through the use of facets which can express multiple aspects of a concept. For example, to represent all Olympic sports medals in MediaTopics CV we would need hundreds of separate terms for each weight category, age group, gender mix etc for different sports. But by using facets, we can create one set of terms for gender with the values "male", "female" and "mixed", which can then be used in conjunction with other terms to identify different sports.
Facet values can be either from a CV (such as the NewsCodes CV "asportfacetvalue") or could be literal strings of text such as "200m".
So for example, the women’s 200 metre breaststroke swimming event can be described with:
-
Media Topic: swimming (medtop:20001071)
-
Facet swimming type with facetvalue breaststroke
-
Facet distance with facet value "200m"
-
Facet team size with value "1"
For an example of how this can be expressed in NewsML-G2, see the example in the NewsML-G2 example documents repository.
More information on how to represent facets in NewsML-G2 documents is available in the Faceted Concepts section of the NewsML-G2 Guidelines.
2.3.1. Which facets can be used with which Media Topics?
The Media Topics CV contains IKOS hasFacet
properties which link to the relevant
facets that apply to a given term.
The hasFacet
declaration cascades (or inherits) down the tree, so the fact that the term
competition discipline under
sport has the property
hasFacet
aspfacet:distance means that
all of the children of competition discipline
can also have that facet.
A summary of some facets and their association to Media Topics is shown here:
MediaTopics term | Possible Facets | Sample values |
---|---|---|
Sports competition discipline and its children |
aspfacet:distance |
"100 metres" |
"featherweight" |
||
"15 metres" |
||
n/a |
||
aspfacetvalue:giantslalom-snowboarding |
||
2.4. Looking under the hood: the structure of IPTC NewsCodes and Media Topics
All IPTC NewsCodes, including Media Topics, are defined using the W3C’s SKOS vocabulary, which is commonly used to describe taxonomies. We have extended SKOS slightly to add some extra properties to handle retiring concepts and specifying which concepts can have facets. We call our extensions IKOS (IPTC Knowledge Organisation System).
The vocabularies are created using
NewsML-G2
Knowledge Items and their built-in concept
, conceptId
, related
and
broader
properties. Because these are generic concepts, we can convert them
into SKOS/IKOS in RDF/XML, RDF/Turtle and JSON-LD formats.
For more information on how concept definitions work in NewsML-G2, see the Concepts and Concept Items section of the NewsML-G2 Guidelines.
2.5. Extending Media Topics for your own use
The IPTC NewsCodes Working Group often receives requests from agencies and publishers who like the Media Topics vocabulary but also want to extend it for their own use. For example, they might want to add some sports under "competition discipline" which are only played in their country, or some specific festivals and holidays that are not celebrated worldwide and thus wouldn’t pass the guidelines for adding a new term to the global Media Topics vocabulary.
For example, a news organisation specialising in animal stories would have a hard time using Media Topics because we only define one term for "animal" (medtop:20000500).
Using NewsML-G2 Knowledge Items, this organisation could create its own "scheme" that extends Media Topics using its own controlled vocabulary definitions.
It is important not to put new terms in the "medtop:" scheme, as that is reserved for the terms defined by the NewsCodes Working Group and served from http://cv.iptc.org/newscodes/mediatopic/.) |
This uses the SKOS "broadMatch" property which allows for broader and narrower mappings across "schemes".
As shown in the example below, the definitions of new concepts can also include mappings to Wikidata entities or other external systems.
In NewsML-G2, this is represented as:
<knowledgeItem ... > <concept id="exampleanimalcv-100" modified="2020-02-01T12:00:00+00:00"> <conceptId qcode="exampleanimalcv:100" created="2020-02-01:12:00+00:00"/> <type qcode="cpnat:abstract"/> <name xml:lang="en-GB">lion</name> <definition xml:lang="en-GB">Species of big cat often found in Africa and Asia</definition> <!-- the ID for "animal" in Media Topics is 20000500 --> <broader qcode="medtop:20000500"/> <related qcode="medtop:20000500" rel="skos:broadMatch"/> <!-- "lion" is entity Q140 in Wikidata --> <related uri="https://www.wikidata.org/entity/Q140" rel="skos:exactMatch"/> <related uri="http://myorg.org/exampleanimalcv/" rel="skos:inScheme"/> </concept> </knowledgeItem>
In SKOS RDF/Turtle this would be represented as:
exampleanimalcv:100 rdf:type skos:Concept ; skos:prefLabel "lion"@en-GB ; skos:definition "Species of big cat often found in Africa and Asia"@en-GB ; skos:inScheme <http://myorg.org/exampleanimalcv/> ; ikos:created "2020-02-01T12:00:00+00:00"^^xsd:dateTime ; dct:created "2020-02-01T12:00:00+00:00"^^xsd:dateTime ; ikos:modified "2020-02-01T12:00:00+00:00"^^xsd:dateTime ; dct:modified "2020-02-01T12:00:00+00:00"^^xsd:dateTime ; skos:exactMatch <https://www.wikidata.org/entity/Q140> ; skos:broadMatch medtop:20000500 .
Which format you choose will depend on the software you are using to manage your controlled vocabularies.
Please do remember that if you have a suggestion for a term that would apply in multiple countries and passes our criteria for new terms, you can request that the NewsCodes Working Group adds it to the Media Topics vocabulary. All requests are considered.
2.6. Mapping Media Topics to your organisation’s taxonomy
If your company has its own taxonomy, you might want to create mappings from your terms to the IPTC Media Topics and vice versa, so your customers and partners can index their content in a standard taxonomy and your content will become available to a wider audience.
This is straightforward to do because Media Topics are defined in the W3C’s
SKOS format (plus a few extension properties that we call IKOS IPTC Knowledge
Organisation System). SKOS has well-defined relation properties:
skos:closeMatch
, skos:exactMatch
, skos:broadMatch
and skos:relatedMatch
.
You should choose the relation type that best suits the particular semantic
mapping. That means that you should only use skos:exactMatch
if the concepts
are indeed exactly equivalent to each other, otherwise skos:closeMatch
or
skos:relatedMatch
may be better.
In SKOS RDF/Turtle, this would be represented as:
examplenewscv:term625 rdf:type skos:Concept ; skos:prefLabel "Architecture"@en ; skos:definition "Concerning the design of buildings and public spaces"@en ; skos:inScheme <http://myorg.org/examplenewscv/> ; dct:created "2020-02-01T12:00:00+00:00"^^xsd:dateTime ; dct:modified "2020-02-01T12:00:00+00:00"^^xsd:dateTime ; skos:exactMatch <http://cv.iptc.org/newscodes/mediatopic/20000032> ; skos:exactMatch <https://www.wikidata.org/entity/Q12271> ; skos:broadMatch examplenewscv:term726 .
3. How IPTC maintains its NewsCodes
This section describes guidelines and best practices for creating and/or editing entries in the IPTC Media Topics controlled vocabulary. The NewsCodes Working Group considers each change request using these guidelines and other industry best practices. We share them here to make the development process more transparent.
3.1. When should a term be added to Media Topics?
The Media Topics vocabulary is by necessity very high level and avoids detail in many areas. The NewsCodes Working Group decided to create some criteria that can be used to decide whether a suggested term should be added to the vocabulary or not. The general guide is that the "granularity", or level of detail, should be roughly even across all branches of the vocabulary.
IPTC delegates discussed the best approach to a policy on level of granularity in Media Topics entries, as change requests are submitted for a variety of granularity levels. Media Topics were created with the intent of well-balanced granularity across the whole vocabulary.
When considering any proposed change, the NewsCodes development group will consider whether the change meets the follow criteria:
-
Balance with granularity in other branches of the Media Topics taxonomy
Will adding the proposed Media Topic(s) keep the level of granularity fairly even across the vocabulary? -
Threshold of coverage
Is the proposed Topic likely to be covered with reliably regular or seasonal frequency by participating member media organisations? -
Not covered by another term or combination of terms
Does the proposed Topic have distinct semantics and scope from all existing Media Topics, or combination of Media Topics? -
Supported and/or validated by other general taxonomies
Does the proposed Topic have representation in external widely available general subject taxonomies, such as Wikidata, Library of Congress etc.? -
Does not belong to any highly specialised taxonomies
The proposed Topic should not have representation in specialised domain taxonomies such as Medical Subject Headings (MeSH).
3.2. Term style guide for Media Topics in English
The NewsCodes Working Group uses these guidelines as a way of agreeing on a writing style for Media Topics NewsCodes. Different NewsCodes vocabularies may not conform to this guide, and Media Topics in other languages don’t conform to all of these recommendations.
Overall rules:
-
Use British English spelling.
-
Avoid the use of brand names or specific companies.
Rules for term labels:
-
Labels always start with lower case letters except for acronyms (e.g. LGBT) and proper names (e.g. Gaelic football)
-
Acronyms don’t have full stops / periods (e.g. LGBT not L.G.B.T.)
-
We use singular rather than plural, except where it would make no sense (e.g. we use "prison" not "prisons", but "cosmetics" and "arts" are exceptions)
-
Labels should not be too country specific. They should be should be applicable to users from different countries and cultures.
Rules for term definitions:
-
Try to keep the definition to one sentence if practical. Only include periods at the end of sentences if there is more than one sentence.
-
Try not to repeat the label name in the definition.
-
It’s okay to use plural in a definition even when the label is singular, e.g. the definition for "armed conflict" is "Disputes between opposing groups involving the use of weapons, but not necessarily formally declared wars".
-
Try to avoid naming specific countries in definitions unless it’s key to the definition (e.g. Australian rules football).
-
Definitions should not be too country specific. They should be should be understandable by users from different countries and cultures.
-
Definitions should be limited to 250 characters (but this is not always the case).
3.3. Translating Media Topics into your own language
The Media Topics have been translated from English into French, German, Spanish, Swedish, Arabic, Chinese, Danish, Portuguese and Brazilian Portuguese.
We welcome offers of help to create translations into other languages.
We have some scripts that can help to get you started with a translation project, and it is always worth getting in touch before you start because we usually have internal working versions of the MediaTopics vocabulary that are not yet released, and it is better to start with the latest working version so that your translations don’t become out of date.
If you are interested in providing a translation into your language, please contact the NewsCodes Working Group via the IPTC Contact Us form.
3.4. History of Media Topics and Subject Codes
The first IPTC taxonomy for categorising news was the Subject Codes which has about 1,400 terms (more than 1,000 not counting sport competitions) in a hierarchy limited to three levels. In 2007 the basic requirements of a successor taxonomy - now the Media Topics - were defined:
-
It should no longer need to be limited to three levels. This would help to add granularity at a 4th level or even further down.
-
The narrower terms of a term should be reviewed: they should include the most widely used terms under this broader term. This may require the Working Group to add narrower terms. In a case of a high count of narrower terms, removing all narrower terms should be considered.
-
The overall number of terms, excluding sport competions, should not exceed 800. The main reason for this was that it should be easy for a journalist to look over all Media Topics and to quickly pick the appropriate ones for a news item.
The first version of the Media Topics taxonomy in 2010 met these requirements at a high level. Since that time, the set of Media Topics has grown and adapted, language has been changed to clarify meaning, make terms more global and update to modern usage and many language translations have been added.