The page on PyPI, Python's module repository, for the IPTC NewsML-G2 library.
The page on PyPI, Python’s module repository, for the IPTC NewsML-G2 library.

IPTC’s Python library for creating manipulating and managing NewsML-G2 documents, python-newsmlg2, has reached version 1.0.

The earliest versions of the library were created back in 2021, but the code has seen significant changes over that period and we are happy to endorse the latest version as a production-ready 1.0 release.

Created as free, open source library that can be integrated into any Python code, the library supports all parts of the NewsML-G2 specification:

  • multi-media news stories (NewsItem)
  • packages of news content (PackageItem)
  • planned news coverage and information about upcoming and past events (PlanningItem and EventsML-G2)
  • news content classification concepts and sets of concepts (knowledge graphs) (ConceptItem, KnowledgeItem and CatalogItem)
  • syndicated news content transactions (NewsMessage)

Relationship of NewsML-G2 main entities

The 1.0 version has 98% unit test coverage, which can give users confidence that future changes will not introduce regression bugs.

The code can also handle non-NewsML-G2 content embedded within NewsML-G2 files using XML Schema’s “xs:any” construct. This is a feature of NewsML-G2 that allows any type of markup, such as but not limited to XHTML, NITF or RightsML, to be carried as the payload in a NewsML-G2 NewsItem. The 1.0 version adds “round-trip” support of all xs:any constructs allowing additional markup to be captured, retained and output verbatim, without any loss of fidelity.

The library’s documentation also gives examples of how the library can be used to create, process, manipulate and output NewsML-G2 documents.

The code offers some “helper functions” that make working with NewsML-G2 easier, such as:

  • Automatic resolution between QCodes and URIs, two equivalent formats for controlled vocabulary terms, that can now be used interchangeably. The code uses NewsML-G2 Catalogs to look up QCode prefixes and resolve them to URI format.
  • Automatic handling of repeatable items and traversal of the NewsML-G2 element structure to provide easy access to child elements such as “digsrctype = newsitem.contentmeta.digitalsourcetype.uri

The library can be installed by any Python user using PyPI: pip install newsmlg2.

The source code of the library is freely available, licensed under the open-source MIT licence, at https://github.com/iptc/python-newsmlg2.

Feedback on the library is very welcome. Please let us know what you think on the IPTC Contact Us page or the public NewsML-G2 discussion list.