One of the most visible of these technologies has been the use of artificial intelligence (AI) as well as Natural Language Processing (NLP), in process of collecting data and analysing it.
Publishing industry across the world is going through challenges. The dynamic nature of technology trends demands its continuous evolution from publishing to a digital media company. Progress has been made in terms of both content platforms i.e. the move from purely print to a variety of audio-visual avenues (such as television and online news portals, among others), as well as in terms of technology used to gather and publish information. One of the most visible of these technologies has been the use of artificial intelligence (AI) as well as Natural Language Processing (NLP), in process of collecting data and analysing it.
NLP is an area of AI which aims to produce effective human-computer dialogue by teaching computers to understand, organise, analyse and reproduce coherent sentences. NLP software uses a variety of models to achieve this – such as analysing large amounts of data to ‘learn’ a language and its rules as well as relying on manual input of the same rules. Currently, NLP algorithms enable large amounts of data to be absorbed and processed in order to give users the latest stories before they are discovered manually. Mobile news apps like Inshorts, NDTV and others use this process to display snippets in the notification bars, allowing users to be informed of trending stories with minimal effort. The ultimate aim is to allow computers to not only create sentences but to understand the direct and indirect meaning of those generated sentences. This would mean that the articles would contain perspectives too, rather than just facts, showing that the computer was smart enough to understand the language.
In the media industry, the concept of language can be extended not only to written words but to images and videos, as content types have become more diverse over the years. AI and NLP, hence, now possess a larger pool of data to work with and automate so as to generate news reports and other forms of required content. They can do so in a variety of ways:
Data collection & Analysis:
For a media company, there are two facets of data collection. Content in media industry is generated in all possible media formats (viz. text/image/video and audio). Text mining and NLP plays an important role in gathering semantic information about text, audio and video content. Relevant keywords, sentiments, and entities, along with topic classification is computed through NLP. Image classification is used for face detection or finding important parts of an image. All this data is then used to build a knowledge graph for ease of content search, recommendations, and relevant content syndication. The second part of data collection is around collecting time series data on what content is consumed by the end user. This helps the algorithm match user interest with content meta data to serve relevant content.
Real time serving data of web and mobile content helps the publisher to figure out trending content, which is attractive from advertising perspective. Additionally, it enables investigative reporters to obtain all known mentions in and around a topic they are working on within a span of minutes. The data obtained from text mining also serves as a learning process for the computer in order to develop its language skills for future endeavours. For a media company this requires significant investment in developing the infrastructure to collect and analyse large volume of data.
The insights obtained from text mining allow for trends to be identified and leveraged in order to solidify a target audience and appeal to their interests. While this is applicable to new content, feedback from already published content allows for its improvement by using adaptable keywords. Historical analysis of trending content helps the editors focus on the topics they would want to write and distribute across social media. This means that when journalists are assigned specific beats, the algorithms showcase all the stories that relate to the beat immediately based on past data. Social listening tools imbued with NLP algorithms are the preferred mode of gathering the data necessary to achieve this as they can also gather insights into advertising along with long-form content. By serving relevant ads for a target audience, the ROI for the advertisers can be improved resulting in increased revenue for the publisher. High performing advertising campaigns, for example, then become templates for future campaigns and low-performing ones can be analysed for their defects.
Along with improving organisational content, NLP algorithms can also help formulate responses to questions posed by both customers as well as journalists. This would enable the system to also track the effects of the response across media platforms to determine the industry scenario. A direct outcome of this report would be the ability to weed out incorrect or ‘fake’ news that may be published by backlinking their sources and exposing their inaccuracies. Websites that publish spoofs and parodies, for example, can be identified and the information they disseminate fact-checked before it goes viral. Smart AI programs would be able to perform these semiotic analyses to determine accuracy.
The media industry today is becoming more dynamic and embracing technology at a faster rate to ensure that its progress can be harnessed. AI and NLP enable computers to understand semantics and determine the best course of action to be taken with respect to content dissemination based on the analysis of industry feedback.