Analysing World Events Using the GDELT dataset and Google BigQuery


The Global Database of Events, Language and Tone (GDELT) aims at capturing world events by monitoring the global news continuously, analysing the text and identifying the involved people, organisations, locations, themes, and, even the tone and emotions of the articles. Due to its global and near real-time coverage and the inclusion of many news publishers in over 100 languages, it can provide an insight into the frequency and impact of social, economic and political incidents across the world. As an example, it is possible to filter strikes or armed conflicts in different locations, or, to determine the relationships between organisations and people. As the dataset is quite large, big data platforms such as Google BigQuery are ideal to carry out the analysis. BigQuery is an Infrastructure as a Service (IaaS) with a familiar SQL-based language for querying the data, allowing for conveniently joining various data sources. We will demonstrate the application of BigQuery on the GDELT data with a few examples, including the measurement of the probability of strikes in international ports.

*UK Research Software Engineering Conference *