GLOCON database is a multi-country automated dataset of contentious politics events that has been built during the Emerging Markets Welfare (EMW) Project, which investigates the effects of contentious politics on welfare state programs in countries of the Global South, initially focusing on Argentina, Brazil, China, India, South Africa, and Turkey. GLOCON aimed at aiding EMW research in mapping the dynamics of social contention in a comparative perspective that allows studying them with a global perspective which nevertheless takes into account case-specific features of contentious politics. Incorporating variability and generalizability from the outset, it is capable of going beyond its initial focus countries in coverage and becoming a reliable research tool for any project that studies contentious politics or the way it influences other social and political phenomena.
GLOCON dataset owes its success in delivering complete, accurate, and variegated information on contentious politics to its usage of advanced natural language processing and machine learning tools that allow it to tap into the big data universe of on-line news sources. It is the first fully automated, multi-lingual contentious politics event database that gathers event information from local news sources of the countries in focus. It covers all kinds of protest events that make up contentious politics from the 1990s to the present day, depending on the availability of usable online sources each country offers. For each event that is contained, time, place, participant, and organizer type information is also available. For events from South Africa and India, information on whether the event is rural or urban, and violent or non-violent is also available and can be viewed in a filtered way in the ⮕ Dashboard.
As of January 2023, GLOCON dataset currently contains protest event data from India, South Africa, Argentina, Brazil, and Turkey. The data processed during its preparation were collected from sources in three languages: English for India and South Africa, Spanish for Argentina, and Portuguese for Brazil. ⮕ Turkey data were manually collected and processed from sources in Turkish. It is annually updated to cover new contentious politics event data from all countries currently included, and as of the inauguration of this website, contains 621,290 events from 9,509,193 news documents (You can find more statistics in the ⮕ Methodology section). With future updates, the GLOCON Dataset research team plans to incorporate many other countries from all around the world and aims at producing a truly global contentious politics database.
You can view the mapped and visualized version of the data with interactive event feature filters and colorization in the ⮕ Dashboard.
For more information on the Dashboard map, as well as the methodology that guides the creation of the dataset, you can view the ⮕ Methodology section. You can find the most detailed information on event feature definitions and training data creation process in our ⮕ Annotation Manual.
To request access to raw data please follow this link to the ⮕ Download section.