Global Database of Events, Language, and Tone
The Global Database of Events, Language, and Tone
Contents
Summary
Item | Value |
---|---|
Start date | Around 2011 |
Data versioning | |
Focus | World events in real time |
Glossary
Various acronyms and terms are used in the GDELT documentation. Since there seems to be no unified place for these on the GDELT website, we collect them here.
Term | Expansion | Meaning | Examples |
---|---|---|---|
GDELT | Global Database of Events, Language, and Tone | The project name | N/A |
GKG | Global Knowledge Graph | One of the tables in GDELT | |
CAMEO | Conflict and Mediation Event Observations | ???? | |
CAMEO code | Conflict and Mediation Event Observations code | A defined code used in CAMEO | Example verbs: Express intent to institute political reform, not specified below (CAMEO 034), Allow humanitarian access (CAMEO 0863). Example actors: IGOUNO (the United Nations), COP (Police officers, officers, criminal investigative units, protective agencies). Example religions: ATH (Atheism/Agnosticism), CHR (Christianity).[1]:22, 42, 93, 107 |
GCAM | GDELT Global Content Analysis Measures | ???? | |
Actor | N/A | An entity mentioned in an event | Barack Obama, Russia, Microsoft, United Church of Christ in Japan |
Event | N/A | ||
Mention | N/A | ||
Codebook | N/A | Documentation of the table schemas | |
LIWC | Linguistic Inquiry and Word Count | ||
RID | ?? mentioned in [1] | ||
GNS | mentioned in [2] | ||
GNIS | mentioned in [3] | ||
TABARI | Text Analysis By Augmented Replacement Instructions[2] | N/A | |
IGO | |||
NGO |
Versions
The GDELT versioning system is a little unclear. In one sense, there are three versions: 1.0, 2.0, and 3.0. However, for each release, the main dataset, the API, the Visual Knowledge Graph, and the GEO API (is this distinct from the API?) are released at separate times, so in another sense there are separate components each with their own versions (indeed, the GDELT Visual Knowledge Graph 2.0 is for GDELT 3.0).
Release | Part of | Formats available | Year of publication | Month of publication | Years of coverage | Dimensions (inputs) | Metrics (outputs) |
---|---|---|---|---|---|---|---|
GDELT 1.0 | GDELT 1.0 | Raw data files, Google BigQuery | 1979–present | ||||
GDELT 2.0 | GDELT 2.0 | Raw data files, Google BigQuery | |||||
GDELT Visual Knowledge Graph 1.0 | Raw data files, Google BigQuery | 2016 | February[3] | Images | |||
GDELT Visual Knowledge Graph 2.0 | GDELT 3.0 | Google Cloud Vision API | 2016 | October[4] | Images | ||
GDELT 3.0 | GDELT 3.0 | 2016/2017[4][5] | |||||
GDELT GEO 2.0 API | 2017 | April[6] |
Data description
GDELT 2.0 contains three tables: Events, Mentions, and the Global Knowledge Graph (GKG). In both Mentions and the GKG, each row in the table is an article about an event; the difference seems to be that the GKG contains more columns. On the other hand, in Events each row is an event, and only the first article that mentions the event is stored.
Data dimensions and metrics
GKG 2.0 fields from [7]
GDELT 2.0 Events and Mentions from [8]
Table name | General field type | Field names giving information about the general field type |
---|---|---|
GKG 2.0 | Date | DATE , Dates
|
GKG 2.0 | Location | Locations , V2Locations
|
GKG 2.0 | Entities | Persons , V2Persons , Organizations , V2Organizations
|
GKG 2.0 | Topic | Themes , V2Themes
|
GKG 2.0 | Sentiment | V2Tone
|
GKG 2.0 | Source text | Quotations
|
GDELT 2.0 Mentions | Date | EventTimeDate , MentionTimeDate
|
GDELT 2.0 Mentions | Source text | MentionSourceName , MentionIdentifier , MentionDocLen
|
GDELT 2.0 Mentions | Entities | Actor1CharOffset , Actor2CharOffset
|
GDELT 2.0 Mentions | Sentiment | MentionDocTone
|
GDELT 2.0 Events | Date | Day , MonthYear , Year , FractionDate
|
GDELT 2.0 Events | Entities | Actor1Code , Actor1Name , Actor1CountryCode , Actor1KnownGroupCode , Actor1EthnicCode , Actor1Religion1Code , Actor1Religion2Code , Actor1Type1Code , Actor1Type2Code , Actor1Type3Code (repeated for Actor2 )
|
GDELT 2.0 Events | Sentiment | AvgTone
|
GDELT 2.0 Events | Source text | NumArticles , NumSources , NumMentions , DATEADDED , SOURCEURL
|
GDELT 2.0 Events | Location | Actor1Geo_Type , Actor1Geo_Fullname , Actor1Geo_CountryCode , Actor1GeoADM1Code , Actor1Geo_ADM2Code , Actor1Geo_Lat , Actor1Geo_Long , Actor1Geo_FeatureID (repeated for Actor2 and Action )
|
Data sources
GDELT finds news articles through some process, but it's not clear what that process is.
From the about page:[9]
Today GDELT relies on hundreds of thousands of broadcast, print, and online news sources from every corner of the globe in more than 100 languages and its source list grows daily. In addition to worldwide translated news material, the historical backfile of GDELT stretching back to 1979 makes extensive use of AfricaNews, Agence France Presse, Associated Press, Associated Press Online, Associated Press Worldstream, BBC Monitoring, Christian Science Monitor, Facts on File, Foreign Broadcast Information Service, The New York Times, United Press International and The Washington Post.
Auxiliary
In addition to the actual news articles, some auxiliary data sources are used for sentiment analysis and location identification.[10]
Methods of estimation
People
Kalev Leetaru and Philip Schrodt are the co-creators. Who else is involved in the project? How many people work on this? How many person-hours are spent on GDELT per year?
Reception
Usage in debates
See also
External links
References
- ↑ "CAMEO.Manual.1.1b3.pdf" (PDF). Retrieved October 17, 2017.
- ↑ "TABARI: Text Analysis By Augmented Replacement Instructions". Retrieved October 18, 2017.
- ↑ "GDELT Visual Knowledge Graph (VGKG) V1.0 Available". GDELT Blog. February 13, 2016. Retrieved October 18, 2017.
- ↑ 4.0 4.1 "VGKG 2.0 Released - GDELT Blog". GDELT Blog. October 16, 2016. Retrieved October 18, 2017.
Finally, as the first GDELT 3.0 release, the VGKG 2.0 data stream updates every 60 seconds, allowing you near-realtime access to a codified view of global visual narratives.
- ↑ "GDELT 3.0 And Using BigQuery And Streaming Google Cloud Storage For Logging - GDELT Blog". GDELT Blog. March 24, 2017. Retrieved October 18, 2017.
- ↑ "GDELT GEO 2.0 API Debuts! - GDELT Blog". GDELT Blog. April 26, 2017. Retrieved October 18, 2017.
- ↑ "Google BigQuery". Retrieved October 17, 2017.
- ↑ "The GDELT event database data format codebook v2.0" (PDF). February 19, 2015. Retrieved October 17, 2017.
- ↑ "The GDELT Story: About the GDELT Project". Retrieved October 18, 2017.
- ↑ "GCAM Codebook". Retrieved October 17, 2017.