Global Database of Events, Language, and Tone

From Devec
Jump to: navigation, search

The Global Database of Events, Language, and Tone

Summary

Item Value
Start date Around 2011
Data versioning
Focus World events in real time

Glossary

Various acronyms and terms are used in the GDELT documentation. Since there seems to be no unified place for these on the GDELT website, we collect them here.

Term Expansion Meaning Examples
GDELT Global Database of Events, Language, and Tone The project name N/A
GKG Global Knowledge Graph One of the tables in GDELT
CAMEO Conflict and Mediation Event Observations  ????
CAMEO code Conflict and Mediation Event Observations code A defined code used in CAMEO Example verbs: Express intent to institute political reform, not specified below (CAMEO 034), Allow humanitarian access (CAMEO 0863). Example actors: IGOUNO (the United Nations), COP (Police officers, officers, criminal investigative units, protective agencies). Example religions: ATH (Atheism/Agnosticism), CHR (Christianity).[1]:22, 42, 93, 107
GCAM GDELT Global Content Analysis Measures  ????
Actor N/A An entity mentioned in an event Barack Obama, Russia, Microsoft, United Church of Christ in Japan
Event N/A
Mention N/A
Codebook N/A Documentation of the table schemas
LIWC Linguistic Inquiry and Word Count
RID  ?? mentioned in [1]
GNS mentioned in [2]
GNIS mentioned in [3]
TABARI Text Analysis By Augmented Replacement Instructions[2] N/A
IGO
NGO

Versions

The GDELT versioning system is a little unclear. In one sense, there are three versions: 1.0, 2.0, and 3.0. However, for each release, the main dataset, the API, the Visual Knowledge Graph, and the GEO API (is this distinct from the API?) are released at separate times, so in another sense there are separate components each with their own versions (indeed, the GDELT Visual Knowledge Graph 2.0 is for GDELT 3.0).

Release Part of Formats available Year of publication Month of publication Years of coverage Dimensions (inputs) Metrics (outputs)
GDELT 1.0 GDELT 1.0 Raw data files, Google BigQuery 1979–present
GDELT 2.0 GDELT 2.0 Raw data files, Google BigQuery
GDELT Visual Knowledge Graph 1.0 Raw data files, Google BigQuery 2016 February[3] Images
GDELT Visual Knowledge Graph 2.0 GDELT 3.0 Google Cloud Vision API 2016 October[4] Images
GDELT 3.0 GDELT 3.0 2016/2017[4][5]
GDELT GEO 2.0 API 2017 April[6]

Data description

GDELT 2.0 contains three tables: Events, Mentions, and the Global Knowledge Graph (GKG). In both Mentions and the GKG, each row in the table is an article about an event; the difference seems to be that the GKG contains more columns. On the other hand, in Events each row is an event, and only the first article that mentions the event is stored.

Data dimensions and metrics

GKG 2.0 fields from [7]

GDELT 2.0 Events and Mentions from [8]

Table name General field type Field names giving information about the general field type
GKG 2.0 Date DATE, Dates
GKG 2.0 Location Locations, V2Locations
GKG 2.0 Entities Persons, V2Persons, Organizations, V2Organizations
GKG 2.0 Topic Themes, V2Themes
GKG 2.0 Sentiment V2Tone
GKG 2.0 Source text Quotations
GDELT 2.0 Mentions Date EventTimeDate, MentionTimeDate
GDELT 2.0 Mentions Source text MentionSourceName, MentionIdentifier, MentionDocLen
GDELT 2.0 Mentions Entities Actor1CharOffset, Actor2CharOffset
GDELT 2.0 Mentions Sentiment MentionDocTone
GDELT 2.0 Events Date Day, MonthYear, Year, FractionDate
GDELT 2.0 Events Entities Actor1Code, Actor1Name, Actor1CountryCode, Actor1KnownGroupCode, Actor1EthnicCode, Actor1Religion1Code, Actor1Religion2Code, Actor1Type1Code, Actor1Type2Code, Actor1Type3Code (repeated for Actor2)
GDELT 2.0 Events Sentiment AvgTone
GDELT 2.0 Events Source text NumArticles, NumSources, NumMentions, DATEADDED, SOURCEURL
GDELT 2.0 Events Location Actor1Geo_Type, Actor1Geo_Fullname, Actor1Geo_CountryCode, Actor1GeoADM1Code, Actor1Geo_ADM2Code, Actor1Geo_Lat, Actor1Geo_Long, Actor1Geo_FeatureID (repeated for Actor2 and Action)

Data sources

GDELT finds news articles through some process, but it's not clear what that process is.

From the about page:[9]

Today GDELT relies on hundreds of thousands of broadcast, print, and online news sources from every corner of the globe in more than 100 languages and its source list grows daily. In addition to worldwide translated news material, the historical backfile of GDELT stretching back to 1979 makes extensive use of AfricaNews, Agence France Presse, Associated Press, Associated Press Online, Associated Press Worldstream, BBC Monitoring, Christian Science Monitor, Facts on File, Foreign Broadcast Information Service, The New York Times, United Press International and The Washington Post.

Auxiliary

In addition to the actual news articles, some auxiliary data sources are used for sentiment analysis and location identification.[10]

Methods of estimation

People

Kalev Leetaru and Philip Schrodt are the co-creators. Who else is involved in the project? How many people work on this? How many person-hours are spent on GDELT per year?

Reception

[4]

Usage in debates

See also

External links

References

  1. "CAMEO.Manual.1.1b3.pdf" (PDF). Retrieved October 17, 2017. 
  2. "TABARI: Text Analysis By Augmented Replacement Instructions". Retrieved October 18, 2017. 
  3. "GDELT Visual Knowledge Graph (VGKG) V1.0 Available". GDELT Blog. February 13, 2016. Retrieved October 18, 2017. 
  4. 4.0 4.1 "VGKG 2.0 Released - GDELT Blog". GDELT Blog. October 16, 2016. Retrieved October 18, 2017. Finally, as the first GDELT 3.0 release, the VGKG 2.0 data stream updates every 60 seconds, allowing you near-realtime access to a codified view of global visual narratives. 
  5. "GDELT 3.0 And Using BigQuery And Streaming Google Cloud Storage For Logging - GDELT Blog". GDELT Blog. March 24, 2017. Retrieved October 18, 2017. 
  6. "GDELT GEO 2.0 API Debuts! - GDELT Blog". GDELT Blog. April 26, 2017. Retrieved October 18, 2017. 
  7. "Google BigQuery". Retrieved October 17, 2017. 
  8. "The GDELT event database data format codebook v2.0" (PDF). February 19, 2015. Retrieved October 17, 2017. 
  9. "The GDELT Story: About the GDELT Project". Retrieved October 18, 2017. 
  10. "GCAM Codebook". Retrieved October 17, 2017.