Secretary Clinton's Email (Source: Wikileaks)

Date Range:

This application provides the ability to interactively filter 32,795 emails sent during Hillary Clinton's tenure as the United States Secretary of State and display features of the selected subset. The data is extracted from HTML representations of the official State Department release provided in a Wikileaks data base.

Analysis

Peak Email

Email Gaps

Email Times

This service is not meant to provide stand-alone means of analyzing this controversial data set. It is most powerful when used simultaneously with both internet searches and the Wikileaks data base or official State Department site. The latter two services provide indispensable context and precision; two services which the primarily metadata-driven displays cannot provide. Rather, the intended use of this application is the exploration of patterns present in the data to generate and explore different hypotheses.

It is hoped that this service will not simply provide you with a means of exploring this particular data set, but will demonstrate how much can be discovered about an individual using visual analytic tools of uninformative metadata to motivate searches of public data. The dissemination of data in the modern world is a topic of heated discussion, and hopefully experience firsthand exploring the way data can be leveraged will prove informative to you and help you to inform your own opinion on the subject.

Peak Email

Back to analysis links

Focusing on the email volume plot, an obvious peak can be seen near the centre of the time series. Using the date selection slider, the day of highest email volume can be identified as August 21, 2011, the beginning of the Battle of Tripoli in the Libyan Civil War. Inspecting the term frequency and tf-idf for this day reveals a host of terms related to this conflict. The United States and NATO were both heavily involved in this conflict, so this peak makes perfect sense. In fact, many other local maxima correspond to events related to the Arab Spring and the countries affected by this revolutionary wave.

Email Gaps

Back to analysis links

There are a number of conspicuous time periods where no emails are recorded in this data set. The most obvious of these occurs in early November 2012. This time period marks the beginning of much of the increased controversy surrounding the 2012 attack on the US Diplomatic compound in Benghazi, and also includes the 2012 US Presidential Election.

The time slider can be used to select a period surrounding this gap which includes the Benghazi attack, take September 11 to November 23. In this selection mentions of terms related to this attack, such as Benghazi and Ansar al-Sharia, can be seen. The network plot also reveals one of the contentious points of interest in Clinton's emails, the nature and frequency of her contact with Sidney Blumenthal during the Benghazi attack and shortly thereafter. We can also see contact with an account of unidentifiable domain with the label 'aclb.' Utilizing internet searches and inspecting emails, this account can be identified as that of Tony Blair, with the four letter string likely standing for his full initials Finally, many of the emails surrounding this gap contain some FOIA redaction, as is clearly visible in the barplot of FOIA redaction codes.

Other gaps in the data can be found by narrowing the slider range, selecting the centre bar, and dragging this small window across the whole time range with the 'Show Emails' filter set to show only mail from Clinton. Doing this, a number of periods of no email can be discovered. By selecting the foreign travel tickbox, some of these can be identified as corresponding to official state visits. Other gaps occur near less typical events, such as a gap in mid June 2009, likely due to Clinton fracturing her elbow. Another gap in December 2012 corresponds with the resignation of four State Department officials due to the results of the Benghazi investigation.

A number of other gaps of possible interest are not discussed here, and you are encouraged to investigate any period of interest you notice for yourself. However, you should always be mindful of the tendency for all of us to seek information which confirms preconceptions, and attempt as much as possible to be honest and unbiased in your investigations.

Email Times

Back to analysis links

Filtering by the emails sent by Clinton during her tenure, we can glimpse her email sending patterns by looking at the email time plot. These patterns are fairly regular, with most activity occurring at night and a gap between roughly 6 and 11 pm. There are also some strikingly consistent sending times visible in the data, alternatingly at 2 and 3 am. The switching between 2 and 3 am in these sending times matches exactly with the switching of daylight savings time in North America, suggesting these represent some server function performed every 24 hours.

20 Highest TF-IDF Terms

20 Highest Frequency Terms