Sampling

By Gabor M. Toth and Brendan Dooley

Any way you look at it, our analysis of the documents in the Euronews Project involves sampling.  To understand early modern news culture we need  to take soundings within a vast amount of material.  But the problem doesn’t end there, as we will try to show in this post; and in future posts we will present our solutions.

In short, the task of sampling can be defined as “obtaining an independent, or almost independent, set of observations for use with classical statistical procedures.” [Peter A. Rogerson, in Encyclopedia of Social Measurement, 2005] A key aspect of sampling is obtaining not only independent set of observations but also observations that are representative of the domain they describe. Another definition of sampling highlights this aspect of sampling: “Sampling is the technique of getting a clear statistical picture of a whole universe.., from a small representative sample of the whole. The sample must be drawn with mathematical precision.” [From unpaginated brochure, Census  USA/A Thumbnail History of the Nation’s Factfinder, Bureau of the Census, 1974. cited in Kruskal, William, and Frederick Mosteller. “Representative Sampling, III: The Current Statistical Literature.” International Statistical Review / Revue Internationale de Statistique, vol. 47, no. 3, [Wiley, International Statistical Institute (ISI)], 1979, pp. 245–65, https://doi.org/10.2307/1402647. ]

There are linguistic peculiarities, as Kruskal and Mosteller pointed out in their seminal article.  What about the different ways in which the concept has been discussed in the different languages?  “The Italian word campione means both champion and sample – thus embodying the Emersonian sense of the superior specimen.”  They go on, explaining how this peculiarity occurred. “One possible etymology is that campione (from Latin campus) at first meant fighter, gladiator, etc; from there it extended to the sense of champion, and then extended further to the sense of a sample of merchandise presumably as champion of the whole.”  But isn’t a champion exactly the opposite of a representative sample?  Or to put it another way, can a champion be representative and superior at the same time?  “We are reminded,” say Kruskal and Mosteller, “of the everpresent tension between a sample that typifies and a sample that glorifies, a tension that is with us in daily life.”  A rather homely example follows, of how the most visible may not necessarily be the most representative, as  “whenever we buy a box of strawberries.” 

 

Now let us see how sampling can be understood in the context of the Euronews Project. 

In the last two years our team has transcribed a large amount of handwritten news letters (or news sheets) that reached Florence in the early modern period. As a result, we have a digitised corpus of these documents, the Euronews Corpus. At first glance, our corpus may seem to be a representative sample of all news sheets that circulated in early modern Europe.  

The Florentine State Archive is one of the most intact archives in the world. Presumably, most of the Medici collection survived until today. Hence it is a place where we could find a representative sample of news sheets that circulated in the early modern period. Representative means that by studying the news sheets in Florence we can analyse not only the Florentine context but also the early modern news culture as a whole. In other words, by studying the news sheets in the Florentine State Archive we can get insights into how news culture worked in the early modern period.

However, much went on between when the news sheets circulated in the early modern period and when the archive was formed.  Much material would have been lost along the way.  In the Introduction to the Inventario sommario, Antonio Panella explains how Cosimo I de’ Medici began setting up the archive after moving the family from the Medici Palace in Via Larga over to Palazzo della Signoria. “But it was a casual formation,  just as the formative process of that State in gestation, depending on chance and casual circumstances more than on a predetermined plan. The material was accumulated without any order, even after the Court had moved its seat to the Palazzo Pitti, purchased by Cosimo I for his wife Eleonora. The transfer led to a doubling of the archive, with the main core remaining where it was, while another part, that is, documents that seemed to be either more important or of immediate interest or of a family-related character, passed to the Pitti Palace.”[Archivio mediceo del Principato. Inventario sommario a cura di Marcello Del Piazzo. Introduzione di Antonio Panella, Roma, Ministero dell’Interno, 1966]  Things only started getting more organized under Grand Duke Pietro Leopoldo in the eighteenth century, with help from Riguccio Galluzzi and others, paving the way, so to speak, for the Archivio di Stato, constituted officially the following century at the beginning of the Italian state, under the last Grand Duke of Tuscany, Leopoldo II.  

As matters now stand, the archive itself poses numerous challenges. The news letters that were sent to Florence are in fact dispersed in various sub-collections of the archive. Throughout the centuries the Florentine State Archive was often reorganised and restructured. This makes the systematic gathering of all news sheets from the archive a difficult, if not impossible, task. Hence we could hardly gather all possible news letters preserved in Florence. Furthermore, we had to face another difficulty: COVID-19 virus. The Florentine State Archive was closed to researchers for 2 years.

At the same time, we also need to be sceptical about the whole of the news items preserved in the Florentine State Archive. We can never be sure that this whole is actually a representative sample of all news letters that circulated in the early modern period. No one knows exactly what the original body of material might have looked like had it survived intact (although, as we will point out in a later post, we can make an educated guess).

To summarise, even if we attempted to construct a data set that is as representative of early modern news letters as possible, the construction of such a data set has remained an ongoing challenge for us. Despite this challenge, we keep addressing the question of how we can get insight into the lost whole of all news letters circulated in the early modern period by drawing on the Euronews Corpus.

Gabor Toth

Leave a Reply

Your email address will not be published.

Back to top