Outliers and anomalies

By Gabor M. Toth and Brendan Dooley

When studying a data set the very first step is a descriptive analysis. This is meant to get a quantitative overview of the data. A descriptive statistical analysis tells how skewed or balanced a given data set is; ultimately it informs more complex data analysis since it reveals the limitations of the data set. As part of a descriptive analysis, first we count the number of individual data points, as well as the different features that are assigned to each. Second we visualize their distributions. For instance, in the Euronews Corpus we studied the distribution of the number of news items over time (in each month, decade, 25 years, etc). Below you can see the plot demonstrating the total number of news items present in each year between 1540 and 1730. The plot helps us identify gaps and tells from which years and decades we need to collect more data.

As part of descriptive statistical analysis, we also try to detect outliers in a data set. Outliers are anomalies; they are extreme values that are different from the rest of the data points. It is important to find them since they can indicate errors in the data collection process or they can be genuine novelties. It is also important to find them because they might introduce noise into a data set and increase skewness; they can eventually mislead interpretations and confuse models meant to accomplish more complex analysis. As a good practice outliers are detected and eliminated from the dataset.

Throughout the descriptive analysis of the Euronews Corpus, we investigated the distribution of news items per news sheets. The present analysis is based on a sample of 1888 news sheets. On average a news sheet contains 5.86 news items. The box plot below shows the distribution of news items per news sheets. The green line in the box represents the median value, which is five. Approximately, half of the news sheets contain less than 5 news items, and half of them contain more than five news items. We can also see that 75% of news sheets contain less than 8 news items (see the upper edge of the box) and 25% of news sheets contain less than 2 news items. The box plot also shows the maximum number of news items excluding outliers; this is 17. Finally, the box plot renders the outliers. They are marked with black dots in the upper part of the plot. Each news sheet containing more than 17 news items is an outlier. In the second box plot below you can see that once we remove news sheets containing more than 17 news items, we do not have outliers anymore. (To identify outliers we also calculated the z-score of each news sheet).

Next we will examine an outlier we randomly selected from the data set. This is a news sheet from Rome containing 32 individual news items.

Why such an outlier? Further research suggests MapDocId#54512 (Mediceo del Principato, Volume 3462, ff. nn. ) is an instructive rarity from the standpoint of news research. It includes eight unnumbered pages, and a particularly long and rich collection of news items all assembled in a single handwritten newsletter under the heading, “Roma, li 21 Aprile 1736.” The first story is about the Roman popular uprising of 1736, famously discussed in the opening chapter of Franco Venturi’s Settecento riformatore, vol. 1. It says:

“Two couriers from Spain appeared, the first of which went on to Naples, the other brought the answers of that Court [i.e. Spain] concerning the sensational representations made by these royal agents [in Rome] about the recent popular uprising. For this reason, a meeting was immediately held in the Palazzo regio of Spain [i.e. Palazzo Farnese] with the Most Eminent Belluga, the two Spanish Auditors of the Rota, General Marcillach, the Cavalry Captain Viavil, and other servants of the Crown.”

[Comparvero due corrieri da Spagna, il primo de quali passò a Napoli, l’altro portò le risposte di quella Corte sulle strepitose rappresentanze fatte da questi agenti Regij intorno alla passata sollevazione popolare. Per il che nel Palazzo Regio di Spagna fù subito tenuto un congresso coll’Eminentissimo Belluga, li due Auditori di Ruota spagnoli, il generale Marcillach, il Capitano della Cavalleria Viavil, et altri dependenti della Corona.]

Other stories follow. But the narrative dynamic is unlike many other newsletters, where the miscellaneous material is joined in a concatenation of separate stories by phrases like “it is said that” “and that”, as in the following example:

It says:

“In letters from Augusta the 24th of February [1566] [….]

That money was being sent to various parts of Germany to make up the 4 regiments of Germans in the name of the Catholic king which His Imperial Majesty has allowed to be created by those leaders who were previously written about.

That the show [of arms] will take place in Frissene et Weingart.

That by way of Count Alberigo di Londrone who came from Spain by coach it is understood that His Catholic Majesty will send ten thousand Spaniards to Italy.

That the king of Sweden had sent to thank His Imperial Majesty for the good affection that can be seen leading to peace […]

That His Imperial Majesty had been promised considerable help by several private princes.

That His Majesty let it be understood that if the Turk himself comes to Hungary, as they say, he himself also wants to go there in safety.

That as far as could be understood the Turk wanted to divide his forces into two parts, one [going] towards Transylvania and the other towards Buda, to prevent the Christians [deleted: who] from being able to help against Transylvania.

That it was understood that the Turk had sent 40,000 scudi to the Transylvanian to recruit people [i.e., soldiers].”

[In lettere di Augusta, il xxiv febbraio 1566 […]

Che si dava danari in diverse parti della Alemagna per fare li iiii reggimenti di todeschi a nome del re cattolico quali Sua Maestà Cesarea ha concessi di potere fare sotto quei capi che già si scrisse.

Che la mostra si farà a Frissene et Weingart.

Che per via del conte Alberigo di Londrone il quale viene di Spagna in poste s’intende che Sua Maestà Cattolica manderà in Italia Xmila spagnoli.

Che il re di Svezia haveva mandato a ringraziare Sua Maestà Cesarea della buona affezione che si vede che porta alla pace […]

Che a Sua Maestà Cesarea erano stati promessi grandi aiuti da diversi principi privati.

Che Sua Maestà si lasciava intendere che se il Turco viene in persona in Ungheria, come si dice, vuole ancor esso andarvi al sicuro.

Che per quanto si poteva intendere il Turco voleva dividere le sue forze in due parti, una verso la Transilvania et l’altra verso Buda per impedire che i christiani [deleted: che] non possino soccorrere contro il Transilvano.

Che si era inteso il Turco havere mandato 40 mila scudi al Transilvano per fare genti.]

By contrast, the 1736 reports are joined rhetorically by references back and forth. For instance, 

“Monday the 23rd. In the morning at the Palace a Congregation was convened of the above mentioned cardinals to examine the said three Spanish petitions. Meanwhile, since the Spanish petitions were being publicized throughout the city every class of persons…”

[Lunedì 23. Si tenne la mattina a Palazzo la congregazione de suddetti Cardinali per esaminare le accennate tre petizioni spagnole. Intanto essendosi incominciate a pubblicare per la Città le petizioni spagnole, ogni ceto di persone….]

Divisions between stories are modulated by such references.

“A courier came from Velletri with the news that the people, refusing to receive the Spaniards, had taken arms up to the number of 4 thousand, and more, and that having incited the neighboring lands, the number of armies increased: that the fury of the people was great, and they had even raised the Papal standard, and opened the war chest, and dug up earth and made palisades to protect the territory in the least secure places, shouting Long Live the Pope, and Cardinal Barbarini, who is the bishop of the city.

This very true news caused the Most Eminent [Cardinals] Corsini and Passani to hold a meeting in the evening with the said Most Eminent Barbarini.”

[Venne staffetta da Velletri coll’aviso, che quel popolo rifiutando ricevere i spagnoli avevano prese le Armi fino al numero di 4mila, e più, e che avendo concitate le terre vicine il numero degl’armati cresceva: che il furore del Popolo era grande, e che aveva sino in alzato lo stendardo Pontificio, e batteva la cassa militare, e cavava terra e faceva Palizzate per guardare il territorio ne luoghii meno sicuri gridando viva il Papa, e il Cardinal Barbarini, che è il vescovo della Città. 

Questa notizia verissima diede motivo agl’Eminentissimi Corsini, e Passani di tenere la sera congresso col predetto Eminentissimo Barbarini. ]

Here is another impressive succession of events joined together by the writer:

“Cardinal Acquaviva was in conversation with this Cardinal Corsini, but so far without fruit, with Cardinal Acquaviva insisting that the requested satisfactions be given, without reflecting how much it matters to the interests of Spain and Naples, in the present situation, to maintain the goodwill of the people of the Ecclesiastical State, and the consequences that may come from the unrest against the Spaniards occurring in the lands of the Church near the border with the Kingdom [of Naples].

Not for this reason, these meetings have not inspired in the plebs a greater ill-founded suspicion of the Corsini house, which increases due to the presence of observers, who saw Don Bartolo also go every evening to a less frequented quarter near Trinità de Monti to have private conversations with the Spanish officer Viavill, who is perhaps the one who gives the greatest encouragement to Cardinal Acquaviva to be persistent in the commitment already made.

In this state of affairs, it is considered a good idea to introduce intermediaries.

The previously mentioned Cardinal Aldovrandi was believed to be a possible candidate until having sampled, on a tour of various houses, the bad wind blowing from all sides, he now seems to prefer to stay home, and let the fire be extinguished by the same bellows, which lit it, whereupon the Most Eminent Corradini was chosen instead, who in the evening had a long conference with the Most Eminent Belluga, who, under the influence of others, is known to have written in Spain against this Court with the most vehemence of all, and although this cardinal was already in convalescence due to ailments, he allowed himself to be persuaded to go without further delay to see Cardinal Acquaviva, with whom he had a half hour of talks, after which he replied to the said Cardinal Corradini, and although the result has not been divulged, from many clues it is judged to have come to few conclusions, the Spaniards claiming that the said leaders of the tumult are doubtlessly incited by a specious argument that the Pope could forgive the injury done to himself, but not the one done to the two Kings.”

[Il Signor Cardinal Acquaviva fù in abboccamento con questo Cardinal Corsini, ma sin’ora senza frutto, insistendo lo stesso Cardinal Acquaviva, che si diano le richieste soddisfazioni, senza riflettere quanto importi alle convenienze della Spagna e di Napoli, nella presente situazione lo avere benevoli li Popoli dello Stato ecclesiastico, e le conseguenze che possono produrre le commozioni, che contro de Spagnoli si fanno nelle terre della Chiesa verso il confine del Regno. 

Non è per questo che li detti congressi non abbiano eccitato nella plebe una maggior mal fondata sospicione della casa Corsini, la quale si accresce essendovi stati de curiosi, che anno osservato il Signor Don Bartolo anco andare ogni sera in una parte meno pratticata verso la Trinità de Monti a fare solitarij colloquij coll’officiale spagnolo Viavill, che è forse quello che da maggior animo al Cardinal Acquaviva d’essere pertinace nell’impegno già preso.

In questo stato di cose si è stimato bene fraporre de mezzani.  

Il suddetto Signor Cardinal Aldovrandi fu creduto a proposito, ma avendo egli con un giro fatto in varie case sentita la mala qualità del vento, che soffia da tutte le bande, parè che voglia starsene a casa sua, e lasciare, che il fuoco si spenga da quei stessi Mantici, che lo anno acceso, onde fu assunto l’opera efficace dell’Eminentissimo Corradini, il quale la sera ebbe longa conferenza coll’Eminentissimo Belluga, che sedotto dagli altri si sa avere scritto in Spagna contro questa Corte con più veemenza di tutti, e quantunque questo porporato per le sue indisposizioni si fusse già posto in letto, si lasciò persuadere di andare senza altra dilazione a trovare il Signor Cardinal Acquaviva, con cui ebbe mezz’ora di colloqui, dopo il quale fa a dare risposta al detto Cardinal Corradini, restando questa ancora occulta, ma da molti indizi si giudica di poca conclusione, pretendendosi da spagnoli, che i suddetti capi del tumulto siano senza altro appiccati con un’soffistico argomento che il Papa poteva rimettere loro l’ingiuria fatta a se, ma non quella fatta ai due Rè.]

Further analysis of the last paragraph suggests an attempt at literary expressiveness beyond the necessities imposed by the delivery of information. The extended metaphor of “fire” standing for popular discontent employs the atmospheric and sensory effects of nearby conflagration to illustrate the severity and rapid development of the event as well as the diffusion throughout the city of opinions and assorted notions at times amounting to received conclusions whereupon to decide a course of action. But with this reference to a possible connection between newsletter stylistics and historiographical stylistics in the age of Ludovico Antonio Muratori, we close these brief remarks and promise more along these lines in further posts.


Franco Venturi, “Gli anni trenta del Settecento,” in his Settecento riformatore, vol. 1 (Torino: Einaudi, 1969)

Raffaele Ajello, “Carlo di Borbone,” Dizionario Biografico degli Italiani – Volume 20 (1977)

Girolamo Imbruglia, Naples in the eighteenth Century. The Birth and Death of a Nation State (Cambridge: Cambridge University Press, 2000)

Gabor Toth

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top