Cinergie – Il cinema e le altre arti. N.22 (2022), 173–187
ISSN 2280-9481

Applying Automatic Text Analysis Methodologies to Audiovisual Serial Product

Marta RocchiUniversity of Bologna (Italy)

Marta Rocchi is a junior lecturer at the Department of Arts at the University of Bologna. Her research interests concern data-driven approaches for the study of narrative ecosystems and gender inequalities in the audiovisual industry. Her publications include several papers in peer-reviewed journals and a book chapter. She was curator of a special section of SERIES journal on Making Models of Contemporary Serial Media Products.

Submitted: 2021-08-31 – Revised version: 2022-06-30 – Accepted: 2022-06-30 – Published: 2022-12-22

Abstract

The goal of this paper is to explore the application of automatic text analysis methodologies to contemporary audiovisual serial narratives. As a case study we use the Apple TV+ series Servant (Apple TV+, 2019-). We first focused on the primary text (the English dialogue used in the TV series) to examine the role of the dialogue and the characters’ interactions through an exploratory application of Social Network Analysis. In particular, tagging the dialogue in XML format allowed the identification and quantification of scenes, characters and speaker-receiver pairs that were used to implement Social Network Analysis. Secondly, we collected tweets as the secondary text (the text produced by the Twitter audience), and analysed users’ behaviours and preferences considering both semantic and quantitative points of view. We underlined how analyses conducted on tweet sentiment can help to monitor this social engagement mechanism and how it may evolve over time. The paper is highly experimental in that, in addition to findings related to the narrative structure of the serial product (thanks to the primary text) and analysis of the relationship over time with the audience (thanks to the secondary text), it aims to test a shared analytical framework that can enable large-scale comparative investigations of contemporary TV series.

Keywords: TV series; Text analysis; Twitter; Sentiment analysis; Servant.

This paper provides an exploratory analysis aimed at testing the use of automatic text analysis methodologies, which are still underrepresented in the field of media studies in the Italian context. Therefore, the nature of this work is highly experimental. In particular, we are interested in understanding whether and to what extent automated tools can provide meaningful guidance in the field of media studies and thus can be implemented more broadly. Because we are particularly interested in the market entry of a new platform from both the production and reception perspectives, we considered as a case study the Apple TV+ series Servant (Apple TV+, 2019-). We applied automatic text analysis tools that considered textual aspects related to two different levels of the audiovisual product: one associated with the primary text (the dialogue of the TV series) and the other associated with the text produced by the audience, specifically by Twitter users (tweets as secondary texts).

The paper is divided into two sections which consider two different textual aspects and methods. The first section considers the audiovisual text. We analysed the primary text, focusing on the English dialogue of the first two seasons of Servant. The series lends itself to this type of analysis because it has a very limited number of main characters, who interact primarily through dialogue, in a context (a house) that remains constant. This makes coding easier and makes the dialogue meaningful even while isolated from the visual aspect. The psychological evolution of the characters and their interaction occurs mainly through dialogue, precisely because of the ‘theatrical’ structure of the series, and this, unlike in other series, makes the visual aspect of the show less relevant. Therefore, the main purpose of the first section is to examine the role of dialogue and characterization in terms of narrative development. In addition, identification of the characters (‘speakers’ and ‘receivers’ of any speech act) will also allow us to implement an exploratory application of Social Network Analysis (SNA). Several articles have already highlighted the benefits of SNA for investigating the plot content of fictional works, from literary drama to audiovisual products (Moretti 2011; Agarwal et al. 2012; Weng et al. 2007, 2009; Ercolessi et al. 2012).

The second section of the paper deals with the analysis of the secondary text produced by Servant’s Twitter community (the Twitter users who share a common interest in this TV series). The main objective of this section is notable in its analysis of tweets related to the TV series from two different points of view: a semantic one through the methodology of sentiment analysis (one of the most important research areas on Twitter, Antonakaki et al. 2021: 5) and a quantitative one through a survey of user activity metrics. Social media analysis is becoming an important tool to monitor users’ behaviours and preferences and to grasp useful insights for both production and marketing. Considering the production side, even if social media audiences do not equate to the full spectrum of TV viewers, social platforms – particularly Twitter – have become a useful tool in their ability to encourage public discussions and monitor audience engagement (Wakamiya et al. 2011; Napoli 2013). The Twitter platform is particularly important (Antonakaki et al. 2021: 1) in that it allows for dynamic, real-time engagement and interaction with viewers and it thus plays a key role in fan engagement. In relation to marketing, Molteni and Ponce De Leon (2016: 228) point out how “Twitter TV activity and reach data can help networks and agencies to make superior, data-driven advertising and program marketing decisions”. The statistical analysis of the two sections was performed using R Statistical Software (version 4.0.3).

1 Apple TV+ and Servant: a Case Study

The Apple Inc. video-on-demand service, Apple TV+, was launched on 1 November 2019 with new series including For All Mankind, The Morning Show and See (Apple TV+, 2019-). Debuting a little later, on 28 November, Servant has among its executive producers M. Night Shyamalan, an Indian naturalized American director “who fascinates and frustrates in equal measure” (Crewe 2019). He is known for audiovisual products (e.g., The Sixth Sense 1999; Unbreakable 2000; Split 2016; Glass 2019) characterized by particular aesthetic and content choices, and Servant represented his debut in the serial format. If we consider the high budget1 of See, we can identify two points that may underscore the need and the determination of Apple TV+ to produce TV series that stand out within the current abundance of TV serial dramas: its high production budget and high authorial level.

Servant tells the story of Dorothy and Sean Turner, an seemingly quiet couple from Philadelphia. A reborn doll is recommended to the Turner family as a form of transitional therapy to soothe Dorothy’s pain following the grief of losing her son Jericho. In the first episode, they hire a young nanny, Leanne, to take care of the reborn doll. What was supposed to be a temporary therapy turns into a new disturbing reality within which we see, through a path of psychologically insight, the different ways in which the protagonists face pain. Servant is a sort of Kammerspiel with very few characters and shot almost entirely within the elegant Turner home where Dorothy, excited about her impending return to work life (as a television presenter) warmly welcomes nanny Leanne. The latter is a disturbing, almost ghostly character, and is characterized by a deep religiousness2 that, together with the fact that she takes care of the doll both in Dorothy’s presence and absence, contributes to Sean’s anxieties, which result in investigations when the first wail of the Jericho reborn is heard. At this point, Sean, who initially seemed to be the only one who maintained a rational side, becomes alarmed and engages his brother-in-law Julian and a private investigator to get more information about the nanny and the possible origin of the baby.

2 Audiovisual Text Analysis

Biber (1989) distinguished between ‘external’ and ‘internal’ criteria for classifying texts to construct a corpus for linguistic analysis. External criteria are essentially non-linguistic, such as purpose, audience and activity type, while internal criteria are defined linguistically. Once the text is captured and subjected to analysis there will be a range of linguistic features that can contribute to its characterisation in terms of internal evidence, such as the distribution of words, and the lexical or grammatical features throughout the corpus. The analysis of the audiovisual text that we conducted on Servant aims to consider linguistic choices (internal criteria). However, it is not intended to be a complete linguistic examination, only an exploratory analysis to understand the potential of these tools for future investigations.

To evaluate the characterization of the dialogue in Servant we encoded an XML file with information about scenes, speech, speakers and receivers for all episodes of the first two seasons. We did this because “computation provides access to information in texts that we simply cannot gather using our traditionally qualitative methods of close reading and human synthesis” (Jockers and Thalken 2016: ix). Indeed, in the XML file we divided the text (the English dialogue) into scenes using ‘SCENE’ tags and used the ‘SPEAKER’ and ‘RECEIVER’ tags to mark who was talking (‘SPEAKER’) and who was being talked to (‘RECEIVER’). Each exchange between characters is encoded inside a ‘SPEECH’ tag (Figure 1). Because each speech act was encoded in this way, it is easy to see who talks to whom most often, and because it is possible to count and aggregate information about words, the content of the speeches can also be analysed.

Fig. 1. Example of tagging from the XML file.

We used the XML file and R Statistical Software to identify some main characteristics of the audiovisual text. We found that there were 45 speakers in the first season of Servant and 37 in the second. The characters speak a total of 1,829 and 1,825 times, respectively. The number of scenes with speech totals 259 for the first season and 272 for the second, while the number of scenes without speech equals 169 and 122, respectively. It would be interesting to amplify the corpus and compare this data with those of other audiovisual products. For example, if we consider the number of times that the characters speak in See we can already highlight a worthwhile result. The number is very similar to Servant (in See the 58 characters speak a total 1,878 times in 262 scenes) although the two series have very different formats: Servant has 30-minute episodes, while See’s run to about 60 minutes each. Although See has an episode length that is twice that of Servant, the number of times the characters speak is very similar.

Since it might be interesting to compare the two seasons of Servant in terms of the dominance of one character versus many others, we explored the relative frequencies by dividing each character’s raw count (i.e., the number of times each character spoke) by the total number of speakers (Table 1).

Tab. 1. Who speaks most? Relative frequencies of the top seven characters in Servant.
Season 1 Dorothy 27.39 % Sean 24.16 % Julian 14.27 % Leanne 14.05 % Natalie 2.84 % Uncle George 2.78 % Tobe 2.67 %
 
Season 2 Sean 25.91 % Dorothy 24.98 % Julian 15.39 % Leanne 9.31 % Uncle George 3.50 % Tobe 3.17 % Officer Reyes 2.95 %
 
Overall Dorothy 26.19 % Sean 25.04 % Julian 14.83 % Leanne 11.68 % Uncle George 3.14 % Tobe 2.92 % Natalie 2.51 %

In Table 1 small differences emerge between the two seasons. The overall amount (across both seasons) reveals that Dorothy accounts for 26.19 % of the speech acts in the series, followed by Sean (25.04 %) and Julian (14.83 %). Which character talks the most often is one point of interest, but we might also be curious about to whom that character talks, and how often. For this we need to know not just who the speaker is, but also who is the receiver of the specific speech act. We put all the speeches in a node set list object and reviewed the list, extracting the speaker receiver pairs from each item (Figure 2). The strong focus of the speeches on a few characters confirms the atypical Kammerspiel character of the TV series, which is often organized chorally or around a main character and a group of supporting characters. Considering Servant and See, one might speculate that Apple TV+ wanted to characterize its debut with some strongly unconventional products.

Fig. 2. Speaker-receiver pairs extracted from the XML file of the first and second seasons.

Using the speaker-receiver pairs’ data (Figure 2) we performed a Social Network Analysis (SNA) that considers the speakers as source nodes and the receivers as target nodes. We performed visualization and basic SNA (calculating three centrality indices: betweenness centrality, degree centrality, and modularity) in Gephi (Figure 3). The weighted link between the two nodes (source and target) is given by the number of times a dialogue takes place (Figure 2). Considering the overall amount of data extracted from the two seasons, we have 90 nodes and 244 links. In Figure 3 we show the social network of Servant. The four nodes that have the highest betweenness centrality and therefore the major ability to control the flow of information in the network are Dorothy, Sean, Leanne and ‘TV’. While we are not surprised by the centrality of Dorothy, Sean and Leanne, it is interesting to note that the television represents a crucial point in conveying the flow of information. Often, in the various phases of the story it is used to give updates and to advance the development of the narrative. The four nodes that have the highest degree centrality (an index that specifies well-connected nodes that can directly reach many nodes in the network) are Dorothy, Sean, Leanne and Julian (Figure 3). We also evaluated the modularity that specifies the strength of the division of the network and helps us to understand the structural connectivity among individual characters. Eight communities were identified by the different colours in Figure 3: the largest community (34.44% of nodes, in pink) is led by two female characters (Dorothy and Leanne), while the second largest (30% of nodes, in green) predominantly considers television-mediated narratives. Finally, the third most significant community (23.33% of nodes, in light blue) is led by the main male characters (Sean and Julian). Servant’s anomalous structure is further confirmed by SNA because it relates to two very small groups of characters with a different narrative perspective, characterized by a gender difference. The Kammerspiel theatrical aspect appears evident in the strong compactness of the network and in the fact that ‘TV’ is the medium through which a situation completely closed in the domestic space opens up and draws information from the outside world. The resultant structure is an extremely closed interaction between two main micro groups and a connection to the outside world through the ‘TV’ node. Thus, the comparative use of SNA could provide many insights into the dramatic structure of a product.

Fig. 3. The social network of Servant. Nodes represent the characters of the TV series while the links between them identify the number of times each pair of nodes (characters) have a dialogue. The colour of a node indicates the community to which it belongs based on the modularity index. The size of a node corresponds to its degree value as the size of its label.

3 Reception Text Analysis

This section deals with the analysis of the secondary text (audiences’ tweets) produced by Servant’s Twitter community. Twitter and online social networks more broadly provide the opportunity to generate and collect a huge amount of structured and unstructured data that can be used to extract useful information in many areas3 (Salampasis et al. 2011; Jin et al. 2013; Varol et al. 2014; Bruns et al. 2016; Mohapatra et al. 2019; Pano and Kashef 2020), including social science research (Watts 2007; Manovich 2018; Salganik 2019). The global Twitter platform is widely recognized as a particularly important one for public communication4 (Kwak et al. 2010; Osborne and Dredze 2014; Barberá et al. 2015; Bruns and Weller 2016; Jackson and Foucault Welles 2016). Many studies have dealt with the analysis of Twitter politics (for an updated review see Antonakaki et al. 2021). Recently, for example, Van Vliet and colleagues (2020) introduced the Twitter Parliamentarian Database to conduct a systematic and rigorous comparative and transnational analysis to understand how parliamentarians engage in politics on the social media platform. Our aim here is not to provide a comprehensive review of the major research themes and strategies for data analysis on Twitter (Antonakaki and colleagues (2021) have recently done excellent work in this direction). However, we would like to specify some of the Twitter research activities related to media studies.

Considering media studies, the collection and analysis of Twitter data related to users’ social activity allows, for example, an examination of the reception of audiovisual products by the audience (Giglietto 2013; Scharl et al. 2016; Hecking et al. 2017; Crespo-Pereira and Juanatey-Boga 2017; Williams and Gonlin 2017; Antelmi 2018; López et al. 2018; Pugsee et al. 2021), the relationship between fan activists and producers (Guerrero-Pico 2017), the role of the use of Twitter in media events (Clavio et al. 2013) and in general the role of social TV (Buschow et al. 2014) to provide insight into the niche of celebrity studies (Usher 2015; Özdemir Çakır 2017), to study over-the-top platform (Fernández Gómez and Martín Quevedo 2018) and linear broadcast (Navarro et al. 2021) marketing strategies, and to develop user profiling analyses and forecast TV audiences (Wakamiya et al. 2011; Hsieh 2013; Nielsen Media Research 2015; Molteni and De Leon 2016; Crisci et al. 2018). Despite the excitement about the affordances of Twitter,5 few studies have begun to address the formidable challenge of systematically collecting valid and representative data (Volkens et al. 2013; Tufekci 2014; Cihon and Yasseri 2016; Jungherr and Theocharis 2017; Antelmi et al. 2018).

Antelmi and colleagues (2018) propose a framework to characterise Twitter communities that is essentially composed of two parts: a semantic part that allows for an investigation of the content produced by a given community, developed on three levels (topic modelling, sentiment analysis, and cognitive analysis), and a quantitative part that provides insights into the behaviour and interaction patterns of users and is based on the identification of three metrics (activity, visibility, and metadata). In this work we focused on two aspects: from the semantic point of view, we considered sentiment analysis and from the quantitative point of view, user activity. We first consider the way we collected the data.

Twitter has made it increasingly difficult for researchers to gather large-scale datasets on user activity through the platform’s standard Application Programming Interface (API; Crisci et al. 2018: 12206; Antonakaki et al. 2021), pointing inquiries instead to its commercial data reseller (whose pricing is likely to be unaffordable for ordinary research projects; Burgess and Bruns, 2015). However, they have recently introduced a product track for academic research.6 Since it was not yet available when we carried out this research, we used a script in the Python programming language that could interact with the Twitter Streaming API to collect tweets about the selected TV series. Twitter’s API returns data in JSON format. In our experiment, we collected data from the opening of the Official Servant Twitter account and therefore we obtained a representative set of Servant tweets from 2019-07-01 to 2021-06-29. We collected tweets with the mention “@Servant”7 and downloaded tweets containing specific keywords. The selected keywords were the official hashtag of the TV series, as well as some possible hashtag derivatives. We gathered tweets containing the following pairs of keywords: “#Servant” and “AppleTV”, “#Servant” and “episode(s)”, “#Servant” and “series”, “#Servant” and “season(s)”, “#Servant” and “Philadelphia”, “#Servant” and “MNightShyamalan”, “#Servant” and “Julian”, and “#Servant” and “watching”. In total, we collected 18,115 tweets from 9,414 different users in the period from 2019-07-01 to 2021-06-29. We stored the data in MongoDB, a NoSQL database. We selected only the 15,341 English language tweets and called this selection “ServantDB”.

Our first goal was to analyse user activity. Assessing the number of daily activities is useful to identify any spike in the interaction pattern and its potential reason to detect and evaluate possible information propagation strategies. Figure 4a shows an increase in Twitter usage over the years. The use is concentrated in the period from November to March (Figure 4b), as we expected, since this period corresponds with the releases of the two seasons (see Table 2), particularly on Fridays and Saturdays (Figure 4c). Figure 4d shows engagements in social discursiveness related to the numbers of retweeted tweets.

Fig. 4. Distribution of ServantDB’s tweets across a) years; b) months; c) days; d) the number of retweets.

Our second goal was to analyse the tweets’ text from a semantic point of view through sentiment analysis.8 Antonakaki and colleagues (2021: 11) recently summarized two main research areas in sentiment analysis: “the first is to incorporate knowledge from external resources and the second is to measure the public opinion towards specific entities like persons, events and products”. This technique refers to a family of tools that are useful for detecting the semantic orientation of individual opinions and comments expressed in written texts. According to Pang and Lee (2008), sentiment analysis is a discipline at the crossroads of statistics, natural language processing, and computational linguistics with several practical applications in numerous domains (Liu 2012; Medhat et al. 2014). Its main goal is to classify texts written in natural language, considering their semantic polarity and distinguishing positive and negative forms through machine learning-based and lexicon-based approaches. In this paper we chose a lexicon-based approach.

Before the implementation of sentiment extraction, we performed classic pre-processing steps (tokenization, expansion of abbreviations, removal of stop words and other elements without lexical value, like URLs and mentions; see Pano and Kashef et al. 2020). In Figure 5, we show the sentiment evaluation of the ServantDB tweets using the syuzhet package (Jockers 2017) and the get_sentiment function, which assesses the polarity of each word or sentence. This function uses two arguments: a character vector (of sentences or words) and a ‘method’. There are four available sentiment extraction methods to employ (‘syuzhet’, ‘bing’, ‘AFINN’, ‘nrc’). All four of these lexicons are based on unigrams (single words). We selected the nrc lexicon9 that was developed by Turney and Mohammad (2010).

To grasp possible differences in audience reception, we divided the ServantDB tweets into five groups: before, during and after the release of the two seasons (Table 2). It turns out that during the release of the second season, Servant increased its impact considering the social discursiveness on Twitter in terms of number of tweets. Within this context we are guided by two main questions: (i) are there any differences in Twitter user activity within the five groups? and (ii) what are the sentiment scores – have they changed?

Tab. 2. The five groups of tweets identified for ServantDB from 2019-07-01 to 2021-06-29.
from to number of tweets
Pre-release 2019-07-01 2019-11-27 2,046
Season 1 2019-11-28 2020-01-17 3,672
After Season 1 2020-01-17 2021-01-15 3,731
Season 2 2021-01-15 2021-03-19 4,592
After Season 2 2021-03-19 2021-01-15 1,300

Considering daily user activity, Figure 5 shows that during the release of the two seasons Twitter user activity was not only greater in numerical terms than during the other phases, but also concentrated in the final days of the week. On the other hand, in the other phases, user activity was lower and more distributed throughout the week. If we consider the sentiment score, by breaking down the survey period, a positive sentiment seems to prevail in all groups except for the release of the first season. However, we must reflect on some limits of sentiment analysis. For example it is not clear whether the negative evaluation refers to the narrative and/or the characters (one tweet reads: “Rare is the day I see someone who’s so disgusting that I think they’ll be taking their intergalactic personal grossness with them into the afterlife, but Uncle George off #servant fits the greasy-grimy stinklines bill to a T”) or to the series itself (“For those who don’t already know, #Servant is a horrible show where nothing happens. What a travesty”). In addition, the series is often defined as creepy in a positive way (author Stephen King tweeted: “SERVANT, on Apple+: Extremely creepy and totally involving. Two episodes and I’m hooked”; another user said: “A creepy nanny is hired to care for a creepy doll in the full trailer for M. Night Shyamalan's Apple TV+ series”). In general, Servant seems to demonstrate good persistence given the high number of tweets between the first and second seasons.

Fig. 5. Each of the five groups is associated with two images. The image on the first line shows the abundance of Twitter user activity over the week in each period. The second row shows the graphs with sentiment scores based on the tweet content.

4 Sentiment Trajectory

Important insights into what is happening in the real world and what people think about a given event can emerge from analyses conducted on the sentiment of tweets. Bollen and colleagues (2011) found that events in the social, political, cultural, and economic spheres have a significant, immediate, and highly specific effect on various dimensions of public mood, suggesting that large-scale mood analyses can provide a robust platform for modelling collective emotional tendencies in terms of their predictive value relative to existing social and economic indicators. Considering the progression of Servant’s social discourse on Twitter over time (from 2019-07-01 to 2021-06-29) and applying sentiment analysis to each individual tweet, we focus on sentiment analysis on a document level: each tweet is considered as a single document and we intend to determine its sentiment score (polarity) by identifying its semantic orientation. If the average semantic orientation of the tweet is above a predefined threshold the document is classified as positive, otherwise it is judged to be negative. We visualize10 the variation of the sentiments of the public in Figure 6, which shows the emotional arc related to the progression of tweets over time using three different superimposed smoothing techniques to extract a meaningful underlying signal from noisy data: LOESS (LOcal regrESSion), Rolling Mean and DCT (Discrete Cosine Transform).11 It can be seen that after an increase in positive sentiment scores in the pre-release phase, during the first season there was a decrease in the level of liking while in the period between the first and second season there was an initial recovery and then a decrease that continued even in the initial release phase of the second season, which then continued in constant growth with a positive sentiment. The strong negative sentiment that accompanied the first season was probably related to a narrative dissatisfaction with a story whose end point was difficult to guess and which frustrated viewer expectations – this was overcome by narrative developments in the second season.

Fig. 6. Plot of the sentiment trajectory of the ServantDB.

5 Conclusions

The paper proposed an exploratory analysis to advance and guide the application of automatic tools for the study of audiovisual serial texts. In Servant, the narrative is developed through a peaceful rhythm, apparently reassuring and characterized by a peculiar use of dialogue (also distinguished by the high number of scenes without dialogue) and images. In this context, the viewer is drawn into the story by a sense of insecurity that sometimes translates into pure discomfort. The series revealed the peculiar traits of Shyamalan’s work. The analyses of the English dialogue (the primary text) of the first two seasons of Servant allowed us to highlight the potential of the methods used in two ways. Firstly, the XML file tagging process allowed for the identification and quantification of scenes, characters, and speaker-receiver pairs and potentially opens the way for in-depth linguistic analysis – which was not explored in this study. These aspects would be particularly interesting in a comparative analyses of audiovisual products belonging to the same genre (such as medical, legal or teen dramas). Secondly, the speaker-receiver pairs enabled the building of a social network. The Social Network Analysis revealed an application concerning the identification of gender bias in relation to the interaction between characters in term of dialogue frequencies. The application of this tool will potentially allow us to see how relationships among characters within audiovisual narratives are articulated and how they evolve. In this way if characters are organized in cluster by gender (characters tend to interact more with other characters of the same gender) we can determine whether there is some evolution during the narrative. In our case study, the modularity index showed a clear gender partition between the leading characters. Considering two of the eight communities detected (Figure 3), we identified the cluster defined by the main female characters (Dorothy and Leanne) and that defined by the main male characters (Sean and Julian). Near these two clusters, the crucial role of the green cluster related to the ability of television to mediate and advance the narrative. It would be interesting to carry out these analyses from a comparative point of view with serial products that also present a different format and larger casts to fully exploit the potential of SNA.

In addition, the paper proposed an analysis of the secondary text (tweets) through Twitter. Indeed, “by downloading huge numbers of tweets and using appropriate natural language and sentiment analysis techniques, it is possible to get an idea of the general mood about a specific topic of interest, in a given place and time” (Molteni and De Leon 2016: 221). Twitter research provides a starting point for studying communication and propagation of discourses related to TV series and social engagement mechanisms. In particular, the quantitative analysis of user behaviours, in terms of interaction patterns and typology of content posted, clearly illustrates the levels of activity among Servant’s Twitter community. Monitoring the tweets allowed us to determine how users’ interactions evolved in the period under investigation. Considering the audience and the secondary text produced by them, more work needs to be done to understand the distinction between series and story level engagement. The semantic evaluation of the trajectory of the sentiment score used the release period of the two seasons as its timeframe. We conducted an exploratory investigation of Servant and proposed a simplified framework for the analysis of an audiovisual serial’s Twitter communities that might be useful for future comparative investigation across a range of different case studies. In future work, we have the ambition to create a cross-media analysis that combines the internal organization of content in audiovisual products and their reception across large datasets drawn from social platforms.

References

Agarwal, Apoorv, Augusto Corvalan, Jacob Jensen and Owen Rambow (2012). “Social network analysis of Alice in Wonderland.” In Proceedings of the NAACL-HLT, workshop on computational linguistics for literature, 88-96.

Antelmi, Alessia, Delfina Malandrino and Vittorio Scarano (2019, May). “Characterizing the behavioral evolution of Twitter users and the truth behind the 90-9-1 rule.” In Companion Proceedings of The 2019 World Wide Web Conference, 1035-1038. https://doi.org/10.1145/3308560.3316705.

Antelmi, Alessia, John Breslin and Karen Young (2018, August). “Understanding user engagement with entertainment media: a case study of the twitter behaviour of Game of Thrones (GoT) fans.” In 2018 IEEE Games, Entertainment, Media Conference (GEM), 1-9. IEEE, 2018. https://doi.org/10.1109/GEM.2018.8516505.

Antonakaki, Despoina, Paraskevi Fragopoulou and Sotiris Ioannidis (2021). “A survey of Twitter research: Data model, graph structure, sentiment analysis and attacks.” Expert Systems with Applications 164: 114006. https://doi.org/10.1016/j.eswa.2020.114006.

Barberá, Pablo, Ning Wang, Richard Bonneau, John T. Jost, Jonathan Nagler, Joshua Tucker and Sandra González-Bailón (2015). “The critical periphery in the growth of social protests.” PLoS one 10(11): 1-15. https://doi.org/10.1371/journal.pone.0143611.

Bernardelli, Andrea (2021). “L’immaginario religioso nelle serie tv. Generi e generazioni.” Ocula 22(25): 197-212., giugno 2021. https://doi.org/10.12977/ocula2021-14.

Biber, Douglas (1989). “A typology of English texts.” Linguistics 27: 3-43.

Bollen, Johan, Huina Mao and Alberto Pepe (2011). “Modeling public mood and emotion: Twitter senti-ment and socio-economic phenomena.” In Proceedings of the international AAAI conference on web and social media.

Bruns, Alex, Brenda Moon, Felix Münch and Troy Sadkowsky (2017). “The Australian Twittersphere in 2016: Mapping the follower/followee network.” Social Media+ Society 3(4): 1-15. https://doi.org/10.1177/2056305117748162.

Bruns, Axel and Katrin Weller (2016). “Twitter as a first draft of the present: and the challenges of preserving it for the future.” In Proceedings of the 8th ACM Conference on Web Science, edited by Nejdl, W. et al., 183-189. Hannover, Germany: ACM Press. https://doi.org/10.1145/2908131.2908174.

Burgess, Jean and Axel Bruns (2015). “Easy data, hard data: The politics and pragmatics of Twitter research after the computational turn.” In Compromised Data: From Social Media to Big Data, edited by Langlois, G. et al., 93-111. New York, NY: Bloomsbury Academic.

Buschow, Christopher, Beate Schneider and Simon Ueberheide (2014). “Tweeting television: Exploring communication activities on Twitter while watching TV.” Communications 39(2): 129-149. https://doi.org/10.1515/commun-2014-0009.

Cihon, Peter and Taha Yasseri (2016). “A biased review of biases in Twitter studies on political collective action.” Frontiers in Physics 4: 1-34. https://doi.org/10.3389/fphy.2016.00034.

Clavio, Galen, Patrick Walsh and Ryan Vooris (2013). “The utilization of Twitter by drivers in a major racing series.” International Journal of Motorsport Management 2(1).

Crespo-Pereira, Verónica and Óscar Juanatey-Boga (2017). “Spanish TV Series on Twitter: What Social Media Audiences Say.” In Media and Metamedia Management, edited by Campos Freire, F. et al., 435-440. Cham: Springer.

Crewe, Dave (2019). “M Night Shyamalan.” Screen Education (95): 86-99. https://search.informit.org/doi/10.3316/informit.927592279526637 (last accessed 30-08-21).

Crisci, Alfonso, Valentina Grasso, Paolo Nesi, Gianni Pantaleo, Irene Paoli, and Imad Zaza (2018). “Predicting TV programme audience by using twitter based metrics.” Multimedia Tools and Applications 77(10): 12203-12232. https://doi.org/10.1007/s11042-017-4880-x.

Elkins, Katherine and Jon Chun (2019). “Can Sentiment Analysis Reveal Structure in a Plotless Novel?” arXiv preprint arXiv:1910.01441

Ercolessi, Philippe, Christine Sénac and Hervé Bredin (2012). “Toward plot de-interlacing in tv series using scenes clustering.” In 2012 10th international workshop on content-based multimedia indexing (CBMI), 1-6.

Fernández Gómez, Erika and Juan Martín Quevedo (2018). “Connecting with audiences in new markets: Netflix s Twitter strategy in Spain.” Journal of media business studies 15(2): 127-146. https://doi.org/10.1080/16522354.2018.1481711.

Giachanou, Anastasia and Fabio Crestani (2016). “Like it or not: A survey of twitter sentiment analysis methods.” ACM Computing Surveys (CSUR) 49(2): 1-41. https://doi.org/10.1145/2938640.

Giglietto, Fabio (2013). “Exploring correlations between TV viewership and twitter conversations in Italianpolitical talk shows.” http://dx.doi.org/10.2139/ssrn.2306512.

Guerrero-Pico, Mar (2017). “# Fringe, audiences and fan labor: Twitter activism to save a TV show from cancellation.” International Journal of Communication 11: 2071–2092.

Hecking, Tobias, Vania Dimitrova, Antonija Mitrovic and H. Ulrich Hoppe (2017). “Using network-text analysis to characterise learner engagement in active video watching.” In Proceedings of the 25th International Conference on Computers in Education, edited by Chen, W. et al., 326-335.

Hsieh, Wen-Tai, Seng-cho T. Chou, Yu-Hsuan Cheng and Chen-Ming Wu (2013, October). “Predicting tv audience rating with social media.” Workshop on Natural Language Processing for Social Media (SocialNLP), 1-15.

Jackson, Sarah J. and Brooke Foucault Welles (2016). “# Ferguson is everywhere: Initiators in emerging counterpublic networks.” Information, Communication & Society 19(3): 397-418. https://doi.org/10.1080/1369118X.2015.1106571.

Jin, Long, Yang Chen, Tianyi Wang, Pan Hui and Athanasios V. Vasilakos (2013). “Understanding user behavior in online social networks: A survey.” IEEE Communications Magazine 51(9): 144-150.

Jockers, Matthew (2017). “Syuzhet: Extracts Sentiment and Sentiment-Derived Plot Arcs from Text (Version 1.0. 1).” https://cran.r-project.org/web/packages/syuzhet/index.html (last accessed 30-08-21).

Jockers, Matthew Lee and Rosamond Thalken (2014). Text analysis with R for students of literature. New York: Springer.

Jungherr, Andreas and Yannis Theocharis (2017). “The empiricist’s challenge: Asking meaningful questions in political science in the age of big data.” Journal of Information Technology & Politics 14: 97-109. https://doi.org/10.1080/19331681.2017.1312187.

Kucher, Kostiantyn, Carita Paradis and Andreas Kerren (2018, February). 2The state of the art in sentiment visualization." Computer Graphics Forum 37(1): 71-96. https://doi.org/10.1111/cgf.13217.

Kwak, Haewoon, Changhyun Lee, Hosung Park and Sue Moon (2010). “What is Twitter, a social network or a news media?” In Proceedings of the 19th International Conference on World Wide Web, 591-600. https://doi.org/10.1145/1772690.1772751.

Liu, Bing (2012). “Sentiment analysis and opinion mining.” Synthesis lectures on human language technologies 5(1): 1-167. https://doi.org/10.2200/S00416ED1V01Y201204HLT016.

López, Soledad Ruano, Javier Trabadela Robles and M. Rosario Fernández Falero (2018). “Research Methodology To Study The Audiences Of Television Series Analyzed By Facebook And Twitter.” In Congreso universitario internacional sobre la comunicación en la profesión y en la universidad (CUICIID), 165-167.

Manovich, Lev (2018). “Digital traces in context| 100 billion data rows per second: Media analytics in the early 21st century.” International journal of communication 12: 473-488.

Medhat, Walaa, Ahmed Hassan and Hoda Korashy (2014). “Sentiment analysis algorithms and applications: A survey.” Ain Shams engineering journal 5(4): 1093-1113. https://doi.org/10.1016/j.asej.2014.04.011.

Mohammad, Saif and Peter Turney (2010). “Emotions Evoked by Common Words and Phrases: Using Me-chanical Turk to Create an Emotion Lexicon.” In Proceedings of the NAACL-HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, 26-34. Los Angeles: Association for Computational Linguistics. https://aclanthology.org/W10-0204/ (last accessed 30-08-21).

Mohapatra, Shubhankar, Nauman Ahmed and Paulo Alencar (2019, December). “KryptoOracle: A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments.” In 2019 IEEE International Conference on Big Data (Big Data), 5544-5551. https://doi.org/10.1109/BigData47090.2019.9006554.

Molteni, Luca and J. Ponce De Leon (2016). “Forecasting with twitter data: an application to Usa Tv series audience.” International Journal of Design & Nature and Ecodynamics 11(3): 220-229. https://doi.org/10.2495/DNE-V11-N3-220-229.

Moretti, Franco (2011).“Literary Lab Pamphlet 2: Network Theory, Plot Analysis Retrieved.” http://litlab.stanford.edu/LiteraryLabPamphlet2.pdf (last accessed 27-05-22).

Napoli, Philip M. (2013). “Social TV engagement metrics: An exploratory comparative analysis of competing (Aspiring) market information regimes.” Association for Education in Journalism & Mass Communication Washington, DC August. http://dx.doi.org/10.2139/ssrn.2307484.

Navarro, Celina, Matilde Delgrado, Elisa Paz, Nuria Garcia-Muñoz and Alba Mendoza (2021). “Comparative analysis of the broadcaster’s Twitter strategies of the highest-rated British and Spanish TV series.” Catalan Journal of Communication & Cultural Studies 13(1): 101-119. https://doi.org/10.1386/cjcs_00041_1.

Nielsen Media Research (2015). “Must see TV: how twitter activity ahead of fall season premieres could indicate success.” http://www.nielsen.com/us/en/insights/news/2015/must-see-tv-how-twitter-activity-ahead-of-fall-season-premieres-could-indicate-success.html (last accessed 30-08-21).

Osborne, Miles and Mark Dredze (2014). “Facebook, Twitter and Google Plus for breaking news: Is there a winner?” In Proceedings of the 8th International AAAI Conference on Web and Social Media, 611-614. Palo Alto, CA: AAAI Press. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8072 (last accessed 30-08-21).

Özdemir Çakır, Hilal (2017). “Interpersonal relationships of celebrities in social media: a content analysis of Turkish tv series leading actors’ instagram and twitter messages.” https://hdl.handle.net/11467/1599 (last accessed 30-08-21).

Pang, Bo and Lillian Lee (2008). “Opinion Mining and Sentiment Analysis.” Foundations and Trends in Information Retrieval 2: 1-135.

Pano, Toni and Rasha Kashef (2020). “A Complete VADER-Based Sentiment Analysis of Bitcoin (BTC) Tweets during the Era of COVID-19.” Big Data and Cognitive Computing 4(4): 1-33. https://doi.org/10.3390/bdcc4040033.

Pugsee, Pakawan, Tanasit Rengsomboonsuk and Kawintida Saiyot (2021). “Sentiment Analysis for Thai dramas on Twitter.” Naresuan University Journal: Science And Technology (NUJST) 30(1): 18-29. http://www.journal.nu.ac.th/NUJST/article/view/Vol-30-No-1-2022-18-29 (last accessed 30-08-21).

Salampasis, Michail, Georgios Paltoglou and Anastasia Giachanou (2014). “Using social media for continuous monitoring and mining of consumer behaviour.” International Journal of Electronic Business 11(1): 85-96. https://doi.org/10.1504/IJEB.2014.057905.

Salganik, Matthew J. (2019). Bit by bit: Social research in the digital age. Princeton: Princeton University Press.

Scharl, Arno, Alexander Hubmann-Haidvogel, Alistair Jones, Daniel Fischl, Ruslan Kamolov, Albert Weichselbraun and Walter Rafelsberger (2016). “Analyzing the public discourse on works of fiction–Detection and visualization of emotion in online coverage about HBO’s Game of Thrones.” Information processing & management 52(1): 129-138. https://doi.org/10.1016/j.ipm.2015.02.003.

Schwartzel, Erich (2019, July) “Coming to a Streaming Service Near You: Shows Costing as Much as Big-Budget Movies.” The Wall Street Journal. https://www.wsj.com/articles/coming-to-a-streaming-service-near-you-shows-costing-as-much-as-big-budget-movies-11562751000(last accessed 30-08-21).

Tufekci, Zeynep (2014). “Big questions for social media big data: Representativeness, validity and other methodological pitfalls.” In Eighth international AAAI conference on weblogs and social media, 505-514.

Usher, Bethany (2015). “Twitter and the celebrity interview.” Celebrity studies 6(3): 306-321. https://doi.org/10.1080/19392397.2015.1062641.

van Vliet, Livia, Petter Törnberg and Justus Uitermark (2020). “The Twitter parliamentarian database: Analyzing Twitter politics across 26 countries.” PLoS one 15(9). https://doi.org/10.1371/journal.pone.0237073.

Varol, Onur, Emilio Ferrara, Christine L. Ogan, Filippo Menczer and Alessandro Flammini (2014). “Evolution of online user behavior during a social upheaval.” In Proceedings of the 2014 ACM conference on Web science, 81-90. https://doi.org/10.1145/2615569.2615699.

Volkens, Andrea, Pola Lehmann, Nicolas Merz, Sven Regel, Annika Werner (2013). “The Manifesto Data Collection. Manifesto Project (MRG/CMP/MARPOR).” Version 2013b. Berlin: Wissenschaftszentrum Berlin für Sozialforschung (WZB). https://doi.org/10.25522/manifesto.mpds.2013b.

Wakamiya, Shoko, Ryong Lee and Kazutoshi Sumiya (2011). “Towards better TV viewing rates: exploiting crowd's media life logsover twitter for TV rating.” In Proceedings of the 5th international conference on ubiquitous information management and communication. https://doi.org/10.1145/1968613.1968661.

Watts, Duncan J. (2007). “A twenty-first century science.” Nature 445(7127), 489-489. https://doi.org/10.1038/445489a.

Weng, Chung-Yi, Wei-Ta Chu and Ja-Ling Wu (2007). “Movie analysis based on roles' social network.” In 2007 IEEE International Conference on Multimedia and Expo, 1403-1406.

Weng, Chung-Yi, Wei-Ta Chu and Ja-Ling Wu (2009). “Rolenet: Movie analysis from the perspective of social networks.” In IEEE Transactions on Multimedia 11(2): 256-271.

Williams, Apryl and Vanessa Gonlin (2017). “I got all my sisters with me (on Black Twitter): second screening of How to Get Away with Murder as a discourse on Black Womanhood.” Information, Communication & Society 20(7): 984-1004. https://doi.org/10.1080/1369118X.2017.1303077.

Zimbra, David, Ahmed Abbasi, Daniel Zeng and Hsinchun Chen (2018). “The state-of-the-art in Twitter sentiment analysis: A review and benchmark evaluation.” ACM Transactions on Management Information Systems (TMIS) 9(2): 1-29. https://doi.org/10.1145/3185045.


  1. According to a report by the Wall Street Journal (Schwartzel 2019), each episode cost nearly $15 million, making it one of the biggest budgets ever for a TV series. Game of Thrones (HBO, 2011-2019) only had a similar budget for its final season.↩︎

  2. Bernardelli (2021) highlights how religious themes have recently emerged in the production of contemporary TV series.↩︎

  3. The Web of Science Core Collection contains 3,805 articles with “Twitter” as a topic produced in 2020. See Antonakaki et al. (2021) for a comparison of the number of scientific publications containing the words “Twitter” and “Facebook” in their title from 2006 until 2020, as stated on Google Scholar.↩︎

  4. Bruns et al. (2017: 1) identify the three key factors that allow Twitter to be a widely recognized global platform for public communication.↩︎

  5. According to Antonakaki et al. (2021: 18), “although 30% of the world population is on Facebook, researchers most of the time use Twitter, in order to quickly assess the public opinion, sentiment, trend or belief regarding a subject of interest”.↩︎

  6. See https://developer.twitter.com/en/products/twitter-api/academic-research↩︎

  7. Mentions are tweets in which users use the ‘@’ symbol to tag the official Servant Twitter account.↩︎

  8. See Giachanou and Crestani (2016) and Zimbra et al. (2018) for a review of techniques and algorithms that have been proposed for sentiment analysis on Twitter.↩︎

  9. A list of English terms (with the possibility of selecting different languages) associated with a positive or negative polarity as well as one of eight emotions (anger, anticipation, disgust, fear, joy, sadness, surprise, and trust).↩︎

  10. See Kucher et al. (2018) for a discussion about insights and opportunities in sentiment visualization.↩︎

  11. See Elkins and Chun (2019) for a detailed discussion about the three different superimposed smoothing techniques.↩︎