Cinergie – Il cinema e le altre arti. N.13 (2018)
ISSN 2280-9481

“Commutation Tricks” and “Forced Marriages.” Videographic Manipulation as a Tool for the Analysis of Music in Films

Emilio AudissinoUniversity of Southampton (UK)

A film scholar and a film musicologist, Emilio Audissino (University of Southampton) holds one PhD in History of Visual and Performing Arts from the University of Pisa, Italy, and one PhD in Film Studies from the University of Southampton, UK. He specialises in Hollywood and Italian cinema, and his interests are film analysis, film style and technique, comedy, horror, and film sound and music. He has published journal articles, book chapters, and encyclopedia entries on the history and analysis of films from the silent era to contemporary cinema. He is the author of the monograph John Williams’s Film Music: ‘Jaws,’ ‘Star Wars,’ ‘Raiders of the Lost Ark’ and the Return of the Classical Hollywood Music Style (University of Wisconsin Press, 2014), the first book-length study in English on the composer. His book Film/Music Analysis. A Film Studies Approach (Palgrave MacMillan, 2017) concerns a method to analyse music in films that blends Neoformalism and Gestalt Psychology.

Submitted: 2018-03-01 – Accepted: 2018-06-12 – Published: 2018-07-12


The audiovisual object is like a product of visuals and sound, not a summation. Film analysis should track back to the separate factors and show how they multiply each other when combined. Music, in particular, can be an elusive element, especially for non specialists. But music can crucially perform a variety of functions in films, and its agency should therefore be taken in proper consideration during film analysis. After a preliminary discussion of some recent theory of how visuals and sound fuse into an audiovisual whole, the article presents the manipulations made possible by the videographic approach as a helpful analytical tool to draw attention to the music. As an example, I provide my own videographic manipulations: an annotated audiovisual clip of the opening sequence of The Shining, and the same clip with the original music replaced with John Williams’s ‘Main Theme’ from Jurassic Park.

Keywords: film analysis; film music; audiovisual studies; The Shining; Gestalt Theory.

Examinations of videography or audiovisual technology applied to Film Studies often take the move from, or cite at some point, Raymond Bellour’s 1975 essay “The Unattainable Text” (Bellour 1975).1 There, Bellour brings the attention to the fact that, if it is ‘obvious enough’ that the film is a text in the Barthesian sense, said text is yet “unattainable,” not so much because of the difficulties that the analyst faces in retrieving a good quality copy to work on, as because of the very nature of the film medium (Bellour 1975: 19). Unlike literary texts, the film text cannot be quoted. Unlike, say, a fragment of poetry that can be literally lift from the analyzed work and included in the critical essay, when a film is ‘quoted’ that is not really a quotation but a ‘transmediation’: from an audiovisual medium film must be transformed into a written medium. In the subsequent “Analysis in Flames” Bellour further elaborates on the conundrum, to the point of stating that ‘“film analysis has finally become an art without a future. The fact is that has never been, in itself, anything more than an illusory object,” and the film is “an elusive body.” (Bellour 1985: 54) The home-video systems that, at the time of writing might have seemed to have overcome the 1975 difficulty of accessing a study copy, have actually killed the analysis. When one pauses the film s/he is working on a frame, an image, not on the film: “the freeze frame which moves the film closer to the book, is a turning of pages. But struggling against the ‘natural’ procession of the images, it is also more: a game, a permutation, a diversion [derive]… A creation set adrift.” (Bellour 1985: 55) Then, what remains of so-called ’film analysis’ is actually a creative re-elaboration of the filmic materials — “free gestures […] a creation set adrift.” By citing the works of Thierry Kuntzel or some educational television shows, Bellour envisions the possibility that, as literature has its analyses expressed through its own medium, so will cinema, at some point:

Theory has not really been able to arrive at the image — to speak, to hold to, to live by the image; infinitely less has it been able to retain the image in its words. Perhaps this union of theory and image is an impossible marriage. Yet I continue to believe in the surprises that could arise, at this level, from encounters of the word and the image. (Bellour 1985: 56)

Videographic Film Studies seem to be one of those ‘surprises’ that Bellour was hoping for. Technology has now reached such a level not only of sophistication and flexibility but also of user-friendliness and low cost that it is actually possible to produce scholarship not in written but in audiovisual form.2 The text has finally become ‘attainable’ (Parker and Parker 2011; Newman 2016).

But what is a film text for Bellour? It is mostly images. His wording betrays that the visual is the main, if not only, concern: “iconicity;”…" Theory has not really been able to arrive at the image [emphasis mine]"… But the film text is more than image; film is an audiovisual composite, and sound played an important part in it even at that time in which cinema was said to be ‘silent’ — see Altman 2004. Bellour’s inattentiveness for the aural aspects has been pinpointed by Peter Larsen as regards Bellour’s famous segmentation of the car dialogue in The Big Sleep (1946) (Bellour 1974), which is flawed by its lack of consideration for the music: the audiovisual construction of the scene suggests a different segmentation than the visually restricted one proposed by the French analyst (Larsen 2005: 110–14). This disregard for sound is quite common amongst film scholars, as I have detailed elsewhere (Audissino 2017b: 2–8), and still ongoing, despite the audiovisual paradigm in Film Studies having being launched in the early 1980s.3 For example, the recent survey of international film theory by D’Aloia and Eugeni (D’Aloia and Eugeni 2017) is, again, typically visual-centred: not one chapter is devoted to audiovisual theory, even if ‘Sound Studies’ constitutes a lively branch of the disciplines, with recent contributions being Chion 2009, Donnelly 2014, Kulezich-Wilson 2015, to name a few. Neurosciences and cognitivism, which appear to be a strong interest in D’Aloia and Eugeni’s volume, have invested considerable efforts in the study of how image and sound combine into the audiovisual percept — for example, Tan et al. 2013. Why are film scholars so often neglectful of sound, and of music in particular? Besides a cultural/historical visual bias and disciplinary boundaries that are implicitly enforced (‘music is a musicologist’s concern, not ours’), and despite the much advertized benefits of ‘interdisciplinarity,’ probably the most potent hindrance to a wholly audiovisual take on film has been the volatile nature of sound (see Audissino 2017b: 3–4).

If the visual text is unattainable and its body elusive, even more so is the sound text. Apart from dialogue — which can be naturally transcribed into words and poses few issues of quotability — music and sound effects are frustratingly immaterial. If one stops a film to analyze a segment, visually s/he obtains a freeze frame, which is not the film but it is still something; aurally, s/he obtains nothing. One can freeze images but cannot freeze sound. There have been attempts to ‘transmediate’ the sound components on the page of an essay. For reasons of space, hereafter I shall focus on music, because music — unlike sound effects — has a tradition and a rich set of conventions to transmediate the aural phenomena into written forms, and hence it may seem to come with a ready-made solution. Here is an example, by the music theorist Frank Lehman, of a musical piece transmediated into words, Hans Zimmer’s chorale from Gladiator (2000):

This sense of collapse is reinforced by the trajectory of the bass line […]. Together with the other voices, it articulates three sinking gestures. In the first of these, it drops from ^5 to ^1 to form a weak imperfect authentic cadence in D minor. It then sinks further from ^1 to ^5 for a Phrygian half cadence. The final ^3 to ^5 to ^1 motion in the last gesture [forms] a perfect authentic cadence that sums up the affect of the cue. (Lehman 2017: 39)

The transmediation issue is, actually, not that easily sorted out. As can be seen, this language is rather specialistic. If it passes a rather punctual description of the musical structure and development to those able to decipher it — although still failing to convey the full range of expressive musical qualities, or “secondary parameters” (Meyer 1996: 209, 322), like agogics, dynamics, timbral colour… — it sounds rather obscure to those lacking a musical education. Even more important, this and other descriptions typically examine music in isolation from the film and do not say what music does within, how it combines with the visuals. There have also been attempts to provide a more comprehensive transmediation, showing how the visual and sound tracks interact. The classical example is Sergei Eisenstein’s Alexander Nevsky diagram (Eisenstein 1957: 175–216, 283), in which a series of stills from the film are paired with a reproduction of the musical score and with a graph converting the audiovisual dynamics into a wave line. This diagram has been amply criticized (see for example Adorno and Eisler 2007: 104–07; Prendergast 1977: 211–14; Donnelly 2014: 52), its flaw basically being a visual-biassed treatment of music: Eisenstein sees music as signs on the score, not as the sound resulting from it being played. A more recent attempt is Rick Altman’s “mise-en-bande” diagram (Altman 2000), which strives more systematically to incorporate not only music but all the other elements of the soundtrack as well, but it is equally ineffective in rendering on the page the unmediatable rich aural qualities of music.

Videography is perhaps the long-awaited solution: it allows analysts to quote not only the (visual) ‘film text’ without a transmediation, but music too. In videographic scholarship, we do not need musical education to decode from a highly technical jargon what type of music we are talking about; we can listen to the music itself. We do not need inventive diagrams and tables to visualize how sound elements interact with the visuals; we can appreciate it directly. Videographic scholarship can facilitate an approach to film that is, finally, more audiovisual than simply visual, and in particular in which film music can be handled with more confidence even by those who are not full-fledged musicologists. Examples of videographic scholarship applied to film music abound online — for example, Nicholas Kmet’s Structure in Film & Film Music, which precisely sets to show the importance of music in the structuring of the film, against the visual bias (, last accessed 25-05-2018); or Dan Golding’s A Theory of Film Music, which discusses why the music for recent blockbuster films sounds increasingly derivative and unimpressive (, last accessed 25-05-2018). An entire channel on the video-sharing platform — ‘FilmScoreAnalysis’ — is devoted to such format of analysis ( In the final part of this essay I shall provide my own example of videographic manipulation aimed at highlighting the impact of the music on the audiovisual whole. Beforehand, though, it is important to lay some basis about how to conceptualize the music/visual interaction.

Recent film scholarship that pays attention to music and sound emphasizes the fusion and reciprocal transformation of the aural and visual elements within the unified audiovisual object. K. J. Donnelly, in his study on synchronization, has employed the McGurk effect to explain this audiovisual fusion (Donnelly 2014: 17–24). In the McGurk experiment a close up of a mouth uttering the sound /ga-ga/ is coupled with the sound /ba-ba/: the resulting effect is that we see/hear the mouth emitting the sound /da-da/ (McGurk and MacDonald 1976). The experiment demonstrates that perception is a matter of fusion and reciprocal modification of elements, and the final percept is something different from the elements in their isolation.4 Building upon Donnelly’s Gestalt-based theory, I have offered my own theorization.5 To eschew what I call a “separatist conception” in which music and visuals are studied separately or are not studied as to their close interaction within the film’s system, I conceptualize the film segment to be analyzed as a ‘macro-configuration’ that is the product of the fusion of two or more ‘micro-configurations.’ I use the word ‘configuration’ to evoke the term ‘Gestalt,’ often translated as ‘form,’ to which I prefer ‘configuration’ in order to avoid any confusion between the Gestalt concept and the one from stylistic/formalist analysis — the film form, a film’s organized system. The visual-biassed separatist conception sees image as the key element, whereas music only ‘adds’ something extra to a meaning that, basically, is already in the images. An example is Noël Carroll’s view of film music functioning as adverbs do (Carroll 1996). Even Michel Chion — a pre-eminent scholar of the audiovisual phenomenon who theorized “synchresis,” a synthesis of image and sound that happens through synchronization — sometimes shows the tendency to still assign pre-eminence to the image, as when he writes that “sound enriches a given image” or mentions the “added value” that music projects onto images (Chion 1994: 5). Instead of addition and sum, I prefer the words fusion and product. In Gestalt theory a configuration is the stabilized organization of components that produce a ‘whole,’ which is something else than the single elements and their mere summation (Koffka 1935: 176). Our brain seeks patterns and stabilization until it organizes the separate stimuli into a unitary experience that makes sense perceptively, emotionally, and cognitively (see Audissino 2017a). As viewers/listeners, we experience the film as an audiovisual whole (macro-configuration), and each of its components (micro-configurations) is transformed by and, in turn, transforms the others as they interact. When music is coupled with the images, it attaches some of its qualities onto the visuals, and in turn it absorbs some of the qualities of the visuals (see Audissino 2017b: 110–118). Both transform one another in the process, and both concur in producing the whole.

Noticing the music is the first step to study what the music does. This may sound obvious, but music does largely go unnoticed. Film scholars might ignore it for the reasons I have sketched above, but the general audience as well is mostly unaware of the music’s presence — see Cohen 2000: 366. The titles of famous studies — Unheard Melodies, The Soul of Cinema, The Invisible Art of Film Music… — signal the almost subliminal agency of most mainstream film music, particularly Hollywood’s (see Gorbman 1987). In cognitivist parlance, when music is “associated” with the visuals and “congruent,” we do not take notice of its presence (Cohen 2013); in Gestalt parlance, when music is “isomorphic” with the visuals, we perceive the macro-configuration, not its separate components (micro-configurations). A good way to take notice of the music is to detach it from the visuals, or to replace on the same visuals the original music with different music, in order to reveal what the musical factor exactly brings to the audio/visual product. Analysis, then, is a sort of a reverse-engineering operation, in which we take apart the elements of the macro-configuration, we retrieve the factors that originated the product, and study how they interact and why they produce that specific outcome. Videographic manipulations in this sense has been done, and demonstrate music’s powerful contribution. The 50th anniversary DVD edition of Psycho (1960) includes a featurette that shows the notorious shower scene without its iconic Bernard Herrmann score (, last accessed 20-05-2018). Without music, the scene presents a starker realism, and the sound effects jump to the foreground, with the noise of the knife piercing the flesh, Marion’s moans, and the “anempathetic” (Chion 1994: 8–9) shower water that keeps running. To me, as such, the scene suggests stronger sexual innuendos, the moaning and sighs being aurally prominent as the knife penetrates the female body, and the slow collapse of Marion, accompanied only by the liquid sound of the running water, acquires a static (almost ecstatic) quality reminiscent of a post-orgasmic ‘resolution phase.’ The music, if analyzed separately, is exceedingly strident, with high-register violins playing marcato dissonances during the first part, and then dramatic descending gestures by the basses and cellos until the piece ‘dies.’ Coupled with the images, the shrieking stabs of the violins coalesce with the knife hits of the killer, not only to materialize the sense of acute pain conveyed by their ‘painful’ high-register dissonance — a sense of pain that is not really conveyed by Marion’s moans — but also to express a sense of murderous fury given by the encounter of the rampaging conduct of the music and the hectic editing pace; then, the descending gestures of the strings make us perceive Marion’s life fading away, as she slowly collapses. A similar example of ‘music silencing’ that is worth at least mentioning is one that removes John Williams’s music from the finale of Star Wars (1977) to show how slower and duller the whole scene looks without the ‘pomp and circumstance’ contributed by the driving and uplifting score (, last accessed 20-05-2018).

A different type of manipulation is the one in which the original music is replaced with other music of the opposite sign, to observe how the product changes if one of the factors is modified. This type of experiment has been described by both Michel Chion and Philip Tagg, music scholars equally engaged in ways to analyze music that distance themselves from the formalistic, jargon-heavy traditional approaches — Tagg, for example, specializes in the semiotics of popular music for “non musos,” a term he employs to indicate people who consume music without having an expert’s musical education (Tagg 2012). Tagg calls this substitution “commutation trick” and thus describes it and its rationale:

I have attempted to focus attention, as tangibly as possible, on music’s ability to bring about radical changes in our interpretation of the images it accompanies. […] That power is both manifest and elusive, and it is necessary to identify this contradiction if we wish to address the question of manipulation in relation to music and the moving image […] [T]wo different music scores create two completely different narratives from the same visual sequence. (Tagg 1999: online)

Chion calls the operation “forced marriage,” performed to demonstrate what the “expressive and informative […] added value” is that music can pass onto the images, “so as to create the definitive impression, in the immediate or remembered experience one has of it, that this information or expression ‘naturally’ comes from what is seen, and is already contained in the image itself.” (Chion 1994: 5) Continues Chion: “Changing music over the same image dramatically illustrates the phenomena of added value, synchresis [synthesis through synchronization], sound-image association, and so forth. By observing the kinds of music the image ‘resists’ and the kinds of music cues it yields to, we begin to see the image in all its potential signification and expression.” (Chion 1994: 188–89) A hilarious application of this musical permutation can be found online, in a ‘mashup’ (see van den Berg and Kiss 2016: chapter II) of the door-axing sequence from The Shining (1980). Being a mashup, the modification of the original materials goes beyond the simple musical substitution, and also involves an intervention on the image track. Specifically, the climatic axe-wielding confrontation between the deranged Jack Torrance and his terrified wife Wendy is turned into a silent comedy skit: the colour is turned to black and white, the projection speed accelerated, title cards are interpolated, and cartoonish sound effects and canned laughs are added. The music is the key ingredient to subvert the original: the sequence is accompanied by ‘Spider’ Rich and Boots Randolph’s 1958 Yakety Sax. The piece is a polka-like break-neck fast tour de force for tenor saxophone and pop orchestra. Its lively pace and carefree mood fuse with the accelerated actions and Jack Nicholson’s emphatic performance, and turn the horrifying attack into a clownish gag. Moreover — for those who grasp the reference6 — the music also contributes an intertextual association that carries an additional comic quality: it was the signature theme of The Benny Hill Show (1969–1989), specifically used at the end of each episode over an accelerated chase sequence that revived those of the silent cinema of Mack Sennett and the Keystone Cops.7 Yakety Sax completely changes the final result, de-emphasising the threatening micro-configurations of the original and, on the contrary, infusing a comic-chase quality over the entire piece.8

Coming to my own “commutation trick,” I remain in the territory of The Shining but focus on the opening sequence. This segment presents interesting advantages: music is the only aural element, which makes it easier for the analyst’s attention to focus on it without any other distractions such as dialogue or sound effects; music is very noticeable, not only because it is foregrounded and devoid of any other competing aural element, but above all because it is (seemingly) non-isomorphic with the image track: there is a contrast between what we see and what we hear; finally, given that the music is not intertwined with other sound elements, its replacement is technically easy. Elsewhere, I claimed that the Burkian Sublime at the base of the Romantic aesthetic — a contradictory mix of fascination and dread for something that is both spectacular and life-threatening, like Nature’s powerful phenomena — is diffusely present in The Shining and, in particular, is presented as a theme from the outset, during the opening sequence (Audissino 2018). This, specifically, is made possible by the music, which, coupled with the images, results in a product that expresses that contradictory feeling typical of the Sublime. The visuals show a car trip in a mountainous landscape during a sunny day: the green of the trees, the blue of the serene sky, the pure white of the snow, the glowing flares of the sun-rays hitting the camera lens, gliding aerial views following the travelling vehicle, the arrival to an elegant hotel spectacularly lodged amongst the mountains. The Nature, in this isolated micro-configuration, appears majestic, imposing, immense, compared to the minute car traversing the landscapes. Below is the image-only sequence.

Clip 1. The Shining, opening sequence without sound, uploaded by the author on Critical Commons (
Clip 1. The Shining, opening sequence without sound, uploaded by the author on Critical Commons (

Wendy Carlos and Rachel Elkind’s original music, in isolation, features a number of elements that characterize it as ominous and threatening. The piece opens with a marked statement of the Dies Irae, the medieval sequentia describing the Day of Wrath and used in the Requiem Mass, which carries a long-standing association — both in film and in concert music — with doom, evil, and tragic outcomes (see Chase 2003 and Rosar 2001). The Dies Irae is written in the ecclesiastic Doric mode, a precursor of the contemporary diatonic minor mode — minor mode is typically associated with sadness or tragedy (Boltz 2001). The timbral quality of the piece is dark, with low-register synthesisers playing the medieval theme and bass pedal points (sustained notes typically used to build suspense). Then noise-like voices are introduced, resembling ghost cries and whisperings, to which the distant sound of a harpsichord is added, another film-music trademark of the horror genre — think of the vampire ball in The Fearless Vampire Killers (1967), just to name one.9 The macro-configuration that results from the fusion of this piece of music and these images is one in which Nature appears majestically threatening, imposingly oppressive, immensely dangerous. The flyover camera movement towards the car, coupled with the ‘ghost cry,’ looks like an attempted attack by some menacing supernatural presence that is stalking the vehicle. The type of Nature presented here is not plainly beautiful; it is Sublime. It generates a feeling of “astonishment” and “terror.” (Burke 1909: 38–39) And it looks as if the car were going deeper and deeper into a deadly trap (Dies Irae) populated by supernatural threats (ghostly voices). In David J. Code’s words: “there is little visual information in the whole, sunlit title scene to perturb the clichéd […] expensively produced car commercial. It is only the music […] that clearly tells us what kind of film we are watching, and signals the crescendo of horrors lying ahead.” (Code 2010: 133–34) Below is the original sequence, with my annotations superimposed.

Clip 2. The Shining, opening sequence with Wendy Carlos and Rachel Elkind's original music and annotated commentary by Emilio Audissino, available on Critical Commons (
Clip 2. The Shining, opening sequence with Wendy Carlos and Rachel Elkind's original music and annotated commentary by Emilio Audissino, available on Critical Commons (

To demonstrate how important the agency is of the musical factor in this product, the “commutation trick” can be performed by replacing Carlos and Elkind’s music with John Williams’s Main Theme from Jurassic Park (1993). The music is serene, reverential, hymn-like, nobly majestic, soaring, in major mode — identifiable with positivity, as opposed to the minor mode — with a predominance of bright timbres and high registers (violins, trumpets, flutes…) and characterized, harmonically, by an alternation of the I and IV degrees, the constituents of the ‘Amen cadence’ of liturgical music. Coupled with the image, the music produces a macro-configuration in which Nature appears majestically serene, imposingly awesome, immensely beautiful. Now, the flyover camera movement towards the car has no menacing connotation but looks like a spiritual soaring movement, as the music soars. We do not have the Burkian Sublime here but rather a quiet, harmless Beauty. With this piece of music, the car journey looks like the beginning of an uplifting adventure, an enriching discovery.10

Clip 3. The Shining, opening sequence: ‘Commutation Trick’ using John Williams's ‘Main Theme’ from Jurassic Park + annotated commentary, by Emilio Audissino, available on Critical Commons (
Clip 3. The Shining, opening sequence: ‘Commutation Trick’ using John Williams's ‘Main Theme’ from Jurassic Park + annotated commentary, by Emilio Audissino, available on Critical Commons (

Videographic scholarship can help incorporate the study of film music more firmly and confidently within the Film Studies agenda. It is an immediate solution to the problem of quoting and describing the music and, moreover, to account for how it interacts with the visuals. Videographic manipulation — through the silencing of the music or its substitution — makes it possible to experiment with novel combinations that can unveil aspects we might have not been able to spot from the analysis of the macro-configuration alone; it makes it possible to separate the factors of the audiovisual product, and thus to better discern their contribution to the whole and the way in which they interact.


Adorno, Theodor W. and Hanns Eisler (2007 [1947]). Composing for the Films. London and New York: Continuum.

Altman, Rick (2000). “Inventing the Cinema Soundtrack. Hollywood’s Multiplane Sound System.” In Music and Cinema, edited by James Buhler, Caryl Flinn, and David Neumeyer, 339–359. Hanover NH: Wesleyan University Press.

Altman, Rick, ed. (1980). Yale French Studies 60.

Altman, Rick, ed. (1992). Sound Theory. Sound Practice. London and New York: Routledge.

Altman, Rick (2004). Silent Film Sound. New York: Columbia University Press.

Audissino, Emilio (2017a). “A Gestalt Approach to the Analysis of Music in Films.” Musicology Research 2 “Music on Screen. From Cinema Screens to Touchscreens” (Spring): 69–88,

Audissino, Emilio (2017b). Film/Music Analysis. A Film Studies Approach. Basingstoke: Palgrave MacMillan.

Audissino, Emilio (2018). “Paura e Natura. Una lettura burkiana della sequenza d'apertura di Shining” [Fear and Nature. A Burkian Reading of the Opening Sequence of The Shining], Unpublished,

Audissino, Emilio (Forthcoming). “The Sound of The Uncanny: The Narrative Role of Music in Profondo Rosso.” In Scoring Italian Cinema. Patterns of Collaboration, edited by Giorgio Biancorosso and Roberto Calabretto. New York: Routledge.

Bellour, Raymond (1974). “The Obvious and the Code.” Screen 15(4): 7–17.

Bellour, Raymond (1975). “The Unattainable Text.” Screen 16(3): 19–28.

Bellour, Raymond (1985). “Analysis in Flames.” Diacritics 15(1): 52–56.

Boltz, Marilyn G. (2001). “Musical Soundtracks as a Schematic Influence on the Cognitive Processing of Filmed Events.” Music Perception: An Interdisciplinary Journal 18(4): 427–454., last accessed 27-02-2018.

Bordwell, David (2008). Poetics of Cinema. New York-London: Routledge.

Bordwell, David, and Kristin Thompson (2010). Film Art. An Introduction. New York, McGraw-Hill, 9th International ed.

Burke, Edmund (1909 [1757]). A Philosophical Enquiry into the Origin of Our Ideas of the Sublime and Beautiful. New York, P.F. Collier & Son Company.

Carroll, Noël (1996). “Notes on Movie Music.” In Id., Theorizing the Moving Image, 139–145. Cambridge: Cambridge University Press.

Chase, Robert (2003). Dies Irae: A Guide to Requiem Music. Lanham MD: Scarecrow Press.

Chion, Michel (1994 [1990]). Audio-Vision. Sound on Screen, trans. Claudia Gorbmam. New York: Columbia University Press.

Chion, Michel (2009). Film, A Sound Art, trans. Claudia Gorbmam. New York: Columbia University Press.

Cohen, Annabel (2000). “Film Music: Perspectives from Cognitive Psychology.” In Music and Cinema, edited by James Buhler, Caryl Flinn, and David Neumeyer, 360–377. Hanover, NH: Wesleyan University Press.

Cohen, Annabel (2013). “Congruence-Association Model of Music and Multimedia: Origin and Evolution.” In The Psychology of Music in Multimedia, edited by Siu-Lan Tan, Annabel J. Cohen, Scott D. Lipscomb, and Roger A. Kendall, 17–47. Oxford: Oxford University Press.

D’Aloia, Adriano and Ruggero Eugeni, eds. (2017). Teorie del cinema. Il dibattito contemporaneo. Milan: Raffaello Cortina.

David J. Code (2010). “Rehearing The Shining: Musical Undercurrents in the Overlook Hotel.” In Music in the Horror Film. Listening to Fear, edited by Neil Lerner, 133–51. New York: Routledge.

Donnelly, K. J. (2014). Occult Aesthetics: Synchronization in Sound Film. Oxford and New York: Oxford University Press.

Eisenstein, Sergei (1957 [1942]). The Film Sense, trans. and ed. Jay Leda. New York: Meridian Books.

Gorbman, Claudia (1987). Unheard Melodies. Narrative Film Music. London and Bloomington: BFI/Indiana University Press.

Kiss, Miklós and Thomas van den Berg (2016). Film Studies in Motion: From Audiovisual Essay to Academic Research Video. Scalar:, last accessed 25/02/2018.

Koffka, Kurt (1935). Principles of Gestalt Psychology. Brace: Harcourt.

Kulezic-Wilson, Danijela (2015). The Musicality of Narrative Film. Basingstoke: Palgrave MacMillan.

Larsen, Peter (2005). Film Music, trans. John Irons. London: Reaktion Books.

Lehman, Frank (2017). “Manufacturing the Epic Score: Hans Zimmer and the Sound of Significance.” In Music in Epic Film: Listening to Spectacle, edited by Stephen C. Meyer, 27–56. New York: Routledge.

McGurk, Harry and John MacDonald (1976). “Hearing Lips and Seeing Voices.” Nature 264 (December): 746–748.

Meyer, Leonard B. (1996 [1989]). Style and Music: Theory, History, and Ideology. Chicago and London: University of Chicago Press.

Newman, Michael Z. (2016). “GIFs: The Attainable Text.” Film Criticism 40(1): R1–R7.

Parker, Mark and Deborah Parker (2011). The DVD and the Study of Film: The Attainable Text. New York: Palgrave MacMillan.

Prendergast, Roy M. (1977). Film Music: A Neglected Art: A Critical Study of Music in Films. New York: W. W. Norton.

Santiago Hidalgo, ed. (2018). Technology and Film Scholarship: Experience, Study, Theory. Amsterdam: Amsterdam University Press.

Tagg, Philip (1999). “Music, moving image, semiotics and the democratic right to know.” In Philip,, last accessed 27/02/2018.

Tagg, Philip (2012). Music’s Meanings. A Modern Musicology for Non-Musos. New York/Huddersfield: The Mass Media Music Scholars’ Press.

Tan, Siu-Lan, Annabel J. Cohen, Scott D. Lipscomb, and Roger A. Kendall, eds. (2013). The Psychology of Music in Multimedia. Oxford: Oxford University Press.

Thompson, Kristin (1988). Breaking the Glass Armor. Neoformalist Film Analysis. Princeton NJ: Princeton University Press.

Weis, Elisabeth, and John Belton, eds. (1985). Film Sound: Theory and Practice. New York: Columbia University Press.

William H. Rosar (2001). “The Dies Irae in Citizen Kane: Musical Hermeneutics Applied to Film Music.” In Film Music: Critical Approaches, edited by K. J. Donnelly, 103–16. New York: Continuum International Publishing.

  1. For example van den Berg and Kiss 2016 or Hidalgo 2018

  2. The main formats are illustrated in van den Berg and Kiss 2016: chapter II.

  3. The main texts that launched the ‘audiovisual paradigm’ were Altman 1980; Weis and Belton 1985; Altman 1992; Chion 1994. See Audissino 2017b: 28–29

  4. The experiment is reproduced here:, last accessed 20-05-2018.

  5. See Audissino 2017a and Audissino 2017b

  6. An anonymous reviewer has raised the concern that this article of mine is too focussed on textual analysis, with the innuendo that such approach is outmoded. Yes indeed, it is focussed on textual analysis — which I prefer to call ‘formal/stylistic analysis’ to avoid the over-interpretive approach of ‘reading films’ that typically comes when films are thought as ‘texts’ (see Thompson 1988: 34 n25). My approach is deliberately centred on the film — the ‘text’ — as a counterbalance to the majority of today’s more fashionable approaches that are primarily concerned with the socio-cultural and contextual ‘reading’ of the films (see Audissino 2017b: 225–228). Yet, it would be simplistic to think that, just because one is more interested in the ‘text’ than in the ‘context,’ the ‘context’ goes completely ignored — actually, neoformalist analysis, seemingly a heavily text-based approach, is also attentive to the “transtextual motivation” of the film devices (Thompson 1988: 18) and to the “backgrounds” in which films are produced and viewed (Thompson 1988: 21). It is, more often, the other way around: scholars interested in the interpretation of the ‘context’ are likely to neglect the proper analysis of the ‘text.’ (Audissino 2017b: 6–8). So, I openly confess that this approach I use is indeed ‘textual,’ but it does not ignore the ‘context.’ The reference of Yakety Sax to The Benny Hill Show is a device motivated both ‘compositionally,’ ‘artistically’ and ‘transtexually’: by contributing to a fusion between the fast tempo of the music and the fast pace of the accelerated images, Yakety Sax is a key device to the overall consistency of the sequence construction; by adding a comic tone to a sequence that was originally designed as a dramatic one, it operates an artistic, parodical manipulation of the original; by referencing The Benny Hill Show, it adds, for those viewers who are familiar with the British TV show, a connotation that further boosts the parodical intent. Being familiar with The Benny Hill Show helps, but even if one did not know what The Benny Hill Show is — the ‘transtextual’ motivation not being grasped — the overall effect would still be a comic one because of the intrinsic qualities of the micro-configurations. Hence, ‘textual’ analysis alone is sufficient to explain the comic effect of the sequence.

  7. Other versions of Yakety Sax Shining can be found online, which present fewer interventions, limiting themselves to recuts plus speed acceleration, or simply speed acceleration (, last accessed 20-05-2018). One version has no image tampering at all but simply a musical substitution, but the fast tempo of the music results inconsistent with the unaccelerated pace of the visuals, undermining the comical effect (, last accessed 20-05-2018).

  8. In the more ‘legitimate’ repertoire of art films, Chris Marker’s Letter from Siberia (1958) is also based on the “commutation trick.” The film presents the same footage (a bus traversing a city and some workers paving a street) three times, each time with a different audio commentary (music + voice over). The result is that the three sequences are perceived differently because a different audio factor changes the final audiovisual product. The film is described as a demonstration of the “power of sound to alter our understanding of images” in Bordwell and Thompson 2010: 270–271. Thanks to the other anonymous reviewer for reminding me of this example.

  9. On the harpsichord in the horror film music, see Audissino forthcoming.

  10. Another note on the ‘textual’ VS ‘contextual’ approach, to reply to another concern raised by the anonymous reviewer. I am not considering here the Jurassic Park music as a reference to the Jurassic Park film; I am considering it as to its musical qualities, in terms of ‘compositional motivation’ (isomorphism of the musical and the visual qualities), and I have chosen the piece because it perfectly fits the duration and the flow of the sequence — probably there are other, less transtextually marked pieces that would have fit as well as this does, but I have come across this one. Surely a viewer that recognises it as the main theme from Jurassic Park might also interpret it as a ‘transtextually motivated’ device, and hence perhaps expect a T-Rex to come out any moment from the next street corner to attack the car. If we wish to push the contextual hypotheses even further, an American viewer who happened to hear the Jurassic Park music at the funeral of a dear one might project her/his own idiosyncratic associations onto the visuals and perceive the entire sequence as infused with a profound, painful sadness; or if s/he happened to have heard that music (strangely enough) during a strip tease, s/he might found, from such weird association, the sequence to have arousing erotic overtones. Any viewer can bring to the film its own contextual experience and his own baggage of associations and knowledge, what Bordwell calls “appropriation” of a film (Bordwell 2008: 46–49) — again, here is an example of a ‘text-focussed’ scholar who acknowledges the influence of context on film viewing. I have provided a ‘text’ based analysis here because this is the approach I prefer and, as I said before, because I claim we need more works along these lines to counterbalance the dominant position in today’s Film Studies.