by Scot McPhee.
"We are yet but young in deed" - Macbeth III.iv
This essay consists of some fragments of a prosthesis for aiding movement across a landscape. It is too trite to say that the structure of this essay somehow mirrors its object of study; rhizomatic, fragmentary, incomplete, and allowing the reader some control over its interpretation and linking. On some superficial levels this may be true. However this essay has a purpose which draws together the myriad strands running through it and gives it its final direction. This is not an attempt to make Theory without any purpose to it, lacking a centre which no longer exerts a hold on the elements in orbit around it and so flying off on the inertial path of least resistance. The purpose of this essay, bluntly put, is to ascertain a general structure that describes the audio-visual logic of multimedia in way that draws the question of sound, regarded here as including sonic as well as poetic logic, to the centre of the entire apparatus.
Recently, interest in this newish (but certainly faddish) computer multimedia apparatus, has been dominated by newer (and more faddish) interest in the possibilities of the global communication network that is the Internet. But regardless of the bit-delivery mechanisms at play in these models, the ultimate viewing configurations remain mostly the same in the continuum of possibilities from CD-ROM to high-speed worldwide network. For the purposes of much of this essay I choose to restrict myself to mostly talking about CD-ROM multimedia. The other forms are currently to technically primitive at present to have as rich variety of media interplay and sophistication. The World Wide Web for example is still too much like an animated, hyperlinked magazine. Sound, being a "real-time" medium requiring considerable bandwidth, is not well-suited to the current web form, although there are (currently inferior) methods for delivering less than telephone quality sound to the viewer over the network. It can be assumed that the CD-ROM paradigms will be the dominant ones for rich multimedia experiences once the technical problems with networks are solved. The user (Browser? Reader? Viewer?) visits this mediated space almost exclusively by force of the mouse, the keyboard, and the computer monitor, as a kind of expanded television (one might point out that one is forced to sit closer to this monitor than one's mother ever allowed in relation to the TV set whilst watching Saturday morning cartoons). Added to this picture of a domesticated interface is the question of sound that is always seemingly a poor cousin to the visual elements, provided so frequently as it is through small tinny speakers hardly worthy of a cheap transistor radio.
We are so used to dealing with this interface without sound in our "normal" range of computerised activities (such as the wordprocessing I am performing now, where the computer is mute and only the sound of its cooling fan and my fingers on the keys interrupt the acoustic environment), that this technically inferior sound reproduction system is rarely given much thought in its relationship to the screen, often even by critics who should realise better. Usually their arguments are reduced to musicological examination and valorisation of "technical perfection" in systems without much reference to anything but emotive effects and physics, the same types of attention to sonic detail as found in hi-fi boutiques or popular film criticism.
That this is not surprising goes without saying. We had the "sound" film (or more precisely the "talking" picture) for nearly some sixty years before a large range of critical theory engaged with it and interrogated its (now central) role in constructing cinematic experience. This con be contrasted against the much larger body of criticism invoking narrative and visual effects. That much multimedia is produced with an overwhelming priority on (usually slick) visual design is not likely to stupefy many familiar with the field of sound studies.
But this essay's purpose is not to simply assert the predominance of sound over all other elements of the media. Its real purpose to explore the ways that multimedia construct the subject into the media's frame of reference, and the media into the subject. It is a study of how the different elements work together to produce an inventive (associative) logic that is evolving away from the Enlightenment's system of epistemological organisation (whose inheritance extends back to Ancient Greece). In such a system all individual media act in concert to produce a particular effect, just as in modern cinema the inclusion of sound, and later Dolby and THX surround-sound altered the perception of screen space in dramatic ways that are still being explored. It is my intention that this essay provide at least a partially completed map which can be used to explore the fertile fields of multimedia production. The last section of this piece begins to flesh out ways of taking the observations in this essay and applying them to thinking about a multimedia creation that can highlight at least one way of evolving an aesthetics that draws out the poetic potentiality (and a politic) of the completed work.
In this first part the essay I shall deal with multimedia sound and its relation to the whole using Michel Chion's book Audio-Vision [1] as an exemplar of what needs to be considered in thinking about the role of sound and its relationship to the image. It is quite true to say however, that Chion's ideas, formulated as they are for the cinema, are not always immediately and obviously applicable to multimedia, and this is compounded by current multimedia production practices. There are a also a number of technical limitations - the moving image in multimedia is laughably primitive compared to even art-school super-8 film or video production, and it is never "projected" theatrically like films are.
To confront or interact an interactive multimedia piece the viewer is usually seated, close, less than a metre, in front of a computer screen. There are speakers to the side of or below the screen. To operate the work the viewer has to use usually, the computer's mouse, and perhaps also the keyboard. With artworks, the images on the screen are not usually "realistic", that is they do not purport to represent some point of view (and audition) into a supposedly-existing universe that is parallel to ours. Imagery is most frequently created (or manipulated) with computer imaging technology and this is overwhelmingly its "look", although there are many individual exceptions to this rule. Overall the screen design is similar to page design in glossy full-colour magazines - there are elements of "realism", for example, photographs or small snippets of digitised video called Quicktime movies, but the overall effect is one of a surreal dreamscape of seemingly unconnected graphics which the viewer can move through or manipulate as the author allows. The sound quality is below that of a home hi-fi, frequently much more like a very cheap radio. Perhaps the entire experience can often be summed up as like reading a televisual book. In this, new media, there are several striking poetic resonances which resound through its entire form, and it is in these that I hope to find an understanding of the form which enable us to in turn produce new, vibrant works that consciously resonate with a rich poetic structure.
To illustrate this essay throughout, I have chosen some works from the first major exhibition of CD-ROM art in this country - Burning the interface curated by Mike Leggett and Linda Michael at the MCA, Sydney.
In his catalogue essay [2] for Burning the interface, Murray documents the increasing influence of "ambience", first in daily life:
"If it didn't sound paradoxical, we might say that ambience is the anthem of the late millennium." [3]
And from that, multimedia art:
"It is not just art CD-ROMs which draw on ambience. Commercial works titles rely heavily on aural fields to enmesh their players. This can be incidental (casual walks down ancient Greek paths in Wrath of the Gods), a highlight (sounds evoking rainforest environment that is the central concern in SimIsle) or part of the logic of the work itself (noises like cricket chants from the desert sands of East Africa used as a repertoire of geographical difference in Encarta '96 World Atlas). So why is ambience everywhere?" [4]
Indeed, why is it? Murray fails to see past the musicological categorisation of the "ambient" genre and grasp the deep affinity that "ambience" in all its forms has within multimedia. In the first case I mean ambient "genre" to mean a practical approach to making electro-acoustic music, as well as a description of the resulting music [5]; by the second - "ambience" - I mean the process of mapping, marking or indicating, an exterior terrain or space with sounds in order to produce an interior, as well as exterior, sense of that space - the deep reverb of the cathedral, for example, not only materialises the physical space of the church, it also invokes within the individual a psychological sense of the space. So it is not just a matter of nature being a set of easily coopted recurrent sounds that drives the contemporary dominance of ambience (of the first kind) in multimedia [6] but a relation of subjects to the medium that is created by such within the bounds of the second sense of the term. The soothing features of a quiet natural environment are but one possible set of aesthetic (and political) approaches to a particular conceptualisation of space that a producer can make to their work.
Murray offers up "ambience" by first excluding one poetic model, offering a version of what is excommunicated in his model of contemporary multimedia art. Perhaps this is to highlight the political choices many producers have made in their works. At any rate, for him, multimedia is:
"Not Kant:
Bold, overhanging, and as it were, threatening rocks, thunderclouds piled up the vault of heaven, borne along with flashes and peals, volcanoes in all their violence of destruction, hurricanes leaving desolation in their track, the boundless ocean rising with rebellious force, the high waterfall of some mighty river, and the like, make our power of resistance of trifling moment in comparison with their might." [7]
Instead it is best expressed in Keats;
"Hedge-crickets sing; and now with treble soft The red-breast whistles from a garden-croft; And gathering swallows twitter in the skies." [8]
His model explains the predilection of producers for choosing the sounds which best fit the technical requirements of the medium ("reality ... is already looped" [9]), thereby framing or rewriting nature with our cultural devices:
"This contemporary attunement to ambience is quite different from our experience in traditional theatres of nature, such as the zoological gardens we have inherited from the Nineteenth century. Its symbolic habitat is the wilderness where nature is largely hidden from sight, providing visitors with a soothing aural field of insect hum, pure light and enveloping odours." [9a]
However this explanation fails to recognise the intimate relationship that "ambient" sound has in either of Kant's or Keats' visions. This ambient sound expresses both passages by utilising 'iconic' notions of sounds, ie certain sounds stand in for bold and overhanging, desolate landscapes, and others for an English garden in the summer. Sound and poetics are bound with the power of sonic representation as metonymy, and through that or because of or perhaps in spite of, as an archetype.
Of course multimedia producers select the model (often unconsciously) which best fits their work's intentions, but if one was committed to engender Kant's Critique of Judgement into a multimedia form, it would not at all be surprising that sound must necessarily be an integral part of the representation of such a fierce, overwhelming and uncompromising nature as much as it is part of a soothing botanical garden. For comparison to a cinematic model, see, for example, Chion's description of the sound in the film The Bear - "the crew of The Bear knew that you can't just film shots of a bear and thereby automatically convey the bear's strength, its odor, weight, and animality: and they knew to draw on sound to aid in rendering all these qualities." [10] Really one could conceive and construct a multimedia work which conjures Kant's world using only audio elements, freeing the work's "story" to be situated in such a world. You would still require a certain graphic look (black, grey and muted, fiery reds, destructive and overwhelming dark blues and greens), but the primary poetic relationship, the one that 'propels' the work (even if to an abrupt stop), is often then one between acoustics, and "hypertext" [11] (ie simply, narrative).
Armed with the right "soundtrack" [12] and hypertext it is perfectly possible to create aesthetically pleasing interfaces with nothing but a one-bit colour system (ie digital black & white). Even if we were restricted to a silent work, it must evoke this sound environment in the mind's ear or else fail to communicate as effectively as it could. If a silent work doesn't do this successfully it seems to be a mere slideshow or computer-screen book which is unfulfilling compared to the "real thing".
Ambience doesn't permeate the work just as "evidence of the durability of the natural form in the workings of the human soul" [13] as much as a new form of it underpins a certain relationship in the way that multimedia poetics are constructed. Murray perhaps means "ambience" as a marker for a particular style of music, and from that he relates it to a certain style of poetry, and to a style of relating the "human condition".
But if ones takes a broader view of "ambience" to mean a certain acoustic field which permeates a work, then we arrive at understanding that writing an 'ambient nature' can be expressed in both example forms. Murray rightly expresses that multimedia is really about the ambience within (a certain subjectivity) as much as it is the ambient field without. But how do the two relate? Can the ambience within be bold, threatening and overhanging as much as it might be sweetness, light twittering and fluttering? And how does this relate to the hypertextual and the visual?
It is not just Keats who is "hardly a sexy reference point" [14] in helping us understand this inner logic, but entire schools of thought and institutions which at first glance may appear to be entirely unconnected with Romantic poetry (or multimedia). It is apt that Chion, an electro-acoustic music producer of some note, can provide initial direction for an exploration of the terrain of the multimedia ambience within. Our starting point for this exploratory journey will be the new acoustic environment, the ambience without, of the interactive multimedia work. First we shall consider the voice.
Quite early in his book, almost as his introduction, Chion establishes how the text (or voice) structures what you 'see' in cinema. His example is quite straightforward, an announcer's voice-over simply directs the viewers attention to the image (by calling forth a particular fact of what you see, in selection over other facts or possible facts regarding the image presented) [15]. Cinema, and television, set up a pivotal relationship between image and voice; this relationship is central to cinematic functioning. However, one of the first things that one notices about much multimedia is the just how much this relationship is pushed into the background. As well we can note that it is text, rather than voice, that is dominant in multimedia.
Text is frequently used to communicate available choices directly to the user, as well as act to form a demarcation of the 'visual' (graphic) element. It is also sometimes used as a conveyor of the principle narrative content of the work. Multimedia works, particularly 'utility' works like magazines, can often really be nothing more than an "expanded book", or even just as an expanded coffee table book - utilising certain graphical elements to stylise it. Where text is necessary, it usually conveys narrative with either large sections of words printed on the screen, smaller inter-titles or captions (often blended into the graphics), or voiceover or by a combination of these.
Talking heads by themselves usually do give the impression that multimedia but a poor cousin to television. This is often because of the intrusion of technical elements into this traditionally TV style of presentation. Spoken narration by itself or over text also invites comparison to a "talking book". These are two comparisons that many multimedia practitioners seem particularly keen to avoid.
It is of course in this circumstance misdirected to blindly regard Chion's model as necessarily applicable in the same way as it is in cinema, even if it is true to say that multimedia has many of the same elements - it is also true to say that this simple model is made considerably different by massive extension. However we always can adapt his model and in turn construct a new model.
Our new model has to allow visual elements that invest new meaning into the text (eg simply via graphic design) instead of narration focussing attention on some element of the image or, in an inversion of the silent movie's inter-title interrupting the visual stream, the short Quicktime movie interrupts, interferes with, or recontextualises the reading of the screen's text. For example, in Brad Miller's work A Digital Rhizome, a black South African journalist's story is recontextualised completely by association with the Gulf War via small Quicktime movies of a smart bomb's "combat camera" - the two together becoming a powerful motif for the nature of modern representation. A different type of recontextualisation also takes place in John Collette's 30 Words for the City, a complex interplay of texts, photographs and sounds positions the "30 words" in question.
Then there is the simple matter of 'text' in the form of the program's controls (either on-screen or via a menu). These structure the experience of the viewer by directing their anticipation. If you select a button marked 'Home' or 'Contents', any user accustomed to multimedia convention expects to find a screen which functions as the central switching point or main access page, or the first page they encountered. Even if the page appears to be obscure or non-obvious, the user still has certain expectations of its functions and structures their use of it accordingly (such as clicking on parts of the image displayed in the hope of finding "hot spots" which activate the functions of the work).
Text is also used to impart information to the viewer as a narration does in cinema. Sound in this role often is relegated to the role of being a reinforcement medium, and some narrator reads out the exact text presented on the screen. Occasionally it plays a sole part in imparting a text, that is, it operates in the bounds of conventional screen narration and forms a relationship to the screen as described by Chion. Voice also plays the part of the mystical voice, the "ghost in the machine". At other times there is a sound/text dialogue, with conventional actor's dialogue on one side and the viewer playing the part of another actor but inserting "dialogue" into the action in the form of menu or other command choices or direct text input. Luc Courchesne's work Portrait One is an example of this. The viewer propels the conversational narrative (an "interaction dialogue" [16]) along by selecting from a on-screen list of possible responses to the screen actor's dialogue.
Rarely does sound contradict or interrupt what can be read in the text on the screen. Doors of perception by Mediamatic is the CDROM documentation of a conference hosted in Amsterdam in 1994.The conference, whose topic is itself "the cultural and economic challenges of interactivity" is organised into bite-sized chunks of infotainment through which the reader can navigate by means of an agreeability index - a sliding scale which varies from "violently disagree" (represented by the binary 0) through to the "definitely agree" (represented by the binary 1). This is a familiar form to anyone used to answering basic surveys and questionnaires. You are presented with an opening statement or aphorism (the speakers word's are shown to you on the screen as well as you hearing the audio excerpt) and you can take it from there, being in turn presented by other conference speaker excerpts which can be multiple-choice surveyed with your answer determining what you are presented with next.
Apart from the misapplication of a binary oppositional logic to a continuum of possibilities, the user soon finds that choosing the 'right' answer (in the sense of the one that actually aligns with your feeling toward the subject at hand) is not the best way to get through the information. This strategy soon yields you with a small cluster of statements that you mostly agree with and quickly grow tired of viewing again and again. The best stratagem for viewing the maximum number of interesting statements is to choose the mid-point 'don't know' position, which forces the selection of a new statement (seemingly) at random. The ultimate 'postmodernist' position of taking a minimalist position in order to cover the richest variety of territory.
The capabilities of 'hypertextuality' is one of the often vaunted differences (advantages) of multimedia, it is touted as the medium's defining unique characteristic. One gets the impression that although textual structure (ie the circuits between meanings) is capable of being hypertextual, 'non-linear' etc, the nodes themselves often are not.
The work, A digital rhizome, as one might expect from its title, shows elements of intertextuality within the boundaries of a single node. Within a node, Miller draws on a juxtaposition of sound elements (narration about technology, music, radio, news excerpts) with (often competing) Quicktime video snippets and text labels which announce a new possibility of connection with a new nodal space. The user is left with a sense of connections and circuits triggered by the mimetic devices left scattered through the luscious graphics like mines scattered in a field.
Chion asserts that, for film, "there is no soundtrack" [17]. By this he means not that there isn't any sound 'channel' running through the film but that "the sounds of a film, taken separately from the image, do not form an internally coherent entity on equal footing with the image track" [18]. One of his chief determinants here is the relation of onscreen to offscreen sound - a distinction which is of course determined by their relationship to a sound-emitting object in the visual field. This distinction, as I shall outline later, may not be all that useful in multimedia because of the peculiar nature of its onscreen "space" and the way sound is used to delineate it. But we cannot dismiss Chion's bold statement entirely - for at least some works, the soundtrack is indeed completely referent to "on screen" action and devoid of this, it is not sensible to a logic that is readily discernible, for example the work Haiku Dada by Felix Hude. This work is full of interesting "emanation sounds" [19] made by the characters and cartoon sound effects and music. Stripped of its cartoon visuals the sounds suddenly lose all force. It is only when the audio-visual element is combined with the poetic, a strange Haiku self-constructor set, that the incongruity of work comes through. You are given some sense of the authors desire to express how alien he found Japanese culture by the very alien-ness of such a dramatic collision of styles.
However the media form can and does exhibit many features which call into question the notion of a soundtrack which is by definition bound to its image for a life of its own, and operating that way in the work. Multimedia opens a channel whereby the soundtrack escapes and drifts into space free of its visual referent. The relationship can also be inverted and the image in fact anchored steadfastly to the sound, with the sound free to float in the aether for the duration, the form's inner logic has set the soundtrack adrift, towing along the image with it. A strange possibility occurs, for now it is possible to demonstrate it as media which is based on listening - "A film deprived of its image and transformed into an audio track proves altogether strange - provided you listen and refrain from imposing the images from your memory onto the sounds you hear. Only at this point can we talk about a soundtrack." [20] In this context the soundtrack in multimedia has become akin to a work of musique concrete, an electro-acoustic accompaniment, but one that is more than just mere decoration [21].
Each sound in this model soundtrack also becomes a container for meaning in a way that is not found in Chion. In Chion's cinema model, and indeed radio too, there is no analogue of the cinematic shot, and nor has any other idea of the "auditory shot" been identified and accepted amongst a wide range of practitioners. [22] While it is convenient to break down visuals by each shot (multimedia's screen or node is a symbiosis of this along with the book's page), sound is not so easily categorised in this manner. But here too we find multimedia resisting this traditional model, for it now carries with it a notion of the "earcon" or auditory icon, that is, a distinct unit of audio meaning which is usually associated with some particular control action. Normally these are arbitary with no common "language" across works (a problem human interface engineers seem bent on solving).[23] Although this is not quite an analogue of the shot, especially since nowhere near all of the auditory material is classifiable as separate elements thusly, the "sound unit" is a new feature not found in cinema and brings new possibilities for sonic representation (although I would hasten to add these are often hardly of a 'realistic' nature).
Added to this scheme of a literal sound unit, "ambience" as I shall expand on later, also brings with it the notion of the poetic earcon, a unit of meaning that accompanies the sonic and is deployed to situate the poetics in some specific subjective internal state or physical place. The summer garden, then, is not just a generic summer garden, but a specific garden which is evoked in the mind of the viewer. For example the Keats passage evokes for me a memory of a specific section in a specific garden - the Royal Botanic Gardens in Sydney. This idea of expanded ambience [24] is of central importance to the way multimedia functions as a poetic device.
These new features have a significant effect on the nature of sonic flow and editing within multimedia, as well as the idea of the perceived space that the media projects. In multimedia, sound does not (usually) propel the images forward in time as much as it situates them in space. Although John Collette's 30 Words for the City does represent, on one level, Chion's model of a cinematic sound-image relation, in that it takes fractured city landscapes and layers them with a temporal flow, but on other levels the image, and text, are provided a space, a specific location, from which the work sounds. However this space is significantly different to the usual associations of sound and image space in cinema, particularly those coming from tendencies to realism.
Scrutiny in the Great Round, a work by ScruTiny Associates (Tennessee Rice Dixon, Jim Gasperini and Charlie Morrow) provides a clearer example of this process. The work, which "has a dreamlike sensibility, full of symbols and metaphors comprising a landscape of the imagination" [25], is richly illustrated with images recalling the glories of Renaissance Humanism in both its rational and spiritual guises. Here sound positions this work as a mystical apparatus, and working with the graphic design clearly delineates its division into "Sun" and "Moon" levels and invests the work into, and with, a hidden, magickal "space". It positions the work in an acoustic field that allows the work to operate successfully as a kind of ambient image-sound symbolic generator. You get the impression that you are viewing a parallel universe which is designed to reflect certain aspects of your own unconscious world back to you in a "sensual and personalised audio-visual concert" [26].
Sound, in this work and in many others, operates almost entirely in the realm of what Chion identifies as en creux, or phantom sound. [27] It's frequently sounding "from the other side of the film" (or screen as it is in this case), like Tarkovsky's Swedish songs in The Sacrifice. Many multimedia works also use natural sounds as 'rhythmic' markers of ambient space, not so much marking time as timelessness. Multimedia also has the forms of cartoon sound, general television, and talking book or presentation. For the most part works belonging to those categories are sonically constructed to as close as possible to their precursor forms as the new form allows. Except where they are imitated as closely as possible, many cinematic modes of audio-vision seem to be largely left out, or rather, there is such a different logic operating in it that the mode has considerably mutated once combined in symbiosis with literary form, broadcast media, and experimental art and video.
Nowhere is the absence of cinema more noticeable than in the acoustic field in which you experience the work. The actual acoustic space in which the work is experienced has radically altered, as well as the way in which it is listened to. Cinema sound operates, particularly in modern forms with Dolby surround-sound systems as "a space with fluid borders, a sort of superscreen enveloping the screen - the superfield" [28] (emphasis mine). Despite this new(ish) development, the image still "magnetizes" the sound in space - the superfield's origin, its zero-coordinate, is still the screen and everything is heard relative to this space.
Currently, multimedia contains an almost binary approach to the concept of onscreen "space". Mostly it is either "there" in-the-screen (eg, a Quicktime movie with its soundtrack, a control feature/alert dialogue noise, text being read out, synch fx with animation, mouse pointer "rollovers" [29] and other earcons), or it's not - except, as I shall extend later, by poetic association. The major exceptions are (some) role-play and (all) shoot-'em-up games and (most of) virtual reality (as they are concerned with immersive, physically modelled worlds) and this is an important distinction in current forms of multimedia to which I will return later. However for the current CD-ROM type, which is not usually used with overly-specialised interface hardware (eg "eyephones" the curious VR headset), the sound very much remains in nearfield mode.[30] The speakers are situated just centimetres away from the listener, under or immediately to the side of the image, as in television. There is no twenty or thirty metre wide space in which to position and move sound emitting objects, relativised into, out of, and through, the screen. Furthermore, the need to create the illusion that sound 'follows' objects around the screen, is greatly modified. To be sure, visual elements appear and their soundtrack starts, or mouse movement over certain objects creates a corresponding sound but as the entire system is less than a cubic metre in volume (screen, speakers and observer being the points which define its volumetric space) there is hardly any need for the mind to invent the illusion as it does with monophonic cinema, or much scope for positional acoustics to be simulated as with surround-sound systems.
The reliance on nearfield magnetisation of this sort means that stereo is used to simulate a physical space, an ambience or territory field, rather than liberating screen objects into a sort of 3-dimensional environment. Objects do appear in this tiny three dimensional space but quite often the logic of doing so is set free from having the screen as its physical zero-coordinate. The relationship is defined primarily as one that creates a psychological environment between the two senses, rather than a physical one. Despite a restricted sense of physical, acoustic space, multimedia has an enormously rich sense of an interconnected, subjective space. Restricted enormously in one set of (physical) dimensions, it expands rapidly in another, as if to compensate, giving a sense of a flatness, which has depth.
Thus sonics is central to the structure of subjectivity in multimedia. The ambience in many works acts as the 'interior' of the work (for example Thirty Words for the City). In this work, we are presented with a series of 'inter-titles', diegetic moments which define the particular rhizomatic island the viewer is 'in'. Where the text is read out, it is not read to us with our internal voice - the acoustic presence of the narrator audibly precludes, locks out, this possibility. We do not hold the perspective of this narrator, it is not the viewer who is the narrator, the narrator is 'in' the frame (or at least in the frame if we are to use the work's own logic in placement). This is unlike Chion's construction of the narrator belonging outside the diegesis of the work (outside the temporal mechanics of the actual picture that comprises the frame). This work does not use the familiar cinematic system of presenting a window onto 'reality'. The 'camera' (that is, the frame of the screen) doesn't carry the subjectivity either (by expressing a visual reference that can be used to construct a possible point-of-view), either on a literal level (here is a shot where you or an on-screen protagonist are walking through) or often on any other level (even from an 'fly on the wall' or omni-prescient perspective).
We are cut off, adrift from the visual world of the work. We can operate the mechanism to view the world but what we are seeing is a collection of rhizomes, which have a conductive logic, and which are viewed through the small portal of the computer monitor.
What then 'writes' the viewer into the piece? How are we, striding hopelessly through this empty landscape of signposts, sutured into its disjointed memory? The element which is most at work here is the imaginary landscape conjured up by the non-diegetic sound of the soundtrack. It is the ambience of the work which represents the viewer in it. That is, the ambient, acousmatic music of the piece does not mark out the 'territory' of the logically represented on-screen space but it marks out the inner subjectivity of the viewer. The ambience is the accumulated memory of the reader, in the process of 'becoming' - being modified by the circulatory flow in the work we observe. The ambiences of different regions become earcons for, that is they are iconic of, subjective experiences and memories of the viewer, particularly where these need to be related to a particular space, or a direct emotional feeling (as in music).
Here we find a major departure from the ontologically privileged position given to the gaze in regular cinematics. The screen is no longer the reference point, the origin, for a physically "real" world into which the viewer is given a view and "point of audition" [31], either omniscient-to or subjective-within. The whole system acts as a portal to another universe, an interior world. The defining moment for multimedia, what makes the inter-texualisation between sense-data possible in this context, produces the interior contemplation required, is the flow of sounds that supposedly originate from the collective unconscious of the spectator, and it is this acoustic field which links the subjective moment in the work to the subjective state of the viewer.
How does the acoustic relate to the metaphysical structure of a multimedia work? Thus far we have seen some territory which shows how the media elements act in concert to produce a definite effect, or subjective state, in the viewer. But how does this apparatus then affect the possibilities for poetic interpretation of the resulting art?
If hypermedia is really so mechanically different to cinema what is it most related to? What logic does it reflect or incorporate? There are a number of media which qualify, and in truth it is all of them and yet like none the most. However, we may wish to start in its comparison to television.
In his book, Heuretics, Gregory Ulmer writes extensively about exploring a new method of invention, which is to say a new logic with which to organise poetics. In this system he envisages out electronic media to be a writing system, a system which instead of being a literate aide to mimesis, is instead a literal recording of "pure memory". [32] Ulmer finds that video is "an alternative means of gathering data into sets, for the purpose not of proving or testing an idea, but of having a thought, of inventing both in the rhetorical sense of finding something to say and in the creative sense of innovation" [33]. Thus on one level multimedia can be thought of as not of a rhetorically logical writing to aid one's remembrance but a direct writing of the mimetic structures themselves. These memory systems are the modern equivalents to methods of mimesis long rendered obsolete by print, where "formalised logic replaced associational reasoning" [34]. Video, and in extension multimedia and the Internet, reassert associational reasoning in the form of a conductive logic (conduction being an analogue of induction, deduction, and 'abduction').
A multimedia piece can be thought of as a symbolic generator, a motor, for generating a conductive logic, which forms circuits of reasoning moving from thing to thing. A truly "electric" machine, electric not just in its operation but in its reasoning also. Ulmer labels this associational logic Heuretics. Such a logic may sound very silent (indeed conduction mostly seems to flow among images ("things") in Ulmer's writing), but the technique of inventing by association also lends itself to acoustics. This acoustic basis, manifest in the features I outlined earlier, can be used as one basis with which to invent a multimedia form. Into the conductive flow of association between things multimedia splices sounds, where video uses the sonic flow to smooth and temporalise, multimedia can use sound to roughen and interrupt, or it can (sometimes simultaneously) create a "space" in which the image and text occurs. This process conjures new conduction between what is seen and what is heard and imagined. To the Twinkies and Milk of Jameson's reading of alienNATION [35], we can add any number of new potentialities. While video may lend itself to this in some way, multimedia opens itself more widely because it is already built upon the notion of the associational node (screen/card/window) rather than the smooth flow of ever-changing images that usually comprises video. Here, at this node the view can associate (or conduct) a definite set of visual elements together with its accompanying text and sounds, with links to further nodes in a rhizomal complexity only limited by the amount of time and resources it takes to create each (a limit which not usually acknowledged openly by many producers). How formless (or otherwise) such a complexity may be is still under the control of the artist, however the authoring function has changed, it is just as surely there.
This system is metonymic, where interrelated parts stand for yet more others. Multimedia (electronic media in general) are really a metonymic code. Each program however, has its individual set of decoder rings.
But this logic associates also with space, the sense of surroundings, of place. Into this space creeps the ambient field, because (apart from the Cartesian representation of 3-D computer modelling), in multimedia it is the ambient sound that usually signifies it. Ulmer says:
"the suggestion that television is pure memory is based on the grammatological analogy with the invention of other information-storage technologies, such as writing or print, which constitute prostheses for memory. The history of writing shows that print favoured a style of logical representation that finally replaced and exceeded the hermetic tradition of the memory theatre - the mnemonics of places and active strong) images derived from ancient rhetoric. What began in ancient oratorical training as a method for memorising quantities of information by associating it in the imagination with a series of images distributed through the rooms of one's home, or along the street of one's community, had evolved by the time of the Renaissance into a theatre, a building, designed as an encyclopaedia of total knowledge" [36]
But what sort of "memory theatre" do we find ourselves in?
Multimedia objects contain a metonymic code, and meaning is transformed by the viewing with the logic of conduction. It can also be said that media objects within the form are in fact ciphers, that is "a secret method of writing, as by a specially formed set of symbols" [37].
These ciphers are often used as just empty vessels (cipher is derived from an Arabic word (via Latin) that means empty and it is still used this way), carriers of a metonymic association that the viewer reconstructs with a version of Ulmer's conductive logic. The media objects carry their 'texts' into the field - a field or space which is constructed with the judicious use of territory or ambient (electro-acoustic) sound. Even if territory sound is necessary to create a formally 'realist' construction of a physical place, the outside is invoked as metonymy in order to create a 'place' within, an internal zone in the subject viewing the work.
This also helps explain why many works create or invoke ancestral mythology. There is perhaps, a strong link between mythology, internal zones, ambience, and multimedia which makes this understandable. It may be that the link also extends bilaterally, that is, that our "specially formed set of symbols" conduct mythology to its own logic as much as it carries itself towards mythology.
This fragmentary analysis isn't meant to be a theorising as much as it is meant to be a description of a DIY multimedia mechanism. More usefully, I am interested in the evolution of the new media - it seems to me there are a number of impasses and blockages (insulation?) in the conduction that are yet to be worked around.
A general thrust lies in my text which seeks to assert that the multimedia form is quite unlike the cinema, especially in its sound-image relationships, its faithfulness to "realism" and the way in which its poetic meaning is constructed. For if:
"Unlike painting or writing, it is commonly supposed, cinema uses motion picture photography and sound recording to fix and retain in memory a physical image of the pro-filmic scene. Whereas representational painting is based largely on iconic resemblances, and writing is built around symbolic relationships, cinema is thought to depend especially strongly on indexical connections" [38],
then it can be seen in my essay that multimedia art is much closer to literature and painting. This is made especially so by its unique relationship to sound-as-ambience and the logic of "iconic" sound. This movement toward the painting and the book is intensified by the fact that even while even cinema has ceased to be entirely indexical because of its increasingly electronic nature, having lost the aspect of "recording its object by sacred contact" [39], multimedia has (mostly) always been this way, intensifying this trend towards the iconic representation.
Iconic representation, associational logic, etc, bring with them the idea that multimedia already carries with it the logic of the network, that of deeply interconnected and yet distinct entities or islands of meaning (and conduction). And so the rapid development of computer-networked forms of multimedia finds parallel in its very own internal construction, perhaps none too surprising that the formal logic and the transmission medium are developed more-or-less in concert.
This concept of networked virtuality, can be equated also with notions of aurality, which for Fran Dyson at least are related by their "phenomenal invisibility, intangibility, multiplicity, and existential flux" [40]. Sound is "an agent of destabilisation" [41] which "challenges an understanding of the real based on the physical, visible and enduring object." [42] She also identifies a strong link between the history of radio and the unfolding rhetoric of virtuality - in that radio offers "another example of presence recuperated via the often spiritualised notion of embodiment" [43], and both were/are used to idealise a form of "supra" subjectivity, a mass subjectivity popularly thought of as a type of collective unconscious or mystical being. In this respect at least, the history of early radio offers us many insights into the political discourse which surrounds such networks. However, especially given the latent impulse that "virtual" sound has to rendering physical space, it also provides interesting pointers for constructing an aesthetics, an ambient radio poetics, in some ways perhaps reviving early utopian models of radio broadcasting which were largely pushed aside in the fight over defining the "citizen-consumer" [44].
To be sitting back in front of a multimedia work, peering at the flickering monitor, clicking with the mouse like a rat pressing a lever, reading our televisual book and inhabiting our internal space, the most pressing need of multimedia seems too frequently to be the need of real, physical, space rather than any virtual one. The exhibition gallery space for example does not really appear to be the ideal venue for the new media, even those made by artists. And even as a private form used in the home it certainly requires further "domestication" for it to claim the "market penetration" of television (for which the future multimedia Internet is often is touted as a replacement).
That this issue of bodily ergonomics is a key future question is not questioned here. But it's not just an ergonomics of the body that computer-based multimedia requires, but also an ergonomics, or energy distribution, of meaning or poetics, in the form itself. Ultimately this is what this essay is really concerned with, and I hope that it has at least started, in its own fashion, to provide glimpses of this fragmented territory.
Scot McPhee June 1996
[1] Chion, M., Audio-Vision, trans. by C. Gorbman, Columbia University Press, New York.
[2] Murray, K, 'Mouse, where is thy sting?', in Burning the Interface [International Artists' CD-ROM], exhibition catalogue, Museum of Contemporary Art, Sydney 1996.
[3] ibid p 14
[4] loc cit
[5] see for example, the liner notes of, Brian Eno, "Music for Airports / Ambient 1", EG records, 1978. Or any of the definitions found at the URL http://hyperreal.com/music/misc/ambient/faq/
[6] see Murray p 14: "Being practical minded, we might begin with the medium itself. ... it is easier to loop recurrent sounds than compose an extended linear piece of background sound. Nature is readily supplied with sound patterns made for this purpose. ... reality outside is already looped."
[7] Kant E, Critique of Judgement, in Murray K, p 15.
[8] Keats J, To Autumn, in Murray K, p 15.
[9] Murray, p 14
[9a] loc cit
[10] Chion p 119
[11] By "hypertext" I mean the semantic content, eg the words and pictures and diegetic sounds, together with the structure and organisation of that material. Therefore it encompasses all possible "narratives" ie paths, through the material.
[12] Similarly here I mean "soundtrack" to mean the collection of sounds and the methods and sequences with which it is possible to play them. For some works this may be exceedingly complex, others might offer a single, fixed soundtrack (usually music). Perhaps the right word is sound-sequence, but I shall stick to the more conventional soundtrack.
[13] Murray, p 16
[14] loc cit
[15] Chion, in the section titled "Text Structures Vision", p 6-7
[16] Luc Courchesne, Portrait One, in Burning the Interface [International Artists' CD-ROM], exhibition catalogue, Museum of Contemporary Art, Sydney 1996, p 52.
[17] Chion, p 39
[18] loc cit
[19] see Chion, p 177-183
[20] ibid, p 40
[21] Of course, this doesn't stop the less imaginative from merely decorating decorative graphics with their equally banal choice of music.
[22] Chion p 41
[23] eg see Stephen Brewster, The Earcon Home Page, URL http://www.dcs.gla.ac.uk/~stephen/
[24] "Expanded ambience" incorporates the ambient "bed" which locates a physical place, with the acoustic environment of the work's viewing situation, the notion of an "ambient" or electro-acoustic music, and the poetic associations (the ambience within) which these evoke.
[25] Scrutiny Associates, Scrutiny in the great round, in Burning the Interface [International Artists' CD-ROM], exhibition catalogue, Museum of Contemporary Art, Sydney 1996, p 90.
[26] loc cit
[27] Chion p 123-137
[28] ibid p 69
[29] a rollover is an onscreen region (image, button, etc) that makes a short sound and/or changes shape, colour and so on, when the mouse pointer is moved to it without pressing the button. They are usually used to alert the viewer of the function's purpose or button's destination, or even just simply to let you know there is a link or other function actually present in this part of the screen.
[30] taken from the concept of nearfield monitoring found in audio engineering - speakers placed close to the engineer so as to minimise the effects of control room acoustics, eg wall reflections etc.
[31] Chion, p 89-92
[32] Ulmer, G., 'One Video Theory (some assembly required)' in Critical Issues in Electronic Media, ed. Penny, S., SUNY Press Albany, 1995. p 269
[33] ibid p. 268
[34] ibid p 270
[35] Jameson, F, Postmodernism Or, the cultural logic of late capitalism, Verso, London, 1991 p 79-94. Also quoted by Ulmer passim.
[36] Ulmer p 270
[37] Macquarie Dictionary, Second Edition, ed Arthur Delbridge, Macquarie Library 1991 p 328 (definition 6).
[38] Altman, R., 'Four and a Half Film Fallacies', in Sound Theory, Sound Practice ed by Altman, R., Routledge, New York, 1992, p 42
[39] ibid, p 44
[40] Dyson, F, 'In/Quest of Presence' in Critical Issues in Electronic Media, ed. Penny, S., SUNY Press Albany, 1995. p 29
[41] loc cit
[42] loc cit
[43] ibid, p 34
[44] see Spinelli, M, 'Radio Lessons for the Internet', in Postmodern Culture v.6 #2, January 1996 Oxford University Press. URL - http://jefferson.village.virginia.edu/pmc/issue.196/pop-cult.196.html