Social media as a research method

June 4, 2017 | Autor: Tarrin Wills | Categoria: Media Studies, Digital Humanities, Social Media
Share Embed


Descrição do Produto

Communication Research and Practice

ISSN: 2204-1451 (Print) 2206-3374 (Online) Journal homepage: http://www.tandfonline.com/loi/rcrp20

Social media as a research method Tarrin Wills To cite this article: Tarrin Wills (2016) Social media as a research method, Communication Research and Practice, 2:1, 7-19, DOI: 10.1080/22041451.2016.1155312 To link to this article: http://dx.doi.org/10.1080/22041451.2016.1155312

Published online: 25 Apr 2016.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=rcrp20 Download by: [University of Sydney Library]

Date: 25 April 2016, At: 18:10

COMMUNICATION RESEARCH AND PRACTICE, 2016 VOL. 2, NO. 1, 7–19 http://dx.doi.org/10.1080/22041451.2016.1155312

Social media as a research method Tarrin Wills Department of English, University of Sydney, Sydney, Australia

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

ABSTRACT

Major developments in information technology in the digital society are eventually realised in the way in which research is conducted, particularly in the field of digital humanities (DH). Through a brief historical survey, this paper observes that the adoption of new technologies in DH occurs with some delay from the wide-scale adoption of the same technologies in other areas of society. This delay allows for a prediction about what technologies may be adopted in the near future in DH. In particular, the rise of social media in recent years provides a potential model for future DH research, particularly as it differs greatly from previous technologies in its capacity to engage end-users in digital methods. This paper argues that the techniques by which users interact with data in social media, particularly categorisation and semantic tagging, can be applied to a broad range of humanities research methodologies using similar interfaces to those of social media platforms. It then discusses some research tools developed by the author as a way of facilitating the interaction between researchers and primary sources using digital methods. Although much more limited than social media tools, it shows a way forward for implementing social media methods in the field of humanities research.

ARTICLE HISTORY

Received 27 November 2015 Accepted 8 February 2016 KEYWORDS

Digital humanities; social media; hashtag; medieval studies; digital society

Introduction This paper is based on a presentation at the conference Digging the Data at the University of Sydney (17 April 2015). It incorporates some reflections on the insights gained through the conference on how those working, particularly, in the discipline of media studies use and analyse the outputs of new media. It addresses a cultural and methodological problem specifically in the field of digital humanities (DH), focusing on humanities research which deals with non-contemporary primary materials, particularly those which do not originate in digital processes. Through a brief historical survey of DH in medieval studies, it identifies a significant delay in the take-up of otherwise mature and widely adopted information technologies in traditional disciplines in the humanities. The consequence of this delay is that we can predict what changes in information technology and culture are likely to influence future research in such fields. One movement that has dominated the period leading

CONTACT Tarrin Wills

[email protected]

© 2016 Australian and New Zealand Communication Association

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

8

T. WILLS

up to this paper is that of social media, and the latter part will address the question of how social media can be applied to traditional humanities disciplines as a research methodology in its own right. This paper develops some general principles by which broad types of humanities research could be advanced using methods deriving from and compatible with the ways in which social media users interact with textual and non-textual data. This paper avoids reference to works which attempt to define DH and digital editing in particular (but see e.g. Robinson, 2013 for a review of these), as well as ‘where are we now/going’-type collections and publications in DH as they often obfuscate the longterm picture and do not necessarily reflect broader trends. Although the author has been working in digital methods for humanities research since the late 1990s, the research has been largely focused on research questions related to the interdisciplinary study of early Scandinavia, rather than DH, a research field in itself.

The adoption of mainstream information technologies in DH In 1974, Speculum, the highest-impact journal in the field of medieval studies, included a paper entitled ‘Report: Computers and the Medievalist’ (Bullough, Lusignan, & Ohlgren, 1974). The vast majority of the paper is devoted to describing concordancegenerating projects, with music, archaeology, and social data in small sections of their own. What is apparent from this stage of computing in the field is that these methods could only process relatively simple structures (words, numeric data). The report concludes, But we feel that we are slowly leaving this barbarian age, and moving toward our Carolingian Renaissance. To do more, many of us will have to gain more training, or perhaps demand that our students obtain training. In fact, it is significant that a high proportion of studies reported in this brief review were undertaken by assistant professors or fairly recent Ph.D. graduates. Perhaps they are the coming generation. We should also provide ourselves with the necessary tools to achieve this progress. Data preparation remains the heaviest burden of text processing and here the setting up of a medieval data bank would bring an important relief to the scholar. Medieval text processing calls for its own Irish monastic libraries! (Bullough et al., 1974, p. 402)

What is remarkable about this summary is how much of it remains true to this day. It does point to a fundamental problem in how the field works in comparison with other fields such as new media – to begin the texts need to be digitised, and digitised in a way that supports analysis. For this reason, a large amount of energy in the field is devoted to data preparation. Some 10 years later, a publication Computer Applications to Medieval Studies dealt with a few projects in the field. The contributions are notable for their largely descriptive content, rather than addressing specific research questions. The data formats discussed reflect the heyday of mainframe computing: non-relational fixed-length records (i.e. a single table with up to 8 columns). Few projects had the capacity to deal with a character set that encompassed case sensitivity let alone non-ASCII characters. This technology could still produce simple text concordances, usually distributed in print, and at least one project attempted to use computer processing to collate manuscript versions of texts.

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

COMMUNICATION RESEARCH AND PRACTICE

9

Notably, the complexities of relational databases were largely not available to those who contributed to the 1984 volume, apart from one paper (Bächler). The relational database model had been conceived in 1970 (Codd, 1970), and its language (SQL) not long after (Chamberlin and Boyce 1974), and by 1984 systems implementing the model were available from IBM and Oracle. There is no mention of marked-up text in the volume – SGML had been under development for several years but only became a public standard in 1985. The projects described in the 1984 volume are still very much restricted to the techniques already seen in 1974, but with a small minority making use of the developments of the previous decade to do more advanced work. The PC revolution was still a long way from impacting on research in the field, and the work of these projects was directed towards dissemination in print, a fundamental restriction on the use that could be made of these projects. The early 1990s saw the emergence of the internet and in particular the World Wide Web. The development of the web was enabled by the development of platformindependent and easy to develop programming languages that had emerged around the same time, such as Perl 5 (late 1994), Java (1994–1995), and JavaScript (1995). This was a period when computing software was dominated by Microsoft, whose main market was large workplaces. In such enterprises, the employer provides the software and employees use it for their work and there are consequently few compatibility problems within the workplace. Whole industries tended to use the same software for this reason, that is, to allow exchange between enterprises. The most obvious problem raised by the emergence of the Internet was that if people were going to collaborate (either for work or recreation), they would need to be using tools that could ‘talk’ over the internet. Proprietary file formats – most notably Microsoft Word’s .doc format – could cause compatibility problems. The other issue was that, as the internet connected different types of computer and different types of people, when information was exchanged, it would need to be in a format that reflected the meaning and use of the underlying information. This would allow for different applications to process the same information. In the humanities, word processors, Microsoft Word in particular, did what humanities scholars needed, that is, produce documents that could be published in print publications. Print publications, if expanded to include electronic representations of print (e.g. PDF), are still the dominant medium for not just disseminating results but the process of humanities research itself. This is the fundamental challenge for increasing the take-up of digital methods in the humanities: most projects start and end with a Word document, which has almost no semantic information in its electronic form, apart from possibly the basic structure of the text. The solution to the problem of semantic structure in the humanities was largely provided through the Text Encoding Initiative (TEI), a standard for the digital representation of texts. Originally a Standard Generalised Markup Language (SGML) application, the first full and public version was published in 1994 (P3 – see http://www.tei-c. org/About/history.xml). It was later updated to be compatible with the emerging Extensible Markup Language (XML)-generalised markup standard (XML is a subset of SGML which is easier to process). TEI provided a comprehensive solution to a number of the early problems: it was an open standard format that encompassed a huge range of humanities projects including textual scholarship (prose, drama, primary

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

10

T. WILLS

sources, critical editions, etc.), dictionaries, and language corpora. TEI is now the de facto standard for producing digitally structured materials in traditional textually focused humanities disciplines. The process of standardisation of data representation in DH was based on the premise that personal computers, rather than networked servers, would process digital research materials. The internet, in this thinking, exists as a means of convenient file sharing. The presumption remains with TEI that scholars hand-code the XML, possibly with the help of an XML editor. In the digital society more broadly, XML is now ubiquitous – but usually at the front end of platforms and methodologies rather than at the back end (with some exceptions, including TEI and, perhaps ironically, file formats such as Microsoft’s .docx). It is telling that the TEI project still does not have a set of publication tools that can reflect anything like the complexity that its semantic model allows. Many types of DH projects do not use TEI. Dictionaries, for example, (such as the Dictionary of Old English and the Dictionary of Old Norse Prose) normally use relational databases as their underlying technology. They require encoding complex relationships and normally involve teams of scholars working concurrently on small pieces of text. Networked relational databases allow those scholars, through web or desktop interfaces, to produce and publish their work collaboratively. Although published through web interfaces, in such projects, most of the collaboration and interaction occurs through desktop computers and local networks. These differences between the two basic data structures – XML and relational data – is relevant to this paper because they determine how users interact with data. XMLbased projects tend to involve one person at a time on one device producing a single file or set of files, which are then processed. Database projects can involve multiple people on different devices interacting with diverse media. They are also far more scalable, with well-defined mechanisms for the exponential expansion of users and data. For these reasons, the rise of new media has relied on a technological foundation not of XML but of relational data. Most social media platforms (e.g. Facebook, Twitter, Wikipedia) started or continue to be developed with a variation of the LAMP (Linux, Apache, MySQL, PHP/Perl) platform, combining operating system (normally Linux), web server (Apache or similar), database server (especially MySQL), and an application programming interface (API) to connect them and generate web pages or interfaces to mobile apps. There are some DH projects which use something close to this model, but they tend to be the minority. This brief survey of major developments in the field of DH shows consistently a 5–15-year delay in the take up of basic digital technologies such as mainframe computing, generalised markup languages and relational databases. This delay is not counted from when the technologies are first developed, but rather from when those technologies achieve widespread adoption in the digital society. In other words, there is a considerable time gap in DH when a new technology is mature, widely available and inexpensive and when the systems and processes are implemented in humanities research. The time lag allows us to make a prediction about the future development of DH based on (relatively) recent trends in digital media. In terms of hardware technologies, the popularity of networked hand-held touch-screen devices, especially phones and

COMMUNICATION RESEARCH AND PRACTICE

11

tablets, has the potential to improve and develop the way researchers and end-users interact with data and research resources. This paper, however, will focus on software technologies, in particular, social media. The rise of social media in recent years provides an obvious contender for future developments in DH. The following section will explore ways in which social media techniques can be used in a research context.

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

Social media tools as semantic analysis Social media can be defined as ‘a group of Internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of user-generated content.’ (Kaplan and Haenlein 2010, p. 61). This definition applies to the main social media platforms that I will refer to here, such as Facebook, Twitter, and Wikipedia. The success of social media derives from its capacity, as the name suggests, to connect people, and to do so in a way that provides a counterpart to direct, traditional social interactions – sharing experiences, opinions, jokes; organising events, and so on. In order to do this, social media platforms rely on the ability of users to categorise information according to audience and content. The audience may be the general public (tweets, public posts on other platforms, blogs, etc.) or limited to those known or categorised by the content creator (friends, circles, groups). Importantly to the present study, these platforms provide numerous ways of organising the content itself according to semantic and social fields (hashtags, fan pages), as well as temporal and spatial dimensions (universal timestamps, geotagging), subject (hashtags, URLs, handles), group or community (hashtags, groups), event (calendartype events, photo albums), and disposition towards the thing referenced (likes, plus ones, shares, hashtags). Categorisation is an important but perhaps under-recognised tool in managing and analysing data in research. Some social media platforms use categorisation as a foundational principle, notably Pinterest. Perhaps the most powerful of the techniques for semantic analysis is the hashtag, which was popularised by Twitter and is now used by all major social media platforms (apart from Wikipedia, which has its own in-text referencing system; cf. https://en.wikipedia.org/wiki/Help:Link). The ontology or taxonomical study of hashtags is still in its infancy, although the work of Bruns et al. in this special issue foregrounds a developing typology for hashtags, and (Caleffi, 2015) offers a linguistic analysis of the phenomenon. Yang et al. identify two broad categories of hashtag (described as ‘[an] organisational object of information’): the first and most obvious is the ‘bookmark of content’, but the second is as a marker of a virtual community (Yang, Sun, Zhang, & Mei, 2012, p. 261). What defines that community may be membership of an organisation, attendance at an event, or more commonly a particular disposition towards the content of the information itself. The two roles in fact overlap: the first role of the hashtag labels the content, the second makes it available to those interested in such content. The resulting information structure is sometimes referred to as a folksonomy. Unlike an ontology, it is not structured but various tools can be used to create an ontology on the basis of the folksonomy (e.g. Christiaens, 2006).

12

T. WILLS

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

Hashtags allow end-users to reference concepts that are not part of the data structure of the platform. They implement, in a very simple way, a methodology used in DH projects, that is, markup, as well as one used in cataloguing and data curation, that is, restricted vocabularies. Handles also provide a very simple linking mechanism within the platform compared with traditional web methods (i.e. URIs in html links – although most platforms allow for the automatic generation of links from URIs). In many platforms hashtag information is used to tailor content, particularly advertising – but the tools which enable the semantic categorisation can be seen as analogous to many of the analytic techniques which underpin current humanities research.

Semantic classification and humanities research Surveys of scholars in medieval studies in 2002 and 2011 show that while there has been a very quick adoption in the field of digital publication of secondary materials, the use of digital editions barely changed in the period (Porter, 2013, p. 9). Part of the problem may be with the ways in which scholars interact with the primary materials, that is, how they are read and analysed. DH researchers often make the distinction between ‘close reading’ and ‘distant reading’ as analytical approaches in the field. Much research assumes that the latter is the goal of DH research, whereas close reading is generally thought to require a discursive method, which can only be realised in the process of writing longer texts such as theses, papers, and monographs. However, a good proportion of humanities research requires as its foundation a systematic identification and analysis of social or semantic fields across a corpus or text, which may be done with or without the use of digital techniques. In this section, I will show that such an approach can cut across the categories of close and distant reading and be applied to digital research. PhD projects can be viewed as an indicator of the future of research. A very large proportion of present and recent PhD projects do not use digital methods but at the same time seek to explore semantic or social fields in a body of cultural products appropriate to their discipline. If we look at the titles of recent projects, we see a remarkably consistent pattern across disciplines of how research is done in the humanities by those who will be the future of the disciplines. For example, a quick browse of some of the more common keywords in the University of Sydney’s online repository of PhD theses brings up titles such as: ‘Repetition, revision, appropriation and the Western’ (Robards, 2014); ‘From footnotes to narrative: Welsh noblewomen in the thirteenth century’ (Richards, 2005); ‘God’s Comics: Religious Humour in Contemporary Evangelical Christian and Mormon Comedy’ (McIntyre, 2013); and ‘Sisterly Subjects: Brother-sister relationships in female-authored domestic novels, 1750–1820 (Clifford, 2013). All of these projects attempt to analyse diverse sociocultural fields (film/narrative techniques, gender and socioeconomic categories, literary techniques, and social fields, respectively) in a particular corpus (film and literature genres, historical-geographical periods). Either or both the sociocultural fields and corpus are appropriate to the discipline of the researcher, but the approach can be generalised: it involves the identification and analysis of examples of the fields realised within the corpus. The

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

COMMUNICATION RESEARCH AND PRACTICE

13

sociocultural fields usually involve an implicit ontology where different aspects of a field are discussed in relation to the corpus. This is a common approach to a large proportion of humanities research, although it may not always be recognised as such. Traditional digital projects rarely belong to such a framework, instead addressing problems of preservation, description and formal analysis rather than social or semantic fields. They are often focused primarily on the text or cultural product itself. This is not necessarily the case in the field of media and communications. The papers presented in the Digging the Data conference at which the present work originated, for example, show an emphasis on the sociocultural fields as the starting point. Such a method can be effectively applied to a digitised body of material when it is in a form that can be electronically analysed. The difference in approach – between DH focused on traditional disciplines, and digital media and communication studies – may have to do with the primary materials themselves, which are not normally originally created in digital form for traditional humanities fields. In a discipline that deals with media created for digital platforms, the analysis of sociocultural fields is relatively straightforward. Other disciplines, which deal in particular with pre-twenty-first-century media, face the problem that the materials may not be digitised, or may not be digitised in a way that is apparently useful to the researcher. However, a large proportion of the primary materials of a number of disciplines have in fact already been digitised. This is certainly the case for pre-twentieth-century printed works (especially through Project Gutenberg, Internet Archive, Google Books and similar), but now also extends to a very large amount of material available through public institutions (especially Semantic Web/metadata projects), and public social media platforms (Twitter API etc.), as well as more specialised academic repositories such as the Oxford Text Archive. Copyrighted work remains an obstacle, particularly predigital sources, but much of this can be easily digitised, and for the purpose of semantic analysis can be privately digitised (scanning and OCR), processed and referenced without requiring the reproduction of the copyrighted work for publication. In order for this type of research to take place, mature technology is required for robust user-friendly semantic markup and/or ontological analysis of texts on a large scale. Such technology exists: GATE (General Architecture for Text Engineering) has been in development since 1995 and is used by a large number of projects for manual and automatic semantic markup and analysis of large corpora. Originally developed for computing science projects it is rarely used in DH, despite the relevance of its capabilities. A search for the name of the tool in the journal Digital Scholarship in the Humanities (formerly Literary and Linguistic Computing), for example, reveals only one project using the tool in the principal DH journal (Odat, Groza, & Hunter, 2015). GATE could be used, for example, to import a novel, web pages, or other resources and then either automatically analysed against an ontology, or manually marked up with semantic labels. This would facilitate finding relevant materials in the corpus and help develop analyses based on ontologies and other relationships between terms and features. Non-textual corpora would require manual tagging of relevant features in images, video, and so on. Tools likewise exist for these purposes. One of the main advantages of a digital method for this approach to humanities research is that the materials could be made available to other researchers, either

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

14

T. WILLS

working on a similar corpus or exploring similar sociocultural fields in related corpora. Using the PhD projects above as an example, Richards’ analysis of medieval noblewomen in Welsh sources could be extended to English or Irish sources by other researchers; or the investigations of female-authored domestic novels by Clifford could be extended to other types of relationships, refining the ontologies that arise from the analysis. Importantly, this approach is not incompatible with the work of those producing scholarly digital editions. In cases where such editions exist, researchers can build on that work by using those editions as the basis of their semantic analysis, producing results based on more reliable and exchangeable corpora than other kinds of digitised works. However, with clear metadata definitions for other digitised work, subsequent digital editing projects can align the semantic analyses with the new editions. The prerequisite for this is a clearly defined set of metadata for each corpus, defining individual texts and their parts so that analyses of different versions can be aligned. The prerequisite for these methodologies is a set of tools akin to social media interfaces, which allow researchers and potentially others interested in a field to interact with the primary materials in order to produce semantic tagging and categorisation. Crowdsourcing projects such as those hosted by Zooniverse (www.zooniverse.org) allow for such work, but within the strict semantic confines of the particular project, with most humanities projects involving transcription. What is needed is a more generalised interface, which allows for users to interact with text, image, and other creative products, which can be easily digitised but may not be originally digital. Social media breaks down the distinction between researchers as end-users and as developers: end-users are the content creators and often drive development, as we have seen in the use of hashtags on platforms, such as Facebook before the platform has implemented them.

A case study: the skaldic project One of the reasons for the delay in the adoption of mature technologies in DH may be due to the relatively short-term and restricted nature of funding for many DH projects. With a longer time frame and consistent support, a project may be able to develop in a way that takes advantage of new techniques and technologies in the digital society. At this point I will share my direct experience of a project, ‘Skaldic Poetry of the Scandinavian Middle Ages’ (http://abdn.ac.uk/skaldic), which may be held as such an example. This is with a view to showing the development of a relationship between a platform, its users, and its content. I have had a long-standing involvement in the project, starting in 1999, during my doctoral research, when I began work with Margaret Clunies Ross on the project. It aims to edit the corpus of skaldic poetry, a complex poetic form composed in Old Norse by mainly Icelanders and Norwegians from the ninth to the fourteenth centuries. The resulting output was originally envisaged as two print volumes with five contributors. (As is typical of these projects, the scope has increased by an order of magnitude, with now around 50 contributors and at least 16 large volumes, published digitally and in print. In some 18 years, only about half the corpus has been finalised.) For my PhD project (1997–2000), I was working on an interactive TEI-based edition of an Old Norse

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

COMMUNICATION RESEARCH AND PRACTICE

15

text, and I was invited to advise on how digital methods might be applied to the new project. In 2001, following the submission of my PhD, I began full-time work on the project, and in 2007, I became a member of the editorial board. TEI seemed like an obvious solution for this type of project, but there were a few challenges that quickly became apparent. The project was very diverse in terms of its metadata and media, incorporating different types of text and image, necessitating a move away from the purely text-encoding approach as originally envisaged. Contributors needed information about and access to the hundreds of manuscripts containing the poetry, as well as a reference point in the existing standard editions of the corpus (in particular Finnur Jónsson 1912–1915, which defined the organisation of the corpus used by subsequent major editions). The solution was a relational database linking the corpus structure of Finnur Jónsson’s edition to its various contexts: manuscript pages (with images where available), prose works in which the poetry is recorded, and the new project and its editorial team. The web seemed an obvious platform for interacting with this database, as contributors were working on three different continents. By 2003, this database had a web interface for editing its contents, built on the LAMP configuration. The project was from a very early stage built on and for the web, although the major research outputs continue to be print publications. (Metadata and material not in the printed volumes are publicly available, and the content of the volumes themselves is available after a 3-year embargo.) The scope of the project meant that the content of the edition would need to be concurrently updated by numerous contributors and assistants, requiring the data structure to be divided in some way in order to facilitate this process. At this stage TEI was retained for the encoding of the text of individual stanzas of poetry, with the rest of the structure represented by the relational data model. However, the relational database platform proved to be flexible and scalable enough to avoid the mixing of XML and relational data in this way, particularly with improvements to the opensource MySQL server software. After the first volume was produced in 2007, I converted the remaining TEI to a relational model. The process of conversion did not involve great challenges or compromises. Fundamentally, the tree structure of XML/SGML is compatible with a relational model, allowing for bidirectional linking of data, but requiring additional information and processing to retain the syntagmatic structure of language (see Wills, 2013). The purpose of moving to a largely database solution (there is some XML tagging within short sections of text) was to enable scalability and to allow for further analysis of the material, which had already expanded into related fields. It also allowed the development of interfaces to allow the editing of data by the dozen or so editors and assistants involved in this process. The semantic and analytical information in the project, that which is encoded electronically, is largely represented through the linking of various entities (texts, words, dictionaries, manuscripts, editors, bibliographic items, etc.) through the relational links within the database. These are represented to the end-user, as is typical of such projects, by HTML pages linking the entities together. The design, like similar projects, conforms to Semantic Web principles as each relationship has a defined ‘triple’

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

16

T. WILLS

representing the subject, predicate, and object of the link (where the predicate is defined by the named link itself). In many ways the most impressive success of this approach was to engage scholars, particularly senior scholars, who do not see themselves as doing digital research, in the process of electronic editing and analysis. Just two of the retired or soon-to-retire contributors (the editors of the most recently published volumes) have between them made over 70,000 edits to the database. The key here has been to make the digital interface closely mirror the traditional processes of scholarship. In the case of textual editing, this involves gathering the manuscript and secondary sources, transcribing and collating the primary sources, normalising the text and recording variants, providing a close translation, commentary, and in the case of skaldic poetry, analysing the complex diction and word order. The database exports the information in a format that is used to directly produce traditional print publications. The added value of an electronic approach is that all of the information produced and processed is digitally linked to related information, such as the text, translation, and commentary on individual words, and pieces of text to manuscript images. These links can be represented in a variety of interactive ways through web and similar interfaces. The processes are implemented digitally through a web interface to the database using a series of interactive forms. The online forms have been gradually modified over the years to almost eliminate technical markup, while maintaining the electronic encoding of semantic structures. The result is a resource, which can be exported as TEI and preserves the intricacies of TEI, but which does not require knowledge of the markup language, nor does it permit errors in the markup language itself. The basic model for incorporating textual editing processes with primary and secondary sources, and analysis proved to be generalisable to related fields. In 2012, I was invited to join the international project, Pre-Christian Religions of the North (PCRN), and have taken on the role of developing a database of the sources of Northern paganism (see http://abdn.ac.uk/pcrn). This project builds on the corpusbased work of the Skaldic Project, but adds a further semantic dimension to it: the process of amassing a body of evidence for such a broad and diverse phenomenon as religious belief and practice is not very useful unless there is some way of navigating it for specific information regarding those practices. My original conception of the PCRN database was to create a series of complex structures linking the different text types (written, onomastic, archaeological, visual, epigraphic, etc.) with the mythological-religious phenomena they may shed light on (gods, supernatural beings, cultic practices, etc.). The development process has involved recruiting PhD students to incorporate analyses of mythological-religious material into the database, experimenting with different models by adapting their own analytical approaches. The complexity of dealing with the source types as structures requires extensive training and still leaves compromises in the digital representation of the interpretations (see Wills, 2014). The most successful approach so far with this project has been semantic tagging of text fragments (supported in particular by the University of Iceland). Semantic tagging is a much more accessible technique for the postgraduate-level assistants working on this project, and this

COMMUNICATION RESEARCH AND PRACTICE

17

practical discovery has informed the above proposal for semantic tagging in this and other humanities fields.

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

Conclusion Social media engages end-users on a large scale in processes of semantic analysis and categorisation of data, using interfaces that rely on either no technical markup or very simple markup, such as hashtags and handles. A survey of diverse research outputs in a range of fields, as well as a critical understanding of social media tools for categorisation, tagging and linking, shows that these digital methods can be applied to a great deal of traditional research methodologies. These methodologies include both close and distant readings of primary materials that have been digitised. The history of DH in traditional fields shows a significant delay of 5–15 years in the adoption of the dominant information technology movements into its research practices, which is perhaps not surprising given the constraints of funding and tradition on the disciplines it serves. This delay suggests that the emerging trends in the digital society of the last 5–10 years should now be ripe for integration into DH methodologies. Crowdsourcing projects, such as those hosted on Zooniverse, show that this approach can be applied to simple methodologies, such as transcription and semantic analysis within narrow categories and primary source types. The emerging challenge is to generalise these approaches to much larger corpora and highly complex ontologies. This will allow for methodologies and analyses to be compared across research projects. The author’s experience in developing digital research interfaces for both senior and emerging researchers demonstrates that researchers can effectively engage with methodologies comparable to those in social media without technical training. The key in all cases is to provide interfaces, which allow researchers to pursue traditional readings and analyses using digital methods. The added value of the digital methods is that they can be scaled into much larger corpora, media and semantic fields. There are a number of challenges if this vision is to be realised. The corpora themselves must be referenced consistently, despite a great deal of variation in editions, reproductions, and digitisations of the primary materials in many fields. Perhaps more importantly, the emerging ontologies must not only be compatible across media, disciplines, and corpora, they must also allow the ongoing conflicts and debates about the very semantic and social fields that they encompass.

Disclosure statement No potential conflict of interest was reported by the author.

ORCID Tarrin Wills

http://orcid.org/0000-0001-5360-3495

18

T. WILLS

Notes on contributor Tarrin Wills (PhD Sydney) is lecturer in English at the University of Sydney, on secondment from the Centre for Scandinavian Studies at the University of Aberdeen. He has made numerous contributions to the fields of Digital Humanities and Old Norse studies, including extensive involvement in the projects Skaldic Poetry of the Scandinavian Middle Ages, Pre-Christian Religions of the North, The Medieval Nordic Text Archive (Menota) and the Medieval Unicode Font Initiative (MUFI). In 2016, he will be taking up a Horizon 2020 Marie Curie fellowship at the University of Copenhagen.

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

References Bullough, V. L., Lusignan, S., & Ohlgren, T. H. (1974). Report: Computers and the medievalist. Speculum: A Journal of Mediaeval Studies, 392–402. doi:10.2307/2856091 Caleffi, P.-M. (2015). The ‘hashtag’: A new word or a new rule? SKASE Journal of Theoretical Linguistics, 12(2), 46–70. Chamberlin, D. D., & Boyce, R. F. (1974). SEQUEL: A structured English query language. Proceedings of the 1974 ACM SIGFIDET (now SIGMOD) workshop on Data description, access and control. New York, NY: ACM. Christiaens, S. (2006). Metadata mechanisms: From ontology to folksonomy… and back. In R. Meersman, Z. Tari, & P. Herrero, et al. (Eds.) On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops. Springer: Berlin Heidelberg. Clifford, K. (2013). Sisterly subjects: Brother-sister relationships in female-authored domestic novels, 1750–1820 (PhD thesis). University of Sydney, Sydney. Codd, E. F. (1970). A relational model of data for large shared data banks. Communications of the ACM, 13(6), 377–387. doi:10.1145/362384.362685 Gilmour-Bryson, A. (Ed.). (1984). Computer applications to medieval studies. Kalamazoo: Western Michigan Univ. Jónsson, F. (Ed.). (1912–1915). Den norsk-islandske skjaldedigtning. Copenhagen: Villadsen & Christensen. Kaplan Andreas, M., & Michael, H. (2010). Users of the world, unite! The challenges and opportunities of social media. Business Horizons, 53(1), 61. doi:10.1016/j.bushor.2009.09.003 McIntyre, E. (2013). God’s comics: Religious humour in contemporary evangelical Christian and Mormon comedy (PhD thesis). University of Sydney, Sydney. Odat, S., Groza, T., & Hunter, J. (2015). Extracting structured data from publications in the Art Conservation Domain. Literary and Linguistic Computing, 30(2), 225–245 . Porter, D. (2013). Medievalists and the scholarly digital edition. Scholarly Editing, 34, 1–26. Retrieved from Robards, A. (2014). Repetition, revision, appropriation and the Western. PhD thesis, University of Sydney. Richards, G. (2005). From footnotes to narrative: Welsh noblewomen in the thirteenth century (PhD thesis). University of Sydney, Sydney. Robinson, P. (2013). Towards a theory of digital editions. Variants: The Journal of the European Society for Textual Scholarship, 10, 105–131. Wills, T. (2013). Relational data modelling of textual corpora: The skaldic project and its extensions. Literary and Linguistic Computing. doi:10.1093/llc/fqt045 Wills, T. (2014). Semantic modelling of the Pre-Christian Religions of the North’. Digital Medievalist, 9. Retrieved from Yang, L., Sun, T., Zhang, M., & Mei, Q. (2012). We know what@ you# tag: Does the dual role affect hashtag adoption? Proceedings of the 21st international conference on World Wide Web. New York, NY: ACM.

COMMUNICATION RESEARCH AND PRACTICE

Downloaded by [University of Sydney Library] at 18:10 25 April 2016

Digital Resources Internet Archive Facebook General Architecture for Text Engineering Google Books Oxford Text Archive Pinterest Pre-Christian Religions of the North Project Gutenberg The Skaldic Project The Text Encoding Initiative Twitter Wikipedia Zooniverse

19

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.