Qualitative Text Analysis Supported by Conceptual Data Systems

Share Embed


Descrição do Produto

Quality & Quantity 33: 135–156, 1999. © 1999 Kluwer Academic Publishers. Printed in the Netherlands.

135

Qualitative Text Analysis Supported by Conceptual Data Systems KARSTEN MACKENSEN1 and UTA WILLE2? 1 Institut für Musikwissenschaft, Humboldt-Universität, 10099 Berlin, Germany; 2 Zentrum für

Umfragen, Methoden und Analysen (ZUMA), Postfach 122155, D-68072 Mannheim, Germany

Abstract. Content analysis as a method in social sciences is used to systematically explore textual data. Data resulting from content analysis can be made transparent by saving it in a conceptual data system. This supports its interpretation and reexamination and the process of interpretative theory building. By means of an example of a conceptual data system from musicology, the possibilities and restrictions of this new approach in computer-aided qualitative text research are analyzed. Finally, the approach is discussed as a general method of qualitative formal theory building in the context of content analysis. Key words: computer-aided text analysis, Formal Concept Analysis, conceptual data system, dataoriented theory building.

1. Introduction Computer-aided techniques for the management, coding, retrieval, and analysis of textual data are of increasing importance in social science text research. Computer systems especially designed for text analysis allow the researchers to handle and to systematically organize huge amounts of textual data. They provide enhanced coding and retrieval techniques, and include various statistical tools for hypothesis testing. But the implemented systems usually do not support researchers coming from an interpretative, hermeneutic or inductive tradition of text analysis to a sufficient extent in their process of developing qualitative theory (cf. Kelle, 1996). Therefore, we propose an approach to computer-aided text analysis based on ‘Formal Concept Analysis’ which especially tries to support the iterative process of formulating categories and the process of data-oriented theory building typical for qualitative text analysis. It should be emphasized that, using the term ‘qualitative’, we refer to a rather inductive and interpretative approach to text analysis which develops theory from the data. A more hypothetico-deductive approach focusing on hypothesis testing can be supported as well by exploring gathered data with the help of ‘conceptual data systems’; but a principal strength of our approach lies ? Author for correspondence.

136

KARSTEN MACKENSEN AND UTA WILLE

in its possibilities for supporting the iterative process of developing theory from empirical data. In Section 2 the basic notions and ideas of Formal Concept Analysis are introduced from the perspective of the requirements arising from text analysis. How qualitative text analysis can be supported by using Formal Concept Analysis is the leading question addressed in Section 3. We explain in detail how to proceed while building up a conceptual data system in text research. The individual steps are demonstrated and discussed by means of a musicological study. In Section 4 we point out the specific properties of the introduced approach based on Formal Concept Analysis by sketching the state of the art in computer-aided text analysis. Finally, we discuss to what extent conceptual data systems combined with implemented systems for computer-aided text analysis provide a flexible tool to support data-oriented theory building.

2. Conceptual Data Systems A data-oriented approach to text analysis as sketched in the previous section presupposes a high level of transparency in the coding of the textual data and the process of building theories. In particular the iterative process of developing categories requires a comprehensive and differentiated conceptual structuring of content. The consequences of certain preliminary definitions of categories with reference to the categorization of the text passages have to be revealed in order to support the researcher in her/his thinking, but also to make the process of theory generation intersubjectively comprehensible. By building up a conceptual data system using content abstracted from the data under study, a flexible tool can be provided to support the development of theories by unfolding conceptual structure and differentiated relationships inherent in the data. Conceptual data systems have been developed in the field of Formal Concept Analysis to represent conceptual knowledge arising from data and to support their interpretation. In connection with the management system ‘TOSCANA’ they provide a flexible tool of knowledge communication which is supposed to support ‘human communication and argumentation to establish intersubjectively assured knowledge’ (R. Wille, 1997: 6). In the following paragraphs, the basic ideas of Formal Concept Analysis and conceptual data systems in qualitative text analysis are outlined (see R. Wille and Ganter, 1996; R. Wille, 1982). Formal Concept Analysis is a mathematical theory based on a set-theoretical formalization of concept and conceptual hierarchy reflecting the philosophical understanding of a concept as a unit of thought constituted by its extension and its intension. The extension comprises all objects belonging to the concept while the intension consists of all attributes valid for all those objects. Since the extension and intension of concepts often cannot be comprehensively reported, Formal Concept Analysis with respect to concrete applications always starts from a restricted ‘(formal) context’ consisting of ‘objects’ and ‘attributes’.

137

Peter Andrew James John Philip Bartholomew Thomas Matthew James Alphaeus Thadaeus Simon Judas

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

1. Four fishermen called as disciples 2. The miraculous draught of fish 3. A sinful woman forgiven 4. Feeding the five thousand 5. Peter confesses Jesus as Christ 6. The death of Lazarus 7. The fruitful grain of wheat 8. The way, the truth and the life 9. Jesus promises another helper 10. Seeing and believing 11. The anointing at Bethany 12. Jesus announces the betrayal of Judas 13. Jesus predicts Peter's denial 14. Betrayal and arrest in Gethsemane 15. Jesus faces the Sanhedrin 16. Peter denies Jesus, and weeps bitterly 17. He is risen 18. Breakfast by the sea 19. Jesus restores Peter

Figure 1. A formal context about the Twelve Disciples in John’s Gospel.

In qualitative text analysis conceptual patterns unfolding the theoretical structuring of a collection of categories are obtained from formal contexts having the studied texts or text passages as objects and as attributes the categories considered; an object ‘text’ is said to have the attribute ‘category’ if the text is linked to the category by the coding. Now, the extent of a ‘formal concept’ of such a formal context consists of a set of texts or text passages and the intent of a set of categories. The extent comprises exactly the texts of the context linked to all categories of the intent and the intent contains exactly the categories of the context linked to all texts of the extent of the considered formal concept. For example, Figure 1 shows a data context whose objects are passages in John’s Gospel from the New Testament and whose attributes are the Twelve Disciples of Christ; the crosses represent the object-attribute-relation and indicate which disciple is mentioned explicitly in which passage according to evangelist John (this data context was derived from Peisker, 1979; see also R. Wille, 1996). With respect to this context the formal concept characterized by the explicit occurrence of the disciples Peter and Thomas has the extent {2, 18}, and the intent {P eter, T homas, J ohn, J ames}; i.e., Peter and Thomas are explicitly mentioned in the text passages 2. and 18., where Peter, Thomas, John, and James are the disciples mentioned in both text passages. Let us emphasize again that formal concepts are defined formally and always with reference to a previously determined formal context. Furthermore, the formal concepts of a context are structured hierarchically by the subconcept-superconceptrelation defined by: the extent of a subconcept is contained in the extent of the

138

KARSTEN MACKENSEN AND UTA WILLE

Peter Judas

Philip

Thomas 3.

6.

5. 13. 16. 15. 17.

11.

Thadaeus

10. Andrew

John 12.

4.

19.

9.

7.

14.

James 2. 18.

1.

8.

James Alphaeus Simon

Bartholomew Matthew

Figure 2. The concept lattice of the context in Figure 1.

superconcept. The formal concepts of a context can be effectively computed. Together with the subconcept-superconcept-relation they form the mathematical structure of a lattice, called the ‘concept lattice’ of the underlying context. This definition turned out to be very fruitful because concept lattices of contexts can be visualized by ‘line diagrams’, which support the interpretation and communication of the underlying data. Figure 2 shows the line diagram of the concept lattice of our John’s Gospel context. The circles represent the 13 formal concepts of the gospel context, while the line segments ascend from subconcepts to superconcepts. Furthermore, the name of an object always is attached to the smallest concept having the object in its extent, and the name of an attribute is attached to the largest concept having the attribute in its intent. Then the extents and intents of the individual formal concepts can be read from the line diagram as follows: the extent (intent) of a formal concept consists of all objects (attributes) whose names are attached to the circle of the concept or to a circle derivable by a descending (ascending) path from the circle of the concept (e.g., the circle at the left-hand side of the diagram represents the concept with the extent {12, 14} and the intent {J udas, P eter}). Additionally, this representation enables us to reconstruct from it the complete underlying context: namely, an object has an attribute if and only if the names of the object and the attribute are attached to the same circle or if there exists an ascending path from the circle labeled with the name of the object to the circle labeled with the name of the attribute. For example, the object ‘8. The way, the truth and the life’ has the attributes ‘Thomas’ and ‘Philip’. Furthermore, from the line diagram one can read that the passage ‘9. Jesus promises another helper’ is the only one in which the disciple Thadaeus

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

139

is mentioned and that the occurrence of James is always accompanied by that of John, Peter, and Thomas. The disciples James Alphaeus, Simon, Bartholomew, and Matthew are nowhere explicitly mentioned in John’s Gospel. This example based on the Twelve Disciples of Christ in John’s Gospel should serve merely to explain the basic notions of Formal Concept Analysis. How Formal Concept Analysis can support the process of qualitative text analysis in concrete hermeneutic research, and the kind of procedure we propose, are demonstrated in the following by means of a study from musicology. 3. The Procedure of Qualitative Theory Building Briefly summarized, qualitative text analysis consists of a systematic process of formulating and exploring classificatory categories, and an analysis and interpretation of the relationships between the defined categories. In view of the support for this process by Formal Concept Analysis we propose to start with an abstraction of content from the textual data under study by formulating ‘indexing attributes’ and coding the texts with respect to the established attributes. Subsequently, the actual process of interpretative theory development can be strongly supported by storing the gathered data in a conceptual data system. The coding with respect to the attributes can be made transparent, the categories can be defined as combinations of those attributes, and their relationships can be studied by looking at conceptual networks of categories with respect to the underlying text passages. Thus, the basic idea of our approach is to break down the procedure of qualitative text analysis into the following four steps, where a back-and-forth between the individual steps is always possible: (1) (2) (3) (4)

Formulation of relevant attributes; Coding of the textual data with respect to the attributes; Process of formulating and exploring categories; Analysis of relationships among categories.

To demonstrate our approach concretely, we first introduce a study from the aesthetics and sociology of music having a research question that is typically hermeneutic in character. Then the individual steps of the proposed procedure are described more comprehensively; they are discussed and illustrated by applying them to the introduced study. 3.1.

A STUDY IN MUSICOLOGY

In music and music aesthetics of the 18th century a phenomenon can be observed that may be described as a general tendency to a certain simplicity or simplification. This tendency can be found in different forms in various fields connected with the aesthetics of music. The investigation ‘Simplicity as an Aesthetic Category in

140

KARSTEN MACKENSEN AND UTA WILLE

Music and Literature about Music in 18th Century Germany’ deals with a variety of aspects, for example, − − − −

In what way is the phenomenon discussed in printed texts? How are certain central terms like ‘noble simplicity’ used? Does the term’s content change during the period we are dealing with? How are ‘Einfalt’ (‘simplicity’), ‘edle Einfalt’ (‘noble simplicity’), and ‘Naivität’ (‘naivety’) related to each other, and in which contexts are they used?

These questions led to considerations concerning the discussion of the concrete simplification of music and the specific form of simplification, and its technical realization in compositions of that time. Which kind of music is ‘simplicity’ applied to? Another point of interest is a certain simplification both of music and of the way it is taught: For instance you can observe a rapid increase of printed instructions for playing instruments at that time. Additionally, there is a connection between simplicity and the popularization of music for a growing group of ‘dilettanti’, that is for amateurs of music. To approach this broad field of aspects and phenomena, we decided to develop a ‘conceptual history’ (‘Begriffsgeschichte’; see Richter, 1987) that shows principal classificatory categories, their changes and progress, and the development of the content of specific terms. For this purpose, it seemed methodologically appropriate to develop a theory about the role of ‘simplicity’ and related phenomena in 18th century aesthetics by carrying out a systematic lexical-semantic analysis of contemporary texts. 3.2.

FORMULATION OF RELEVANT ATTRIBUTES

By activating Formal Concept Analysis one may obtain a flexible tool to extensively explore and retrieve data by revealing its structure in line diagrams of conceptual networks. In order to utilize these possibilities of Formal Concept Analysis for the process of data-driven theory building in text research, one has to decide how to access the underlying textual data. A ready-made scheme of categories is usually the result of the research process and not available in advance. We propose to start with an abstraction of content from the underlying documents by coding them with respect to ‘attributes’ relevant for the phenomena under study (in order to avoid confusion we sometimes also call them ‘indexing attributes’). Thereby attributes can be catchwords, extended keywords, little summaries, or anything which serves to make the data accessible if the coding has the function of assigning indices. Attributes are formulated from the researcher’s theoretical preconception and a first reading of a reduced number of texts. Compared with categories they do not have to be rich in content or substantial in their theoretical meaning with reference to the research question. They should enable a differentiated and fine grained ab-

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

141

straction of the content and make the data better accessible, while categories are the basic theoretical concepts developed in the process of generating a qualitative theory about phenomena under study. Furthermore, the attributes should be the elements that come to constitute the categories defined during an iterative process of theory development. In this study of music aesthetics, 192 texts from 18th century journals and books have been explored with respect to the occurrence of twenty rather general fields of interest, divided up into about 400 attributes in the explained sense. Most of the contributions considered are short essays on topics of aesthetic relevance, reviews of both music and books, or translations from the French, Italian, or English. They provide a sort of survey of the 18th century state of the art in aesthetics and reflect all the important contemporary discussions. In some of the debates the authors or editors were personally involved. Many of them were both composers and journalists, some of them highly specialized music-theorists. The attributes ordered by fields of interest were defined after a first careful reading of a reduced number of texts, with the objective of getting familiar with the subject and to develop a certain feeling for topics both relevant or irrelevant, keywords, aspects, opinions, statements, etc. Of course, this took place with an intimate knowledge of the 18th century in general and of certain theories about developments and trends well established in the field of musicology as theoretical background. Some of the formulated attributes are standard; for instance, year, type, and place of publication were recorded. Other attributes are keywords which are interesting with respect to their literal occurrence. In order to abstract relevant content from the data, many attributes are catchwords or short summaries of certain statements. The following example will provide an impression of the character of the attributes relevant for the topic ‘statements concerning immediacy’: (5) Äußerungen zur Unmittelbarkeit (a) Das Bezeichnete soll stärker und deutlicher als das Zeichen sein, um Illusion zu ermöglichen (b) Statt Beschreibung direkter Ausdruck des (musikalisch-affektiven) Gegenstandes gefordert (c) Ergießung/Sprache des Herzens und vergleichbare Formulierungen positiv (d) Unmittelbarkeit der Töne (Musik) selbst statt Ausdruck eines Affektes oder Gegenstandes (e) Positive Bewertung spontaner Gedanken in Komposition There is, for instance, the summarizing attribute 5(a): ‘the signified should be stronger and more distinct than the sign to make illusion possible’. The passage this attribute is derived from we find in the following paragraph: ‘If the artist wants to evoke illusion he should strive to draw clear pictures into our imagination in such a way that the signified comes along more vividly than the sign. [. . .] so that

142

KARSTEN MACKENSEN AND UTA WILLE

we believe not to see an illusion, but the thing itself.’ (‘Will also der Artist durch seine Werke Illusionen hervorbringen, so darf er sich nur bemühen, anschauende Bilder in unsere Phantasie zu mahlen und so zu mahlen, daß das Bezeichnete sinnlicher und lebhafter, als das Zeichen, gedacht wird. [. . .] daß wir nicht glauben, die Vorstellung, sondern die Sache selbst zu sehen’ (Riedel, 1767: 152). The formulation of such attributes with summaries of important statements is an act of interpretation; but let us emphasize again that, unlike categories, the attributes do not have to be of strong theoretical meaning or substance. They mainly serve to abstract content from the underlying data and to establish indices. Categories based on combinations of attributes are developed and defined later after the data have been coded with respect to the attributes.

3.3.

CODING OF THE TEXTUAL DATA WITH RESPECT TO THE ATTRIBUTES

After establishing the relevant attributes, all texts are coded with respect to the attributes; while reading the documents it is recorded which text passage has which attribute. On this level the researcher already can be supported by building up a conceptual data system based on the coding data. Namely, the coding can be made transparent by considering data contexts that consist of the investigated text passages as objects and the previously formulated ‘indexing attributes’ as attributes while the object-attribute-relation is determined by the coding which links texts to attributes. The line diagrams we obtain from the concept lattices of those contexts unfold the structure of the coding, they can reveal possible inconsistencies of the coding, and furthermore they make the coding criticizable by others. Especially with a data-oriented research design where hypotheses and theory are developed from the data, a flexible, transparent, well structured, and efficient retrieval of the raw data is of great importance. That can be provided nicely if the associated conceptual data system is built on top of some computer program designed for computer-aided content analysis like NUD•IST, Atlas/ti, or Textpack (see Weitzmann and Miles, 1995). Unfortunately, a link between conceptual data systems and a system for text analysis is not yet technically implemented. For that reason the established links between texts and attributes have to be stored in some database management system. Then a conceptual data system can be established using this database; that was done in the musicology study. The 192 texts were coded in the classical way with respect to the formulated attributes; for every text, the attributes recorded in the text were stored in an MSAccess database. Using this database, a conceptual data system was established. The line diagram in Figure 3 visualizes the coding with reference to a reduced choice of five attributes from the fields of ‘Einfalt’ (‘simplicity’) and ‘Leichtigkeit’ (which is a rather ambiguous term meaning easiness, lightness, and/or simplicity). In this visualization of a concept lattice, only the number of texts having certain attributes is recorded. For example, at the left-hand side of the diagram we find the attribute ‘ ‘Leichtigkeit’, positively connotated; independent from real difficulties’

143

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS Merkmale zu Einfalt und Leichtigkeit Leichtigkeit pos. rezeptionsästhetisch unabhängig von realer Schwierigkeit der Hervorbringung Einfalt pos. bedeutet Reduktion harmonischer und polyphoner Komplexität Einfalt pos. verlangt einfache musikalische Strukturen Leichtigkeit pos. als Ausführbarkeit

Einfalt dient dem musikalischen Ausdruck (pos.)

22

10

7

4

2

2

2 3 4 2

2

1

Figure 3. Coding of five attributes concerning ‘Einfalt’ or ‘Leichtigkeit’.

(‘Leichtigkeit pos. rezeptionsästhetisch unabhängig von realen Schwierigkeiten der Hervorbringung’). From the diagram it can be seen that there are 17 documents containing this attribute; namely the 10 + 4 + 2 + 1 texts derivable by a descending path from the circle the attribute is attached to. Using the computer, the authors and titles of the individual texts can be obtained by clicking on the number under question; it is also possible to implement the system such that one can also retrieve the underlying texts or text passages. In the middle of the diagram we observe that every document in which it is claimed that ‘simplicity, positively connotated, means reduction of harmonic and polyphonic complexity’ (‘Einfalt pos. bedeutet Reduktion harmonischer und polyphoner Komplexität’) also has the attribute ‘simplicity, positively connotated, requires simple musical structures’ (‘Einfalt pos. verlangt einfache musikalische Strukturen’). Such implications of attributes with respect to the data can easily be read from the diagram and every implication raises the question whether it is indeed plausible. Implications should be explainable and the texts having the second but not the first attribute can be of special interest because they separate the two considered attributes. In the above diagram, for instance, there are 4 texts asserting that ‘simplicity, positively connotated, requires simple musical structures’ which do not also claim that ‘simplicity, positively connotated, means reduction of harmonic and polyphonic complexity’. In this case, the implication appears to be plausible. It indicates that the second attribute is more general; the 4 mentioned documents claim the need for simple musical structures in general but they do not explicitly demand the reduction of harmonic and polyphonic complexity.

144

KARSTEN MACKENSEN AND UTA WILLE

The above example demonstrates, how the coding of the data can be made transparent by activating conceptual data systems, and how questions concerning the coding, its correctness, and the formulation of the attributes arise from considering concept lattices of the coding data. Often these questions are already related to theoretical questions appearing that way. For example, there are many texts dealing with a positive connotation of ‘Leichtigkeit’ in the sense of ‘easily executable’ (‘Leichtigkeit pos. als Ausführbarkeit’). Now, it is interesting that there seem to be only a few connections between such a positive evaluation of easiness of execution and a demand for decreasing harmonic or melodic difficulty leading to simplicity (‘Einfalt pos. bedeutet Reduktion harmonischer und polyphoner Komplexität’). Twelve texts plead for a certain simplification of harmonic or melodic structures and 37 documents contain positive remarks on ‘ ‘Leichtigkeit’, positively connotated, means easily executable’. But there are just five texts that contain positive remarks on both attributes. Of course, the numbers of co-occurrences should be interpreted carefully but at least they force us to have a closer look at the phenomenon. One has to go back to the raw data in order to investigate the theoretical meaning of this small number of co-occurrences or to check whether an error was made during the coding or the formulation of the attributes. This kind of data exploration can be managed in a more comfortable way if the conceptual data system is built on top of a system for computer-aided text analysis, which is explained in further detail in the Section 3.5.

3.4.

PROCESS OF FORMULATING AND EXPLORING CATEGORIES

A key part of interpretative text analysis is the process of formulating and exploring classificatory categories. In contrast to the differentiated attributes used for the coding, categories are fundamental theoretical concepts of the generated theory. Usually they are developed in an iterative process of theorizing on the one hand and analyzing the data on the other. That part of qualitative theory building can be extensively supported by utilizing methods of Formal Concept Analysis. As mentioned above, we assume that the categories can be understood as combinations of attributes. More specifically, we consider categories to be defined as combinations of attributes and attribute values which can be obtained in Formal Concept Analysis by conceptual and logical scaling (see Ganter and R. Wille, 1989; Prediger, 1997). Practically, this is more or less everything that can be generated from the attributes using the Structured Query Language (SQL) which is usually available in relational database management systems. In order to work out substantive categories, in a first attempt preliminary categories are defined from the attributes. Then, using methods of Formal Concept Analysis, the described process of category development can be supported by studying systematically the consequences and impacts of those provisional categorizations. The preliminary definitions can be revealed, explored, modified, and adapted with the help of the associated conceptual data system.

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

145

Merkmale von Einfaltskategorien

13

Einfalt positiv allgemein

29

E. zugunsten Unmittelbarkeit

Einfalt des Erhabenen

4,n) 4,o)

Forderung edler Einfalt allgemein 11,h) 4,a) 4,e) 4,g)

4,b) 4,f) 4,h)

10,f) 4,ab) 4,ae) 4,an) 4,av) 4,i) 4,w) 4.d)

Figure 4. Definitions of four categories concerning ‘Einfalt’.

The structure of the category definitions can be revealed by visualizing the concept lattices of formal contexts obtained as follows: choose some categories (of related content) as attributes of the formal context and all ‘indexing attributes’ defining the considered categories as context objects. Then the object-attributerelation can be defined depending on the specific types of logical combination used to define the categories from the attributes. By the derived line digrams the relationships between the individual definitions of the preliminary categories can be discussed, and it can be seen how the attributes are included. In the musicology study, for example, the following four categories concerning ‘simplicity’ seemed to be of special interest: ‘Einfalt positiv allgemein’ – positive remarks on simplicity in general ‘Einfalt zugunsten Unmittelbarkeit’ – simplicity in favor of immediacy ‘Einfalt des Erhabenen’ – sublime simplicity ‘Forderung edler Einfalt allgemein’ – demand for noble simplicity in general.

Thereby, each category is defined by a set of attributes such that a text is in the category if and only if it has at least one of the attributes defining the category (i.e., the category can be considered as the disjunction of its defining set of attributes). Figure 4 visualizes the assignment of the attributes to the four ‘simplicity’ categories, where the attributes are represented by labels. The category ‘Einfalt des Erhabenen’, for instance, is defined by the attributes ‘4,o’ and ‘4,n’and comprises positive remarks on this specific concept which emerges – as ‘simplicité du sublime’ – from the aesthetic debate in 17th century France. The category ‘Einfalt zugunsten Unmittelbarkeit’ is defined by the attributes ‘4,a’, ‘4,e’,‘4,g’, ‘11,h’, ‘4,b’, ‘4,f’, and ‘4,h’. Texts having the attribute

146

KARSTEN MACKENSEN AND UTA WILLE Einfalt positiv

95

E. positiv allgemein 32 E. zugunsten Unmittelbarkeit E. des Erhabenen

Reichardt: An junge Künstler

Hiller: Ueber die Musik Krause: Schreiben an den Herrn Marquis von B.

Forderung edler E. 20

19

17 Zx.: The celebrated Stabat Mater [Rez.]

Hiller (Übers.): Ueber die Musik, ihre Gewalt, Grundsätze, Endzweck u.s.w. Hiller: [ohne Titel. Über das Dictionnaire von Rousseau] Lecerf: Der Französische Anwald Riedel: Natur, Simplicität und Naivete [Anon.]: Etwas über Gluckische Musik

Figure 5. Categories concerning ‘Einfalt’ with reference to the data.

‘4,b’, ‘4,f’, or ‘4,h’ also belong to the categories ‘Forderung edler Einfalt allgemein’ and ‘Einfalt positiv allgemein’. Note that the hierarchical structure of the defined categories also can be read from such a diagram. It can be seen that the category ‘Einfalt positiv allgemein’ is a super-category of the three other ones because the categories are defined as disjunctions of attributes and every attribute belonging to one of the three subcategories also defines ‘Einfalt positiv allgemein’. This hierarchy of the categories is so defined, i.e., it is due to the definition of the categories. Furthermore, hierarchies of categories with respect to the data can arise from the coding, like the implications of attributes discussed in the previous section. Those implications of categories have to be interpreted because they might not be intended and can be due to inappropriate definitions of the categories. Therefore, within the process of category development the provisional definitions of categories also should be explored with respect to the underlying textual data. The impacts of such preliminary definitions can be studied by looking at the concept lattices of formal contexts having the text passages as objects and as attributes sets of categories grouped together by themes. This conceptual unfolding of the consequences of the category definitions strongly supports theory building. Figure 5 shows the consequences of the above definitions for the four categories concerning the aesthetic term ‘Einfalt’. This line diagram facilitates scrutiny as to

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

147

the relevance of the proposed categories. Surprisingly, only a few texts deal with the concept of sublime simplicity which is – according to Henn (1974) – in Germany the most favored argument from the French aesthetic debate about simplicité and naiveté. Perhaps then this specific category has to be modified, refined, or eliminated as expressed before in a sort of iterative process of close and careful rereading the texts, modifying the category, going back to the texts and so on. From the mere numbers we can not infer to the significance of the individual documents within the aesthetic and historic process, their reception, distribution or influence. Again, we have to take a close look at the sources. Even their titles show the French impact: three out of nine (Lecerf, Hiller: ‘ohne Titel’, and Riedel) are translations from the French or give an account of French texts. The influence and impact of the texts vary manifestly: A review of Joseph Haydn’s ‘Stabat Mater’ by an anonymous ‘Zx’ could not have had the same impact as the essay ‘Natur, Simplicität und Naivete’ (‘Nature, Simplicity and Naivety’) by Friedrich Just Riedel, a professor of aesthetics in Erfurt, or the translation of parts of Jean-Jacques Rousseau’s ‘Dictionnaire de musique’ by Johann Adam Hiller. This ‘Dictionnaire’ was one of the standard works in musical lexicography in the second half of the 18th century. Hiller translated important articles from it immediately after its publication in the year 1768. By re-reading the texts we find a more specific differentiation of sublime simplicity. There is a demand for it without detailed further explanation. The term seems to mean rather vaguely anything simple that is not ‘common’ or related to the music of people of lower social status. Other statements refer to the paradigm of the ‘fiat lux’ and to the author Nicolas Boileau, who was an advocate of the position of the ‘ancients’ in the ‘querelle des anciens et des modernes’ in the 17th century. His specific concept means expressing a sublime subject in very simple words – as with ‘Then God said, ‘Let there be light’; and there was light’ (Genesis, 1.3). It was in fact, like the ‘querelle’ in general a concept with far reaching consequences for aesthetics in Europe. So despite the fact that there are only a few texts recurring to this category, it should not be eliminated. On the contrary, it turned out that the category has to be refined by dividing it into a demand for sublime simplicity either with or without regard to Boileau’s ideas. That shows especially well how the iterative process of formulating categories can be strongly supported by activating concept lattices in different ways.

3.5.

ANALYSIS OF RELATIONSHIPS AMONG CATEGORIES

Of course, the process of formulating and exploring categories and the process of analyzing relationships among categories actually cannot be broken down into two wholly distinct and successive steps. Nevertheless, the process of defining, exploring, and adapting preliminary categories converges at some point to a rather stable definition of those categories. Then the formulation of the actual theory can be further supported by ‘navigating’ in the established conceptual data system

148

KARSTEN MACKENSEN AND UTA WILLE

using the management system TOSCANA which has been developed at the TH Darmstadt (see Kollewe et al., 1994; Vogt and R. Wille, 1995). It is also fruitful to extend the established conceptual data system and to investigate the defined categories in different combinations under various aspects. Often it is already sufficient to view a suitable choice of categories together in order to obtain a line diagram which expresses the kernel of a substantial part of theory. This line diagram then is interpreted verbally and extensively by the researcher. Furthermore, it serves to make the developed theory transparent by making it comprehensible and by enabling other researchers to retrieve and to compare the textual data underlying the theory. Therefore, a main aim of this section is to introduce, to demonstrate, and to discuss the possibilities of the management system TOSCANA in supporting the development of theory. A conceptual data system in text research is always based on a database storing the data obtained by coding. In this study from music aesthetics, the coding information is stored in the relational management system MS-Access; for every text or text passage, there are recorded in the database those attributes which were coded in the individual texts. The data which are now accessible via the database management system then can be conceptually structured and explored by means of TOSCANA. For the design of the management system TOSCANA a fundamental observation was that, in data analysis and knowledge processing, one usually is concerned with questions that relate only a small number of attributes. That justified the strategy of considering concept lattices derived from a few attributes and attribute values, and presenting only combinations of a few line diagrams of such concept lattices. Figures 3 and 5 are examples of line diagrams of concept lattices derived from a few attributes by conceptual and logical scaling (cf. Ganter and R. Wille, 1996; Prediger, 1997). Combinations of such diagrams can be presented using TOSCANA, which employs the Structured Query Language (SQL) to query to the database. In general conceptual data systems consist of a database and a collection of line diagrams of concept lattices derived by conceptual and logical scaling. Those line diagrams of the conceptual data system are called conceptual scales and can be prepared with the software A NACONDA which has been developed by Frank Vogt (see also Vogt, 1996). The management system TOSCANA enables the exploration of the underlying data by providing a flexible tool for navigating and browsing within the conceptual scales of the conceptual data system. First a sequence of conceptual scales has to be chosen. Then one can represent combinations of these conceptual scales by nested line diagrams or one can refine present line diagrams by zooming into them. To zoom into a part of a present line diagram means to restrict the view to a part of it by activating the next diagram in the chosen sequence. Following landscape paradigm of knowledge, ‘the basic idea of this navigation method lies in the conviction that each serious navigation is connected with a learning process of the user, who increasingly understands better how to specify what he is looking for.’ (R. Wille, 1997: 6) And that is more or less what we usually do while devel-

wörtl. Vork. Einfalt wörtl Vork. positive attr. Einfalt

kein wörtl. Vork. Einfalt

wörtl. Vork. edler Einfalt von 1700 an

Bis 1802

von 1720 an

Vor 1800

von 1740 an

Vor 1780

von 1760 an

Vor 1760

von 1780 an

Vor 1740

Vor 1720

von 1800 an nach 1802

Vor 1700

3

10 1

22

10 11

4

13

8 8

29 19

2

29

19

1

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

Erscheinungsjahr Woertliches Vorkommen Einfalt interpretiert

Mattheson: Vorrede zu Das Neu=Eröffnete Orchestre Hudemann: Gedanken von den Vorzügen der Oper Mattheson: Der vollkommene Capellmeister

149

Figure 6. Nested line diagram of the literal occurrence of ‘Einfalt’ in different time periods.

150

KARSTEN MACKENSEN AND UTA WILLE

oping theory from empirical data. In the following paragraphs the basic tasks of conceptual data analysis with TOSCANA are further discussed and demonstrated out of the musicology study. For a ‘conceptual history’ it is necessary to look at the occurrence of specific terms or categories in different time periods. Therefore, one has to represent in one diagram information about the year of publication and the occurrence of the specific terms under discussion. Figure 6 shows a nested line diagram combining two conceptual scales that consist of texts and categories; the first scale represents the year of publication by periods of 20 years and the second one comprises the literal occurrence of terms concerning ‘Einfalt’, ‘positiv attr. Einfalt’ (i.e., ‘positively attributed simplicity’), and ‘edle Einfalt’. The line diagram representing the concept lattice of the year of publication constitutes the outer part of the nested line diagram while the inner line diagram located in each ellipse was obtained by the conceptual scale concerning the literal occurrence of terms concerning ‘Einfalt’. The circles of the first line diagram are enlarged to ellipses, where in each ellipse a copy of the second line diagram is drawn. The nested line diagram can be read like an ordinary one if we replace all lines between ellipses by parallel lines between the corresponding circles of the inner diagrams. Thereby, only the black filled circles represent concepts of the combined concept lattice, the remaining circles are just helpful to read the diagram. For example, we can read from the diagram that among the texts in the sample we are examining there are exactly 4 published between 1740 and 1759 which literally talk about ‘edle Einfalt’. Namely, there are ascending outer lines from the ellipse containing the circle corresponding to the number 4 up to the ellipses the attributes ‘von 1740 an’ and ‘vor 1760’ are attached to. Furthermore, there is an ascending inner line from the considered circle to the circle which is labeled with ‘wörtl. Vork. edler Einfalt’. Furthermore, we observe that, within our chosen set of texts, before 1720 the term ‘edle Einfalt’ does not occur at all and that one out of four documents of the time between 1700 and 1719 – a foreword to ‘Das Neu=Eröffnete Orchestre’ (‘The New ‘Orchestre’ ’) by Johann Mattheson of the year 1713 – contains terms concerning ‘Einfalt’ but not ‘edle’ or any other positively attributed simplicity. That seems to be of importance if one considers the fact that 1739 Mattheson uses the term in ‘Der vollkommene Capellmeister’ (‘The Perfect Master of Music’) quite often and with a certain self-evident truth. Since Mattheson was a Hamburg connoisseur of music and an important ‘homme des lettres’ well known all over Germany, his specific use of terms is a hint as to their character as established ‘topoi’ or aesthetic categories. In the following time periods one can observe an increasing use of ‘edle Einfalt’; for example, between 1780 and 1799 in our sample there are 19 out of 79 texts using the term. That would support the hypothesis about its establishment in the German discourse on the aesthetics of music of that era. Summarizing, one may say that there is a certain category defined by a literal occurrence of the term ‘edle Einfalt’, frequent from the 1740s on, in texts impor-

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

151

Einfalt und Musikgattung

14 bezogen auf Oper

bezogen auf Lied und KaM

bezogen auf Kirchenmusik 1

3

8

Einfalt und Musikgattung

30 bezogen auf Oper

bezogen auf Kirchenmusik

20

1

bezogen auf Lied und KaM

16

1

15

2

1

Figure 7. Zoom into the genre of music ‘Einfalt’ is explicitly referring to (1720–1759 and 1760–1799).

tant for the aesthetic discussion. Further differentiations of course can be made – and have to be made – by looking at each single document with respect to other categories. From each of the formal concepts it is possible to zoom deeper into the structures built by the iterative process of formulating categories. The question about the literal occurrence of terms concerning ‘Einfalt’ during different time periods can be refined by looking at the particular genres of music the demands for simplicity are explicitly referring to in the individual texts. Zooming into the formal concept characterized by the time period 1720–1759 and the literal occurrence of terms concerning ‘Einfalt’ one obtains the first line diagram of Figure 7. It can be seen that in the period from 1720 to 1759 only one text literally refers to ‘Einfalt’ with respect to sacred music (‘Kirchenmusik’); most remarks concerning ‘Einfalt’ refer to the opera. This phenomenon changes completely in the subsequent forty years. The second diagram of Figure 7 shows the literal occurrence of terms concerning ‘Einfalt’ in the period from 1760 to 1799. By then most literal demands for Einfalt explicitly refer to sacred music.

152

KARSTEN MACKENSEN AND UTA WILLE

Using the techniques of zooming and nested line diagram representations one can comprehensively explore the underlying data. In this process of navigating and learning, theory can be developed in interaction with the empirical data. Especially in text research it is important to be able to go back to the raw data at any point in this research process. In our TOSCANA-system currently it is only possible to click the numbers in the diagrams in order to obtain the list of texts the number is referring to; the system is not yet able to retrieve directly the coding in the texts. Of course a link from this list of documents to the individual texts and codings should be established as well. Therefore, one should implement a link between conceptual data systems and some program for computer-aided text analysis; in principle that is no problem. In the following concluding section, the approach we have introduced, based on Formal Concept Analysis, is discussed in the context of computer-aided qualitative data analysis. 4. A Fourth Generation of Computer-assistance? A starting point for the present paper was the observation that available computer systems designed for text analysis do not support researchers following an interpretative, hermeneutic paradigm to a sufficient extent for a process of developing theory from the data. Therefore, in the Introduction it was stressed that the approach introduced in this paper focuses on supporting an inductive and data-oriented strategy of text analysis which we refer to when we use the term ‘qualitative’. A methodology developed and applied in the context of the Chicago School is just one example of such an inductive research approach (see Lindesmith, 1968; Cressey, 1971; Glaser and Strauss, 1967). Basic properties of an inductive research design and the resulting requirements for an appropriate computer system for qualitative text analysis already have been mentioned during the previous sections and can be summarized as follows: (1) A central method in the hermeneutic, interpretative research tradition is a systematic comparison of texts or text passages. Therefore, an effective organization and management of the underlying textual data should be guaranteed. (2) The coding has the function of assigning indices because, for a systematic text comparison, one has to make sure that the relevant data can be drawn together at any point. Thus it is important that the coding can be made transparent by comfortable coding and retrieval techniques. (3) Category schemes are developed during the research process by formulating, modifying, and refining definitions of categories in continuous recourse to the raw data. This process has to be supported by providing possibilities for revealing the structures and consequences of different definitions of categories. (4) Theory building is carried out in a process of learning, understanding, and theorizing in interaction with the data. Therefore, a flexible tool is needed which

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

153

allows the researcher to navigate within the data and which makes the procedure of theory development transparent, comprehensible, communicable, and criticizable. There already exists a substantial amount of methodological literature about computer-aided qualitative data analysis (see, for example, Kelle, 1995; a comprehensive overview of available softwares can be found in Weitzmann and Miles, 1995 or in Prein et al., 1995). Thus we only briefly sketch the major advances in computeraided text analysis made during recent decades. That is done with regard to the properties and requirements of qualitative text analysis summarized above, in order to clarify the contribution this current approach based on Formal Concept Analysis can make. The development of computer-aided text analysis might be described by distinguishing three generations of implemented computer systems (cf. Kelle, 1996; Mangabeira, 1995). In regard to qualitative research, a differentiated analysis of the relevance of the innovations made by the individual generations is given in Kelle (1996) (see also Kelle, 1995). According to Kelle, the first programs used in text analysis were word-processors and database management systems. This first generation of programs could facilitate certain techniques like cut-and-paste but did not exceed very basic requirements concerning the management of the data. In the early eighties, the development of specific programs for text analysis represented a significant step further forward. These programs constitute the second generation and can be characterized as ‘coding-and-retrieval’ programs. The main advances of this new generation were the possibility of managing unstructured textual data and the computerization of cut-and-paste and indexing techniques. Thus it became possible without any further expenditure to create files containing all text passages assigned to a certain category in connection with ‘memos’ or other additional information one had written down during the reading of the material. Consequently, the most important aspect of that generation of computer programs lies in the fact that they guarantee a transparency of the processes both of the coding and of the retrieval of the textual data. Finally, a third generation of programs has been provided with certain facilities to support the process of theory building and hypothesis testing. They offer enhanced retrieval techniques for the search of co-occurring codes and contain different methods for the construction of complex networks linking categories, codes, memos, and text segments. Programs like NUD•IST or Atlas/ti further support the construction of such networks by visualizing them graphically. On the other hand, some of the new features provided by the programs are not appropriate for use in interpretative research. In particular, modules for hypothesis testing are tied to a hypothetico-deductive paradigm and therefore require an a priori formulated scheme of categories and a style of coding which is not adequate in hermeneutic research (see Kelle, 1996: 51). Summarizing, one can say that the programs of the first two generations support the interpretative researcher with respect to the

154

KARSTEN MACKENSEN AND UTA WILLE

requirements (1) and (2), while the programs of the third generation show first attempts at being helpful in the process of formulating categories (3) and developing theory (4). To provide a more comprehensive support of the process of qualitative theory building, we proposed to combine the administrative facilities of an implemented system designed for text analysis with the strengthes of conceptual data systems as flexible tools of data and knowledge communication. On a basic level this approach can contribute transparency with respect to the coding of the data, as shown in Section 3.3. In view of the requirements (3) and (4), the advances obtained by activating conceptual data systems are discussed in the following. First of all, it is an important property of the introduced approach that a scheme of categories does not have to be constituted in advance which is a necessary condition for a computer system to be helpful in qualitative research. The formulation of the attributes used for the coding only presupposes rather vague theoretical preconceptions of the phenomena under study, whereas the actual process of theory building is carried out after the coding of the data has been performed. Therefore, the researcher can easily play with and test different definitions of categories, and does not have to tie her/himself to some fixed scheme of categories early in the research process. It was demonstrated in Section 3.4, that this process of formulating categories can be especially supported by methods of Formal Concept Analysis. The structure of certain definitions of categories can be revealed, and it is possible to study the consequences of certain definitions with respect to the data. At this point two particular strengths of conceptual data systems with respect to data-oriented research should be emphasized: (1) concept lattices keep the unity of extension and intension of the conceptual structure, which enables, for instance, a simultaneous unfolding of hierarchies of categories and consequences of the categorization with reference to the underlying coded data; (2) the close interconnection of formal parts and content enables a formal treatment of the data with a strong reference to human interpretation. That supports the development of theoretical concepts in interconnection with the interpretation of the empirical data. With regard to requirements (3) and (4), we summarize that, by building up a conceptual data system, the definitions of (preliminary) categories are made transparent and can be explored with respect to their theoretical impacts. This especially supports the empirically oriented researcher in her or his work and makes the theoretical decisions during the theory development comprehensible, communicable, and criticizable. Finally this enables a scientific discourse about relevant aspects of the process of inductive theory building if, in addition, concept lattices representing relationships between the created categories link the resulting theories to the data they are based on. According to Strauss (1987: 10), the ‘basic question facing us is how to capture the complexity of reality (phenomena) we study’. To make sense of complex data meant to him, among other things, that a theory, as a final product of successively

QUALITATIVE TEXT ANALYSIS SUPPORTED BY CONCEPTUAL DATA SYSTEMS

155

evolving interpretations made during the research process, ‘must be conceptually dense – there are many concepts, and many linkages among them.’ At this point, concept lattices representing significant patterns of theoretical categories can help to capture and to unfold such complexity. During the research process, researchers can try to cope with the complexity of their data by using the methods of navigation provided by TOSCANA. Usually the navigation process turns out to be a process of exploring, learning, and understanding from which theories might emerge. Different theoretical views can be examined and dependencies among categories are revealed. Since that is always done in close interconnection with the empirical material, it supports a comparative analysis, wherein the researcher notices empirical counterexamples and possible mistakes made during the previous work. Finally, line diagrams of concept lattices communicate resulting theoretical knowledge because they may represent kernels of conceptually dense theories that link various theoretical categories. By integrating and revealing the underlying empirical material, they keep the relevant data accessible, which enables a theoretical re-examination and a discursive validation of the developed theories. All in all we hope that the introduced approach can provide a tool which supports an inductive process of theory development more comprehensively, and that the approach proves to be useful in further concrete applications.

Acknowledgements The authors thank the Studienstiftung des deutschen Volkes and the GESIS (research grant ‘Hochschulsonderprogramm III’) for financial support during the work on this paper. Furthermore, we are grateful to S. Sedelow, W. Sedelow, and R. Wille for valuable comments and suggestions.

References Cressey, D.R. (1971). Other People’s Money. A Study in the Social Psychology of Embezzlement. Belmont: Wadsworth (appeared first in 1953). Ganter, B. and Wille, R. (1989). Conceptual scaling. In: F.S. Roberts (ed.), Applications of Combinatorics and Graph Theory to the Biological and Social Sciences. New York: Springer, pp. 139–167. Ganter, B. and Wille, R. (1996). Formale Begriffsanalyse: Mathematische Grundlagen. Berlin, Heidelberg: Springer-Verlag. Glaser, B.G. and Strauss, A.L. (1967). The Discovery of Grounded Theory: Strategies for Qualitative Research. New York: Aldine de Gruyter. Henn, C. (1974). Simplizität, Naivetät, Einfalt. Studien zur ästhetischen Terminologie in Frankreich und in Deutschland 1674–1771. Zürich: Juris. Kelle, U. (ed.) (1995). Computer-aided Qualitative Data Analysis. Theory, Methods and Practice. London: Sage. Kelle, U. (1996). Computer-aided qualitative data analysis: an overview. In: C. Züll, J. Harkness, J.H.P. Hoffmeyer-Zlotnik (eds), ZUMA-Nachrichten Spezial: Text Analysis and Computers. Mannheim: ZUMA, pp. 33–63.

156

KARSTEN MACKENSEN AND UTA WILLE

Kollewe, W., Skorsky, M., Vogt, F. and Wille, R. (1994). TOSCANA – ein Werkzeug zur begrifflichen Analyse und Erkundung von Daten. In: R. Wille and M. Zickwolff (eds): Begriffliche Wissensverarbeitung: Grundfragen und Aufgaben. Mannheim: B.I.-Wissenschaftsverlag, pp. 267–288. Lindesmith, A.R. (1968). Addiction and Opiates. Chicago: Aldine (appeared first in 1947). Mangabeira, W. (1995). Computer assistance, qualitative analysis and model building. In: R.M. Lee (ed.), Information Technology for the Social Scientist. London: UCL Press, pp. 129–146. Prediger, S. (1997). Logical scaling in formal concept analysis. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J.F. Sowa (eds), Conceptual Structures: Fulfilling Peirce’s Dream. LNAI 1257. Berlin–Heidelberg: Springer, pp. 332–341. Prein, G., Kelle, U., and Bird, K. (1995). An overview of software. In: U. Kelle (ed.), Computer-aided Qualitative Data Analysis. Theory, Methods and Practice. London: Sage, pp. 190–210. Peisker, C.H. (1979). Evangelien-Synopse der Einheitsübersetzung. Stuttgart: Katholische Bibelanstalt. Richter, M. (1987). Begriffsgeschichte and the history of ideas. Journal of the History of Ideas 48: 247–263. Riedel, F.J. (1767). Theorie der schönen Künste und Wissenschaften, ein Auszug aus den Werken verschiedener Schriftsteller. Jena: Cuno. Strauss, A.L. (1987). Qualitative Analysis for Social Scientists. New York: Cambridge University Press. Vogt, F. (1996). Formale Begriffsanalyse: Datenstrukturen und Algorithmen in C++. Berlin– Heidelberg: Springer. Vogt, F. and Wille, R. (1995). TOSCANA – a graphical tool for analyzing and exploring data. In: R. Tamassia and I.G. Tollis (eds), Graph Drawing ’94. Lecture Notes in Computer Science 894. Berlin–Heidelberg: Springer, pp. 226–233. Weitzmann, E.A. and Miles, M.B. (1995). Computer Programs for Qualitative Data Analysis. Thousand Oaks: Sage. Wille, R. (1982). Restructuring lattice theory: an approach based on hierarchies of concepts. In: I. Rival (ed.), Ordered Sets. Boston–Dordrecht: Reidel, pp. 445–470. Wille, R. (1996). Conceptual structures of multicontexts. In: P.W. Eklund, G. Ellis, G. Mann (eds), Conceptual Structures: Representation as Interlingua. Berlin–Heidelberg–NY: Springer, pp. 23– 39. Wille, R. (1997). Conceptual landscapes of knowledge: A pragmatic paradigm for knowledge processing. In: G. Mineau, A. Fall (eds), Proceedings of the International Conference on Knowledge Representation, Use, and Storage for Efficiency, 11–13 August 1997, Vancouver, pp. 2–13.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.