SemWebDL: A privacy-preserving Semantic Web infrastructure for digital libraries

June 9, 2017 | Autor: A. Rezgui | Categoria: Web Accessibility, Semantic Web, Information Flow, Digital Library, Library and Information Studies, Internet technology, Web Service, Privacy Preservation, Internet technology, Web Service, Privacy Preservation

Share Embed

Denunciar este link

Descrição do Produto

Int J Digit Libr (2004) 00: 1–14 / Digital Object Identiﬁer (DOI) 10.1007/s00799-004-0081-0

SemWebDL: A privacy-preserving Semantic Web infrastructure for digital libraries∗ Abdelmounaam Rezgui, Athman Bouguettaya, Mohamed Eltoweissy Department of Computer Science, Virginia Tech, 7054 Haycock Road, Falls Church, VA 22043, USA e-mail: {rezgui, athman, eltoweissy}@vt.edu Published online:

2004 –  Springer-Verlag 2004

Abstract. Recent advances in digital libraries have been closely intertwined with advances in Internet technologies. With the advent of the Web, digital libraries have been able to reach constituencies previously unanticipated. Because of the wide deployability of Webaccessible digital libraries, the potential for privacy violations has also grown tremendously. The much touted Semantic Web, with its agent, service, and ontology technologies, is slated to take the Web to another qualitative level in advances. Unfortunately, these advances may also open doors for privacy violations in ways never seen before. We propose a Semantic Web infrastructure, called SemWebDL, that enables the dynamic composition of disparate and autonomous digital libraries while preserving user privacy. In the proposed infrastructure, users will be able to pose more qualitative queries that may require the ad hoc collaboration of multiple digital libraries. In addition to the Semantic Web-based infrastructure, the quality of the response would rest on extraneous information in the form of a proﬁle. We introduce the concept of communities to enable subject-based cooperation and search speedup. Further, digital libraries’ heterogeneity and autonomy are transcended by a layered Web-servicebased infrastructure. Semantic Web-based digital library providers would advertise to Web services, which in turn are organized in communities accessed by users. For the purpose of privacy preservation, we devise a three-tier privacy model consisting of user privacy, Web service privacy, and digital library privacy that oﬀers autonomy of perspectives for privacy deﬁnition and violation. We propose an approach that seamlessly interoperates with potentially conﬂicting privacy deﬁnitions and ∗ This research is supported by the National Institutes of Health’s NLM Grant 1-R03-LM008140-01 and by grant No. 437869 from the Commonwealth Information Security Center (CISC).

policies at the diﬀerent levels of the Semantic Web-based infrastructure. A key aspect in the approach is the use of reputations for outsourcing Web services. A Web service reputation is associated with its behavior with regard to privacy preservation. We developed a technique that uses attribute ontologies and information ﬂow diﬀerence to collect, evaluate, and disseminate the reputation of Web services. Keywords: Digital libraries – Privacy – Reputation – Semantic Web – Web services

1 Introduction According to a recent study, the world produces between one and two exabytes of information every year [19]. This is roughly 250 MB for every man, woman, and child on Earth. The study also revealed that printed documents account for only 0.003% of the total. More than 90% of this enormous annual output is now stored in digital forms. In its 2001 report , the US President’s Information Technology Advisory Committee (PITAC) concluded that “because digital information is being produced so much more rapidly than other forms, libraries of the future will perforce increasingly be libraries of digital content”, i.e., digital libraries [27]. A digital library (DL) is a large collection of electronic documents and services that enable their use [1]. More formally, a DL may be deﬁned as a 2-tuple (D, S), where D is a set of digital documents and S is a set of services that may be used to access the set of documents D. The motivation behind DLs is to provide large information collections that (i) can be searched for any phrase, (ii) can be accessed anywhere, (iii) can be copied without error, and (iv) do not decay over time [18]. Functionally, a DL may be viewed as a database, i.e., a data store, to which

MS ID: JODL081

29 September 2004 18:40 CET

2

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

is added a data management system supporting functions such as insertion, deletion, search, and sort. However, unlike databases, DLs are inherently public. They are accessible by any user or by large groups of users (e.g., a DL accessible to all the researchers at a university). Also, unlike databases, DLs are, by nature, information providers, i.e., they disseminate information far more often than they collect information. However, for a variety of purposes (e.g., planning, performance improvement, statistics generation), DLs may also collect information. In particular, they may record detailed information about the transactions with their users. In general, this information may easily be used to discover users’ access history. In addition to this dynamic access information, DLs may also record static actual personal information about their users. For example, in a DL with a fee-based access, the DL system may need to know at least some actual ﬁnancial information about the user (e.g., credit card number, checking account number). Combining dynamic access information and static actual personal information may reveal extremely private information about users. For example, using simple data mining techniques, DLs may infer rich hidden private information (e.g., personal habits, belief, political orientation, and medical condition) about their users from less critical information such as a “benign” access log. This illustrates the potential for violating users’ privacy when accessing DL systems. This also raises the challenge of preserving users’ privacy in DLs. Preserving the privacy of library users has been a sensitive issue for decades. The Library Bill of Rights, adopted by the American Library Association in 1948 [3], states that “the privacy of library users is and must be inviolable”. The bill also requires that “policies should be in place that maintain conﬁdentiality of library borrowing records and of other information relating to personal use of library information and services”. The “digital” preﬁx has not changed the essence of the problem. In fact, with the emergence of the Web as a means for global connectivity and the widespread deployment of Webaccessible DLs, the privacy problem has become even more complex.1 In the previously mentioned US PITAC report, the panel on DLs recommended that privacy technologies be developed to protect the rights of both content owner and DL users [27]. The inception of DLs may be traced back to the early 1960s, when the ﬁrst attempts were made to store library information on computers [2]. These early eﬀorts faced two major obstacles: the high cost of computers and storage devices and the lack of a networking infrastructure. Advances in storage technologies have made the ﬁrst issue almost irrelevant. In parallel, the advent of the Web has enabled an unprecedented global connectivity and provided a deﬁnite answer to the issue of networking infras1 Throughout this paper, we will use the terms Web-accessible DLs and Web DLs interchangeably.

tructure. However, the Web has not reached its full potential. A recent NSF report concluded that the “emerging vision is to use cyberinfrastructure to build more ubiquitous, comprehensive digital environments that become interactive and functionally complete for research communities . . . ” [23]. This is particularly the case for DLs, where the impact of Web technologies, access and computing models, and standards is yet to be seen. A noticeable eﬀort to answer this need was the NSF Post Digital Library Futures Workshop [22]. A promising research direction that was particularly highlighted was to explore ways in which the emerging Semantic Web may support the next generation of DLs. The premise for this conclusion is twofold. First, the Semantic Web enables the “global sharing of commercial, scientiﬁc and cultural data” [21]. Second, DLs aim at providing individual users and collaborative research groups with a uniform and integrated access to a large collection of distributed, heterogeneous data [11]. The rationale is that the Semantic Web, with its agent, service, and ontology technologies, oﬀers the ideal combination of tools to support uniform, seamless access to Web-based autonomous, heterogeneous DLs. The Semantic Web is steadily becoming a realistic vision where “machines become much better able to process and understand the data that they merely display at present” [4]. The Semantic Web introduces fundamental changes in how information is organized, retrieved, and exchanged on the Web. In particular, it introduces a novel information retrieval paradigm based on the interaction of two types of entities: Web services and Web agents. The former are applications that expose interfaces through which they may be automatically invoked by Web clients. Semantic Web agents are “intelligent” software agents that carry out sophisticated tasks on behalf of their users. In addition to its agent and service technologies, the Semantic Web provides a key capability that will have a signiﬁcant impact on DLs, namely, semantic interoperability. An important prerequisite in tomorrow’s DLs is to enable communication among diﬀerent groups using diﬀerent vocabularies to represent, organize, and manage their information space [10]. To create some common ground of understanding, the Semantic Web introduces ontologies [8, 12]. Semantic Web ontologies express both the context and structure of information as well as relationships among bits of information in a form that supports machine processing [21]. Preserving privacy in Semantic Web DL environments is a particularly complex instance of the Web privacy problem [25]. Although solutions to the more general Web privacy problem may have some impact on preserving privacy in Web DLs, their deployment in the current Web is likely to be of little help in the context of the Semantic Web. Semantic Web agents will be able to explore the Web more aggressively than humans. This will introduce a signiﬁcant amount of automation into most of today’s Web applications. In the particular case of

MS ID: JODL081

29 September 2004 18:40 CET

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

DLs, the Semantic Web will provide users with eﬀective means for the automatic retrieval of higher-quality information from Web DLs. While this increased automation improves users’ experience, it also introduces greater risks as Semantic Web agents would also be able to manipulate sensitive private information. Preserving privacy in this context will introduce a set of challenges fundamentally diﬀerent from those encountered in the “original” Web. A particular challenge is to enable software agents to reason about users’ privacy. This means enabling agents to make decisions about (i) what information sources to access when retrieving information for users and (ii) which Web entities sensitive information may be disclosed to in the course of that process. Privacy must also be considered from the perspective of DLs. In a Semantic Web setting, these DLs must also be able to answer Semantic Web agents’ requests while preserving the privacy of their content. Consequently, the need exists for a Semantic Web infrastructure supporting DLs where agents and DLs may conﬁdently trust each other. Web agents and DLs need mechanisms and tools to automatically evaluate the degree to which agents may trust DLs’ Web services with users’ sensitive information such as his/her access history. Our approach is to build trust through the reputation. The deﬁnition of service reputation in our work is adapted from the broader one where reputation means “a common belief of a peer’s capabilities, honesty and reliability” [28]. In the context of our work, the reputation is associated with a peer’s behavior with regard to privacy preservation. Preserving user privacy in the context of Semantic Web DLs is not only complex but also urgent. Indeed, considering key aspects such as privacy was often an afterthought during the early stages of the design of the Web. The consequences have proven to be signiﬁcant in the subsequent stages. In this work, we propose a Semantic Web infrastructure that supports a reputationbased, privacy-preserving access to DLs. We ﬁrst present a three-tier privacy model for DLs whose components are user privacy, service privacy, and library privacy. The model clearly delineates the privacy requirements associated with each type of entity. This enables an elegant formulation of the privacy problem at each level. It also allows the implementation of eﬀective, scalable, and modular solutions with little interdependency between the mechanisms deployed at each level of the three-level hierarchy. Based on the proposed three-tier model, we introduce a reputation-based service layer through which users access DLs. This layer consists of a collection of Web services that users or their agents invoke to retrieve information from DLs. We propose a reputation management mechanism that provides estimations for services’ reputations with regard to handling users’ sensitive information. The basic idea is that agents will invoke Web services based on their own privacy requirements and on the services’ privacy policies and reputation. The reputa-

3

tion of a service determines the degree to which an agent can actually trust the privacy policy that a Web service “promises” to its users. We also propose the concept of the reputation-based composition of DLs. The basic idea is to build communities of several DLs that (i) users can access as if they were a single DL and (ii) are built with the requirement of a reputation threshold. A community of DLs is a virtual view of a collection of DLs that not only share some common domain of interest, but also achieve, as a group, a minimum level of reputation for their users. The paper is organized as follows. In Sect. 2, we present a scenario that demonstrates the challenge of preserving privacy in Semantic Web DLs and identiﬁes a set of requirements for addressing them. In Sect. 3, we propose a privacy model for Semantic Web DLs. Based on this model, we describe, in Sect. 4, an architecture for Semantic Web DLs. In Sect. 5, we introduce a mechanism for reputation management and present the architecture of a reputation management system. In Sect. 6, we provide an overview of our ongoing implementation of our SemWebDL prototype system. Section 7 reports on some previous related work, and Sect. 8 concludes the paper.

2 Privacy in Semantic Web digital libraries: a scenario In this section, we use a scenario to demonstrate the rationale for a Semantic Web infrastructure to support the next generation of DLs. The scenario will also illustrate the main aspects that need to be addressed to preserve privacy in such an environment. In addition, we will use this scenario as the basis for a running example employed throughout the paper. In this scenario, users access a collection of health-care-related DLs. These DLs include: – Academic DLs: DLs provided by schools of medicine at several universities. Such DLs would typically contain information such as technical reports, research papers, theses, and case studies. – Hospital DLs: DLs provided by hospitals and other health care centers. They contain, in particular, deidentiﬁed health records of past/current patients. The deidentiﬁcation process is assumed to be privacy preserving (i.e., no means exists that enables the linking of a given deidentiﬁed record to a given known patient). – Public health DLs: DLs that oﬀer access to large repositories of online general medical information destined for use by lay users (e.g., tutorials on common pathologies, statistics, survey results). These DLs may be provided by governments [e.g., Centers for Disease Control (CDC)], nonproﬁt institutions, or international organizations (e.g., World Health Organization). – Research DLs: DLs owned by research institutions. They provide advanced research results destined for

MS ID: JODL081

29 September 2004 18:40 CET

4

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

use by medical researchers and professionals. They may also provide research results accessible to and intelligible by the general public. – Pharmaceutical DLs: DLs provided by pharmaceutical companies. In addition to advertising pharmaceutical products, these DLs may contain other healthrelated information such as detailed descriptions of drug ingredients, studies comparing the eﬀects of different drugs, etc. – Private DLs: These may be thousands of small private DLs contributed to society by physicians, professors of medicine, and health professionals. Consider a person, Emile, who has a chronic disease. Emile wants to learn about his disease. Traditionally, he would have manually and iteratively accessed some of these DLs until he retrieved the desired information. In the process, Emile would have accessed many DLs that do not actually contain relevant information. He also would have most likely missed many of the DLs where

more relevant information may be found. This lengthy, error-prone, collect-then-ﬁlter approach to information retrieval calls for new paradigms enabling automatic and faster access to more accurate and more succinct information. This clearly means that intelligent software agents must replace users in carrying out information retrieval tasks with greater eﬀectiveness and eﬃciency. These agents must be able to explore the Web-accessible distributed information space and extract information relevant to their respective users. This obviously requires that these agents be able to (i) understand the semantics of their users’ requests, (ii) understand the semantics of the content of the diﬀerent DLs, and (iii) properly identify when and to what extent a given DL content matches a user’s request. Now, let us assume that Emile accesses the previous set of DLs through a Semantic Web infrastructure as shown in Fig. 1. In this ﬁgure, Semantic Web agents submit user requests to DL Web services. These services may potentially interact while answering the user re-

Fig. 1. A scenario for accessing Semantic Web digital libraries

MS ID: JODL081

29 September 2004 18:40 CET

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

quests. Services access DLs through DL Web interfaces provided by the respective DL management system. Each DL deploys a Web-accessible service that exposes the DL’s functionalities to the users. Users have Semantic Web agents that explore the Semantic Web infrastructure to retrieve information. In such an environment, Emile may instruct his Semantic Web agent to explore the distributed DL information space and interact with the Web services of the respective DLs to retrieve information relevant to his disease. In this automatic information retrieval process, DLs may require Emile’s agent to disclose some of Emile’s personal information. For example, for marketing purposes, a pharmaceutical DL may ask the agent for information such as Emile’s name, address, date of birth, etc. This illustrates an important consequence of the increased automation introduced in the envisioned Semantic Web, namely, the loss of user control over privacy. Figure 1 also shows that services may have to interact in the process of answering Emile’s request. This raises the issue of determining to which services Emile’s sensitive information may be disclosed and from which services it must be concealed. Another aspect to consider is the trustworthiness of services and DLs. In many cases, requests for information may not be answered unless sensitive information is disclosed to parties whose trustworthiness is unknown. Going back to our example, Emile may subscribe to several private and public health DLs to access valuable material related to his disease. Emile’s Web agent explores these DLs, searching for relevant information. Some of the DLs may require that Emile’s agent present actual personal information before they deliver the requested information. The agent must then decide whether those DLs are suﬃciently trustworthy to disclose Emile’s personal data. Emile and his physician, Robert, invoke the same Web service S to access information related to Emile’s disease. Emile and Robert may have diﬀerent backgrounds, education levels, skills, etc. A major advantage of the Semantic Web is that it enables context-aware processing of requests to access DLs, i.e., services become aware of a user’s proﬁle and adapt their answers to that proﬁle. When a user or user’s agent submits a request to a Web service, the service may need some information on the user’s proﬁle to deliver answers that best suit that speciﬁc user. Knowing the user’s proﬁle, the service S may deliver diﬀerent results by accessing diﬀerent DLs. For example, it may deliver Emile’s results by accessing public health DLs and deliver Robert’s results by accessing a collection of hospital, research, and private DLs. In a Semantic Web setting for accessing DLs, this translates to the challenge of deploying agents that are not only able to “understand” their users’ proﬁles, but also capable of selective information disclosure. The scenario illustrated in Fig. 1 justiﬁes the need for a user privacy proﬁle, a service privacy policy, and a library privacy policy. A user privacy proﬁle has two

5

components: a static privacy proﬁle describing the user’s requirements on his/her static information and a service access proﬁle describing the user’s requirements on how services use his/her behavioral information, i.e., DL accesses. The service privacy policy has two components: a user interaction policy describing the service’s privacy policy toward its users and a service interaction policy describing its privacy policy when interacting with other services. The library privacy policy also has two components: an access policy describing the DL’s privacy policy toward the information collected from services’ access and a data privacy policy describing the privacy requirements on the DL’s content.

3 A privacy model for Semantic Web digital libraries Solutions that preserve privacy in Semantic Web DL transactions may be ad hoc or layered. In the former solutions, a request is viewed as an indivisible sequence of steps. Consequently, mechanisms deployed to preserve privacy at one step depend on the privacy-preserving mechanisms deployed at other steps. This assumes a closed, static environment where all requests have a static, predeﬁned execution ﬂow. On the Web, the assumption of a closed environment is obviously not valid. For example, under the closed environment assumption, new Web services may be deployed that implement privacy mechanisms that do not necessarily interoperate with existing semantic Web agents. As our objective is a modular, evolutionary solution, we propose a layered, three-tier privacy model for a Semantic Web DL infrastructure that distinguishes three classes of entities, namely, users, Web services, and digital libraries. In the proposed model, users (or their Semantic Web agents) access DLs through Web services. A Web service may provide an interface to retrieve information from one or several DLs, i.e., a DL community. Also, multiple Web services may provide diﬀerent access capabilities to the same DL. To capture the privacy requirements and behavior at the three layers, the model deﬁnes three types of privacy, namely, user privacy, service privacy, and library privacy: – User privacy: Users and their agents access DLs through Web services. They expect/require diﬀerent levels of privacy according to their perception of information sensitivity. These diﬀerences are captured in a user privacy proﬁle. This proﬁle is used, in particular, when a user’s Semantic Web agent invokes a Web service to access a DL. In our DL example, for instance, Emile may specify that only anonymous accesses are permissible. In this case, the agent will not invoke services that require identiﬁcation information. – Service privacy: DLs are accessed through Web services that are not necessarily provided by the DLs’

MS ID: JODL081

29 September 2004 18:40 CET

6

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

providers. Each Web service has its own service privacy policy. This policy speciﬁes a set of rules that determine how sensitive information that the service receives from its clients must be handled. An important component of this policy is the service interoperability policy, which determines the set of privacy rules applicable when user requests require cooperation among several services. Such cooperation must, for example, avoid the exchange of sensitive access information between services with conﬂicting privacy policies. – Library privacy: DLs may store two types of sensitive information about users: static and dynamic. Static user information includes information that is intrinsic to the user (e.g., name, address). Dynamic user information describes the access behavior of the user, i.e., items accessed. Each DL has a library privacy policy that describes how it handles users’ static and dynamic information. A DL may have a membership in one or more communities. A community is a dynamic collection of DLs that is, functionally, indistinguishable from an ordinary DL. For example, a DL on the history of medicine may be a member of a community of history DLs and another community of medicine DLs. The privacy policy of a community is derived from the combination of the privacy policies of its DL components. DLs that are members of the same community may have conﬂicting privacy policies. For example, a DL may state that all user accesses are recorded and may be used for DL management purposes, while another DL in the same community may not record user accesses to information items. Access to a DL community is a conservative process, i.e., user requests are submitted only to DL members with privacy policies that are compatible with the user’s privacy requirements.

terms of three layers and three modules. The architecture’s layers are the digital library layer, the Web services layer, and the community layer. The architecture also includes three modules, namely, the Reputation Manager, the Community Manager, and the Request Processor: – Library layer: This layer contains the actual DLs. Each DL has a library privacy policy that describes its policy with regard to users’ behavioral information. – Service layer: The service layer is a collection of Web services deployed by individual DLs. To each service is associated a service privacy policy that describes its behavior with regard to users’ access information. – Community layer: This layer provides a virtual view of the undelying actual DLs. This layer is a collection of communities of DLs and Web services giving access to these communities. These services exist only while their respective communities exist. – Reputation manager: The reputation manager (RM) is the core of the reputation system. It is a unanimously trusted party responsible for (i) collecting, (ii) evaluating, (iii) updating, and (iv) disseminating reputation information. – Community manager: The community manager (CM) ensures all the functionalities related to the management and invocation of DL communities. It provides Web-accessible functionalities that enable DL administrators to create and manage communities of DLs. Examples of these functionalities include: – CreateCommunity(), which DL administrators use to create a new community of DLs. – JoinCommunity(), which DL administrators use to make that DL a member of a given community. – LeaveCommunity(), which DL administrators use to withdraw a DL from a given community. – MergeCommunities(), which DL administrators use to merge two existing communities. – PropertyInquiry(), which DL administrators invoke to inquire about the type (i.e., domain of interest), members of a given community, etc.

To enable privacy-preserving interaction among users, services, and DLs, we augment our three-tier privacy model with a reputation management system. This system uses reputation as a criterion to trigger and validate interaction between any entities in the system. Our solution is based on two principles. First, using attribute ontologies and information ﬂow diﬀerences, reputation of an entity is quantiﬁed such that reputation is attributed to an entity that is not the source of any “leakage” of private information. Second, the traditional invocation scheme of Web services (discovery-selection-invocation) is extended into a reputation-based invocation scheme where reputation is also a parameter in the discoveryselection processes.

The CM maintains a list of community proﬁles that describe the characteristics of the diﬀerent communities. A community proﬁle describes the reputation of each member of the community in accordance with each criterion of the given set of reputation assessment criteria. It also gives an aggregated view of the members’ reputation in accordance with the assessment criteria. When a DL community is created, its administrator speciﬁes a reputation threshold that the community must globally maintain during its lifecycle. When processing requests that change the community’s membership (e.g., join() requests), the CM ensures that the resulting community preserves the speciﬁed reputation threshold. For example, the CM may reject a join request issued by a service if that

4 The Semantic Web digital library architecture In this section, we present the general architecture of the proposed Semantic Web DL, SemWebDL, focusing on the privacy aspects. Figure 2 describes this architecture in

MS ID: JODL081

29 September 2004 18:40 CET

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

7

Fig. 2. A reputation-based DL architecture

service’s reputation would lower the overall reputation of the community to a level below the given threshold. – DL request processor: The DL request processor (RP) is responsible for the initial processing of user requests. In particular, it parses user requests, generates appropriate execution plans to answer these requests, and orchestrates the execution of these plans through a reputation-based invocation of services and communities. An appropriate execution plan is a sequence of service invocations that the RP constructs based, in particular, on the user’s reputation/privacy requirements and on the services’ reputation information provided by the RM.

5 Reputation management Figure 3 provides an overview of SemWebDL’s Reputation Manager (RM). Three components are at the core of the RM: the reputation evaluation engine, the attribute ontology, and the reputation repository. Web services have service wrappers that implement the new functionalities necessary for reputation management. Also, a set of probing agents, called ProbAg in the ﬁgure, are deployed to implement the service monitoring function of the RM. This section decribes our approach to reputation

Fig. 3. Architecture of the Semantic Web DL reputation manager

MS ID: JODL081

29 September 2004 18:40 CET

8

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

management. We consider a dynamic set S of Web services that Web clients invoke to access DLs and communities of DLs. Two concepts are key in the proposed approach, namely, attribute ontology and information ﬂow diﬀerence.

orders. Our approach assumes that, statistically, any two attributes are ranked consistently by a majority of individuals. However, the solution may readily be extended to employ user-deﬁned attribute ontologies where users specify the sensitivity of the diﬀerent attributes.

Attribute ontology

Information ﬂow diﬀerence

An operation of a Web service may be viewed as a “processing” unit that consumes input parameters and generates output parameters. The invocation of a given operation may potentially result in privacy violation when one or more of the output parameters correspond to private attributes (e.g., last name, address, phone number). A requirement for automating privacy preservation is to formally capture any possible leakage of sensitive information that may result from service invocation. Our approach is based on a concept called information ﬂow difference that provides an estimate of services’ potential to release private information. The deﬁnition of this concept is based on a scaled attribute ontology that captures two important characteristics of attributes, namely, synonymy and privacy signiﬁcance order.

Let si be a Web service in the set S and Opj an operation of the service si that has p input attributes and q output attributes. Let Input(Opj ) denote the set of input attributes for operation Opj and Output(Opj ) denote the set of output attributes for operation Opj .

Synonymy: Consider the scenario of Sect. 2. Emile may access a public health library by invoking two fee-based Web services. Both expect their Web clients to disclose the user’s actual family name and home phone number. The description of the ﬁrst service names the parameters: FamilyName and PhoneNumber, while the description of the second service names these parameters: LastName and PhoneNumber. Clearly, from a privacy perspective these two services are equivalent. This is due to the semantic equivalence of FamilyName and LastName. To capture this equivalence among attributes, the proposed attribute ontology deﬁnes sets of synonymous attributes. The following are examples of sets of synonymous attributes: T1 = {FamilyName, LastName, Surname, Name} T2 = {PhoneNumber, HomePhoneNumber, ContactNumber, Telephone, Phone} T3 = {Address, HomeAddress, Location} Privacy signiﬁcance order: Private attributes do not have the same sensitivity. For example, most DL users consider their Social Security number as being more sensitive than their phone number. To capture the diﬀerence in attributes’ sensitivity, we deﬁne the privacy signiﬁcance level as a function deﬁned over the set of attributes and that, given an attribute a, associates it to a number PSL(a) ∈ N that reﬂects attribute a’s signiﬁcance from a privacy perspective. For any two given attributes a and b, a is said to be of higher privacy signiﬁcance if its privacy signiﬁcance level, PSL(a), is greater than b’s privacy signiﬁcance level, PSL(b). This establishes a privacy signiﬁcance order between any pair of attributes. Of course, this order may not be universally valid. For example, two diﬀerent people may rank two attributes in diﬀerent

Deﬁnition 1. The information ﬂow diﬀerence IFD of operation Opj is deﬁned by IFD(Opj ) = PSL(a) − PSL(a). a∈Input(Opj )

a∈Output(Opj )

Example 1. Assume that Opj has as its input the attribute SSN and as its output the attribute PhoneNumber. The values of the PSL function for attributes SSN and PhoneNumber are, respectively, 6 and 4. In this example, IFD(Opj ) = 2. The meaning of this (positive) value is that an invocation of operation Opj must provide information (SSN) that is more sensitive than the returned information (PhoneNumber). Intuitively, the information ﬂow diﬀerence captures the degree of “permeability” of a given operation, i.e., the diﬀerence (in the privacy signiﬁcance level) between what it gets (i.e., input attributes) and what it discloses (i.e., output attributes). In general, positive values for the function IFD do not necessarily indicate that invocations of the corresponding operation actually preserve privacy. In the previous example, a service invoking Opj may still be unauthorized to access the phone number, although it already knows a more sensitive bit information (i.e., the Social Security number). However, invocations of operations with negative values of the function IFD necessarily disclose information that is more sensitive than their input attributes. They must, therefore, be considered as cases of privacy violation. Deﬁnition 2. The information ﬂow diﬀerence of a Web service s is the sum of the IFDs of all its operations. We now present a general model for a Semantic Web environment where users and their agents access DLs based on the reputation of the services used to access those DLs. The objective of the proposed model is to enable Web services and agents to interact in an environment where the decision to disclose private sensitive information becomes an automatic, reputation-driven process that does not require the intervention of human users. Our approach mimics the real-life business and social environments where (good) reputation is a prerequisite (or,

MS ID: JODL081

29 September 2004 18:40 CET

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

sometimes, the reason) for any transactions. The basic idea is to deploy a reputation management system that continuously monitors Web services, assesses their reputation, and disseminates information about services’ reputation to other services and agents. To each Web service we associate a reputation that reﬂects a common perception of other Web services toward that service. In practice, diﬀerent criteria may be important in determining the reputation of a Web service. For example, the reputation of a service that searches for “best” airline fares clearly depends on whether or not it actually delivers the best fares. In this paper, services’ reputation depends only on the eﬀectiveness of their enforcement of the privacy of their users. To simplify our discussion, we will use “services” instead of “services and agents” in any context where “services” is an active entity (e.g., “service” s invokes operation Op). We propose ﬁve criteria that are the basis in the process of reputation assessment. Criteria for reputation assessment To automate the assessment of services’ reputation, we identiﬁed a set of criteria that (i) reﬂect the “conduct” of services with regard to how they protect private information that they collect from users and (ii) may be automatically and objectively assessed. Degree of permeability: We previously introduced the function IFD, which determines services’ degree of permeability (DoP), i.e., their proneness to the disclosure of sensitive information. We also use this function to rank Web services according to their DoP. Let s1 and s2 be two Web services. If IFD(s1 ) < IFD(s2 ) < 0, then s2 is said to be less permeable than service s1 . Consider the scenario of Sect. 2. To answer a given request, Emile may use two fee-based services having the same set of output parameters. The ﬁrst requires as its input Emile’s family name, and credit card number. In addition to these two attributes, the second service also requires Emile’s Social Security number. If the two services have the same set of output variables, the second service’s DoP is higher than the ﬁrst. Authentication-based disclosure of information: Web services use diﬀerent approaches to authenticate the senders of the received requests. The reputation of a service clearly depends on the strength of the mechanism used to authenticate clients. For example, the reputation-based infrastructure may adopt the rule that services using Kerberos-based authentication schemes have better reputation than services that use schemes based on user/password authentication. In the previous scenario, if Emile has the possibility of invoking a service s1 that uses a Kerberos-based authentication scheme and another service s2 that uses a user/password authentication scheme, the reputation manager will rank s1 better than s2 with regard to this criterion.

9

Use of encryption mechanisms: This criterion captures the eﬃciency of the encryption mechanisms used by Web services. For example, Emile may use a service whose messages are encrypted using a 128-bit encryption scheme or another service using a 64-bit encryption scheme. Obviously, the ﬁrst service will be ranked higher than the second with regard to the encryption criterion. Seniority: This criterion reﬂects the simple “fact” that, like businesses in the real world, trust in Web services increases with the length of their “lifetime”. If the dates of deployment, d1 and d2 , of two services, s1 and s2 , are known, then the reputation of s1 may be considered better than that of s2 if d1 precedes d2 . For example, two services may let Emile access the same DL. However, while one is an established service, the second has been advertised only recently. Obviously, the ﬁrst service is ranked higher than the second one with regard to the seniority criterion. A weighted deﬁnition of reputation We now present a formal deﬁnition of the reputation of Web services. Let R be the set of m criteria used in the process of reputation assessment (m = 5 in the proposed list of criteria) and cji the value of criterion cj for service si . The values of these m criteria are normalized such that ∀si ∈ S, ∀cj ∈ R, 0 ≤ cji ≤ 1. In practice, the criteria used in reputation assessment are not equally important or relevant to privacy enforcement. For example, the seniority criterion is clearly less important than the degree of permeability. To each criterion cj ∈ R we associate a weight wj that is proportional to its relative importance as compared to the other criteria in R. A Web service’s reputation may then be deﬁned as follows: Deﬁnition 3. For a given service si ∈ S, the reputation function is deﬁned by Reputation(si) =

m

wk .cki .

(1)

k=1

The intuitive meaning of Eq. 1 is that the reputation of service si is the weighted sum of its performances along each of the considered reputation criteria. We now describe the three main components: the reputation evaluation engine, the probing agents, and service wrappers (Fig. 3). Reputation evaluation engine The evaluation of most of these criteria is based on the syntactic description of service sj . For example, to evaluate the degree of permeability of sj , the RM reads sj ’s description by accessing the appropriate service registry,

MS ID: JODL081

29 September 2004 18:40 CET

10

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

computes IF D(sj ) (sj ’s information ﬂow diﬀerence), and maps that value to the corresponding degree of permeability. The RM maintains an attribute ontology that is used in computing the degree of permeability of the diﬀerent Web services. Once the DoP of service sj is evaluated, it is stored in the local reputation repository. The other criteria may be obtained using similar processes. Once all the criteria are evaluated, the RM computes sj ’s reputation (as described in Sect. 5), stores the obtained value in the reputation repository, and sends it to service si . Anonymous probing agents Service reputations are not static values. Diﬀerent types of updates may aﬀect a service’s reputation. The reputation manager (RM) must permanently maintain an accurate perception of services’ reputations. Two alternatives are possible for a continuous monitoring of Web services. In the ﬁrst, the RM permanently retrieves services’ descriptions and issues requests to services to collect the information necessary to evaluate the diﬀerent criteria for reputation assessment. This approach is clearly inadequate. First, it leads to a huge traﬃc at the RM. Second, malicious Web services may easily identify requests originating at the RM and reply with messages that do not reﬂect their actual behavior. To overcome these drawbacks, our solution deploys a set of η probing agents (or probers) that collect information necessary to the process of reputation assessment and share it with the RM. These agents are responsible for permanently monitoring the services and reporting the collected information to the RM. These agents are not colocated with the RM and are, a priori, anonymous to the Web services being monitored, i.e., services may not distinguish probing requests from ordinary requests. Probing Web services may be explicit or implicit. In explicit probing, an agent invokes a service’s operation only for monitoring purposes. For example, a Web service may deploy an operation ping that enables its clients to determine whether it is alive [24]. A probing agent may then invoke this operation to determine the availability of that service. In implicit probing, an agent invokes the actual functionality of a Web service. In the process, the agent collects information about criteria such as availability, response time, reliability, etc. Services with a low reputation present a greater potential for unauthorized information disclosure. Therefore, the process of monitoring Web services must be distributed such that services with low reputations get probed more aggressively and more frequently. To meet this requirement, services are partitioned into δ (δ ∈ N) clusters C0 , C1 , .., Cδ−1 such that services in the same cluster have “comparable” reputations. δ is called the clustering factor. Formally, ∀Cj , si ∈ Cj =⇒

j j+1 < Reputation(si) ≤ . δ δ

(2)

To enable a variable probing policy, we associated δ−1 ηi probing agents to each cluster Ci (δ ≤ η = k=0 ηk ). A reasonable distribution of the η probers on the δ clusters is one in which ∀i, 0 ≤ i < δ, Ri .ηi = α, where α is a constant and Ri is the average reputation of services in cluster Ci , i.e., sk ∈Ci Reputation(sk ) Ri = , δi where δi is the size of cluster Ci . Probing agents associated with cluster Ci continuously and randomly invoke services in Ci to determine the values of the diﬀerent criteria in the set R. In the monitoring process, they may also have to access service registries and retrieve service descriptions. An advantage of this cluster-based monitoring approach is that it is ﬂexible and may be easily “tuned” to accommodate loose and strict privacy enforcement. For example, the parameter α may be set higher (for all clusters) to achieve stricter privacy control. Also, if a speciﬁc cluster or set of clusters (e.g., corresponding to businesses with low reputations) turns out to be more prone to information disclosure than others, only their probing agents may be instructed to switch to a more aggressive monitoring mode. Service wrappers A signiﬁcant challenge in deploying the proposed approach is to introduce a posteriori a privacy-preserving mechanism to existing Web services that are already built without a mechanism for privacy enforcement. The solution clearly requires adapting service invocation schemes to accommodate the added mechanism. To achieve this transition to privacy-preserving services, we introduce components called service wrappers. A service wrapper associated with a service si is a software module that is colocated with si and that handles all messages received or sent by the service si . To send a privacy-sensitive message M to a service sj , the service si ﬁrst submits the message to its wrapper. If necessary, the wrapper sends a request to the RM to inquire about sj ’s reputation. Based on the answer received from the RM, si ’s wrapper may then forward the message M to sj or decide to cancel, sending M to sj . To avoid an excessive traﬃc at the RM, the wrapper may locally maintain a reputation cache that contains information about the reputation of the most frequently invoked services. 6 Implementation In this section, we describe the implementation of the SemWebDL system. The implementation centers around

MS ID: JODL081

29 September 2004 18:40 CET

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

the use of a variety of database and Web service technologies. The system is implemented across a network of Solaris workstations. Figure 4 provides an overview of the SemWebDL system. Users may access SemWebDL via a graphical user interface (GUI) implemented using HTML/Servlet. A user’s request may involve one or more DL systems. Prototypes of several DL management systems are implemented in Java (JDK 1.3). The underlying DL databases are (Oracle 8.0.5 and Informix 7.0 ) databases. These databases are populated with diﬀerent types of digital content including text, image, audio, and video documents. The DL systems are wrapped by WSDL descriptions. To automatically generate these WSDL descriptions, we use the Axis Java2WSDL utility in IBM’s Web Services Toolkit. This utility generates WSDL descriptions from Java class ﬁles. WSDL service descriptions are published into a UDDI registry. The DL Reputation Manager exposes its ReputationInquiry capability through a Web service that it advertises in the UDDI registry. We adopt Systinet’s WASP UDDI Standard 3.1 as our UDDI toolkit. The Cloudscape (4.0) database is used as a UDDI registry. SemWebDL uses the service management client provided within Apache SOAP (2.2) to deploy services giving access to the diﬀerent DLs. Apache SOAP provides a server-side infrastructure for deploying and managing

11

services. It also provides a client-side API for invoking these services. Each service has a deployment descriptor. This descriptor includes the unique identiﬁer of the Java class to be invoked, session scope of the class, and operations in the class available for the clients. Each service is deployed using its descriptor and the URL of the Apache SOAP servlet rpcrouter as input arguments. SemWebDL’s request processor (RP) is responsible for the initial processing of user requests. Currently, the RP has three main components: – Request parser: This component compiles user requests and determines their semantics. In particular, it decides whether single DLs or a community must be invoked to answer the request. For example, the user may specify a particular DL as the only source to be considered in answering his/her request. Users may also submit fuzzy information retrieval requests that do not specify any particular source. – Service locator: This component receives compiled requests from the request parser. To answer this request, several services may be invoked. The service locator discovers WSDL descriptions by accessing the UDDI registry. It implements UDDI Inquiry Client using WASP UDDI API. In the process of generating an execution plan to answer the request, the service locator

Fig. 4. Overview of the SemWebDL prototype

MS ID: JODL081

29 September 2004 18:40 CET

12

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

submits to the RM a request asking for the reputation of diﬀerent services that may be invoked. For each service sj , if a value of Reputation(sj ) is available and accurate (i.e., reasonably recent), the RM answers SL’s request with a message containing that value. If the reputation of sj is outdated or not available (i.e., has not been previously assessed), the RM initiates a reputation assessment process that involves one of a set of probing agents. Based on the received reputation values and its local policy, the SL builds an information retrieval plan and submits it to the information retrieval engine. – Information retrieval engine: This module is responsible for invoking Web services giving access to DLs and DL communities. Services are invoked through SOAP Binding Stub, which is implemented using Apache SOAP API. The information retrieval engine orchestrates the process of information retrieval according to the plan submitted by the service locator module. A typical execution ﬂow generated by one of Emile’s requests is as follows (Fig. 4). He ﬁrst submits a request for information through SemWebDL’s GUI. In coordination with the RM, the Request Processor then generates the appropriate information retrieval plan for Emile’s request. A typical plan is a sequence of service invocations. Services invoked may be associated with single DLs or DL communities. In the latter case, the request is sent to the community’s invocation service. The service then accesses the appropriate DL management system(s), collects the desired information, and forwards it to SemWebDL’s RP. During this process, the system enforces the privacy requirements speciﬁed in Emile’s privacy proﬁle, the service’s privacy policy, and the DL’s privacy policy. 7 Related work Preserving privacy on the Web has recently become a hot topical research issue. Most existing solutions such as anonymizers, personal ﬁrewalls, remailers, trace removers, etc. primarily apply to the “conventional” Web [25]. Solutions for preserving user privacy in the context of the Semantic Web are still in their infancy. In this section, rather than surveying research addressing privacy preservation on the Semantic Web in a generic sense, we focus on the core issue of using the concept of reputation as a solution to enabling privacy-preserving Semantic Web infrastructures for DLs. We discuss some related work that illustrates the general trends in reputation management. The concept of reputation on the Web tends to be associated with commercial Web sites that online consumers access and use based on their “reputation”. Determining the reputation of a Web-accessible information source or a service provider poses signiﬁcant challenges as diﬀerent criteria may be relevant for diﬀerent types of entities and diﬀerent categories/preferences of users. Sev-

eral solutions have been proposed. In [14], the authors determine reputation based on approaches to solving the credibility problem, which reduces to a problem of quantifying parameters on the nodes of a graph. Nodes represent information items and directed arcs link pairs of nodes (a, b), where node b’s credibility depends on node a. The credibility is a numerical quantity attached to the node. Nodes with a large number of incoming arcs correspond to information items with high credibility. This formulation has been adopted in some Web search engines (e.g., Google) to determine data reputation and, consequently, rank search results. In [17], the authors address the issue of measuring the reputation of Web sites. Their work aimed at answering the question of whether certain types of search tools yield sites that are perceived to be more reputable, i.e., authoritative and trustworthy, than others. Reputation has also been studied in the context of P2P networks. Examples of applications include P2P anonymity systems (e.g., anonymous remailers [6]) and P2P resource-sharing networks. The basic idea in these networks is to deploy reputation systems that provide reliable reputation information about peers. A peer can then use this information in decision making, e.g., who to download a ﬁle from [5, 13, 16]. In [30], the authors address the problem of deception in testimony propagation and aggregation. The objective is to protect against spurious ratings generated by malicious agents. In [29], the authors consider reputation in the context of peer-to-peer e-commerce communities. They show that models based solely on feedback from other peers in the community is inaccurate and ineﬀective. Their solution extends this model into one where two additional criteria are used to determine reputation of peers, namely, the total number of transactions a peer performs with other peers and the credibility of the feedback source. In [15], the authors consider reputation among peers within the same community. They present a decentralized reputation mechanism that is incentive compatible, i.e., in which a peer does not decrease its own reputation by reporting positive ratings of other peers. Despite the abundance of reputation-related literature, little research has focused on the reputation of Web services. In [20], the authors describe an agent-based system where agents act as proxies to collect information on and build a reputation of a Web service and then disburse endorsements of the service to create a reputation for a Web service. The authors present an approach that provides a conceptual model for reputation that captures the semantics of attributes. The semantics includes characteristics, which describe how a given attribute contributes to the overall rating of a service provider and how its contribution decays over time. The approach thus applies both to reputations and to explicit endorsements of a service provider by another party. In [9], a principal might trust an object if that object is trusted by a third party that is trusted by the given principal. This is similar to

MS ID: JODL081

29 September 2004 18:40 CET

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

the notion of endorsement proposed in [20]. A key diﬀerence between the two approaches is that [9] captures policies for endorsement and delegation, whereas [20] seeks to capture service attributes and how they can be combined to support various policies. In [26], the authors investigate the use of service reputations to enable the composition of trustworthy services. In [7], the authors investigate the impact of incorporating source credibility theory on the process of evaluating service providers. Current reputation systems are based, for the most part, on evaluating the reputation of a service using feedback collected from customers that, in the past, have acquired similar or comparable services or products. Customers may then make informed decisions in light of past experiences. This approach is obviously not feasible in Semantic Web-enabled DL environments where reputation must be automatically evaluated. The reputation system proposed in this paper diﬀers from other systems in three main aspects. First, contrary to most existing systems that target Web sites or Web databases, our system targets a Semantic Web environment hosting Web services and software agents supporting the inherently public DLs. Second, reputation in our system is not based directly on business criteria (e.g., quality of a service or product) but, rather, it reﬂects the “quality of conduct” of Web services with regard to preserving the privacy of (personal) information that they may divulge to other services, agents, or DLs. Finally, reputation management in our system is fully automated and neither uses nor necessitates potentially subjective human recommendations or feedback.

8 Conclusion The advent of the Web has enabled a new generation of DLs with easier access and richer content. It also has introduced new challenges related to protecting the privacy of DL users. Preserving user privacy becomes even more challenging in the context of the emerging Semantic Web. We believe that dimensions such as privacy and security must be considered at this early stage of the development of Semantic Web DLs. In this work, we proposed a privacy-preserving Semantic Web infrastructure for DLs. The proposed infrastructure, SemWebDL, enables a high degree of automation where software agents and Web services seamlessly carry out tasks involving an extensive exchange of information, including private information. Our solution is based on a three-tier privacy model and a reputation-based composition of DL communities. A “good” reputation is attributed to an entity if it is not the source of any “leakage” of private information. The reputation of an entity is quantiﬁed using the concepts of attribute ontology and information ﬂow differences. A reputaion management system is proposed to support reputation-related activities.

13

Deploying the next generation of Semantic Web DLs will likely involve providing Semantic Web support for the integration of database and DL systems. We plan to investigate and develop comprehensive privacy solutions for Semantic Web infrastructures enabling integrated database and public and personal DL systems. We also plan to extend our research to address the issue of privacy due to the eﬀects of incorporating digital rights management in DL environments. For example, recent solutions to digital rights management explore the idea of shipping digital objects along with their associated digital rights. Objects will then report to their “original” owner activities performed on these objects. This has the potential to reveal private information about the current users of the object.

References 1. Anderson WL (1997) Digital libraries: a brief introduction. SIGGROUP Bull 18(2):4–5 2. Arms WY (2001) Digital libraries and electronic publishing, 2nd edn. MIT Press, Cambridge, MA 3. American Library Association (1948) Library Bill of Rights, June 1948 4. Berners-Lee T, Hendler J, Lassila O (2001) The Semantic Web. Sci Am 284(5):34–43, May 5. Damiani E, De Capitani di Vimercati S, Paraboschi S, Samarati P (2003) Managing and sharing servents’ reputations in P2P systems. IEEE Trans Knowl Data Eng 15(4):840–854 6. Dingledine R, Mathewson N, Syverson P (2003) Reputation in P2P anonymity systems. In: Workshop on economics of peerto-peer systems, June 2003 7. Ekstrom MA, Bjornsson HC, Nass CI (2002) A reputation mechanism for B2B electronic commerce that accounts for rater credibility. J Electron Commerce Organizat Comput (in press) 8. Fensel D, van Harmelen F, Horrocks I, McGuinness D, PatelSchneider P (2001) OIL: An ontology infrastructure for the Semantic Web. IEEE Intell Syst 16(2):38–45 9. Finin TW, Joshi A (2002) Agents, turst, and information access on the Semantic Web. ACM SIGMOD Rec 31(4):30–35 10. Fox EA, Marchionini G (1998) Toward a worldwide digital library. Commun ACM 41(4):29–32 11. Gertz M (2000) Achieving semantic interoperability through controlled annotations. In: Position paper, US-Korea joint workshop on digital libraries, Sand Diego, 10–11 August 2000 12. Gruber TR (1993) A translation approach to portable ontology speciﬁcations. Knowl Acquisit 5:199–220 13. Gupta M, Judge P, Ammar M (2003) A reputation system for P2P networks. In: Proc. 13th international workshop on network and operating systems support for digital audio and video, Monterey, CA, pp 144–152 14. Huhns MN, Buell DA (2002) Trusted autonomy. IEEE Internet Comput 6(3):92–95 15. Jurca R, Faltings B (2003) An incentive compatible reputation mechanism. In: ACM AAMAS’03, 14–18 July 2003 16. Kamvar SD, Schlosser MT, Garcia-Molina H (2003) The EigenTrust algorithm for reputation management in P2P networks. In: Proc. 12th international World Wide Web conference (WWW) 17. Keast G, Toms EG, Cherry J (2001) Measuring the reputation of Web sites: a preliminary exploration. In: Proc. 1st ACM/IEEE-CS joint conference on digital libraries, 24–28 June 2001, pp 77–78 18. Lesk M (1997) Practical digital libraries: books, bytes and bucks. Morgan Kaufmann, San Francisco 19. Lyman P, Varian H (2000) How much storage is enough? ACM Queue 1(4)

MS ID: JODL081

29 September 2004 18:40 CET

14

A. Rezgui et al.: SemWebDL: A privacy preserving Semantic Web infrastructure for digital libraries

20. Maximilien EM, Singh MP (2002) Conceptual model of Web service reputation. ACM SIGMOD Rec 31(4):36–41 21. Miller E (2003) Enabling the Semantic Web for scientiﬁc research and collaboration. In: Proc. NSF Post Digital Library Futures Workshop, 15–17 June 2003 22. NSF (2003) In: Proc. NSF workshop on digital library futures, 15–17 June 2003 23. NSF Blue Ribbon Advisory Panel on Cyberinfrastructure (2003) Revolutionizing science and engineering through cyberinfrastructure. January 2003 24. Ouzzani M (2004) Eﬃcient delivery of Web services. PhD thesis, Computer Science Department, Virginia Tech, Blacksburg, VA 25. Rezgui A, Bouguettaya A, Eltoweissy M (2003) Preserving privacy in the Web: facts, challenges and solutions. IEEE Secur Privacy 1(6):40–49

26. Singh MP (2002) Trustworthy service composition: challenges and research questions. In: Proc. Autonomous Agents and Multi-Agent Systems, workshop on deception, fraud and trust in agent societies. Springer, Berlin Heidelberg New York 27. Panel on Digital Libraries US President’s Information Technology Advisory Committee (2001) Digital libraries: universal access to human knowledge. Report to the President, February 2001 28. Wang Y, Vassileva J (2003) Trust and reputation model for P2P networks. In: 3rd international conference on peer-to-peer computing (P2P’03), 1–3 September 2003 29. Xiong L, Liu L (2003) A reputation-based trust model for peer-to-peer ecommerce communities. In: Proc. 2003 IEEE conference on e-commerce (CEC’03) 30. Yu B, Singh MP (2003) Detecting deception in reputation management. In: ACM AAMAS’03, 14–18 July 2003

MS ID: JODL081

29 September 2004 18:40 CET

Lihat lebih banyak...

SemWebDL: A privacy-preserving Semantic Web infrastructure for digital libraries

Descrição do Produto

Comentários