ACO-GA approach to paper-reviewer assignment problem in CMS

June 24, 2017 | Autor: Dariusz Król | Categoria: Multi Agent Systems and Technologies
Share Embed


Descrição do Produto

ACO-GA Approach to Paper-Reviewer Assignment Problem in CMS Tomasz Kolasa1 and Dariusz Kr´ol2 1

Faculty of Computer Science and Management, Wroclaw University of Technology, Poland [email protected] 2 Institute of Informatics, Wroclaw University of Technology, Poland [email protected]

Abstract. Conference management, which requires proper coordination and international communication, is a complex task. There are many Conference Management Systems (CMS) which can be used to carry out the conference. Nevertheless, a convenient to use, free and effective module for automatic paper-reviewer assignment is still not available. Searching for best assignments relying only on common paper-reviewer topics not always will give good solutions. This paper proposes an approach, that uses the reviewers responses information, to tune the solution up. Proposed algorithm combines genetic algorithms (GA) and ant colony optimization (ACO), to quickly find good solutions. The experiment results confirm the superiority of proposed algorithm.

1

Introduction and Motivation

An intelligent system is needed to carry out an international conference, which requires the cooperation of several different people, see [1] and [3]. Ensuring proper coordination and communication requires the system to support several distinct paper revision phases: submission, evaluation, decision, and final paper collection. After the deadline for authors to submit their papers has passed, the evaluation phase begins and reviewers are asked by the PC chairs to review a number of papers. After the reviews are ready, a decision to accept or reject each paper is made and authors are asked to send a final version of their paper if it was accepted. After all final versions of accepted papers are sent, the papers are collected and printed in the conference proceedings. The quality of the published papers strongly depends on the evaluation phase. Selecting a proper reviewer to review a particular paper is not an simple task. This assignment process, done manually, can be a tedious and vulnerable job. There are many aspects that must be taken into account when proposing an automated solution for paper-reviewer assignment task, such as: reviewers tastes, their capacity constraints, adequate numbers of reviews for papers, conflicts of interest [2]. There is a lot of conference management systems that propose an automated solution for paper-reviewer assignment. Many of them, like EasyChair, present

2

Tomasz Kolasa and Dariusz Kr´ ol

an explicit ,,bidding” phase and utilize the collected data for the optimization of paper-reviewer assignment. In this phase the reviewers are asked to declare their interests in reviewing particular papers (rank the papers from low to high interest). However, it is important to remember that the system ought to support the PC chair in their task and not to hand over a great part of their work to the reviewers. Many of them do not tend to have a good opinion on the ,,bidding” phase. Nevertheless, they should have a possibility to express their preferences over particular papers. But the motivation is to design and present the robust algorithm that will produce good solutions over topical information distribution, without much exploitation of the reviewers time. The topical or keywords data are considered more accurate to predict well-matched paper-reviewer pairs [2]. The GRAPE system [4] uses topical information predominantly, but also supports the ,,bidding” phase. The distribution of reviewers and papers over topics is not always predictable. So the preferences supplied by reviewers are used for getting the wrinkles out of topical assignments. However, this inconvenient and time consuming ,,bidding” phase often gives inaccurate feedback. The algorithm proposed in this paper tries to fit a conference management business processes in a more natural way. Paper-reviewer assignments are done based only on topical information (paper keywords and reviewer topics of interests are matched). The results are presented to the PC chair. After the PC chair decides the assignments are ready, the reviewers are informed about the proposed reviews. Each reviewer has the possibility to accept or decline the proposed review. There is additional information about the decline response that are supplied by reviewer (conflict of interest, inadequate domain of paper, not enough time). That kind of feedback from reviewers is minimal but significant. The response information is then utilized to reevaluate the first solution and propose a better one. The general idea of the proposed solution is to combine the advantages of GA and ACO, the ability to cooperatively explore the search space and to avoid premature convergence, to quickly find good solutions within a small region of the search space. Both GA and ACO perform very well in assignment problems [8] [11]. The ACO-GA hybrid algorithms have shown better efficiency then ordinary ACO-based algorithm and a classical genetic algorithm [7] [10]. Solution representation in ACO and GA algorithms was based on binary matrix [12].

2

The Paper-Reviewer Assignment Problem

Properly assigning reviewers to corresponding paper can be considered as Maximum Generalized Assignment Problem (GAP3 ). In the conference management domain there is a set R of reviewers (of size m) and a set A of papers (of size n). Any reviewer can by assigned to review any paper, with the profit based on assignment. There must be a defined number of reviewers assigned to review 3

GAP is defined as follows: given a set of box, with capacity constraint and a set of items with different size and value for each box; pack a maximum-valued subset of items into the boxes [6].

ACO-GA Approach to Paper-Reviewer Assignment Problem in CMS

3

a paper. Furthermore, the capacity constraint is given for each reviewer. It is desired to find an assignment in which all reviewers do not exceed their capacity, all papers have requisite number of reviewers and total profit of the assignment is maximized. Moreover, a conflicts of interest and reviewer response are also important constraints. After the paper-reviewer assignment process is completed, all reviewers registered in CMS are notified about reviews assigned to them. The reviewer response ϕ is a reason of reviewer decision to refuse the request for a review. The ϕ response to ensure a necessary feedback describes two kind of answers: ϕ1 — the assigned paper does not fit to reviewer’s area of interests, ϕ2 — the reviewer does not have a time to review assigned paper. The problem is to find a paper-reviewers assignments set, that satisfies a set of constraints Θ: θ1 — the number k > 0 of different reviewers assigned to review a paper is constant for all papers and defined by the user (common values are k = 2 and k = 3). θ2 — the reviewer r capacity lr > 0 is a maximum number of papers that can be assigned to reviewer r. It can be defined for all reviewers separately, the reasonable initial value can be defined as follows: ∀r∈R • lr = ⌈ k·n m ⌉. The capacity lr1 of a reviewer r1 , who gave a response ϕ2 is decreased by 1. To preserve an overall capacity of reviewers, the capacity lr2 of a random reviewer r2 , who did not gave a response ϕ2 is increased by 1. θ3 — the value bar = true (bar ∈ {true, f alse}) indicates that as assignment of reviewer r to paper a is possible, due to lack of conflicts of interest and ϕ1 response of reviewer r to paper a. To have a clearer view on the paper-reviewers assignment problem in multiagent based conference management system, the conference itself can be described in terms of organization. Which consist of set of agents who play roles within a structure that defines the relationships between the various roles. The goal of organization is achieved by cooperating agents trying to achieve their own subgoals. The work [3] defines a metamodel of organization by goals, roles, agents, capabilities, assignments, policies, and a domain model. Capabilities are central to the process of determining which agents can play which roles and how well they can play them, while policies constrain the assignment of agents to roles thus controlling the allowable states of the organization. The conference organization model can be defined by: Goals — the organization tries to achieve a set of goals, which can be assigned to individual agents. The goals which are crucial for paper-reviewers assignment is a specific paper evaluation. These goals can be achieved by agents playing the reviewer role. Roles — conference management system specific roles like: author, reviewer, PC member.

4

Tomasz Kolasa and Dariusz Kr´ ol

Capabilities — capabilities of agent (in reviewer role) of being assigned to specific goal. Policies — constrains on assignments, the representation of previously presented Θ set.

3

Proposed ACO-GA Algorithm

The proposed solution for paper-review assignment problem combines the advantages of GA and ACO. This two algorithms complement each other in order to find good assignments and avoid premature convergence. In each iteration of ACO-GA algorithm, the best solution of the two algorithms is selected. The search by ACO and GA algorithms continues based on that collaboratively chosen solution. The fitness function represents the overall quality of a solution as mean of assignments qualities. The assignment quality degree bases on a concept of searching for reviewer-paper pairs that are the most coincide in their keywords based characteristics and maximizing the coverage degree, that is a number of paper keywords that are covered by assigned reviewers keywords. The adjustment function U and the coverage function V is presented. Both functions define a quality of assignment {a, r}4 . The adjustment function U implies a degree of keywords coverage between the reviewer r and paper a: U (a, r) =

|K(r) ∩ K(a)| |K(a)|

(1)

where KC — a keywords set defined in the conference system, K(a) — returns a keywords set of paper a (K(a) ⊂ KC ), K(r) — returns a keywords set of reviewer r (K(r) ⊂ KC ). The coverage function V defines a paper a coverage intensity, that is proportional to a number of paper a keywords that are covered by at least one assigned reviewers keyword: ∪ |K(a) ∩ r∈Ra •K(r)| V (a) = (2) |K(a)| where Ra — a set of reviewers assigned to review a paper a. Than the assignment {i, j} quality degree can be defined as suitability function f : f (i, j) = γ · U (i, j) + (1 − γ) · V (i) (3) where γ — a algorithm parameter that determine the relative importance of adjustment function information and the coverage function value. 4

{a, r}: reviewer r is assigned to review a paper a

ACO-GA Approach to Paper-Reviewer Assignment Problem in CMS

Respectively the fitness function F for solution S is defined as follows: ∑ {i,j}∈S f (i, j) F (S) = |S|

4

5

(4)

Ant Colony Optimization Algorithm

The Ant Colony Optimization can be applied to optimization problems that can be described by graphs [5]. The idea is to use nature inspired artificial ants to search for a minimum cost path in a weighted graph. Here, in the paper-reviewers assignment problem graph, the nodes represent papers and reviewers, with the edges between them indicating the assignment. In contradiction to traveling salesman problem (TSP) there are two kind of nodes: the reviewers and the papers, with edges constrained to paper-reviewer connections. The paper-reviewers assignment problem graph can be defined as G := (V, E) with nodes V = A ∪ R and edges E ⊆ {{u, v} : u ∈ A ∧ v ∈ R}. In this paper an ACO algorithm that go through a not fully connected construction graph is proposed. The search for the best paper-reviewers assignments is done by an ant traversal through the graph, where all Θ constraints must be met and the traversal stopping criterion satisfied (e.g. all papers A have exactly k reviewers assigned). Each ant h has a memory of traversed edges implemented as tabu list. The tabu list contains edges that the ant h can not traverse. The knowledge about traversed edges is essential to get away with repeated assignments. The fact that an ant will not traverse the edges specified in the tabu list can be utilized to ensure θ3 constraint. To do so, the tabu list is initialized with all assignments {a, r} that fulfill bar = f alse. The ant population of size g traverse the paper nodes and reviewer nodes interchangeably. The probability that the hth ant assigns reviewer node to paper node is given by:  β α τij (t)·ηij (t)   ∑ if i ∈ allowedhj (t) ∧ j ∈ R  α (t)·η β (t)  τ h aj aj a∈allowed (t)  j     β α h τij (t)·ηij (t) pij (t) = ∑ (5) if i ∈ A ∧ j ∈ allowedhi (t)  α (t)·η β (t)  τ h ir ir  r∈allowedi (t)       0 otherwise where τij — the pheromone value associated with assignment (or graph edge) {i, j}, ηij — he heuristic value associated with assignment {i, j}, in principle ηij = f (i, j) where f is suitability function defined in in Eq. 3, allowedi — set of feasible nodes currently not assigned to i node, defined in Eq. 6, α,β — positive real parameters whose values determine the relative importance of pheromone versus heuristic information.

6

Tomasz Kolasa and Dariusz Kr´ ol

allowedhi (t) =

 / tabuh ∧ |Ra | < k} if i ∈ R  {a ∈ A : {a, r} ∈ 

(6) {r ∈ R : {a, r} ∈ / tabuh ∧ |Ar | < lr } if i ∈ R

where Ra — a set of reviewers assigned to review a paper a, Ar — a set of papers assigned to reviewer r. The ant in search for optimal assignments traverses only feasible nodes to find a good solution that satisfies the Θ constraints. In other words, the ant h placed in node i can traverse to some node j when j ∈ allowedi . In cases when the are no such nodes available∑(allowedhi ⊆ ∅) the ant gets stuck. It is fine when the stop criterion is met ( a∈A |Rah | = k), but otherwise the reviewers for remaining papers must be found. If an ant got stuck in some node i, it is moved to different node u that u ∈ allowedi . The action of moving the ant from one node v1 ∈ V1 to other node v2 ∈ V2 when V1 = V2 , has been called the jump. There are possible two kind of jumps: ω1 — when the ant h is in node i ∈ A and |Rih | = k, the jump to node u ∈ argmina∈A |Rah | occurs, ω2 — when the ant h is in node j ∈ R and |Ahj | = lj , the jump to node v ∈ argminr∈R |Ahr | occurs. The pheromone evaporation is essential for exploration of ants around the optimal solution in the following iterations. The quantity of pheromone, deposited by ants, is big if the generated solution is feasible. On the other hand, this value h is set to low if solution is infeasible. Therefore, the quantity of pheromone ∆τij deposit by ant h depends on the solution quality:

h ∆τij (t) =

  F (S h (t)) if {i, j} ∈ S h (t) 

(7) 0

otherwise

where S h (t) — a set of assignments found by ant h in time t, F (S) — the fitness function value for solution S, Eq. 4. After all ants have completed their assignments the global rule of pheromone update from Eq. 8 is applied to all the edges. The ants updates the pheromone based on their solution quality, but the best ant deposits additional pheromone on edges of the best solution. τij (t + 1) = (1 − ρ) · τij (t) +

m ∑

h b ∆τij (t) + ∆τij (t)

h=1

where ρ — ρ ∈ (0, 1] is parameter called evaporation rate, b — indicates the ant with best solution at each iteration.

(8)

ACO-GA Approach to Paper-Reviewer Assignment Problem in CMS

5

7

Genetic Algorithm

The GA implementation is using two-dimensional binary chromosome representing the paper-reviewer assignments. The initial population is randomly generated and keeps the Θ constraints. The selection of the individuals is done according to the roulette wheel method [9]. The offsprings that keeps the Θ constraints are generated using special two-argument crossing operator. The parental population is replaced by the new offsprings. Afterwards, the one-argument mutation operator is applied to some randomly picked offsprings. The best individual within the population is maintained over populations. Each chromosome is encoded as a binary two-dimensional matrix C of n×m. Each row of this matrix corresponds to one paper, and each column to one reviewer. Cij = 1 represents an assignment of reviewer Rj to a paper Ai . Accordingly, Cij = 0 indicates that reviewer with column number j is not assigned to paper with row number i. The encoding method of two-dimensional binary chromosome fits well to ACO algorithm in implementation basis. So using the two-dimensional binary matrix also as ACO graph representation fits well to ACO-GA combination, as there is no need for any solution re-representation. The initial population is partly random and partly generated to keep the constraints Θ. The first step is to randomly generate chromosomes that keep the θ1 constraint. Next, some part of this generated population is improved to keep also the θ2 and θ3 constraints. The initial population that keep all the constraints helps to find a good solution quickly and the randomness enables to avoid premature convergence. The procedure to generate random chromosome C that keeps the θ1 constraint is defined as follows: forAll a ∈ A forAll i ∈ [1..k] get r ∈ R \ Ra assign: Car = 1 The fix procedure that improves a chromosome to keep the θ2 and θ3 constraints is presented. It is also a part of two-argument and one-argument operators. This enforces the GA algorithm to seek the solution in area of search space that keeps all the constraints. The fix procedure unassigns papers from reviewers who have the greatest number of assigned papers and then assigns papers to reviewers with the least number of papers. The procedure for chromosome C is defined as follows: while θ2 not met get rmin ∈ arg minr ∈ R |Ar | get rmax ∈ arg maxr ∈ R |Ar | get a ∈ Armax get e ∈ {p : p ∈ A \ Armin ∧ bprmin = true} unassign: Carmax = 0 assign: Cermin = 1

8

Tomasz Kolasa and Dariusz Kr´ ol

The two-argument crossing operator invoked on parental individuals P1 and P2 takes following steps to produce the offspring: Step 1: based on P1 i P2 individuals, make a temporary chromosome T = P1 ∨ P2 Step 2: initialize empty offspring O1 and O2 , and proceed: for i = 1 to n R1 = random reviewers(k, Ti )5 R2 = random reviewers(k, Ti ) O1i ← R1 6 O2i ← R2 Step 3: execute fixing procedure 5 on offspring The one-argument mutation operator invoked on individual C takes fallowing steps to produce the new, changed chromosome: Step Step Step Step Step

6

1: 2: 3: 4: 5:

get random paper a get random reviewer r assigned to paper a get random reviewer e not assigned to paper a switch reviewers: Car = 0, Cae = 1 execute fixing procedure 5 on individual C

Evaluation and Conclusion

The system was evaluated on real-world conference data (KN S) and automatically generated datasets (G1 and G2 ). The real-world data comes from the academic conference KNS held in 2009 at Wroclaw University of Technology. Characteristics of these datasets are presented in Tab. 1. Table 1. Characteristics of evaluation data |A| |R| KN S 179 64 G1 180 40 G2 200 70

|KC | min|K(a)| max|K(a)| min|K(r)| max|K(r)| FB 80 1 7 1 15 0.560 30 2 4 2 4 0.901 35 3 11 3 11 0.973

|A| - number of papers; |R| - number of reviewers; |KC | - number of keywords; min|K(a)| - minimal number of keywords per paper; max|K(a)| - maximal number of keywords per paper; min|K(r)| - minimal number of keywords per reviewer; max|K(r)| - maximal number of keywords per reviewer; FB - best possible fitness

To evaluate the proposed algorithm capabilities, the computational experiments were carried out. The ACO-GA, ACO and GA algorithms were repeated for 10 times each. The real-world assignments done by the PC chair and mean results of algorithms processing are presented in Tab. 2. 5

6

select k random reviewers assigned to paper i in chromosome T and memorize as set R1 assign all reviewers from set R1 to paper i in offspring chromosome O1

ACO-GA Approach to Paper-Reviewer Assignment Problem in CMS

9

Table 2. Results of ACO-GA, ACO and GA algorithms evaluation Ω manual ACO

0.548 0.832 ± 0.001 0.743 ± 0.003 GA 0.828 ± 0.002 0.850 ± 0.001 ACO-GA 0.893 ± 0.001 0.905 ± 0.001 0.748 ± 0.005 GA 0.881 ± 0.003 0.926 ± 0.001 ACO-GA 0.945 ± 0.001 ACO

0.869 ± 0.001 0.806 ± 0.002 GA 0.874 ± 0.001 0.883 ± 0.001 ACO-GA 0.909 ± 0.001 ACO

F

U KN S 0.307 0.251 0.466 ± 0.001 0.415 ± 0.001 0.416 ± 0.003 0.325 ± 0.004 0.464 ± 0.002 0.382 ± 0.002 0.476 ± 0.001 0.422 ± 0.001 0.500 ± 0.001 0.440 ± 0.001 G1 0.849 ± 0.001 0.757 ± 0.002 0.702 ± 0.005 0.518 ± 0.006 0.798 ± 0.003 0.630 ± 0.004 0.844 ± 0.001 0.746 ± 0.002 0.886 ± 0.001 0.784 ± 0.002 G2 0.858 ± 0.001 0.776 ± 0.002 0.796 ± 0.002 0.647 ± 0.003 0.842 ± 0.001 0.716 ± 0.002 0.860 ± 0.001 0.769 ± 0.002 0.897 ± 0.001 0.814 ± 0.002

V

time[s]

iter

0.254 0.517 ± 0.002 0.507 ± 0.003 0.546 ± 0.003 0.531 ± 0.001 0.559 ± 0.002

86.0 ± 0.4 100.7 ± 0.6 189.0 ± 1.8 12.3 ± 0.1 176.8 ± 0.4

500 500 1000 50 500

0.941 ± 0.001 0.885 ± 0.004 0.965 ± 0.003 0.943 ± 0.002 0.988 ± 0.001

76.2 ± 0.2 65.7 ± 0.6 126.9 ± 0.3 9.2 ± 0.1 133.9 ± 0.5

500 500 1000 50 500

0.940 ± 0.002 0.945 ± 0.002 0.968 ± 0.001 0.951 ± 0.001 0.981 ± 0.001

108.6 ± 0.1 84.5 ± 0.8 162.5 ± 0.7 13.1 ± 0.1 197.6 ± 0.6

500 500 1000 50 500

To ensure understandable evaluation, a best possible fitness value FB and measurement Ω of the accuracy rate of the proposed solution are given as follows: Ω(S) =

F (S) FB

(9)

where F (S) — fitness of solution S, FB — a fitness value of naive solution B, to construct such solution, each paper has k best fitting reviewers (considering only the adjustment U defined in Eq. 1) iteratively assigned, the B solution satisfies only θ1 constraint. ACO and GA parameter values used in experiments, are defined as follows: number of reviewers per paper k = 2; adjustment relative importance γ = 0.5; GA population size: 200 crossing probability: 0.8; mutation probability: 0.2; ACO ants number: 100; pheromone importance α = 4; heuristic information importance β = 1; pheromone evaporation rate ρ = 0.1. All parameter values were determined empirically. Parameter optimization can be considered as a topic for future research. The experiment results presented in Tab. 2 confirm the superiority of proposed ACO-GA algorithm. All tested algorithms have determined good solutions. However, the combination of two population based algorithm is faster and finds better results that ACO and GA used separately.

10

Tomasz Kolasa and Dariusz Kr´ ol

The ACO-GA algorithm, searching for optimal paper-reviewer assignments was proposed and evaluated on real-world conference dataset. For future research, the proposed solution will be evaluated on different conference datasets and embedded in Web conference management system to check not only algorithm robustness, but also end-user satisfaction when will be used during a conference. The embedded version of algorithm will be reimplemented to perform better in multiprocessor environment. Also, proper algorithm reactions for reviewers responses need to be evaluated.

References 1. Benferhat, S., Lang, S.: Conference Paper Assignment. International Journal of Intelligent Systems 16(10), 1183–1192 (2001) 2. Conry, D., Koren, Y., Ramakrishnan, N.: Recommender Systems for the Conference Paper Assignment Problem. ACM Conference On Recommender Systems, 357–360 (2009) 3. DeLoach, S.A., Oyenan, W.H., Matson, E.T.: A Capabilities-based Model for Adaptive Organizations. Autonomous Agents and Multi-Agent Systems 16(1), 13–56 (2008) 4. Di Mauro, N., Basile, T.M.A., Ferilli, S.: GRAPE: An Expert Review Assignment Component for Scientific Conference Management Systems. Innovations in Applied Artificial Intelligence, 789–798 (2005) 5. Dorigo, M., St¨ utzle, T.: Ant Colony Optimization. MIT Press, Cambridge (2004) 6. Fleischer, L., Goemansy, M.X., Mirrokniy, V.S., Sviridenko, M.: Tight Approximation Algorithms for Maximum General Assignment Problems. Symposium on Discrete Algorithms, 611–620 (2006) 7. Lee, Z.J., Su, S.F., Chuang, C.C., Liu, K.H.: Genetic Algorithm with Ant Colony Optimization (GA-ACO) for Multiple Sequence Alignment. Applied Soft Computing 8(1), 55–78 (2008) 8. Maniezzo, V., Colorni, A.: The Ant System Applied to the Quadratic Assignment Problem. IEEE Transactions on Knowledge and Data Engineering 11(5), 769–778 (1999) 9. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. Springer (1999) 10. Nemati, S., Basiri, M.E., Ghasem-Aghaee, N., Aghdam, M.H.: A Novel ACO-GA Hybrid Algorithm for Feature Selection in Protein Function Prediction. Expert Systems with Applications: An International Journal 36(10), 12086–12094 (2009) 11. Silva, C.A., Sousa, J.M.C., Runkler, T.A.: Rescheduling and Optimization of Logistic Processes Using GA and ACO. Engineering Applications of Artificial Intelligence 21(3), 343–352 (2008) 12. Yang, J., Luo, Z.: Coalition Formation Mechanism in Multi-agent Systems Based on Genetic Algorithms. Applied Soft Computing 7(2), 561–568 (2007)

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.