A system for automatic HPV typing via PCR-RFLP gel electrophoresis

Share Embed


Descrição do Produto

2011 IEEE Conference on Automation Science and Engineering Trieste, Italy - August 24-27, 2011

FrBB.1

A System for Automatic HPV Typing via PCR-RFLP Gel Electrophoresis Christos F. Maramis, Student Member, IEEE Anastasios N. Delopoulos, Member, IEEE Alexandros F. Lambropoulos and Sokratis P. Katafigiotis Abstract— The identification of the types of the human papillomavirus (HPV) that have infected a female patient provides valuable information as regards to her risk for developing cervical cancer. A widely used method for performing the above task (namely HPV typing) is PCR-RFLP gel electrophoresis. However, the conventional HPV typing protocol is error-prone and resource-ineffective due to lack of interaction between the phases involved in it. In order to treat these shortcomings, we introduce a novel HPV typing system that can be built upon widely available laboratory equipment. The proposed workflow of the system automates the task of HPV typing via PCRRFLP gel electrophoresis. The proof-of-concept of the proposed methodology is evaluated via an experiment that emulates the operation of the introduced system on a set of real HPV data.

I. INTRODUCTION According to recent epidemiological studies, the human papillomavirus (HPV) is considered to be the causal factor of cervical cancer [1], [2]. For this reason, the detection of HPV in the cervical cells of a female patient provides an indication regarding her probability of developing the aforementioned cancer type. The significance of HPV detection is fostered by the high frequency of cervical cancer – it is one of the leading cancers affecting women worldwide [3]. However, HPV does not appear in only one form: currently, over 40 HPV types (i.e., variants of the virus characterized by different genotypes) that infect the anogenital tract have been discovered. Moreover, virologists have classified these types into four discrete categories with respect to their associated risk for the development of cervical cancer [4]. Due to the existing diversity among the type-specific risks, the identification of the exact HPV type(s) that have infected a female patient provides her medical practitioner with valuable prognostic information. The procedure of identifying the infecting HPV types based on their genotypic differences is called HPV genotyping or, more simply, HPV typing, and it is currently performed by a variety of molecular biology methods: reverse hybridization assays [5], DNA microarrays [6], DNA sequencing [7] – just to name a few. Among them, the PCRRFLP gel electrophoresis [8], [9] is the method of choice for the majority of molecular laboratories worldwide, due C. Maramis and A. Delopoulos are with the Information Processing Laboratory – Multimedia Understanding Group, Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Thessaloniki, 54124 GREECE (e-mail: [email protected], [email protected]). A. Lambropoulos and S. Katafigiotis are with the Laboratory of Molecular Biology, 1st Department of Obstetrics and Gynecology, General Regional Hospital Papageorgiou, Aristotle University of Thessaloniki, Thessaloniki, 54124 GREECE (e-mail: [email protected], [email protected]).

978-1-4577-1732-1/11/$26.00 ©2011 IEEE

to its simplicity, cost-effectiveness, and moderate equipment requirements. The essence of this method is the digestion of the viral DNA into fragments of known lengths, which are next used to identify the HPV types. Although almost all HPV typing methods have been automated with the help of appropriate specifically-designed devices (e.g., see [10], [11]), this has not been the case for PCR-RFLP gel electrophoresis. Instead, HPV typing via the latter method is still performed manually in two phases with well-defined boundaries between them. The first phase involves the in vitro processing of a cervical tissue sample that has been collected from a subject, and it is followed by an in silico phase, where the outcome of the first phase is analyzed with the help of appropriate software in order to reach to a typing decision. However, as we will demonstrate here, an efficient HPV typing system can be built should interaction be introduced between the two phases. On top of it, the proposed HPV typing system is cost-effective, since it employs general-purpose equipment that is already available in the majority of molecular biology laboratories worldwide. This is opposed to the already automated HPV typing methods, which require specialized hardware and/or consumables (e.g., DNA microarrays and microarray scanners). The rest of the paper is structured as follows: In the next section we explore the state-of-the-art in the field of HPV typing via PCR-RFLP gel electrophoresis. Then, Section III introduces the proposed HPV typing system, presenting the system components and the associated HPV typing workflow. In Section IV two algorithms that are required at specific steps of the proposed methodology are presented in more detail. The proof-of-concept of our approach is evaluated in Section V. Finally, the conclusions of this work are drawn in Section VI. II. BACKGROUND In this section, we start by describing the conventional protocol for HPV typing via PCR-RFLP gel electrophoresis, which we call the single image protocol. Then, we present a recent related work, which constitutes an important component of the proposed system. Finally, we list the shortcomings of the single image approach that remain unresolved after the adoption of the aforementioned related work, and are tackled by the proposed system. A. The Single Image HPV Typing Protocol The in vitro phase starts with the collection of a cervical tissue sample and the extraction of the contained DNA.

549

Next, the polymerase chain reaction (PCR) [12, Ch. 24] amplifies a highly reserved region of HPV’s L1 gene sequence. After that, a predefined restriction enzyme digests the amplified viral DNA at sites that are characterized by a specific nucleotide sequence. This is the restriction fragment length polymorphism (RFLP) analysis [12, Ch. 50] – the cornerstone of the discussed method; it produces for each HPV genotype a set of DNA fragments whose lengths in base pairs (bp) are known a priori. The fragment length pattern (FLP) that results from the digestion of each virus genotype serves as its signature throughout the typing process. The next step is the gel electrophoresis [12, Ch. 5]. First, the digested PCR product is stained with a fluorescent dye (ethidium bromide) and an appropriate solution of the stained DNA is injected into an individual well at the front end of a gel matrix. Then, in the presence of an electric field, the negatively-charged DNA fragments are forced to move with different mobilities (i.e., drift velocities) against the electric field and toward the anode. During the electrophoresis, the larger molecules remain closer to the well, while the more agile smaller molecules cover a much larger distance. This way, one lane starting from each well is formed; each lane contains concentrations of DNA of the same length shaped as bands in the perpendicular to the electric field direction. On each gel, one or more wells are reserved to include DNA ladders, i.e., DNA of known lengths. After the completion of electrophoresis, the gel matrix is excited by UV light, causing the ethidium bromide molecules to fluoresce. This way, the viral DNA on the gel becomes visible and can be captured by a common digital camera. The acquisition of a digitized image of the gel matrix (see Fig. 1 for an example) completes the in vitro phase. In the in silico phase, an expert biologist analyzes the – single – acquired image with the help of appropriate software. The essence of this analysis is the fact that the intensity of the image at some position can be related to the viral DNA concentration (viral load) at the corresponding position on the gel matrix. First, the fragment lengths that correspond to the observed bands on a lane of interest are estimated by a software application. This is achieved with the help of a virtual marker that associates positions along the electrophoresis axis with fragment lengths; this marker is constructed via interpolation from the bands of the image’s ladder(s). Finally, the biologist manually compares the set of estimated fragment lengths in the investigated lane with the FLPs of all HPV types in order to judge which type or combination of types has produced the observed pattern of fragment lengths (HPV typing decision). B. Recent Progress Although there are many software applications that automate the first step of the in silico phase, i.e., the estimation of the fragment lengths that correspond to the observed lane bands [13]–[17], respective efforts for the actual HPV typing decision process (i.e., the phase’s last step) have been missing. Recently, a methodology for automating the above process was introduced in [18]; this methodology has further

LANE

LANE

LADDER 200bp

BAND

160bp BAND

120bp BAND 80bp

Fig. 1. Typical image of a gel matrix after one-dimensional electrophoresis. Four lanes that correspond to cervical samples and one DNA ladder are depicted. Samples of lanes, bands and ladder are enclosed in rectangles.

evolved and has been extensively evaluated in [19]. Since this constitutes an important stepping stone for the herein proposed system, the rest of this section is occupied by a brief description of the aforementioned methodology. First, the background intensity of the gel electrophoresis image is estimated and the result is subtracted from the observed image intensity. Then, an one-dimensional curve that aggregates the intensity information of an investigated lane across the perpendicular to the electrophoresis axis is extracted. This is the intensity profile of the lane and consists of several (let us assume K) bell-shaped bands. The methodology of [18] introduces an appropriate observation model, m(·), to describe the above intensity profile as the superposition of K appropriately-shaped peak functions:

m(x; A, β, γ, x) =

K X i=1

Ai · exp(−

1 x − xi γi | | ), γi βi

(1)

where x denotes the position in the electrophoresis axis, A = [A1 , A2 , . . . , AK ] and β, γ, x are defined accordingly. The optimization procedure for fitting the observation model to the intensity profile is described in [18], [19]. A method for selecting the value of K is described also in [18]; however, here we will employ an alternative method for dealing with this issue (see Section IV-B). An appropriate function to model the relation between the lengths (l) of the DNA fragments and their positions along the electrophoresis axis (x as above) is also introduced: x = d(l; µ) = µ1 + µ2 log2 (µ3 + µ4 l + l2 ).

(2)

The length-position relation is calibrated by exploiting the information that is provided by the bands in the DNA ladder(s) of the examined image; the optimization procedure required for the calibration is detailed in [18], [19]. The suitability of the formulas in (1) and (2) for modeling the intensity profile and the length-position relation respectively has been evaluated extensively by the experiments that have been conducted on real HPV data in [19]. With the help of these models, the methodology of [18] is able to estimate the fragment lengths (li ) and concentrations (ci ) that correspond to the bands of the investigated lane as follows:

550

i(x)

li ci

= d(−1) (xi ; µ) , =

1/γ 2γi i βi Γ(1/γi )

li γi

(3) · Ai ,

(4)

for each i = 1, . . . , K, where Γ(·) is the complete gamma function. This concludes the first step of the methodology, namely fragment information extraction. The aim of the second step, namely virus typing algorithm, is to decide which combination(s) of HPV types provide the best quantitative explanation for the extracted fragment information. First, the compatibility of each HPV type with the estimated fragment lengths l = [l1 , . . . , lK ] is checked individually, so as to eliminate the incompatible types. Then, each combination of the remaining HPV types is tested for its ability to generate the estimated fragment concentrations c = [c1 , . . . , cK ] via an optimization procedure (see [18]). The final typing decision is based on both the optimization results and the prior probabilities of the HPV type combinations. An alternative approach to testing the combinations of compatible HPV types is presented in [19]. C. Current Shortcomings The HPV typing methodology of [18] manages to tackle several problems that are associated with PCR-RFLP gel electrophoresis. However, even with its help, the conventional single image protocol, which was described in Section II-A, suffers from certain shortcomings. The main cause of these shortcomings is the lack of interaction and synchronization between the in vitro and the in silico phase: the sole input to the second phase is a single image that has already been taken in the first phase. As we will demonstrate in the subsequent sections, the proposed automatic HPV typing methodology treats these shortcomings by shifting away from the single image approach. The main shortcomings to be tackled are the following: a) Insufficient background intensity information: The efficient removal of the background intensity from the examined image is critical if the fragment concentration information is to be employed [18]–[20]. However, the information that is provided by the single image approach to the second typing phase does not suffice for accurate background subtraction, since the observed intensity of the acquired postelectrophoresis image is the product of interference between the background intensity and the intensity that is produced by the viral DNA. b) Inaccurate fragment concentration estimation: The accurate estimation of fragment concentration is very important for the methodology of [18], and it has been the reason for introducing the observation model of (1). However, the desired accuracy can be achieved only if the bands of a lane lie adequately far from each other. In the opposite case, i.e., when there is extensive overlapping between two bands, the situations that are depicted in Fig. 2 can occur. If the observed intensity profile on the left part of Fig. 2 is denoted by i(x), then a parameter vector [A0 , β0 , γ0 , x0 ] can be found such that

x

Fig. 2. Two discrete cases of extensive band overlapping. For each case, the underlying overlapping bands are depicted with non-continuous lines, while the resulting intensity profile is depicted with continuous line.

i(x) ≃ A0 · exp(−

1 x − x0 γ0 | | ), γ0 β0

for each x along the electrophoresis axis. This means that the superposition of the two bands will be confused by the observation model with a single band. Another type of problem is caused by the band overlapping on the right part of Fig. 2. This time, the observed intensity profile i(x) is not confused with a single band, but there exists more than one combination of parameter vectors [A1 , β1 , γ1 , x1 ] and [A2 , β2 , γ2 , x2 ] such that i(x) ≃ A1 ·exp(−

1 x − x2 γ2 1 x − x1 γ1 | | )+A2 ·exp(− | | ), γ1 β1 γ2 β2

for each x along the electrophoresis axis. In other words, the observation model cannot decide how to distribute the observed concentration to the underlying bands. c) Resource-ineffectiveness: The duration of the gel electrophoresis has to be decided carefully. Short electrophoresis duration will probably lead to intense band overlapping (see Section II-A), complicating and sometimes falsifying the typing decision. On the other hand, unnecessarily long electrophoresis durations contribute to the ineffective use of the laboratory resources and decrease the throughput of the HPV typing method. However, in order to circumvent possible band overlapping problems, the single image protocol usually exaggerates the electrophoresis duration. III. T HE PROPOSED HPV TYPING SYSTEM The HPV typing system that we introduce in this work is able to automatically make HPV typing decisions while the gel electrophoresis is still in progress. This is achieved owing to (i) the introduction of interaction between the in vitro and the in silico typing phase and (ii) the adoption of the methodology presented in [18]. A. System Components The proposed system can be built from common, generalpurpose devices that belong to the standard equipment of most molecular biology laboratories worldwide, and it requires only minor adjustments/modifications to integrate the

551

Image Acquisition System

step 1: capture gel image

camera image acquisition request/data

fluorescence chamber

+

gel

step 2: locate lanes step 3: start electrophoresis

start/stop electrophoresis

Computer System

step 4.1: capture gel image at time t = k · t0

UV light source

Electrophoresis Device

Fig. 3. The components of the proposed HPV typing system and the flow of information between them.

employed devices into a single system. As it can be observed in Fig. 3, the proposed system consists of three components: Electrophoresis Device. This is the device where the electrophoresis of the gel matrix takes place (see Section II-A). It is configurable with respect to the main electrophoresis parameters and provides the computer system with an interface to initiate/terminate the electrophoresis. Image Acquisition System. This component includes a fluorescence chamber, a UV light source set on the chamber ground, and a digital camera attached to the chamber ceiling. The electrophoresis device (including the gel matrix) is placed inside the fluorescence chamber, which ensures controlled illumination conditions, and the UV light excites the fluorescent dye contained in the gel to make the viral DNA visible. The digital camera is configurable with respect to imaging parameters (e.g., exposure time, focus, aperture); it acquires an image of the visible gel upon request by the computer system and sends the image back to the computer system. Computer System. The computer system bears the software that (i) orchestrates the entire HPV typing process (e.g., by requesting gel images from the digital camera and by retrieving them when they are acquired), and (ii) makes the HPV typing decisions. The integration of the employed components is facilitated significantly by the fact that most modern electrophoresis devices are by design UV transparent. Hence, all that is required is to fix the electrophoresis device on top of the UV light source inside the fluorescence chamber, and to ensure a power supply (external or internal) for the electrophoresis. B. The Proposed HPV Typing Procedure The proposed HPV typing procedure is outlined by the flowchart of Fig. 4. It is assumed that the following parameters regarding the examined gel matrix are known: (i) the number of HPV-related lanes (N ), and (ii) the number (M ) and relative positions of the DNA ladders. Moreover, a time step t0 has been defined. Once the viral DNA has been prepared according to the description of Section II-A and injected into the wells of the

step 4.2: subtract background lane N

lane 1

··· step 4.3.1: extract intensity profile step 4.4.1 not typed & typing suitable?

step 4.3.N: extract intensity profile

no

no

yes

step 4.4.N not typed & typing suitable? yes

step 4.5.1: estimate fragment lengths & concentrations

step 4.5.N: estimate fragment lengths & concentrations

step 4.6.1: apply typing algorithm

step 4.6.N: apply typing algorithm

··· no

all lanes typed? yes step 5: stop electrophoresis

Fig. 4.

The flowchart of the proposed HPV typing procedure.

gel matrix, the gel is placed on the electrophoresis device inside the fluorescence chamber and an initial image of it (I0 ) is acquired (step 1). The use of this image is twofold. First, it is used to predict the future boundaries of the lanes and ladders (step 2), as it will be described in Section IVA. Second, since the viral DNA is restricted to only a very small portion of the image (i.e., the well areas), it provides an accurate model for the background intensity component of the subsequent images; this is true owing to the controlled illumination conditions in the chamber. After the above preliminary processing actions, the elec-

552

trophoresis is initiated (step 3). Then, an iterative procedure commences (step 4). Briefly, this is described as follows: Until a typing decision has been reached for all the HPVrelated lanes, capture instances of the gel matrix at each time t = k·t0 , with the same image acquisition conditions as in I0 , and attempt to type each lane individually. This is the main loop of the proposed procedure and, as it is clear from its description, it involves parallelism with respect to the typing of the lanes. We elaborate on this iterative procedure in the following paragraphs by focusing on the nth lane. When the gel image Ik at time k · t0 has been acquired (step 4.1), its background intensity component is removed by subtracting I0 (step 4.2). Hence, the background-corrected image I k is given by: I k = Ik − I0 , where we assume pixelwise intensity subtraction. Then, the intensity profile in (·) of the nth lane is extracted (step 4.3.n) as follows: in (x) = median[I k (x, y)|ynl 6y6ynr ] , y

x = 1, . . . , S X (5)

where (i) S X is the size of the image in the electrophoresis axis X, and (ii) ynl , ynr are the left and right boundaries of the lane in the perpendicular to electrophoresis axis Y (see Section IV-A). The objective of the next step (step 4.4.n) is to decide whether the extracted profile in (x) is suitable to be analyzed according to the methodology of [18] in order to make a typing decision for the lane. For this purpose, a novel algorithm is employed to decide whether the profile provides sufficient information to apply the aforementioned typing methodology. If in (x) is found to be suitable for typing, HPV typing is attempted according to [18] using exclusively the information contained in the intensity profile of the current image. In the opposite case or if the lane has already been typed, in (x) is discarded. The typing suitability algorithm is introduced in Section IV-B. The estimation of the lane’s fragment lengths and concentrations (step 4.5.n) identifies with the fragment information extraction procedure of [18]. This procedure has been briefly described in Section II-B. Assuming a lane that includes K bands, the introduced models of (1) and (2) are employed to estimate via (3) and (4) the fragment lengths ln = n [l1n , l2n , . . . , lK ] and concentrations cn = [cn1 , cn2 , . . . , cnK ] that correspond to these bands. The estimated fragment properties are propagated to the next step (step 4.6.n). This step makes the HPV typing decision for the lane and identifies with applying the virus typing algorithm of [18]. A brief description of this procedure has been also given in Section II-B. The outcome of this step is a set of HPV type combinations that provide the best explanation for the estimated fragment lengths and concentrations, when the probability of each type has been also taken into account. The execution of steps 4.x.n for all the N HPV-related lanes of the gel matrix completes one iteration of step 4. This procedure is repeated until all lanes are typed. Then, the electrophoresis is terminated (step 5).

Fig. 5. Illustration of application of the lane detection algorithm. On the left, a sample initial image I0 (·) is depicted. On the right, the binary image that results from thresholding at 3 · medianx,y [I0 (x, y)] is shown. The predicted lane boundaries have been superimposed on the right image.

IV. D ETAILED ALGORITHM DESCRIPTIONS A. Lane detection algorithm Step 2 detects the lane boundaries, or more accurately predicts the future boundaries of the lanes. This task is performed on the basis of the initial gel image I0 and, owing to the predetermined lane shape, reduces to the estimation of a pair of positions on axis Y (perpendicular axis) that bound each lane (left and right boundary). Lane detection is greatly facilitated by the predictable appearance of I0 : The image is formed by bright blobs of DNA that are concentrated in the well areas, while the background – although variable – is characterized by much lower intensities (see the left image in Fig. 5). For this reason, the pixels that correspond to DNA are easily identified by thresholding the image at a certain multiple of its median intensity (i.e., at th1 · medianx,y [I0 (x, y)]). Then, the connected white regions in the resulting binary image (right image in Fig. 5) are detected. Since – as discussed in Section III-B – the number of lanes (N ) and ladders (M ) on the gel are known, we keep the M + N most extended white regions, eliminating this way false noise-related regions. The leftmost and rightmost pixels of the remaining regions are found and their Y -coordinates constitute the algorithm’s estimate of the lane boundaries. B. Typing suitability algorithm The main task of the typing suitability algorithm (step 4.4.n) is to decide whether the concentration information of the lane bands can be inferred accurately enough from a given intensity profile. Since the accurate estimation of the band concentrations is critical for the employed HPV typing methodology of [18], the algorithm’s decision will determine whether the proposed system will proceed with HPV typing (steps 4.5.n – 4.6.n) based on the current profile or it will wait some time for a more suitable profile to be extracted. In the case that the current profile is considered to be suitable for typing, the algorithm also estimates the number K and approximate positions xi of the lane bands, which serve as input to the employed typing methodology. If we assume that the background intensity of the profile has been removed efficiently in step 4.2, then the typing suitability decision can be extracted by examining the local extrema of the intensity profile and its first derivative. Let us now take a look at the upper part of Fig. 6, which illustrates the evolution of the intensity profile i(x) resulting from two

553

local maximum (minimum). Function append(L, j) appends j to the end of list L, whose cardinality is denoted by kLk. The lists L1 , L2 , and L3 hold at algorithm termination the local maxima of the intensity profile, the local maxima of the first derivative and the local minina of the first derivative respectively. Symbol ¬ denotes the negation of a logical proposition. Finally, the thresholding operation i(x) > th2 ·m is employed to avoid the consideration of small noise-related local maxima of the intensity profile.

i(x)

0 i’(x)

0

x

Fig. 6. Evolution of the overlapping between two bands during electrophoresis. In the upper part, the intensity profiles (continuous lines) resulting from the underlying overlapping bands (dashed lines) are depicted at three time instances. Below the profiles, the corresponding first derivative curves are drawn. The local maxima of the intensity profiles and the local extrema of the first derivatives are noted (stars).

overlapping bands, as electrophoresis proceeds. Initially the overlapping bands appear perfectly as one; then, they start to separate but they still produce a single peak; finally, they are separated as much as needed to form two peaks in the profile. The first derivative curves, i′ (x), of the investigated intensity profiles are depicted in the lower part of Fig. 6. Among the described band overlapping “instances”, only the third type is exploitable, i.e., suitable for typing. As discussed in Section II-C, the other two types hinder the accurate estimation of the underlying concentrations, and, for this reason, they should be detected by the algorithm. Regarding the detection of the second type of overlapping, we can exploit the following observation: Although in the other two overlapping types each pair of consecutive local extrema (maximum and minimum) of the first derivative “surrounds” a local maximum of the profile, an “orphan” pair of extrema is observed in the first derivative of the second overlapping type. If such an one-to-one correspondence between the local extrema of the first derivative and the local maxima of the intensity profile can be established, then, the potential typing suitability criterion is satisfied and the second type of overlapping is ruled out. However, both the first and the third type of overlapping satisfy the above criterion. In order to circumvent this situation, we do not proceed immediately to HPV typing once we diagnose satisfaction of the criterion. Instead, we give any existing bands of the first type the required time to transit to the second type. If no such transitions are observed for a specific amount of time, i.e., if a specific number of consecutive criterion satisfactions is diagnosed, then we proceed to steps 4.5.n – 4.6.n. When this is the case, the local maxima of the last intensity profile are also employed for estimating the number and approximate positions of the lane bands. The potential typing suitability criterion is expressed formally by Algorithm 1. The investigated intensity profile is denoted by i(x) for x = 1, . . . , S X . The operators Dx [·] and medianx [·] calculate the first derivative and the median of a digital signal respectively. The boolean function lmax(s(·), i) (lmin(s(·), i)) is true if the ith sample of signal s(·) is its

Algorithm 1 Evaluate potential typing suitability criterion L1 , L2 , L3 = ∅ i′ (x) = Dx [i(x)] m = medianx [i(x)] for x = 1 to S X do if i(x) > th2 · m then if lmax(i(·), x) then append(L1 , x) end if if lmax(i′ (·), x) then append(L2 , x) end if if lmin(i′ (·), x) then append(L3 , x) end if end if end for  if ¬ kL1 k = kL2 k = kL3 k then return false end if for i = 1 to kL1 k do  if ¬ L2 (i) 6 L1 (i) 6 L3 (i) then return false end if end for return true V. P ROOF OF C ONCEPT In order to prove the feasibility of our typing methodology, we emulated the operation of the proposed system to perform HPV typing on a set of cervical samples collected from 4 female patients. These had already been typed according to the conventional approach by expert biologists in the Molecular Biology Laboratory, Papageorgiou Hospital, Thessaloniki (Greece). The samples were processed as described in Section II-A. Regarding the RFLP analysis, two digestion configurations were employed: (i) digestion by the restriction enzyme HpyCH4V [9], and (ii) concurrent triple digestion by the enzymes PstI, HaeIII and RsaI [8]. The resulting 8 DNA samples (4 patients × 2 digestions) along with 2 DNA ladders were injected into the wells of a gel matrix. The gel matrix was placed at a specific position inside the fluorescence chamber of the image acquisition system R HP was employed) and the initial image (AlphaImager was acquired by the integrated camera. The exposure time parameter was adjusted so as to avoid saturation of the image

554

TABLE I G EL MATRIX SETUP AND ASSOCIATED HPV TYPING RESULTS .

Fig. 7. The gel matrix images that have been acquired for the system emulation after 30, 60, and 80 min. of electrophoresis (top to bottom). Image processing techniques have been applied to improve visualization.

intensity. The acquired image was stored in a computer system via the camera’s firewire interface. The employed imaging parameters (aperture, zoom, focus, exposure time) were recorded. After that, the gel matrix was fixed on top of the electrophoresis device and was electrophorized at 160 V. Every 10 min. the following procedure was performed: The electrophoresis was stopped and the gel matrix was placed at the predefined position inside the fluorescence chamber to be photographed according to the recorded imaging parameters. The image was stored to the computer system and the gel matrix was returned to the electrophoresis device to resume the electrophoresis. This procedure was repeated 8 times, resulting in an overall electrophoresis duration of 80 min. The 9 resulting images (see Fig. 7) were employed to type the DNA samples in an entirely automatic manner according to the methodology described in Section III-B with a single exception. Due to the inability of the system emulation to ensure the desired accuracy in placing the gel matrix inside the fluorescence chamber, an alternative method for background subtraction was employed, namely the rolling disk approach [21]. The HPV types that were taken into

Lane

Class

Expert Diagnosis

Typed

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

P1/D1 P1/D2 P2/D1 P2/D2 Ladder 1 P3/D1 P3/D2 P4/D1 P4/D2 Ladder 2

HPV53 HPV53 HPV66a HPV66a – HPV16 HPV16 HPV6 & HPV53 & HPV59 HPV6 & HPV53 & HPV59 –

yes yes yes yes – yes yes yes no –

Diagnosis Rank 1/4 1/1 1/1 1/2 – 1/1 1/2 2/18 – –

account by the typing algorithm are those considered in [9]. The prior probabilities of these types were estimated with the help of the type-specific HPV infection frequencies that were retrieved from the cervical cancer repository of the ASSIST project [22]. These probabilities were employed for ranking the solutions of the HPV typing algorithm as described in [18]. A prototype implementation of the proposed methodology R The value 5 was employed was developed in MATLAB . for thresholds th1 and th2 (see Section IV-A and IV-B), while a disk radius equal to 3% of the lane’s height was considered for the background subtraction. The compatibility and length coincidence thresholds that are associated with the methodology of [18] (steps 4.5.n-4.6.n) were set to 20 and 7 respectively. Moreover, two consecutive satisfactions of the potential typing suitability criterion were required before attempting an HPV typing decision for a certain sample. In Fig. 7, three of the employed gel images are depicted. The gel matrix setup and the typing results for each sample are given in Table I. In the 2nd column, Px denotes the xth patient and Dy the yth digestion configuration. In the 5th column, z/w denotes that the expert diagnosis (provided in the 3rd column) has been ranked zth among the w solutions of the HPV typing algorithm. Finally, the 4th column informs us whether the sample has been typed or not at the end of the experiment. The values of the potential typing suitability criterion for each sample at each examined time instance are presented in Table II. In this table 1 denotes the criterion satisfaction, while 0 denotes the opposite case. Once a typing decision has been reached, the criterion stops being evaluated. The extracted intensity profiles of two samples at the time when their typing decisions were made and the associated first derivative curves are depicted in Fig. 8. VI. C ONCLUSION The lack of interaction between the phases of the single image protocol for HPV typing via PCR-RFLP gel electrophoresis has been the source of several shortcomings: Erroneous typing decisions due to extensive band overlapping and unnecessary loss of laboratory resources are probably the most serious among them. The need to tackle these shortcomings has been the motivation for proposing the system

555

TABLE II P OTENTIAL TYPING SUITABILITY CRITERION FOR EACH SAMPLE ( COLUMNS ) AT EACH TIME INSTANCE ( ROWS ). time\sample 10 min. 20 min. 30 min. 40 min. 50 min. 60 min. 70 min. 80 min.

1st 0 0 0 0 1 1 – –

2nd 0 0 0 0 1 1 – –

3rd 0 0 0 0 0 1 1 –

4th 0 0 1 0 1 1 – –

6th 0 0 0 0 1 1 – –

7th 0 0 0 0 0 1 1 –

8th 0 0 0 0 1 0 1 1

9th 0 0 0 0 0 1 0 0

Fig. 8. The intensity profile (middle) and corresponding first derivative curves (bottom) of the first (left) and the third (right) lane of the gel matrix after 60 and 70 min. of electrophoresis respectively. It is at these time instances that the HPV typing decisions regarding the associated samples are made. The thresholding value that is involved in the potential typing suitability criterion is indicated by a horizontal line in the intensity profile graphs.

that has been described here. Owing to the introduction of interaction between the typing phases and the appropriate novel algorithms, the proposed system manages to automate entirely the task of HPV typing; this denotes significant progress for the discussed molecular biology method. The proof-of-concept of the proposed approach has been evaluated on a small set of real HPV data. The results from the emulation of the system operation have been very encouraging. Indeed, the system has automatically reached to correct HPV typing decisions for all but one examined sample. It worths mentioning that the untyped sample (9th lane) is the product of triple infection, and, for this reason, the pattern of the contained DNA fragment lengths is unusually complex. This complexity has most probably prevented the potential typing suitability criterion from being satisfied two times in a row within the employed electrophoresis run. In support of this explanation, the other triply-infected sample (8th lane) has been the last one to be typed at the last examined time instance (please refer to Table II). The feasibility of the proposed approach requires further validation through emulation experiments on larger sets of real HPV data and/or simulations. Then, we should proceed with the prototype implementation of the proposed system – possibly also as a compact integrated device. R EFERENCES [1] J. Walboomers, M. Jacobs, M. Manos, F. Bosch, J. Kummer, K. Shah, P. Snijders, J. Peto, C. Meijer, and N. Mu˜noz, “Human papillomavirus is a necessary cause of invasive cervical cancer worldwide,” Journal of Pathology, vol. 189, no. 1, pp. 12–19, 1999.

[2] F. Bosch, A. Lorincz, N. Mu˜noz, C. Meijer, and K. Shah, “The causal relation between human papillomavirus and cervical cancer,” Journal of Clinical Pathology, vol. 55, no. 4, pp. 244–265, 2002. [3] S. Landis, T. Murray, S. Bolden, and P. Wingo, “Cancer statistics, 1999.” CA: A Cancer Journal for Clinicians, vol. 49, no. 1, p. 8. [4] N. Mu˜noz, F. Bosch, S. de Sanjos´e, R. Herrero, X. Castellsagu´e, K. Shah, P. Snijders, C. Meijer et al., “Epidemiologic classification of human papillomavirus types associated with cervical cancer,” New England Journal of Medicine, vol. 348, no. 6, pp. 518–527, 2003. [5] B. Kleter, L. Van Doorn, L. Schrauwen, A. Molijn, S. Sastrowijoto, J. Ter Schegget, J. Lindeman, B. Ter Harmsel, M. Burger, and W. Quint, “Development and clinical evaluation of a highly sensitive PCR-reverse hybridization line probe assay for detection and identification of anogenital human papillomavirus,” Journal of clinical microbiology, vol. 37, no. 8, p. 2508, 1999. [6] T. Oh, C. Kim, S. Woo, T. Kim, D. Jeong, M. Kim, S. Lee, H. Cho, and S. An, “Development and clinical evaluation of a highly sensitive DNA microarray for detection and genotyping of human papillomaviruses,” Journal of clinical microbiology, vol. 42, no. 7, p. 3272, 2004. [7] B. Gharizadeh, M. Kalantari, C. Garcia, B. Johansson, and P. Nyr´en, “Typing of human papillomavirus by pyrosequencing,” Laboratory Investigation, vol. 81, no. 5, pp. 673–679, 2001. [8] O. Lungu, T. Wright et al., “Typing of human papillomaviruses by polymerase chain reaction amplification with L1 consensus primers and RFLP analysis,” Molecular and cellular Probes, vol. 6, no. 2, pp. 145–152, 1992. [9] E. Santiago, L. Camacho, M. Junquera, and F. V´azquez, “Full HPV typing by a single restriction enzyme,” Journal of clinical virology, vol. 37, no. 1, pp. 38–46, 2006. [10] J. Lee, M. Kim, S. Song, J. Hong, K. Min, J. Kim, E. Song, J. Lee, J. Lee, and S. Hur, “Comparison of Human Papillomavirus Detection and Typing by Hybrid Capture 2, Linear Array, DNA Chip, and Cycle Sequencing in Cervical Swab Samples,” International Journal of Gynecological Cancer, vol. 19, no. 2, p. 266, 2009. [11] A. Ermel, B. Qadadri, A. Morishita, I. Miyagawa, G. Yamazaki, B. Weaver, W. Tu, Y. Tong, M. Randolph, H. Cramer et al., “Human papillomavirus detection and typing in thin prep cervical cytologic specimens comparing the Digene Hybrid Capture II Assay, the Roche Linear Array HPV Genotyping Assay, and the Kurabo GeneSquare Microarray Assay,” Journal of Virological Methods, 2010. [12] D. Tagu and C. Moussard, Techniques for molecular biology. Science Pub Inc, 2006. [13] “TotalLab::Phoretix,” http://www.totallab.com/products/totallabquant, June 2011. [14] “GelCompar II - Fingerprint and Gel Analysis Software,” http://www.applied-maths.com/gelcompar/gelcompar.htm, June 2011. [15] “Gel-Pro Analyzer - Gel Analysis and electrophoresis analysis software,” http://www.mediacy.com/index.aspx?page=GelPro, June 2011. [16] I. Bajla, I. Holl¨ander, S. Fluch, K. Burg, and M. Kollar, “An alternative method for electrophoretic gel image analysis in the GelMaster software,” Computer methods and programs in biomedicine, vol. 77, no. 3, pp. 209–231, 2005. [17] S. Shadle, D. Allen, H. Guo, W. Pogozelski, J. Bashkin, and T. Tullius, “Quantitative analysis of electrophoresis data: novel curve fitting methodology and its application to the determination of a proteinDNA binding constant,” Nucleic acids research, vol. 25, no. 4, pp. 850–860, 1997. [18] C. Maramis, A. Delopoulos, and A. Lambropoulos, “Analysis of PCRRFLP Gel Electrophoresis Images for Accurate and Automated HPV Typing,” in 10th International Conference on Information Technology and Applications in Biomedicine, ITAB 2010, Corfu, Greece, 2010, pp. 1–6. [19] ——, “A Computerized Methodology for Improved Virus Typing by PCR-RFLP Gel Electrophoresis,” IEEE Transactions on Biomedical Engineering, in press. [20] C. Maramis and A. Delopoulos, “Efficient Quantitative Information Extraction from PCR-RFLP Gel Electrophoresis Images,” in 20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, 2010, pp. 2564–2567. [21] I. Mikhailyuk and A. Razzhivin, “Background subtraction in experimental data arrays illustrated by the example of Raman spectra and fluorescent gel electrophoresis patterns,” Instruments and Experimental Techniques, vol. 46, no. 6, pp. 765–769, 2003. [22] “The Cervical Cancer Repository of the ASSIST Project,” http://kastor.ee.auth.gr:8888/assist, June 2011.

556

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.