Int. J. Bioinformatics Research and Applications, Vol. 10, No. 2, 2014

DNA-algorithm for timetable problem Igor Yu. Popov*, Anastasiya V. Vorobyova and Irina V. Blinova St. Petersburg National Research University of Information Technologies, Mechanics and Optics, 49 Kronverkskiy, St. Petersburg, 197101, Russia Email: [email protected] Email: [email protected] Email: [email protected] *Corresponding author Abstract: Using of DNA molecules for solving of NP-complete problems is discussed. Properties of DNA allow one to reduce the number of operations from exponential to polynomial. DNA-algorithm for solving of the timetable problem is suggested. The starting point is a set of classes, teachers and hours with some limitations. It is necessary to determine whether there is a timetable satisfying all limitations. The sets of classes, teachers and hours are coded by chains of nucleotides. After preparing of the input multi-set containing all possible timetables the filtering procedure should be made. It allows to exclude all illegal timetables. The filtering algorithm is suggested. An example is described. The analysis of the algorithm is made. Keywords: DNA; molecular computing; DNA-algorithm; bioinformatics. Reference to this paper should be made as follows: Popov, I.Yu., Vorobyova, A.V. and Blinova, I.V. (2014) ‘DNA-algorithm for timetable problem’, Int. J. Bioinformatics Research and Applications, Vol. 10, No. 2, pp.145–156. Biographical notes: Igor Yu. Popov graduated from Leningrad State University (Department of Mathematical Physics) in 1978. He received his PhD at Leningrad State University in 1984, and completed Doctor of Sciences at Saint Petersburg Branch of Steklov Mathematical Institute of RAS in 1996. His research interests include spectral operator theory, scattering theory, physics of nanosystems, quantum computing, DNA computing, mathematical models in geophysics and biophysics, physics of fluids. Anastasiya V. Vorobyova graduated from Saint Petersburg State University of Information Technologies, Mechanics and Optics in 2010. Currently she is head of research group in YANDEX and her research interest is programming. Irina V. Blinova graduated from Saint Petersburg State University of Information Technologies, Mechanics and Optics in 2003. She received PhD in 2008. Her research interests include mathematical modelling of nanosystems. She is Associate Professor of Mathematical Department of Saint Petersburg National Research University of Information Technologies, Mechanics and Optics.

Copyright © 2014 Inderscience Enterprises Ltd.

145

146

1

I.Yu. Popov, A.V. Vorobyova and I.V. Blinova

Introduction

Recent achievements in molecular biology open a new way not only in biology and medicine but also in absolutely another (unexpected) field – programming (see, e.g., Paun et al., 1998; Rambidi, 2007). Using of DNA-molecules allows one to suggest few algorithms based on new principles for solving of some NP-complete problems (see, e.g., Katsanyi, 2003; Malinetskii and Naumenko, 2005). The main DNA property allowing one to realise DNA-algorithms is the complementarity leading to deterministic way of pairing of single DNA chains of nucleotides. In molecular biology, complementarity is a property of double-stranded nucleic acids such as DNA. Each strand is complementary to the other in that the base pairs between them are non-covalently connected via two or three hydrogen bonds. For DNA, adenine (A) bases complement thymine (T) bases and vice versa; guanine (G) bases complement cytosine (C) bases and vice versa. The pioneering work in the field is by Adleman (1994) where DNA-algorithm for the Hamiltonian Path Problem was suggested. It is known that all NP-complete problems can be reduced to one. But the way of the reduction could be non-trivial and difficult for practical realisation (Korman et al., 2009). That is why construction of specific DNAalgorithms for various problems is interesting and important. For instance, such algorithms were suggested for the following problems (see, e.g., good review in Katsanyi, 2003): Boolean satisfiability (Amos, 2005), three colouring (Rooss and Wagner, 1995), Quantified Boolean formulae (Bach et al., 1996), Independent set (Baum and Boneh, 1996), Knapsack (Amos et al., 1996), Subgraph isomorphism (Boneh et al., 1996), Maximum clique (Ogihara and Ray, 1996), MAX-CNF satisfiability (Beigel and Fu, 1997), Circuit satisfiability (Gloor et al., 1998), (3-2)-system (Kari et al., 2000), Shortest common superstring (Rozenberg and Salomaa, 1997), Bounded Post correspondence (Amos, 1997). We suggest DNA-algorithm for the timetable problem. It is based on using of great number of DNA-molecules. That is why the main object is a multi-set of single DNAchains (strings over a given alphabet) which is called a test-tube. The models of molecular computing fall into four natural categories: filtering, splicing, constructive and membrane. We deal with the filtering model. The first operation of any computation is the creation of an initial multi-set. This multi-set should include all possible solutions (each encoded by a string) to the problem to be solved. The point here is that the superset, in any implementation of the model, is supposed to be relatively easy to generate as a starting point for a computation. The computation then proceeds by filtering out strings which cannot be a solution. For example, the computation may begin with a multi-set containing strings representing all possible paths in a graph (for the Hamiltonian Path Problem), and then proceed by removing those strings that encode illegal paths. As a result of the executed operations several multi-sets may exist at the same time. Typically, the computations finish by the determination whether the resulted multi-set is nonempty. The following operations are allowed in the computation (see, e.g., Paun et al., 1998): 

“Input(N)” creates the initial multi-set.



“Union((N1, N2, ...Nn), N)”. The set N is created, which is the multi-set union of the distinct multi-sets N1, N2, ...Nn, where n is an arbitrary non-negative integer.



“Copy” – to make two copies of given test-tube N or “Copy(N; (N1, N2, ...Nn)). This operation creates n duplicates of the multi-set N and places it in the multi-sets N1, N2, ...Nn (n is an arbitrary positive integer).

DNA-algorithm for timetable problem

147



“Separate” or “extract” or “remove” – for given test-tube N and word w prepare two test-tubes: +(N, w) containing all chains from N which have w as a substring, and -(N, w) containing all chains from N which haven’t w as a substring.



“Separate in respect to length” – for given test-tube N and integer n prepare the testtube (N, ≤n) containing all chains from N having the length not greater than n.



“Separate in respect to prefix (suffix)” – for given test-tube N and word w prepare the test-tube B(N,w) (correspondingly, E(N,w)) containing all chains from N which beginning (correspondingly, end) coincides with w.



“Select(N)” selects and returns an element of N at random. If N is the empty set then empty is returned. This operation is only executed at the end of a computation.



“Detect(N)”. Given a set N, return “true” if N is nonempty, otherwise return “false”.

2

Problem description

In general, simplified model for timetabling problems consists of a set of resources, a set of activities, and a set of dependencies between the activities. Time is divided into time slots with the same duration. Every slot may have assigned a constraint, either hard or soft: a hard constraint indicates that the slot is forbidden for any activity, a soft constraint indicates that the slot is not preferred (time preferences). Every activity and every resource may have assigned a set of time preferences, which indicate forbidden and not preferred time slots. Activity (which can be, for instance, directly mapped to a lecture) is described by its duration (expressed as a number of time slots), by time preferences, and by a set of resources. This set of resources determines which resources are required by the activity. Resource is fully described by time preferences. There is a hard condition that only one activity can use the resource at the same time. For instance, such resource can represent a teacher, a class, a classroom, or another special resource at the lecture timetabling problem. Finally, we need a mechanism for defining and handling direct dependencies between the activities. It seems sufficient to use binary dependencies only that define relationship between two activities. Three temporal constraints are possible: the activity finishes before another activity, the activity finishes exactly at the time when the second activity starts, and two activities run concurrently (they have the same start time). The solution of the problem defined by the above model is a timetable where every scheduled activity has assigned its start time and a set of reserved resources that are needed for its execution. This timetable must satisfy all the constraints. Let us describe our particular timetable problem in more details (see Even et al., 1976). Input data: H = {hk} – a set of hours, i.e. time units (slots) for the timetable; T = {ti} – a set of teachers (resources); H(ti) – a set of hours for which the teacher ti can teach, H(ti) C H; C = {cj} – a set of classes, H(cj) – a set of hours for which class cj can be taught, H(cj)  H.

148

I.Yu. Popov, A.V. Vorobyova and I.V. Blinova

For each pair (ti, cj) the workload function R(ti, cj) is determined, R(ti, cj) ≥ 0. It is defined as follows: teacher ti should teach class cj during R(ti, cj) hours. We would like to answer the following question: “Is there a timetable satisfying all these constraints?” In other words it is necessary to become clear whether there is a function f (t, C, h): T × C × H → {0,1} (it is named “meeting function”) such that: 1

f  t , c, h   1  h  H (t )  H ( c ) . It means that the fixed class is taught by the fixed

teacher only at the time when both the teacher and the class are available. 2

Σc f(t,c,h) ≤ 1 for every t, h. It means that at each hour each teacher teaches exactly one class or does not teach (teacher cannot simultaneously teach several classes).

3

Σt f(t,c,h) ≤ 1 for every c, h. It means that at each hour each class is taught by exactly one teacher or is not taught (class cannot be taught simultaneously by several teachers).

4

Σh f(t,c,h) = R(t,c) for every t,c. It means that the workload is exactly satisfied.

3

Complexity of the timetable problem

It is known that the general timetable problem is NP-complete (Even et al., 1976). Moreover, it remains NP-complete even under the following restrictions: 1

|H| = 3, i.e. the timetable contains only three hours;

2

H(c) = H for each c, i.e. each class is always available;

3

|H(t)| = 2 or |H(t)| = 3 for each t, i.e. each teacher is available during two or three hours.

4

|H(t)| = Σc R(t, c) for each t, i.e. each teacher should teach at any time when he is available.

5

R(t,c)  {0,1}, i.e. the fixed teacher teaches the fixed class not longer than one hour.

It is known also that the timetable problem can be solved during the polynomial time in two cases: 1

|H(t)| ≤ 2 for each t

2

H(t) = H(c) = H for each t, c.

It is interesting that in the last case the meeting function always exists. Below we consider the timetable problem with only one restriction: R(t, c)  {0,1}. In this case the problem remains NP-complete, and it is not known an algorithm which works (at worst) faster than full search. Nevertheless, full search is available if we use DNA-computing. The procedure is as follows: (a) Input data are coded by DNA chains (strings); (b) One prepares the initial multi-set of strings (DNA chains) containing all possible timetables; (c) One filters the multi-set using the polynomial number of steps (due to massive parallelism all chains are filtered simultaneously).

DNA-algorithm for timetable problem

4

149

Information coding

Let us code the information in such a way that the first condition  f  t , c, h   1  h  H (t )  H (c )  is satisfied automatically. To ensure it we take such

pairs ti,cj for which the loading function equals unit (R(tj, cj) = 1), find hours for it, when both the teacher and the class are available, i.e. h  H(ti)  H(cj), and form triples ticjhk : R(ti,cj) = 1, hk  H(ti)  H(cj). Each triple is presented by nucleotides chain consisting of three parts corresponding, respectively, to the teacher, the class and the hour: ti cj hk Each element of the triple is coded by a sub-chain of the same length. The length depends on the cardinality of the sets of the teachers, the classes and the hours. From one side it should be sufficiently great to encode all the elements of the sets. From the other side it should be smaller to decrease the probability of errors. Consider such cases for which all intersections H(ti)  H(cj) satisfying the condition R(ti,cj) = 1 are non-empty. Otherwise we can give negative answer immediately (the corresponding timetable does not exist). Let us modify the obtained single chains and find the corresponding complementary ones to allow us to form compounds presenting all possible timetables. Let us form two chains from each initial one. In the first case we add sub-chain presenting the same hour to the triple (to state relation between classes which can take place simultaneously). In the second case we add sub-chain presenting the next hour to the triple (to state relation between sequential classes): ti cj hk hk ti cj hk hk+1 To form the complementary chains we take the complementary triple to the initial one and make the following additions: hk

ti c j

hk

hk 1 ti c j

hk

It is necessary to take into account also that a timetable can contain “windows”, i.e. hours when there are no classes. We code this situation by the following manner: w

hk

hk

w

hk

hk 1

hk

w

hk

hk 1 w hk

Now, if we have great number of copies of such short chains and the corresponding enzymes in a test-tube they will incorporate into various double chains, for example: | t1 c1 h1 h2 | t2 c2 h2 h2 | t3 c2 h3 h4 | | h0 t1 c1 h1 | h2 t2 c2 h2 | h2 t3 c2 h3 | | t1 c1 h1 h1 | t1 c3 h1 h1 | t1 c3 h1 h1 | t1 c1 h1 h2 | | h1 t1 c1 h1 | h1 t1 c3 h1 | h1 t1 c3 h1 | h1 t1 c1 h1 |

150

I.Yu. Popov, A.V. Vorobyova and I.V. Blinova

After the denaturation each single chain presents some timetable. For our way of coding the sub-chains are related in such a way that we get a sequential timetable. It means that at first one has classes taking places in the first hour, than in the second hour, etc. So we have great number of possible timetables in the test-tube. Many of them, may be all, do not satisfy the conditions. It is necessary to organise the filtering to exclude unavailable chains.

5

Algorithm

Filtering algorithm allowing one to get chains satisfying above mentioned conditions 2–4 (condition 1 is valid for all chains due to our construction) is as follows. It uses operations described in Section 1. (1)

for all ticjhk : R(ti,cj) = 1, hk  H(ti)  H(cj)

(2)

comment : Nijk – test-tube obtained at the previous step

(3)

N ijk(1) = +(Nijk,ti cj hk)

(4)

N ijk(2) = -( Nijk,ti cj hk)

(5)

for all cm ≠ cj, R(ti,cm) = 1

(6)

N ijk(1) = —( N ijk(1) ,ti cm hk)

(7)

for all tn ≠ ti, R(tn, cj) = 1

(8)

N ijk(1) = -( N ijk(1) , tn cj hk)

(9)

Nijk = ( N ijk(1) , N ijk(2) )

(10) for all ti, cj : R(ti, cj) = 1 (11) comment: Nij – test-tube obtained at the previous step (12) Nij = + (Nij, ti cj) (13) comment: N – test- tube obtained at the last step (14) detect(N) The algorithm input is the test-tube (multi-set) containing chains corresponding to all possible timetables. In accordance with our construction these chains do not contain subsequences corresponding to waste load ((ti, cj) : R(ti, cj) = 0). Also there are no subsequences corresponding to the situation when a teacher or a class cannot be used at fixed hour ((ti, cj, hk) : hk  H(ti)  H(cj)). Hence, the corresponding conditions are valid. Lines (1)–(9) is a loop in which all triples ticjhk are checked consequently. The initial test-tube is separated into two parts: N ijk(1) containing all chains with ticjhk and N ijk(2) containing all other chains. We need the second test-tube only to preserve that chains which are not operated at this stage of the loop. The first test-tube is subjected to two loops of filtering. During the first loop (lines (5)–(6)) we exclude all chains (timetables), in which the teacher simultaneously teaches several classes. During the second loop (lines (7)–(8)) we exclude all chains (timetables), in which several teachers (more than

DNA-algorithm for timetable problem

151

one) are appointed to the same class. After this filtering the first test-tube is mixed with the second test-tube. The obtained test-tube is the input for the next step of the loop to check the next triple. So, when we come to the line (10) the test-tube contains only chains corresponding to timetables in which at each hour each teacher teaches exactly one class or does not teach, and each class is taught by exactly one teacher or is not taught. It is necessary now to exclude timetables in which the workload function is not satisfied. It is made at the next stage (lines (10)–(12)). It has been mentioned that chains containing sub-sequences corresponding to waste load ((ti, cj) : R(ti, cj) = 0) are not formed due to the construction. Hence, it is necessary to check only that needed load is in the timetable, i.e. correct chains (timetables) should contain all such pairs (ti, cj) for which R(ti, cj) = 1. Consequently, when we come to the line (13) we get only good timetables in the test-tube. It should be noted that for some timetables the load can exceed the needed one but we shall show below that it does not influence on the result essentially. The final operation (line (14)) is to check whether there is any chain in the test-tube after filtering.

6

Analysis of the obtained results

After information coding, formation of all possible timetables and filtering the test-tube can be empty or can contain chains satisfying all needed conditions. If the test-tube contains a chain then the needed timetable exists. By sequencing of this chain (or some of them, if there are several chains), we can decode the timetable. If the test-tube is empty the needed timetable does not exist. Of course, this statement is valid with some non-unit probability, because it is possible that needed chain did not formed in the testtube initially. The condition that may be unsatisfied for chains after the filtering is the equality of the workload function exactly to 1. Due to the construction procedure, pairs (ti,cj) such that R(ti,cj) = 0 are not formed in the test-tube, but pairs (ti,cj) such that R(ti,cj) = 1 can be presented in several examples among the timetables after filtering. Strictly speaking, there should be exactly one such chain. The timetable with needed load can be got from the timetable with extra load. It is necessary to take into account only one such pair and to ignore others. Moreover, our filtering procedure is such that if we obtain the timetable with extra load after filtering then we obtain also the timetable with exact (correct) load. To answer the question about the existence of the corresponding timetable it is not necessary to do anything. From practical point of view, the existence of timetables with extra loads leads to the increasing of probability to get correct answer (practically, it is possible that the test-tube does not contain all the chains, e.g., there is no chain with exact load but there is a chain with extra load). Of course, if you want you can simply add the corresponding filtering allowing one to exclude timetables with extra loads.

7

Analysis of complexity and efficiency

The main advantage of DNA-computing application to NP-complete problems is the reducing of the exponential working time to the polynomial one. It is necessary to check that we do not more than polynomial number of steps both at the information coding stage and at the filtering stage.

152

I.Yu. Popov, A.V. Vorobyova and I.V. Blinova

At first, it is necessary to find all pairs (ti,cj) which are needed to code, i.e. such that R(ti,cj) = 1. To do this one should look through the whole matrix R. It takes |T|·|C| steps. For available pairs it is necessary to find the intersection ofsets. It takes |H(ti)|·|H(cj)| steps, |H(ti)|·|H(cj)| ≤ |H|2. For each pair it is necessary to generate 4·|H(ti)  H(cj)| coding chains and additionally four chains for “window” coding. Moreover, it is necessary to get great number (exponential) of copies for each chain, but due to DNA-properties it rakes linear time. Thus, data coding stage takes polynomial time. Process of formation of all possible double chains is one chemical reaction. One can believe that it is independent of the number of input data, i.e. this time is constant. At the filtering stage each unit filtering takes the time independent of the input data, i.e. it is necessary to find only the number of steps in each loop. Operations (1)–(9) in the algorithm are done for all coded triples ticjhk, the number of which can be estimated above by |T|·|C|·|H|, and each step contains two sequential sub-loops with the numbers of steps not exceeding |C| (for lines (5)–(6)) and |T| (for lines (7)–(8)). Thus, the working time of the loop (1)–(9) is comparable with |T|·|C|·|H|·max(|T|, |C|). During the loop (10)–(12) we look through all coded pairs (ti, cj). Evidently, the number of steps at this stage doesn’t exceed |T|·|C|. Thus, all necessary calculations can be done during the time polynomially dependent on the number of input data. We don’t discuss here how to realise the algorithm experimentally. From the theoretical point of view, it is known that all the operations used in our algorithm can be done. Of course, there are experimental difficulties. Moreover, at the current technologies level, it takes one sufficiently long time to do each step of filtration. Hence, constant corresponding to unit filtration time (mentioned above) is great. It is an additional argument against including of one more filtering into the algorithm to exclude timetables with extra loads.

8

Example

Consider simple example. Let one have the following sets: H = {h1, h2, h3}; T = {t1, t2}; H(t1) = {h1, h3}; H(t2) = {h2, h3}; C = {c1, c2, c3}; H(c1) = {h1, h2, h3}; H(c2) = {h2, h3}; H(c3) = {h3}. That is, we have three hours in the timetable, two teachers (one can work at the first and the last hour, the second teacher can work at two last hours), three classes with some limitations in time. As for the particular coding by nucleotides, it is clear that in this simple example it is sufficient to use, e.g., sub-chains from six nucleotides for coding of each element of the triple “the teacher, the class and the hour”. Let us input the workload function by the following matrix: c1 c2 c3 t1 1 0 1 t2 0 1 0

DNA-algorithm for timetable problem

153

First of all, let us choose pairs (t, c) which are needed to code (for which the workload function are not zero) and find for them the intersection at available hours (t1, c1):{h1, h3}  {h1, h2, h3} = {h1, h3} (t1, c3):{h1, h3}  {h3} = {h3} (t2, c2):{h2, h3}  {h2, h3} = {h2, h3} Thus, we have 2·4 + 1·4 + 2·4 = 20 chains coding hours in which there is a teaching and, additionally, 3·4 chains coding windows: (t1 , c1 ) : t1c1h1h1

h1 t1 c1 h1

t1c1h1h2

h0 t1 c1 h1

t1c1h3 h3

h3 t1 c1 h3

t1c1h3 h4

h2 t1 c1 h3

(t1 , c3 ) : t1c3 h3 h3

h3 t1 c3 h3

t1c3 h3 h4

h2 t1 c3 h3

(t2 , c2 ) : t2 c2 h2 h2

h2 t2 c2 h2

t2 c2 h2 h3

h1 t2 c2 h2

t2 c2 h3 h3

h3 t2 c2 h3

t2 c2 h3 h4

h2 t2 c2 h3

Windows : wh1h1

h1 wh1

wh1h2

h0 wh1

wh2 h2

h2 wh2

wh2 h3

h1 wh2

wh3 h3

h3 wh3

wh3 h4

h4 wh3

If we prepare great number of copies of these chains and put them into the test-tube then various timetables will form. Let us take few examples of them and consider filtering process for these chains. Let us take the following chains: | t1 c1 h1 h1 | t2 c2 h2 h2 | t1 c3 h3 h4 | | h0 t1 c1 h1 | h1 t2 c2 h2 | h2 t1 c3 h3 | | t1 c1 h1 h1 | t2 c2 h2 h2 | w2 h2 h2 | t1 c3 h3 h4 | | h0 t1 c1 h1 | h1 t2 c2 h2 | h2 w2 h2 | h2 t1 c3 h3 | | w1 h1 h1 | t1 c1 h1 h1 | w2 h2 h2 | t1 c1 h3 h4 | | h0 w1 h1 | h1 t1 c1 h1 | h1 w2 h2 | h2 t1 c1 h3 |

(1)

(2)

(3)

154

I.Yu. Popov, A.V. Vorobyova and I.V. Blinova |w1 h1 h2 | w2 h2 h2 | w3 h3 h3 | | h2 w2 h2 | h2 w3 h3 | | t1 c1 h1 h2 | t2 c2 h2 h3 | t2 c2 h3 h3 | t1 c3 h3 h3 | | h2 t2 c2 h2 | h3 t2 c2 h3 | h3 t1 c3 h3 | | t2 c2 h3 h3 | t1 c1 h3 h3 | t1 c3 h3 h3 | | h2 t2 c2 h3 | h3 t1 c1 h3 | h3 t1 c3 h3 |

(4)

(5)

(6)

At first, the loop (1)–(9) from the algorithm is done. It excludes timetables in which one teacher teaches simultaneously several classes or one class is taught simultaneously by several teachers. At this stage, chain (6) is excluded as it corresponds to the situation when teacher t1 at hour h3 should simultaneously teach both class c1 and class c3. Other chains remain after this loop in the test-tube. During the loop (10)–(12) from the algorithm the timetables which do not ensure the needed load are excluded. Namely, chain (3) will be excluded as it corresponds to the situation when teacher t1 never teaches class c3 and teacher t2 never teaches class c2. Also chain (4) will be excluded because it consists only from windows and does not correspond to any load absolutely. Thus, the output test-tube contains ideal (it is dense and does not contain something extra) chain (1), chain (2) which is really identical to chain (1) but having extra window (we can simply ignore this window), and chain (5). The chain (5) corresponds to extra load: teacher t2 teaches class c2 both at hour h2 and at hour h3. But there is no conflict and one can simply choose any of these hours and ignore the second hour (in such a way we obtain correct timetable from (5) without extra filtering).

9

Difficulties and possible improvements

The first difficulty is related with loops. In our algorithm coupling is really by hours. Hence, it is possible to couple one triple corresponding to fixed hour with another (or even the identical) triple containing the same hour. Due to this fact, the number of all possible timetables becomes infinite. The analogous situation is in the Adleman algorithm (Adleman, 1994): the number of paths becomes infinite when the graph contains loops. Possible (heuristic) way to decrease the influence of this undesirable phenomenon is to prepare essentially greater number of chains of type tjcjhkhk+1 and hk 1 ti c j hk than of type tjcjhkhk and hk ti c j hk in the test-tube. In the algorithm described above we have a limitation for the workload function: R(t, c)  {0,1}. In the initial problem R may be equal to any nonnegative integer. To solve this generalised problem it should be necessary to find number of pairs (ti, cj) in a chain coding the timetable. One can suggest some improvements of the described algorithm. Chosen way of coding assumes that the timetable should be collected successively in respect to hours, but it is not necessary. If we allow one to use arbitrary order it is possible to avoid doubling of sub-sequences presenting an hour in coding chain. It leads to decreasing of “memory” volume. May be, the second (complementary) chain can be used more efficiently (in our algorithm it is used only for coupling in respect to hours).

DNA-algorithm for timetable problem

155

From the experimental point of view, the filtering process is rather difficult. That is why, it would be better to decrease the number of such operations, for example, due to ordering of triples ticjhk, loops (5)–(6) and (7)–(8) from the algorithm can be done only for elements which has not been checked earlier. Our algorithm is not free of general source of errors appearing in the process of any DNA-algorithm realisation: spontaneous breaking of chains, possibility of coupling with approximately (not exactly) complementary chain, errors of “separation” operation, etc. Due to these errors the theoretical maximum of information density in DNA-molecule (2 bits per nucleotide) is unreachable in practice. So high density would lead to unavailable sensitivity to errors. There is more essential general obstacle for DNAcomputing - scaling problem (roughly speaking DNA-computing replaces exponential time by exponential memory volume) (Paun et al., 1998). It is possible to put more complicated problems, for example, to find optimal timetable with minimum number of “windows” or timetables in which at fixed hour only limited number of classes can be taught (for example, if the number of lecture rooms is limited). Of course, the most interesting unsolved problem is experimental realisation of our theoretical algorithm.

References Adleman, L.M. (1994) ‘Molecular computation of solutions to combinatorial problems’, Science, Vol. 266, pp.1021-1024. Amos, M. (1997) The complexity and viability of DNA computations (extended draft) CTAG97001, The University of Liverpool, Liverpool. Amos, M. (2005) Theoretical and Experimental DNA Computation, Springer-Verlag, BerlinHeidelberg. Amos, M., Gibbons, A. and Hodgson, D. (1996) ‘Error-resistant implementation of DNA computations’, Proceedings of the 2nd Annual Meeting on DNA Based Computers, Princeton University, Princeton. Bach, E., Condon, A., Glaser, E. and Tanguay, C. (1996) ‘DNA models and algorithms for NPcomplete problems’, Proceedings of the 11th Annual IEEE Conference on Computational Complexity, IEEE Computer Society Press, pp.290-300. Baum, E.B. and Boneh, D. (1996) ‘Running dynamic programming algorithms on a DNA computer’, Proceedings of the 2nd Annual Meeting on DNA Based Computers, Princeton University. Beigel, R. and Fu, B. (1997) ‘Molecular computing, bounded nondeterminism, and efficient recursion’, Lecture Notes in Computer Science, Vol. 1256, pp.816-826. Boneh, D., Dunworth, C. and Sgall, J. (1996) ‘On the computational power of DNA’, Discrete Applied Mathematics, Vol. 71, Nos. 1–3, pp.79-94. Even, S., Itai, A. and Shamir, A. (1976) ‘On the complexity of timetable and multi-commodity flow problems’, SIAM Journal of Computing, Vol. 5, No. 4, pp.691-703. Gloor, G., Kari, L., Gaasenbeek, M. and Sheng, Yu. (1998) ‘Towards a DNA solution to the Shortest Common Superstring Problem’, Proceedings of IEEE’98 International Joint Symposia on Intelligence and Systems, pp.111-113. Kari, L., Gloor, G. and Sheng, Yu. (2000) ‘Using DNA to solve the Bounded Post Correspondence Problem’, Theoretical Computer Science, Vol. 231, No. 2, pp.193-203. Katsanyi, I. (2003) Molecular computing solution of some classical problems, Technical report of Eotveos Lorand University, Budapest, pp.1-13.

156

I.Yu. Popov, A.V. Vorobyova and I.V. Blinova

Korman, T., Leizerson, Ch. and Rivest, R. (2009) Algorithms: Construction and Analysis, Williams, Moscow. Malinetskii, G.G. and Naumenko, S.A. (2005) DNA Computing, Experiments, Models, Algorithms, Instruments, Nauka, Moscow. Ogihara, M. and Ray, A. (1996) Simulating Boolean Circuits on a DNA computer, Technical Report TR631, University of Rochester, Rochester. Paun, G., Rozenberg, G. and Salomaa, A. (1998) DNA Computing – New Computing Paradigms, Springer-Verlag, Berlin. Rambidi, N.G. (2007) Nanotechnologies and Molecular Computers, Fizmatlit, Moscow. Rooss, D. and Wagner, K.W. (1995) On the power of DNA-computers, Technical Report 103, University of Wurzburg, Wurzburg. Rozenberg, G. and Salomaa, A. (1997) Handbook of Formal Languages, Springer-Verlag, Berlin, Heidelberg, New York.

DNA-algorithm for timetable problem.

Using of DNA molecules for solving of NP-complete problems is discussed. Properties of DNA allow one to reduce the number of operations from exponenti...
211KB Sizes 0 Downloads 3 Views