A simple interpretation of two stochastic processes subject to an independent death process.
A stochastic generalized Born (GB) solver is presented which can give predictions of energies arbitrarily close to those that would be given by exact effective GB radii, and, unlike analytical GB solvers, these errors are Gaussian with estimates that can be easily obtained from the algorithm. This method was tested by computing the electrostatic solvation energies (ΔGsolv) and the electrostatic binding energies (ΔGbind) of a set of DNA-drug complexes, a set of protein-drug complexes, a set of protein-protein...
We consider the likelihood ratio test (LRT) process related to the test of the absence of QTL (a QTL denotes a quantitative trait locus, i.e. a gene with quantitative effect on a trait) on the interval representing a chromosome. The originality is in the fact that some genotypes are missing. We give the asymptotic distribution of this LRT process under the null hypothesis that there is no QTL on and under local alternatives with a QTL at on . We show that the LRT process is asymptotically...
In this paper a novel method of noise reduction in color images is presented. The new technique is capable of attenuating both impulsive and Gaussian noise, while preserving and even enhancing the sharpness of the image edges. Extensive simulations reveal that the new method outperforms significantly the standard techniques widely used in multivariate signal processing. In this work we apply the new noise reduction method for the enhancement of the images of the so called gene chips. We demonstrate...
The following is an expository article meant to give a simplified introduction to applications of topology to DNA.
The features of an evolutionary algorithm that most determine its performance are the coding by which its chromosomes represent candidate solutions to its target problem and the operators that act on that coding. Also, when a problem involves constraints, a coding that represents only valid solutions and operators that preserve that validity represent a smaller search space and result in a more effective search. Two genetic algorithms for the leaf-constrained minimum spanning tree problem illustrate...
In this paper, we consider a possible representation of a DNA sequence in a quaternary tree, in which one can visualize repetitions of subwords (seen as suffixes of subsequences). The CGR-tree turns a sequence of letters into a Digital Search Tree (DST), obtained from the suffixes of the reversed sequence. Several results are known concerning the height, the insertion depth for DST built from independent successive random sequences having the same distribution. Here the successive inserted words...
A classification of all possible icosahedral viral capsids is proposed. It takes into account the diversity of hexamers’ compositions, leading to definite capsid size.We showhowthe self-organization of observed capsids during their production results from definite symmetries of constituting hexamers. The division of all icosahedral capsids into four symmetry classes is given. New subclasses implementing the action of symmetry groups Z2, Z3 and S3 are found and described. They concern special cases...
In this review paper we discuss fatgraphs as a conceptual framework for RNA structures. We discuss various notions of coarse-grained RNA structures and relate them to fatgraphs.We motivate and discuss the main intuition behind the fatgraph model and showcase its applicability to canonical as well as noncanonical base pairs. Recent discoveries regarding novel recursions of pseudoknotted (pk) configurations as well as their translation into context-free grammars for pk-structures are discussed. This...
Determining amino acid sequences of protein molecules is one of the most important issues in molecular biology. These sequences determine protein structure and functionality. Unfortunately, direct biochemical methods for reading amino acid sequences can be used for reading short sequences only. This is the reason, which makes peptide assembly algorithms an important complement of these methods. In this paper, a genetic algorithm solving the problem of short amino acid sequence assembly is presented....
This paper presents an application of methods from the machine learning domain to solving the task of DNA sequence recognition. We present an algorithm that learns to recognize groups of DNA sequences sharing common features such as sequence functionality. We demonstrate application of the algorithm to find splice sites, i.e., to properly detect donor and acceptor sequences. We compare the results with those of reference methods that have been designed and tuned to detect splice sites. We also show...
To establish lists of words with unexpected frequencies in long sequences, for instance in a molecular biology context, one needs to quantify the exceptionality of families of word frequencies in random sequences. To this aim, we study large deviation probabilities of multidimensional word counts for Markov and hidden Markov models. More specifically, we compute local Edgeworth expansions of arbitrary degrees for multivariate partial sums of lattice valued functionals of finite Markov...
Insertion and deletion are operations that occur commonly in DNA processing and RNA editing. Since biological macromolecules can be viewed as symbols, gene sequences can be represented as strings and structures can be interpreted as languages. This suggests that the bio-molecular structures that occur at different levels can be theoretically studied by formal languages. In the literature, there is no unique grammar formalism that captures various bio-molecular structures. To overcome this deficiency,...