Ieva Mitasiunaite scite author profile

Ieva Mitasiunaite

4Publications

20Citation Statements Received

135Citation Statements Given

How they've been cited

How they cite others

105

135

Affiliations

Vilnius University, Laboratoire d'Informatique en Images et Systèmes d'Information, Institut National des Sciences Appliquées de Lyon

Publications

Order By: Most citations

Looking for monotonicity properties of a similarity constraint on sequences

Mitasiunaite

Boulicaut

2006

View full text Add to dashboard Cite

Constraint-based mining techniques on sequence databases have been studied extensively the last few years and efficient algorithms enable to compute complete collections of patterns (e.g., sequences) which satisfy conjunctions of monotonic and/or anti-monotonic constraints. Studying new applications of these techniques, we believe that a primitive constraint which enforces enough similarity w.r.t a given reference sequence would be extremely useful and should benefit from such a recent algorithmic breakthrough. A non trivial similarity constraint is however neither monotonic nor anti-monotonic. Therefore, we have studied its definition as a conjunction of two constraints which satisfy the desired monotonicity properties: a pattern is called similar to a reference pattern x when its longest common subsequence with x (LCS) is large enough (i.e., a monotonic part) and when the number of deletions such that it becomes the LCS is small enough (i.e., an anti-monotonic part). We provide an experimental validation which confirms the added value of this approach on a biological database. Classical issues like scalability and pruning efficiency are discussed.

show abstract

Parameter Tuning for Differential Mining of String Patterns

Besson¹,

Rigotti

Mitasiunaite³

et al. 2008

View full text Add to dashboard Cite

About softness for inductive querying on sequence databases

Mitasiunaite¹,

Boulicaut²

View full text Add to dashboard Cite

In many application domains (e.g., WWW usage mining, telecommunication data analysis, molecular biology), large sequence databases are available and yet under-exploited. The inductive database framework assumes that both such databases and the various patterns holding within them might be queryable. In this setting, queries which return patterns are called inductive queries and solving them is one of the main topics in database mining research. Indeed, constraint-based mining techniques on sequence databases have been studied extensively the last few years and efficient algorithms enable to compute complete collections of patterns (e.g., sequences) which satisfy conjunctions of monotonic and/or anti-monotonic constraints in potentially large sequence databases (e.g., minimal and maximal frequency constraints). Studying new applications of these techniques, we consider that fault-tolerance and softness are extremely important issues for tackling real-life data analysis. In this paper, we address some of the open problems when computing soft occurrences of patterns within database sequences instead of the classical exact matching ones. Such an extension is not trivial since it prevents the clever use of monotonicity for pruning the search space. We describe our proposal and we provide an experimental validation on real-life clickstream data which confirms the added value of this approach.

show abstract

Extracting Signature Motifs from Promoter Sets of Differentially Expressed Genes

Mitasiunaite

Rigotti

Schicklin

et al. 2009

View full text Add to dashboard Cite

There is a critical need for new and efficient computational methods aimed at discovering putative transcription factor binding sites (TFBSs) in promoter sequences. Among the existing methods, two families can be distinguished: statistical or stochastic approaches, and combinatorial approaches. Here we focus on a complete approach incorporating a combinatorial exhaustive motif extraction, together with a statistical Twilight Zone Indicator (TZI), in two datasets: a positive set and a negative one, which represents the result of a classical differential expression experiment. Our approach relies on the existence of prior biological information in the form of two sets of promoters of differentially expressed genes. We describe the complete procedure used for extracting either exact or degenerated motifs, ranking these motifs, and finding their known related TFBSs. We exemplify this approach using two different sets of promoters. The first set consists in promoters of genes either repressed or not by the transforming form of the v-erbA oncogene. The second set consists in genes the expression of which varies between self-renewing and differentiating progenitors. The biological meaning of the found TFBSs is discussed and, for one TF, its biological involvement is demonstrated. This study therefore illustrates the power of using relevant biological information, in the form of a set of differentially expressed genes that is a classical outcome in most of transcriptomics studies. This allows to severely reduce the search space and to design an adapted statistical indicator. Taken together, this allows the biologist to concentrate on a small number of putatively interesting TFs.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.