Smith-Waterman alignment algorithm

Not sure I know why my posts stay blank here…
I posted, even replied but nothing is visible …

May be my fist post is too long. Is there a limit in esoTalk ?

You might have been logged out before you posted. I now tend to write lengthy posts in a text editor, then log in and post them.

[quote=125248:@Franck Perez]Not sure I know why my posts stay blank here…
I posted, even replied but nothing is visible …
[/quote][quote=125249:@Franck Perez]May be my fist post is too long. Is there a limit in esoTalk ?[/quote]

Maybe your post was indeed very long. If you tried to post the code you mentioned in “fuzzy search”, that is very nice. A better approach could be to place the project on a web site or on a dropbox and link to it through the link icon on top of the post editor.

try again… split to 2

Something else is wrong : this thread reports 10 posts total in the forum general view, when there are only 5…

OK - I posted the code nine. Here is my firs (shortened) post.

Hi,

I have coded a Smith-Waterman routine to compare and align 2 DNA sequences. It seems from another thread that some people may be interested. Some may optimise it (and I would be interested !). So I post it here. It was not done to be public so I hope it will be clear (and not to bad…)

It takes 2 strings (DNA, degenerate DNA [or raw text] or protein) and returns them with the necessary gaps (using the ‘-’ character) to align them as well as a SimilarityString (with “|” or “:” at identical or similar site and whitespace (or custom NonIdenticalChar) at non-aligned site. The routine tries to estimate the time needed to complete the alignment in case it starts to be long and ask whether the user wants to go on.

Here is an example of the output

[code]Seq_1 1 GTGGAGTGCCCACCTTGCCCAGCACCACCTGTGGCAGGACTT-CAGCTTCCTCTTCCCCC 59
| | || || ||| | | |||| || | || | | |||||||||
Seq_2 1 -AGCAAGCCCACCCCACCCCCTGAACTCCTGGGGACCGTCTGTCTTCA—TCTTCCCCC 56

Seq_1 60 CAAAACC–AAGG-A-CCCGATGATCTCCAGAACCCCTGAGGTCACGTGCGTGGTGGTGG 115
||||||| | | | || |||||||| | ||||| |||||||| |||||||||||||
Seq_2 57 CAAAACCCAAGGACACCCTCATGATCTCACGCACCCCCGAGGTCACATGCGTGGTGGTGG 116

Seq_1 116 ACGTGAGCCACGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGACGGCATGGAGGTGC 175
|||||||||| || ||||||||||| ||||||| |||||| | || | | ||||||
Seq_2 117 ACGTGAGCCAGGATGACCCCGAGGTGCAGTTCACATGGTACATAAACAACGAGCAGGTGC 176[/code]

I posted an example app here Align_SW_Example. I only tested it on a Mac.

This routine is used in Serial Cloner (although in Serial Cloner I carry out first a block search to find long identical section and not loose time aligning them. I then only align stretches in between).

I hope it will interest some of you,
best,
Franck