Regex help needed ... Kem ?

Hi (Kem ?)

I would like to replace any CR in a text that is not after a pipe |, with a minus char “-”
I want to replace only the CR, and keep “not pipe” char just before
I am able to search for these with [^|]\\r as a search pattern
but what can I use as a replacement pattern ?
I really don’t know how to do that.

thanks !

Maybe, like this?

[code]dim rx as new RegEx
rx.SearchPattern = “(?mi-Us)(.*)[^|]\R”
rx.ReplacementPattern = “$1-”

dim rxOptions as RegExOptions = rx.Options
rxOptions.LineEndType = 4
rxOptions.ReplaceAllMatches = true

dim replacedText as string = rx.Replace( sourceText )
[/code]

Made with RegExRX

just found this to work with the fantastic tool regexrx …

dim rx as new RegEx
rx.SearchPattern = "(?mi-Us)([^|])(\\r)([^|])"
rx.ReplacementPattern = "$1-"

dim rxOptions as RegExOptions = rx.Options
rxOptions.LineEndType = 4
rxOptions.ReplaceAllMatches = true

dim replacedText as string = rx.Replace( sourceText )

Sascha is on the right track but the pattern doesn’t have to match (or care) what comes before the character right before the EOL. It will also replace the non-pipe before the EOL with nothing, which is not what you want (I think).

I’d modify it like this:

rx.SearchPattern = "([^|])\\R"
rx.ReplacementPattern = "$1-"

Note that this required that something precede the EOL character, so a leading EOL won’t count. Also, check the behavior when there are multiple EOL’s in a row, if that’s a possibility.

Sasha, thanks for your code, but it seems to removes all the CR, not only the ones that does not have a pipe before.

ha ha … I’ve been faster than Kem … too much ! :wink:

Nice, but did you mean to replace the character after the CR with nothing?

Also, if you aren’t using the groups, you don’t need to put the sub-patterns within parenthesis.

in my file, and my pattern above, the character after the CR is kept in the replaced text
if I remove the 3rd group, the CR is kept in the replaced text, I want to remove them.

some text sample :

197929|2615|||TEAU|27,7|||21/06/2018|10:46|O|C||
197929|2615|||TRANSQ|NORMALE|||21/06/2018|10:46||||
197860|2530|||CL2COMB|0,20|||21/06/2018|12:25|O|C|NETTOYAGE ROBOT EN COURS
PISCINE FERME LE JEUDI 150 PERSONNES LA VEILLE|
197860|2530|||CL2DIS|2,40|||21/06/2018|12:25|O|C|NETTOYAGE ROBOT EN COURS
PISCINE FERME LE JEUDI 150 PERSONNES LA VEILLE|

becomes

197929|2615|||TEAU|27,7|||21/06/2018|10:46|O|C||
197929|2615|||TRANSQ|NORMALE|||21/06/2018|10:46||||
197860|2530|||CL2COMB|0,20|||21/06/2018|12:25|O|C|NETTOYAGE ROBOT EN COURS-PISCINE FERME LE JEUDI 150 PERSONNES LA VEILLE|
197860|2530|||CL2DIS|2,40|||21/06/2018|12:25|O|C|NETTOYAGE ROBOT EN COURS-PISCINE FERME LE JEUDI 150 PERSONNES LA VEILLE|

(there is no CR after the “150” here it’ just because of the text wrap of the forum)

I don’t see how as the pattern match includes the character right after the CR and replaces the whole sequence with the first subgroup (the character before the CR) and a minus.

Using your text and pattern, I get:

197929|2615|||TEAU|27,7|||21/06/2018|10:46|O|C||
197929|2615|||TRANSQ|NORMALE|||21/06/2018|10:46||||
197860|2530|||CL2COMB|0,20|||21/06/2018|12:25|O|C|NETTOYAGE ROBOT EN COURS-ISCINE FERME LE JEUDI 150 PERSONNES LA VEILLE|
197860|2530|||CL2DIS|2,40|||21/06/2018|12:25|O|C|NETTOYAGE ROBOT EN COURS-ISCINE FERME LE JEUDI 150 PERSONNES LA VEILLE|

I use regexrx to test it !

As am I (as you might have guessed :slight_smile: ). Are you sure you posted the pattern you are using?

search : code(\r)([^|])[/code]
replace : $1-

I think I know what’s happening. Change \r to \R and tell me if you get the same results.

isn’t is a line ending problem ? I have only CR as end of line in my file

yes with \R the P of PISCINE is missing.

what’s the difference between \r and \R ?

\r will only match a CR, whereas \R will match an EOL character, meaning it will match CR, LF, or CR+LF. Your data is using the latter so it looks to you to be working because it’s stripping the invisible LF that comes after the CR.

Edit: Or some invisible character after the CR.

ok.
many thanks for the expert talk,
and the “one more thing” …
\R seems better, will use it !

regex is really fantastic … if you know how to use it !

Everytime I try… I think… Nah, I’d rather have my fingernails pulled off, or learn Greek.