Jump to content

Wikipedia:Reference desk/Archives/Computing/2022 May 27

From Wikipedia, the free encyclopedia
Computing desk
< May 26 << Apr | May | Jun >> May 28 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


May 27

[edit]

Regex

[edit]

With regular expressions (I am using the perl variety), how do you capture a character and then match any character except that one?

I tried /(.)[^\1]/ but apparently back references don't work in groups. I tried to match (say) "banana" with something like /(.)(.)([^\1])\2\3\2/ which doesn't work. -- SGBailey (talk) 06:43, 27 May 2022 (UTC)[reply]

It is not valid to use backreferences like \1 inside a character class [ ... ]. That is why it doesn't work. Unfortunately, that is the limit of my RegEx knowledge. 97.82.165.112 (talk) 14:44, 27 May 2022 (UTC)[reply]
I haven't got a complete answer yet, but (.)(?!\1) will match the first character not followed by the same character. I picked that up from here: "Negative lookahead is indispensable if you want to match something not followed by something else." So (.)(?!\1). matches a character and then any character except that one.
OK, I think a parallel of what you tried to write for "banana" is (.)(.)(?!\1)(.)\2\3\2
This says "some first character, some second character not followed by a repeat of the first, some third character," and then obviously the \2\3\2 repeats characters 2, 3 and 2 again. This will match banana, but will also match baaaaa. Note that (?!\1) is not in itself a character or group - my terminology might be wrong here, but anyway, I mean you can't refer back to it, it's not \3. It just looks ahead past the preceding character to check a fact about the next one.
Then there's the variation (.)(.)(?!\2)(.)\2\3\2
This says "some first character, some second character not followed by itself, some third character," then second, third, second again. So it matches banana, bbnbnb and bababa.
My super ultimate banana-matcher in its final form is (.)(?!\1)(.)(?!\1|\2)(.)\2\3\2  Card Zero  (talk) 14:44, 27 May 2022 (UTC)[reply]

Thx -- SGBailey (talk) 06:21, 29 May 2022 (UTC)[reply]