Jump to content

Talk:Lexicographically minimal string rotation

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Mr. Andrew,

The Booth's algorithm that you talk about in this article... doesn't seem to work for arrays / strings which have repeated values in them.

For ex: Suppose there is a number 343. The minimal lexicographic index (per say) should actually be 2 (if the index values start from 0). That would generate the cyclic permutation 334.

Instead this algorithm returns the index as 0, which is not quite correct because 343 > 334.

Please excuse me if I am making some sort of mistake here. I could be wrong, but I wanted to discuss in length about this.

Thanks Soumya

Hey Soumya,
Thanks for letting me know this. I just implemented the algorithm and tried it on a few cases, I haven't put it through a battery of tests. I will test it on that input and try to find the problem. Thanks!
Andrew Helwer (talk) 22:16, 14 April 2012 (UTC)[reply]
Hey Soumya,
Thought I'd update you on how things are going. I've been trying to find the source of this bug, and I've ruled out a translation issue - my implementation is a line-for-line match of the algorithm as presented in the paper. Changing the line f[j-k] = -1 to f[j-k] = 0 (which is more similar to the KMP preprocessing function) fixes the aba case, but then it fails on asdf. Currently my plan is to really try and understand what is going on with the algorithm, as I only have a vague idea - I know the ideas behind the KMP preprocess cold, but translating them to this situation is proving somewhat difficult as the paper does not explain things very well. I'll keep you posted.
Andrew Helwer (talk) 04:54, 16 April 2012 (UTC)[reply]
Hey Soumya,
I've figured out the bug. The Booth algorithm will misreport the minimal rotation when a candidate rotation has a suffix matching its prefix; in all such cases, the correct minimal rotation should begin where the suffix begins. I've added a commented line to the algorithm fixing this issue.
Errors in the Booth algorithm were mentioned in A fast average case algorithm for Lyndon decomposition by Smyth and Iliopoulos (1995), although the precise nature of the errors were never elaborated upon. I believe my patch fixes the problem, however. I've emailed Dr. Smyth to ask if this was the problem he fixed, and will also email Dr. Booth to see if he is aware of the issue.
Thanks again for pointing out the error!
Andrew Helwer (talk) 17:45, 17 April 2012 (UTC)[reply]

As of today the Python code for Booth's algorithm was still wrong (I tested it as a subroutine in my code, and it gave the wrong results; sorry but I am currently unable to provide a concrete example), so I replaced it with a (hopefully) working version based on one by Booth himself. 139.124.5.27 (talk) 15:06, 25 July 2022 (UTC)[reply]