Jump to content

Talk:Three-way comparison

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Function vs special Syntax for three way comparison

[edit]

I agree with the sentiment that any language could define a function to do a three way comparison but Python has a cmp function built-in to the language, you don't even have to import a module to have it available. It seems that Javas' support is via a function from a package. There is a special method __cmp__ that can be defined for arbitrary classes that then allows them to be sorted via Pythons in-built, optimised in C, sort routine too.

It seems as if Nicky edited the article to exclude Python on a false premise that its cmp function had to be defined by the user, when to me, it seems that Python has every right to be their especially as an examle of another way that the three way comparison is offered by other languages. I do not accept that the article be limited to mentioning those languages having special syntax for the three way merge. --Paddy (talk) 06:10, 17 June 2008 (UTC)[reply]

Further, the section "Composite data types" is about the Haskell function compare. Which goes against NickyMcLean's reasoning, as does the fact that NickyMcLean's edit in the section on "High level languages" tries to argue that inclusion should be based on the lower level implementation details. I think it's better to concentrate on whether the high level language, out-of-the-box, supports the three way comparison. --Paddy (talk) 06:25, 17 June 2008 (UTC)[reply]

Yes, I misread cmp as being a derived function. There are incidentally, different degrees of "built-in"ness: a language such as MatLab supplies a whole horde of apparently built-in functions and suchlike, which are in fact pre-supplied scripts themselves written in the basic language syntax. Similarly, table-driven compilers can allow the introduction of additional syntax (though conforming to certain allowed styles only) that will lead to code production quite similar to that generated for the fundamental language. In this context, the key is whether or not the syntax will result in the usage of the special machine code instructions that refer to the "overflow" indicator, or perform two tests on the results of the comparison without repeating the comparison itself. In other words, that the special problem is explicitly recognised in some sort of custom-fitted syntax rather than being approximated by multiple statements. NickyMcLean (talk) 21:21, 17 June 2008 (UTC)[reply]

Hi Nicky, in a high level language, there might well be differences in how 'close to the metal' a three way comparison is created, but I think the main thing to look for is whether the language comes with a built in comparator returning three different values. The reason to use higher level languages is sometimes to mask details of implementation. It way be that some languages in-built three way comparator took two values and returned three distinct strings rather than the three integers -1, 0, 1 - I'd think it odd, but as long as the result indicated the <, ==, or > comparison it would have the necessary 'three way comparison' traits.

When talking of an assembler language then it would not have a three way comparison, (TWC), if such an instruction was not part of the instruction set presented. If a TWC is part of the assembler then we shouldn't exclude the assembler if the CPU used micro-instructions to create the TWC. (It gets hazy if the TWC is a non-optional pre-defined macro though). In a high level language if a TWC is given as part of the base language then that should be enough. If the base language goes further and incorporates its TWC as part of other standard constructs, for example, how Python links its standard (and highly optimised), sort function to the results of a TWC, then the language most definitely does recognise the TWC.

Your latest comment in the main article might be best be removed as it overly restricts the article to a particular type of implementation. --Paddy (talk) 04:47, 18 June 2008 (UTC)[reply]

Comparison with two way comparison

[edit]

I originally gave the sample Perl implementation of a random two-way comparison on this page. This was removed by Tokek as "not well written", though I'd like to see a better written one! The point is that two operators are needed for all but the last element of a composite. The inspiration for the comment about "easy to get wrong" was having to point out to my ex-boss that his implementation of operator<= as "return l.a <= r.a || l.b <= r.b || l.c <= r.c" was the bug in his report.

The original rationale for things like strcmp (if known) would be useful too. I can only assume that having just one comparison function beats six even if the early programming languages had six relational operators for built-in data types.

More significantly, I'd like to see something on the relative efficiency of the two approaches. I would generally assume that 3-way compare is either equally or only slightly more expensive than a single < or <= operation. But are there cases where it is expensive to distinguish < from == but <= is cheap, for example? 92.234.34.39 (talk) 18:48, 18 June 2008 (UTC)[reply]

First of all, although I don't have a source, I believe the justification for strcmp was to push as much functionality as possible into the library - and particularly in C, which doesn't have a boolean type and wanted to keep the library small, it seemed verbose to create a proliferation of string comparison functions. Early languages such as Pascal did in fact feature all 6 comparison operators for strings.
As for performance, it's pretty much a red herring - a good compiler should produce the same code regardless of whether two comparisons or a three-way comparison is used. At the hardware level, some platforms support three-way comparison and some don't - Intel instruction sets allow you to do one CMP and then multiple conditional jumps based on the flags, which is effectively a three-way comparison. In any case, there is no strong reason to prefer one approach to the other; it's a conceptual tool, not a performance tool. Dcoetzee 19:18, 18 June 2008 (UTC)[reply]
One can speculate about what code a "good compiler" could produce in some situation by equivalencing the good compiler to the tricks and cunning of a human. The difficulty is that a human knows a purpose, but a compiler sees only the syntax. There may well be fancy op codes or sequences that could be used, but one of the significant arguments of the "reduced instruction set" proponents was that compiler-generated code did not use the interesting op codes, merely concatenations of standard and simple productions and so those fancy instructions were left idle. One can speculate about code optimisation analysis of arbitrary brilliance, but remember that this is walking into the the Entscheidungsproblem in general.
In this specific case, suppose that the machine code indeed allows something like
 Load    A
 Compare B
 JumpPos LabelPos
 JumpNeg LabelNeg
 code for =
to handle the three-way test. With a specific syntax (e.g. fortrannish) if (a - b) LabelNeg,LabelEqual,LabelPos then obviously the compiler writer would have every reason to generate suitable code, but if there is no specific syntax, the general syntax would be something like
if A > B then PosStuff
 else if A < B then NegStuff
  else EqualStuff;
and a general optimisation approach would have no particular reason to compare the expression of one if-statement with that of another except insofar as it is comparing all expressions in developing a network of dependencies. The difficulty in general is that although there can be relationships between two expressions, identifying them (via formal symbol manipulation) is arbitrarily difficult especially when "A" and "B" could be complex expressions. Even in the simple case the relationship is not so obvious: "<" is not the same as ">" or alternatively, if the second form had been written B > A the terms are not the same even though the operators are. The compiler's analysis would have to recognise that the second if-expression was very specifically related to the first. In an isolated example with only two if-statements in the entire programme this seems especially obvious to impatient humans. There is more connection between expressions in a compound if-statement than in two isolated if-statements, but has anyone ever seen a compiler message noting that a compound if-statement is over-determined? For instance that the third part was else if A = B then EqualStuff (an unnecessary test given what has gone before), or if the second if was else if A > B again, an error.
The question is whether or not a suitable optimisation for a compiler could be designed, and if so, is it likely to be done? Only then will the desired code be produced. It seems unlikely to me.
I've just tried a test and even with all available optimisations declared desired, there are two comparisons performed, not one. Evidently the compiler is not a "good" one.NickyMcLean (talk) 22:05, 18 June 2008 (UTC)[reply]
Okay, good points - I would think compilers could do this with integer variables as a variant of common subexpression elimination, but if modern compilers don't do this then I'm just making things up. More importantly, a user-defined type with a function implementing less-than would almost certainly require at least two calls to get the same information that a three-way comparison function would yield, which (particularly for expensive comparisons) could be a big deal where comparison is the bottleneck. A good example is determining the frequency of the lexicographically-minimum string in a list of strings. However, I'm still convinced that most uses of three-way comparison for performance are ill-informed premature optimization, and would be careful to avoid making three-way comparisons look like some kind of panacea.
Even the conceptual advantage is not clear-cut to me - three-way comparisons are not as familiar from high school mathematics as other comparisons, but are more concise and less redundant. But I'm not aware of any serious user studies on syntax like this. Dcoetzee 00:38, 19 June 2008 (UTC)[reply]

I have just amused myself with the equivalent of the following minimal test code:

 if fred then print "True"
  else if fred then print "Also true!"  %Same expression! Cannot possibly be true here!!
   else print "False";

This being a compound or nested if-statement there is a relationship between the parts, but despite activating "maximum optimisations", not a twitch was evoked from the compilers I tried. The code did work correctly though. There appears scant chance that reverberating expressions will be recognised. Ho hum. NickyMcLean (talk) 00:15, 9 July 2008 (UTC)[reply]

Compound if? Yes. Nested if? No. — Preceding unsigned comment added by 192.91.171.34 (talk) 20:57, 13 September 2018 (UTC)[reply]

"Shuttle operator" listed at Redirects for discussion

[edit]

An editor has asked for a discussion to address the redirect Shuttle operator. Please participate in the redirect discussion if you wish to do so. signed, Rosguill talk 18:32, 28 February 2020 (UTC)[reply]