Jump to content

Talk:No free lunch in search and optimization

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Former good articleNo free lunch in search and optimization was one of the Engineering and technology good articles, but it has been removed from the list. There are suggestions below for improving the article to meet the good article criteria. Once these issues have been addressed, the article can be renominated. Editors may also seek a reassessment of the decision if they believe there was a mistake.
Article milestones
DateProcessResult
January 19, 2008Good article nomineeListed
April 24, 2008Good article reassessmentDelisted
Current status: Delisted good article

older entries

[edit]

When I checked the link at Most of these references can be obtained from... not only were the files not available, but the citations were incorrect and/or incomplete. 22:34, 21 November 2006 (UTC)

A request/challenge--somebody might try translating this explanation (and others like it) into English, so that the rest of us can figure out how it might apply to our fields. I do appreciate the critique of Dembski's application but need more examples in familiar (as opposed to jargon) terms.

Factually incorrect figure

[edit]

The figure is incorrect, and has to go. In fact, when all cost functions are equally likely, each algorithm observes each possible sequence of cost values with equal likelihood, so there is no specialist / generalist trade-off of the sort depicted in the plot. To state that more precisely, if there are N domain elements and M codomain elements (distinct costs), then the probability that an arbitrarily chosen algorithm will observe values in an arbitrarily chosen sequence is M-N.

At the risk of seeming like I'm engaged in self-promotion, I'm going to take a shot at rewriting this article. The touted "simple explanation" is not freely accessible online, and it happens that I gave a very simple explanation during a student's thesis defense in 1994. I will try to respond to the "request / challenge" above, but part of me says that into every life a little math must fall. NFL is inherently about probability distributions on functions and sequences. ThomHImself 00:54, 11 March 2007 (UTC) Tom English[reply]


"The No Free Lunch Theorem" is not the point

[edit]

Section 3 of Wolpert and Macready (1997) gives two NFL theorems, not one. The 1995 tech report and the 1997 article of W&M both refer to theorems, plural, in the titles. This makes reference to the theorem something of an embarrassment. Beyond that, people who state the theorem rarely, if ever, state the first NFL theorem (the one they seem to have in mind). Instead they state their favorite interpretations.

The fact is that the actual NFL theorems are not as interesting as their immediate implications. That is why I am moving this article to "No Free Lunch in Search and Optimization," with "No-free-lunch theorem" redirecting here. ThomHImself 21:10, 11 March 2007 (UTC) Tom English[reply]

Note that double redirects do not work, so that if you do move this page, you will need to update any double redirects that are created as a result of the move. The What Links To No-free-lunch_theorem special page will help in locating them. -- Cat Whisperer 21:18, 11 March 2007 (UTC)[reply]

More changes to come

[edit]

A figure will clarify the introduction.

Watch for sections giving a gentle introduction to NFL and formalizing NFL.

I made a large number of changes without logging in. Sorry, but I'm a neophyte, and didn't know I wasn't logged in. ThomHImself 05:25, 29 March 2007 (UTC)[reply]

Deleted folkloric theorem from introduction

[edit]

It is somewhat humorous that someone added to the introduction precisely what I document as mathematical folklore in a later section on "the" NFL theorem:

Put simply, the theorem says that over the average of all possible solution landscapes, no search algorithm is on average better than any other.

No one who makes this claim ever cites both an article and a theorem number. There is no such theorem in the writings of Wolpert and Macready. Furthermore, at the point where the sentence was added, there had been no mention whatsoever of a theorem. The article is now about NFL in search and optimization, and the folkloric theorem is just one aspect of that. ThomHImself 06:16, 29 March 2007 (UTC)[reply]

Deleted "Additional Notes" message

[edit]

I took this out, and I would be interested in knowing what's going on:

==Additional Notes==
This is also referenced by the infamous Pascal. If this is heard, please disbelieve him.

ThomHImself 23:34, 26 April 2007 (UTC)[reply]

Math typesetting

[edit]

Thanks to whomever cleaned up the math typesetting in "Synopsis of NFL in mathematical terms." I will see later if I can't use <math> to improve some other stuff. —The preceding unsigned comment was added by ThomHImself (talkcontribs) 05:05, 29 April 2007 (UTC).[reply]

Moved Info on Baylor Lab

[edit]

The treatment of Baylor's Evolutionary Informatics Laboratory as part of the "great intelligent design conspiracy" was unjustified. We have to go on what the lab actually gives evidence of doing. What I see is a generalization of Dembski's ideas to what could be an entirely legitimate field of study. It is certainly the case that practitioners of evolutionary computation supply domain knowledge when they can, and that it is important to understand how this information affects the results obtained.

The text entered in the article reduced ultimately to argumentum ad hominem -- against Dembski, actually. Bob Marks has made outstanding contributions to engineering that not even an opponent of intelligent design like me questions. When he says he is engaged in legitimate scholarship, one has to suppose he is in the absence of evidence to the contrary. ThomHImself 23:21, 19 June 2007 (UTC)[reply]

Cullen Schaffer and conservation of generalization accuracy

[edit]

NFL is in fact what I prefer to call conservation of search performance, in analogy with Schaffer's conservation of generalization accuracy (nonexistent article) in machine learning (no reference to conservation). This article could use a section relating NFL to conservation of generalization accuracy, but only when the main article exists.

Instead we got "Cullen Schaffer did it first, only it was a little different" editing that put his name in the introduction and at the head of the list of references. I have given positive suggestions, but to what entered the article, I will simply say no.

A little story: I sketched a proof of the first NFL theorem in 1994, during a public thesis defense. I asked the student what he hoped to accomplish by tweaking optimizers to get better performance on test functions. He said, of course, that he hoped to obtain a generally superior optimizer. After a brief pause, I responded that optimizing a uniformly drawn function was equivalent to optimizing the output of a uniform random number generator. The silence that fell over the room was remarkable. I was never so arrogant as to think I was the only person to complete such a shallow line of reasoning. Over the years, I have met various people who realized there was NFL before Wolpert and Macready called it that.

The substantive part of Wolpert and Macready's work is the information-geometric analysis of "alignment" of algorithms and problems. ThomHImself 23:59, 26 June 2007 (UTC)[reply]

I have added a couple sentences about NFL as conservation of search performance, along with an indication that conservation of generalization accuracy came first, in the overview. Somebody please write the other article and give Schaffer his due. ThomHImself 04:48, 27 June 2007 (UTC)[reply]

GA Tip

[edit]

Perhaps images should be added to this. In the [Criterea], it says that articles should be illustrated. --Dan LeveilleTALK 08:54, 28 December 2007 (UTC)[reply]

Thanks, Dan. There used to be an illustration, but it was based on a common misunderstanding of NFL. The only clear illustration of how NFL "works" I have seen is a collection of three figures in the English (2004) paper. They convey technical detail, and adapting them for the article is not easy. I plan to decorate the introduction with something like Figure 4 – it fits with the final paragraph – though I worry about discouraging the general reader. The illustrations of a decision tree and the operation of the decision tree on objective functions would enhance the "formal synopsis" section. I would very much appreciate advice from anyone on this matter.ThomHImself (talk) 23:38, 15 January 2008 (UTC)[reply]
Good point. It is a hard situation. --Dan LeveilleTALK 01:24, 16 January 2008 (UTC)[reply]
See "Illustration" section below.ThomHImself (talk) 07:48, 18 January 2008 (UTC)[reply]

New section title

[edit]

Regarding "Specified complexity, active information, and NFL": The unpublished papers coauthored by Dembski and Marks make no reference to intelligent design or to specified complexity. The notion of active information is very different from specified complexity, and is in itself reasonable. Marks seems to me an honest creationist who shares IDists' interest in information. The use of "active information" in arguments dismissing the success of computational simulations of evolutionary processes is undeniably analogous to the use of "specified complexity" in Dembski's earlier arguments. There is certainly the possibility that ID has again morphed. Thus I have opted to keep discussion of the two concepts together, but to avoid the suggestion that Marks is a member of the intelligent design movement.ThomHImself (talk) 20:24, 15 January 2008 (UTC)[reply]

I decided to tag both Dembski and Marks with a neo-creationism "see also." I've left it to the reader to decide if what I've described in neutral terms qualifies as neo-creationism. Marks has given science-oriented apologetics presentations around the world, so I think this is fair enough. I also added text to make it clear that Dembski and Marks' arguments that researchers "smuggle" active information into evolutionary simulations is analogous to Dembski's early arguments that researchers smuggle complex specified information into algorithms.ThomHImself (talk) 23:12, 15 January 2008 (UTC)[reply]

Illustration

[edit]

I think a simple illustration to go with the introduction would be two menus, one offering low prices to the vegan, and the other offering low prices to the carnivore. The prices on each menu would simply be those on the other, reordered. I would appreciate hearing from any passersby whether this would clarify.ThomHImself (talk) 07:48, 18 January 2008 (UTC)[reply]

I apologize for being slow in adding an illustration, and I appreciate Josedavid's attempt to improve the article by adding one. Unfortunately, the illustration was very similar to the one (taken from NFL.org) I discussed in "Factually incorrect figure" above. In the NFL framework, there really is no such thing as a specialized algorithm. Any two algorithms A and B have identically distributed performance values. They differ possibly in the problems for which they achieve given levels of performance. The illustrated situation simply cannot arise. The fact that such depictions of NFL are common indicates just how widespread misunderstanding of the fundamental working of NFL is.

Again, thanks to Josedavid for the effort, and I'm sorry to have to remove the work. I will try to get the promised illustration into the article by tomorrow.ThomHImself (talk) 02:01, 19 February 2008 (UTC)[reply]

OK, I now have in place the simplest correct illustration of NFL I've ever come by. I'm not happy with the fuzziness of the captions, and I'm going to take a shot (in my own due time, perhaps) at making an SVG, rather than PNG, image.ThomHImself (talk) 20:54, 19 February 2008 (UTC)[reply]

A suggestion for the illustration: For the f000 entry, the bars should both be 3, since all 3 plates must be examined to determine no good result exists. Yoderj (talk) 13:58, 18 September 2008 (UTC)[reply]

"Paring this down per WP:NPOV. Size and credulousness of section gave undue weight to ID proponents views."

[edit]

Treatment of a topic in the encyclopedia is not endorsement. user:Odd Nature has been unable to provide a reliable source at Robert J. Marks II for his claim that Marks is an ID advocate, so he has come here to assert that Marks is an ID advocate. He obliterated a section here documenting the fact that Marks' active information is not equivalent to Dembski's specified complexity. The content was sourced. He also obliterated a section of Evolutionary Informatics Lab that included Mark Chu-Carroll's criticism of specified complexity and acknowledgment that active information might be useful. This looks like systematic removal of a balanced response, not a NPOV. Any human being, including Dembski, is capable of producing both bogus and useful ideas. We have to report on the ideas as ideas, and not engage in thinly veiled ad hominem attacks.

Odd Nature has a habit of deleting without discussing. I look forward to having my credulousness demonstrated to me in detail. BTW, I have a chapter highly critical of ID theory to appear in an edited volume this June. ThomHImself (talk) 09:17, 18 April 2008 (UTC)[reply]

Removing original research

[edit]

I had little knowledge of Wiki policy when I began editing this article. There is a great deal of my original research here. Having since held others to WP:NOR in other articles, I am going to police myself here. ThomHImself (talk) 16:47, 24 April 2008 (UTC)[reply]

Introduction

[edit]

The second paragraph elaborating the NFL metaphor was something that I have never seen in the literature, and was my WP:Synthesis. I thought long and hard about the figure, because I had neither contributed nor seen a good, simple figure in the NFL literature. Obviously this was my synthesis. It did not follow immediately from anything in the literature, and was supported only by my own authority. ThomHImself (talk) 17:07, 24 April 2008 (UTC)[reply]

Example: NFL in a roulette game

[edit]

When the article passed GA review, a reviewer expressed vague doubts about this section, and I didn't understand what s/he was driving at. The example was entirely novel, reflecting how I understand my own formal results in the source I gave. To my knowledge, here is nothing like the example in the NFL literature. ThomHImself (talk) 17:21, 24 April 2008 (UTC)[reply]

Formal synopsis of NFL

[edit]

The "functionally equivalent" algorithm was my invention. To my knowledge, neither it nor its functional equivalence to anything appears in the NFL literature. The nice concept of Phi-NFL distributions on objective functions was something I came by working on this article. I have not published it, and to my knowledge no one has published an equivalent concept. ThomHImself (talk) 17:41, 24 April 2008 (UTC)[reply]

Original NFL theorems

[edit]

My statement in plain language of what Theorem 1 "says" was novel, and supported only by my own "authority." Similarly, my claim that the theorem implies much more than the "folkloric theorem" is unsupported, and furthermore reflects my bias. I will remove this bias throughout the article later. ThomHImself (talk) 17:51, 24 April 2008 (UTC)[reply]

Interpretation of NFL results

[edit]

This section was full of common misinterpretation of the NFL results when I began editing the article. I replaced them with my own "wise" interpretation, synthesizing statements from published sources, and adding two longish paragraphs with no sources whatsoever. I have pruned this section to the interpretation that was in the article when I first came on the scene, plus qualifying remarks directly supported by sources. ThomHImself (talk) 18:12, 24 April 2008 (UTC)[reply]

Folkloric NFL theorem

[edit]

That people substitute a plain-language statement for Wolpert and Macready's Theorem 1, and that they reduce the entire body of NFL results to that folkloric theorem, is a pet peeve of mine. This probably owes in part to the fact that it deemphasizes my own contributions. Characterization of the plain-language statement as a folkloric theorem occurs nowhere but here to my knowledge, and emphasis of the characterization definitely reflects my bias. Thus I have removed it from the article. ThomHImself (talk) 18:40, 24 April 2008 (UTC)[reply]

Overview and (A)NFL theorem

[edit]

Omission of the (A)NFL theorem was originally oversight on my part. When I came upon it again, six years after first reading the paper proving it, I did not agree with authors' interpretation, and pondered how to treat their interpretation without contradicting different claims in the article. I should never have considered this. What is published is what should be reported here. My unpublished reservations have no place here.

The (A)NFL theorem deserves mention in the introduction. ThomHImself (talk) 19:26, 24 April 2008 (UTC)[reply]

The discussion of Kolmogorov complexity of functions and algorithms is reduced to theoretical finery with the inclusion of the (A)NFL theorem. To allow it to remain in Wikipedia now would serve merely to promote my own work. That is, Kolmogorov complexity is a confusing topic, and including it here tends to confuse the treatment of NFL. ThomHImself (talk) 19:56, 24 April 2008 (UTC)[reply]

Possibly defamatory claim about Robert J. Marks II

[edit]

Please do not repeat the claim that Marks is a proponent of a certain ideology (see this diff) unless you can provide a source allowed by policy on reliable sources for living persons.

WP:BLP specifies immediate removal of possibly defamatory or otherwise damaging claims about living persons that are not sourced according to policy. This applies to all of Wikipedia -- talk pages etc., and not just articles. The three revert rule does not apply to defense of WP:BLP. Note in the diff above that I've done nothing but remove the possibly defamatory or otherwise damaging claim. ThomHImself (talk) 22:30, 24 April 2008 (UTC)[reply]

Suggestions for making this a GA again

[edit]

Figure

[edit]

I have twice removed factually incorrect figures from this article. Anyone who adds a figure should adapt it from a figure in the NFL literature (peer-reviewed article or published book), and not the one at the NFL website (it has already been in this article). ThomHImself (talk) 23:43, 24 April 2008 (UTC)[reply]

Elaboration and elucidation

[edit]

With my removal of a massive amount of exposition that was my original research or reflected my personal opinion, some sections of the article are very thin. They need additional material that is not made up by the editor (don't repeat my violations of Wiki policy), but is directly supported by books or peer-reviewed papers on NFL. ThomHImself (talk) 23:43, 24 April 2008 (UTC)[reply]

Intelligent design and NFL

[edit]

This section has been appropriated for polemics by anti-ID editors. I have been accused of conflict of interest in the matter, so I will do nothing with the section but defend Wikipedia policy on claims about living persons. There should be discussion of invocations of NFL by Dembski and Marks at Evolutionary Informatics Lab or in an article on active information. I recommend that someone who wants to make this a maintainable technical article simply delete the section. ThomHImself (talk) 23:43, 24 April 2008 (UTC)[reply]

A More Friendly Introductory Paragraph

[edit]

I think this article could benefit from another sentence in the introduction explaining the topic to laypeople. As a computer science major unfamiliar with the topic, I found it confusing. —Preceding unsigned comment added by 69.143.161.239 (talk) 03:13, 8 May 2008 (UTC)[reply]

Deletion of constraint-satisfaction content from introduction

[edit]

The search in constraint satisfaction is of a fundamentally different type than addressed here. The problem instance is decomposed in constraint satisfaction. This article addresses "search" as iterative sampling of the objective function. That is, the objective function is treated more or less as a black box. Schaffer's result in machine learning is relevant because it, too, treats the function as a black box. ThomHImself (talk) 21:52, 3 September 2008 (UTC)[reply]

Confusing illustration

[edit]

From the current article:

In the "no free lunch" metaphor, each "restaurant" (problem-solving procedure) has a "menu" associating each "lunch plate" (problem) with a "price" (the performance of the procedure in solving the problem). The menus of restaurants are identical except in one regard — the prices are shuffled from one restaurant to the next. For an omnivore who is as likely to order each plate as any other, the average cost of lunch does not depend on the choice of restaurant. But a vegan who goes to lunch regularly with a carnivore who seeks economy pays a high average cost for lunch. To methodically reduce the average cost, one must use advance knowledge of a) what one will order and b) what the order will cost at various restaurants. That is, improvement of performance in problem-solving hinges on using prior information to match procedures to problems.

I don't think this is a good metaphor — or maybe I'm just missing something here. Clearly neither the vegan nor the carnivore will ever get a "free lunch" if each "lunch plate" has a non-zero "price"! So the "No Free Lunch" theorem doesn't seem to have anything to do with the cost of lunch in our allegory — which seems unnecessarily confusing to me. Either the mathematical "Free Lunch" should be represented in the allegory by a lunch which is free, or else the allegory shouldn't be talking about the allegorical price of allegorical lunches at all.

Lastly, the last sentence of the paragraph quoted above seems to be trying to explain that "No Free Lunch" is rarely relevant in the real world, but it would IMHO be a good idea to state more forcefully that yes, some search algorithms are better than others in the real world (e.g., don't use a linear search if a binary search would be faster; GIF images are usually smaller than BMPs; and so on) precisely because the real world is not uniformly distributed, and is full of "advance knowledge" and "prior information". The current article seems to be aimed at an audience of experts to the bafflement of non-experts, which I found weird given that it's an article about a theorem with a "folksy" name. --Quuxplusone (talk) 19:12, 8 October 2008 (UTC)[reply]

Indeed. I read the "dumbed down" version and was thoroughly confused. The technical explanations were much more palatable (a bad sign)! —Preceding unsigned comment added by 65.183.135.231 (talk) 22:02, 15 October 2008 (UTC)[reply]

Also, the no free lunch "metaphor" does not explain the name of the theorem. The name itself simply comes from the fact that you cannot get something for free. E.g., if some heuristic gives you improved performance on a certain problem, there exists another problem where the heuristic performs worse. So there does not exist a heuristic that gives you improved performance "for free". Hence, there is "no free lunch". This is a way more simpler metaphor (Ockham's razer anyone?). -- HolKann

Herb Sutter essay

[edit]

Should this article, or perhaps a related article, mention Herb Sutter's "The Free Lunch is Over" essay (http://www.gotw.ca/publications/concurrency-ddj.htm), which describes another way in which there is no "free lunch" in computing? —Preceding unsigned comment added by DestroyerofDreams (talkcontribs) 01:53, 27 February 2010 (UTC)[reply]

---no, that's a completely different use of the term "no free lunch" where he is talking about how software engineers have had it easy in the past because their software ran faster and they didn't have to do anything each time a faster CPU came out. his "no free lunch" means that now that the CPUs aren't getting much faster and speed is coming via parallel/multi-core, now the SW engineer has to actually get off his/her butt and work a little to make their single-core applications run on multi-core to get significant speedups (or actually go back and re-code the inefficient garbage).

This does bring up an interesting related subject though.... ...and that is, I don't think the NFL theorems have anything to say about that sort of efficiency. For instance, will anyone disagree that coder #1's solution is "superior" to coder #2's solution if coder #1's software implements "the same" random guessing algorithm as coder #2's, except averaged over all computer hardware/architectures available on the planet, coder #1's random guesser evaluates 10,000 guesses for every 1 that coder #2's algorithm evaluates? I don't think that's what the NFL theorem is talking about at all, but the way people try to explain the NFL theorems in plain English to a non-specialist makes it sound like a bunch of c r a p when anyone involved in practical problems knows that there are practical reasons why you can easily claim one algorithm is superior to another.

Simplify/Rewrite Introduction for non-specialists

[edit]

The introduction to the article is writen using too much jargon, and is generally unclear. It should be re-written so that people not already familiar with the topic can understand what it is talking about. The word "procedures" seems to have a special meaning, which is not a priori obvious. Perhaps the wording in the "Overview" section should be moved here, as it is written much more clearly. — Preceding unsigned comment added by 68.108.237.62 (talk) 17:43, 20 April 2012 (UTC)[reply]

Further research, closed under permutation, countable vs uncountable, multiobjective optimization, etc.

[edit]

There are many missing extensions that should be added to this article. I do not have the time to write this out nicely on the main article, but I have done extensive background research and I will merely post snippets of what I have written for part of a written comprehensive exam at my university. I would appreciate someone / anyone would integrate this into the article.


The NFL theories hold for any set of functions closed under permutation \cite{Schumacher2001, Droste2002, Whitley2005, Auger2010}

The NFL theorems are independent of whether the set of functions is compressible \cite{Schumacher2001}. Compressible and incompressible are referring to information theoretic measures.

The NFL theorems cease to apply to multiobjective optimization. In multiobjective optimization, due to memory considerations, part of the Pareto front is forgotten. This leads to differences in performance between algorithms; but there is no ``ideal algorithm in multiobjective optimization \cite{Corne2003a, Corne2003b}.

\citet{Auger2007, Auger2010} show that a weaker version of the NFL theorems apply to countably infinite sets of functions, but the NFL theorems fail to apply for uncountably infinite sets of functions.

\citet{Culberson1998} notes that black-box problems are much harder than the NFL theorems state. Without any prior knowledge, black-box optimization is harder than NP-hard optimization.

Another criticism is that not all functions are equally likely to occur in real life \cite{Giraud2005, Droste2002}. Functions which are closed under permutation are rare \cite{Igel2003, Igel2004}.

Sources below (all might not be used):

@article{Culberson1998,

   author = {Culberson, Joseph C.},
   title = {On the futility of blind search: An algorithmic view of `no free lunch'},
   journal = {Evolutionary Computation},
   issue_date = {Summer 1998},
   volume = {6},
   issue = {2},
   month = {June},
   year = {1998},
   issn = {1063-6560},
   pages = {109--127},
   numpages = {19},
   url = {http://dx.doi.org/10.1162/evco.1998.6.2.109},
   doi = {http://dx.doi.org/10.1162/evco.1998.6.2.109},
   acmid = {1326777},
   publisher = {MIT Press},
   address = {Cambridge, MA, USA},
   keywords = {Blind search, NP-complete, adversary arguments, computational complexity, knowledge classes, lower bounds, no free lunch},

}


@inproceedings{Schumacher2001,

   title={The no free lunch and problem description length},
   author={Schumacher, C. and Vose, M.D. and Whitley, L.D.},
   booktitle={Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001)},
   pages={565--570},
   year={2001},

}

@ARTICLE{Droste2002,

 author = {Stefan Droste and Thomas Jansen and Ingo Wegener},
 title = {Optimization with Randomized Search Heuristics -- The (A)NFL Theorem, Realistic Scenarios, and Difficult Functions},
 journal = {Theoretical Computer Science},
 volume = {287},
 number = {1},
 pages = {131--144},
 year = {2002},
 volume = {287},
 url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.5850}

}

@ARTICLE{Igel2004,

 author = {Christian Igel and Marc Toussaint},
 title = {A no-free-lunch theorem for non-uniform distributions of target functions},
 journal = {Journal of Mathematical Modelling and Algorithms},
 year = {2004},
 volume = {3},
 pages = {2004},
 url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.8446}

}

@ARTICLE{Igel2003,

 author = {Igel, Christian and Toussaint, Marc},
 title = {On classes of functions for which No Free Lunch results hold},
 journal = {Information Processing Letters},
 year = {2003},
 volume = {86},
 pages = {317--321},
 number = {6},
 month = jun,
 acmid = {941196},
 address = {Amsterdam, The Netherlands, The Netherlands},
 doi = {10.1016/S0020-0190(03)00222-9},
 issn = {0020-0190},
 issue_date = {30 June 2003},
 keywords = {combinatorial problems, no free lunch, optimization, randomized algorithms},
 numpages = {5},
 publisher = {Elsevier North-Holland, Inc.},
 url = {http://dx.doi.org/10.1016/S0020-0190(03)00222-9}

}

@ARTICLE{Wolpert1992,

   author = {David H. Wolpert},
   title = {Stacked generalization},
   journal = {Neural Networks},
   year = {1992},
   volume = {5},
   issue = {2},
   month = {February},
   issn = {0893-6080},
   pages = {241--259},
   numpages = {19},
   url = {http://dx.doi.org/10.1016/S0893-6080(05)80023-1},
   doi = {http://dx.doi.org/10.1016/S0893-6080(05)80023-1},
   acmid = {148453},
   publisher = {Elsevier Science Ltd.},
   address = {Oxford, UK, UK},
   keywords = {Combining generalizers, Error estimation and correction, Generalization and induction, Learning set preprocessing, cross-validation},

}

@INPROCEEDINGS{Wolpert2001,

 author = {David H. Wolpert},
 title = {The supervised learning no-free-lunch Theorems},
 booktitle = {Proceedings 6th Online World Conference on Soft Computing in Industrial Applications},
 year = {2001},
 pages = {25--42},
 url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.133}

}

@ARTICLE{Wolpert2005,

   author = {Wolpert, D.H. and Macready, W.G.}, 
   journal = {IEEE Transactions on Evolutionary Computation}, 
   title = {Coevolutionary free lunches}, 
   year = {2005}, 
   month = {dec.}, 
   volume = {9}, 
   number = {6}, 
   pages = {721--735}, 
   keywords = { black box optimization; coevolutionary free lunches; multiarmed bandit problems; no free lunch theorem; self-play problem; subsequent multiplayer game; decision making; evolutionary computation; game theory;}, 
   doi = {10.1109/TEVC.2005.856205}, 
   ISSN = {1089-778X},

}


@inproceedings{Giraud2005,

   author = {Giraud-Carrier, C. and Provost, F.},
   title = {Toward a Justification of Meta-learning: Is the No Free Lunch Theorem a Show-stopper?},
   booktitle = {Proceedings of the ICML-2005 Workshop on Meta-learning},
   keywords = {ensembles, simplicity},
   pages = {12--19},
   posted-at = {2006-11-12 13:48:01},
   url = {{http://faculty.cs.byu.edu/\%7Ecgc/Research/Publications/ICML2005WS.pdf}},
   year = {2005},

}

@INPROCEEDINGS{Auger2007,

 author = {Auger, Anne and Teytaud, Olivier},
 title = {Continuous lunches are free!},
 booktitle = {Proceedings of the 9th annual conference on Genetic and evolutionary computation},
 year = {2007},
 series = {GECCO '07},
 pages = {916--922},
 address = {New York, NY, USA},
 publisher = {ACM},
 acmid = {1277145},
 doi = {10.1145/1276958.1277145},
 isbn = {978-1-59593-697-4},
 keywords = {Kolmogorov's extension theorem, free-lunch, no-free-lunch},
 location = {London, England},
 numpages = {7},
 url = {http://doi.acm.org/10.1145/1276958.1277145},

}

@ARTICLE{Auger2010,

 author = {Auger, Anne and Teytaud, Olivier},
 title = {Continuous Lunches Are Free Plus the Design of Optimal Optimization Algorithms},
 journal = {Algorithmica},
 year = {2010},
 volume = {57},
 pages = {121-146},
 doi = {10.1007/s00453-008-9244-5},
 issn = {0178-4617},
 issue = {1},
 keywords = {No-Free-Lunch; Kolmogorov’s extension theorem; Expensive optimization; Dynamic programming; Complexity; Bandit-based Monte-Carlo planning},
 language = {English},
 publisher = {Springer-Verlag},
 url = {http://dx.doi.org/10.1007/s00453-008-9244-5}

}

@INPROCEEDINGS{Corne2003a,

 author = {Corne, D. and Knowles, J.},
 title = {No free lunch and free leftovers theorems for multiobjective optimisation problems},
 booktitle = {Evolutionary Multi-Criterion Optimization},
 year = {2003},
 pages = {66--66},
 organization = {Springer},
 url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.1480}

}


@INPROCEEDINGS{Corne2003b,

 author = {Corne, D. and Knowles, J. },
 title = {Some multiobjective optimizers are better than others},
 booktitle = {Proceedings Congress Evolutionary Computation CEC '03},
 year = {2003},
 volume = {4},
 pages = {2506--2512},
 doi = {10.1109/CEC.2003.1299403},

}

@INCOLLECTION{Whitley2005,

 author = {Whitley, Darrell and Watson, Jean},
 title = {Complexity Theory and the No Free Lunch Theorem},
 booktitle = {Search Methodologies},
 publisher = {Springer US},
 year = {2005},
 editor = {Burke, Edmund K. and Kendall, Graham},
 pages = {317-339},
 abstract = {This tutorial reviews basic concepts in complexity theory, as well as various No Free Lunch results and how these results relate to computational complexity. The tutorial explains basic concepts in an informal fashion that illuminates key concepts. No Free Lunch theorems for search can be summarized by the following result: For all possible performance measures, no search algorithm is better than another when its performance is averaged over all possible discrete functions.}, 
 affiliation = {Colorado State University Department of Computer Science Fort Collins CO USA},
 doi = {10.1007/0-387-28356-0_11},
 isbn = {978-0-387-28356-2},
 keyword = {Mathematics and Statistics},
 url = {http://dx.doi.org/10.1007/0-387-28356-0_11}

}

NFL does not imply "matching algorithms to problems gives higher average performance than does applying a fixed algorithm to all"

[edit]

As "matching algorithms" in itself requires an algorithm to optimize "average performance", according to NFL, you do not gain anything compared to randomly matching, or matching everything with "a fixed algorithm". Only if you have a priori knowledge of the subset of all problems you're going to match on, can you expect to gain anything. -- HolKann