Talk:Huber loss
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
Corrections needed
[edit]As far as I can tell this article is wrong, and the notation is a mess.
+ Please don't use $L$ for every loss function.
+ The suggested criteria seems to be missing the important constraint of convexity.
+ A continuous function $f$ satisfies condition 1 iff $f(x)\geq 1 \, \forall x$. This is not what you want.
+ From the perspective of SVM style learning, condition 1 or the ideal loss function should be $\delta(x)=\begin{cases} 0&\text{if x\leq 0}\\1& \text{otherwise.}\end{cases}$. Then the hinge loss $L^1(x)=max(x+1,0)$, and quadratic hinge loss $L^2(x)=(max(x+1,0))^2$ form an upper bound satisfying condition 1.
Then taking $H$ as the Huber function $H(x)=\begin{cases}x^2/2&x<1\\x &\text{otherwise.}\end{cases} an appropriate Huber style loss function would be either $H(max(x+2,0))$ or $2H(max(x+1,0))$, as both of these would satisfy the corrected conditions 1-3 and convexity.
I haven't made the above corrections as I'm unfamiliar with Huber loss, and it presumably has uses outside of SVMs in continuous optimization. For these cases criteria 1. will need to be fixed. Hopefully someone who is familiar with Huber's loss can make some corrections. 86.31.244.195 (talk) 17:08, 6 September 2010 (UTC)
Some corrections
[edit]I agreed with the previous writer. This article was poorly sourced and made a lot of unqualified and unreferenced claims, and suffered from imbalance, being written from the POV of an enthusiast for "machine learning". I tried to make the most important corrections. Kiefer.Wolfowitz (talk) 13:50, 30 October 2010 (UTC)
External links modified
[edit]Hello fellow Wikipedians,
I have just modified one external link on Huber loss. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
- Added archive https://web.archive.org/web/20150126123924/http://statweb.stanford.edu/~tibs/ElemStatLearn/ to http://statweb.stanford.edu/~tibs/ElemStatLearn/
When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.
This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}}
(last update: 5 June 2024).
- If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
- If you found an error with any archives or the URLs themselves, you can fix them with this tool.
Cheers.—InternetArchiveBot (Report bug) 00:07, 8 November 2017 (UTC)
An error in a formula
[edit]The factor delta squared in the smooth version should be delta. Perhaps one should then add delta > 0 for good measure 87.52.15.99 (talk) 11:17, 2 September 2023 (UTC)
Pseudo-Huber loss function (redundant scale factor in loss function)
[edit]The delta^2 multiplier is redundant, right? 162.246.139.210 (talk) 18:21, 30 October 2023 (UTC)
• Correct, the Pseudo-Huber loss function works fine without being scaled by delta^2. The only reason it is there is for those who want the Pseudo-Huber loss function to be scaled like the original Huber loss function.
• Notably, if you want the Huber loss function to result in the same scale as SAE (Sum of absolute errors), then it should be divided by delta, thus:
Original Huber: If |a| <= delta Then fn = |a| * |a| / (2 * delta) Else fn = (|a| − delta / 2) Pseudo-Huber: fn = delta * (Sqr(1 + (Abs(y − x(i)) / delta) ^ 2) − 1)
• Incidentally, I feel that the choice of delta adds subjective complexity, so I use a much simpler alternative to the Huber loss function, which has a unique solution and functions like SAE at a distance:
fn = |a| ^ 1.001
...where aa is the absolute deviation. This alternative is computationally slower than Huber, but beautifully simple! Peter.schild (talk) 14:58, 10 August 2024 (UTC)