Jump to content

Talk:Commodity computing

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

sentence on fault tolerance/mtbf doesn't make sense

[edit]

This article currently says (in the lede): """At some point, the number of discrete systems in a cluster will be greater than the mean time between failures (MTBF) for any hardware platform, no matter how reliable, so fault tolerance must be built into the controlling software."""

The number of systems can't be "greater than" a time. (5 servers is greater than an hour? 10 servers is greater than a meter?)

There's 2 refs for that sentence atm and I wasn't able to read either of them. One is a journal article behind paywall and one is 404 (but might be available in wayback machine). Can't look any further right now, maybe later. --Jeremyb (talk) 21:29, 30 November 2013 (UTC)[reply]

I've added "dubious" tag next to this statement, I hope somebody will come across and elaborate it (the same reason as Jeremyb). --Ajgorhoe (talk) 13:52, 15 September 2017 (UTC)[reply]

I think what they were trying to say is that the number of discrete systems in a cluster will become so large that the mean time between any one of those systems failing will become so small as to be unacceptable.

Imagine a computer with a MTBF of 1 year, with failures uniformly distributed. If you built a system that used 2 of these computers, in which either computer failing would break the system, the MTBF of that system would be 6 months (theoretically). If you used 365 of them, the MTBF would be 1 day, which would be unacceptable for most applications, hence requiring fault-tolerant software.

The problem is that the sentence is comparing two different units (as Jeremyb pointed out) - number of discrete systems to a time. This is the first time I'm contributing to Wikipedia so I'll let someone else fix it as I don't know the protocol. --Tkeady5 (talk) 20:30, 27 August 2018 (UTC)[reply]