Talk:Thread safety
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
Unclear article
[edit]This article sucks for someone who doesn't already know what thread safety is, like me.
"A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time."
Or in other words, "thread-safe code is code where the threads are safe". Hooray, great!
It was much clearer in 2008 (copy-pasted from Stack Overflow):
"Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it functions correctly during simultaneous execution by multiple threads. In particular, it must satisfy the need for multiple threads to access the same shared data, and the need for a shared piece of data to be accessed by only one thread at any given time."
94.254.4.16 (talk) 00:01, 25 June 2015 (UTC)
Difficulties
[edit]Earlier versions of this page contained the following advice:
- One approach to making data thread-safe that combines several of the above elements is to make changes to a private copy of the shared data and then atomically update the shared data from the private copy. Thus, most of the code is concurrent, and little time is spent serialized.
As stated above that advice is incorrect and will lead to lost updates of the shared data. Consider two threads each maintaining their private copy of a shared integer. If each thread increments their own private copy of the integer and copies that to the shared integer the net result will be an increment of the shared integer by 1 instead of 2. The threads must check that there have been no updates to the shared data since they took their private copies before updating the shared data. If available, the Compare-and-swap instruction may be useful.
- update**
There are cases where the above mentioned advice is not wrong. For example when shared data is only read by some threads and written by a unique thread.
- Debates belong on the talk page, not in the article. 96.60.118.196 (talk) 22:12, 15 February 2010 (UTC)
Actually I wanted the old, bad, advice on the main page so that readers could see that even people who think they understand thread safety find it hard. As the page stands the advice is presented as general altho you claim here it only applies in a special case - one writer, multiple readers. Actually each reader thread needs to access the shared data atomically also - so the local copies provide no benefit whatsoever.
I'm not going to get into a revert war, so I suggest you think this thru carefully. The best advice is not to use shared data but rather to communicate by message passing only. —Preceding unsigned comment added by 217.155.175.25 (talk) 15:15, 7 March 2010 (UTC)
Layout
[edit]This article needs to be a little clear formatted to allow easy dissemination of what it saying. Falls End (T, C) 00:36, 1 December 2005 (UTC)
- I've rewikfied a bit. I have removed this phrase from the intro, as it's a repeat of what is found in the (new) section 'Achieving thread safety'. "Common ways of creating thread-safe code include writing reentrant code, using Thread-local storage to localize data to each thread, guarding shared data with mutual exclusion so that only one thread uses it at a time, and modifying shared data with atomic operations." In a longer article it's probably worth keeping this repetition, but I don't think it's justified yet.
- What do you think of my changes? :) --Stevage 02:57, 1 December 2005 (UTC)
Page move
[edit]I think this article should be renamed to 'Thread safety'. Can anyone work out how to do it? Jimbletang 03:05, 8 December 2006 (UTC)
- I thought the same thing, so I moved it. - furrykef (Talk at me) 00:58, 17 June 2007 (UTC)
Reentrancy does not always imply thread-safety
[edit]I think the article is wrong when it infers that reentrancy is a sufficient condition to ensure thread safety. It's not, and the two concepts of thread safety and reentrancy are distinct. It's true that reentrant functions are often thread-safe too, but it's easy to construct examples of reentrant functions that are not thread-safe.
So, I think it would be better to eliminate phrases like "A subroutine is reentrant, and thus thread-safe ..." because they give incorrect impressions that reentrancy is some sort of a stronger guarantee of thread-safety.
- The reentrant (subroutine) article also claims that "Every reentrant function is thread-safe, however, not every thread-safe function is reentrant.".
- Since it is so "easy to construct examples of reentrant functions that are not thread-safe", could you -- or anyone -- give an example "reentrant function that is not thread-safe"?
- Something kind of like this (but, of course, going the other direction):
// thread-safe function that is not re-entrant
Tuple temporary_buffer // global variable
swap_tuples( Tuple * pa, Tuple * pb )
lock(temporary_buffer)
temporary_buffer = *pa
*pa = *pb
*pb = temporary_buffer
unlock(temporary_buffer)
return
- I think that the problem is that re-entrant isn't a black-and-white thing. In particular, recursive functions exhibit a form of re-entrancy; however, because the writer has full control of where the overlapping execution occurs, he/she can take advantage of that knowledge. For example, here's a (crappy) function that counts how many nodes in a binary tree are greater than a particular target value. It's re-entrant in the sense that there are multiple stack frames for this function on the stack at the same time, but it's not thread-safe/generally re-entrant.
int count_greater(Node *node, int target, int level = 0) {
static bool found_it;
if (level == 0) found_it = false;
if (node == NULL) return 0;
int left = count_something(node->left, target, level + 1);
if (node->value == target) found_it = true;
return left + (found_it ? 1 : 0) + count_something(node->right, target, level + 1);
}
Difference between mutual exclusion and atomic operations?
[edit]The article shows 4 ways to make functions thread-safe, two of which are guarding shared data with mutual exclusions and modifying shared data with atomic operations. How is atomic operations not a subclass of mutual exclusion? Acertain (talk) 04:16, 28 December 2009 (UTC)
Intro
[edit]Thread safety is a key challenge in multi-threaded programming. It was not a concern for most application programmers of little home applications, but since the 1990s, as Windows became multithreaded, and with the expansion of BSD and Linux operating systems, it has become a commonplace issue.
The world didn't start to spin with the invention of PC's.--195.113.23.117 (talk) 14:10, 25 May 2011 (UTC)
- Yes that was obviously written by someone that did not start programming until the 1990s. The word "thread" was not always used; in the IBM Mainframe architecture (such as MVS) the equivalent was called "tasks". I do not know what the equivalent is in Unix but Unix had (and has) the equivalent. Those and many other operating systems that existed before 1990 were used by Fortune-500 companies and supported multi-threading (except the name is probably not the same). Multi-threading goes back to the mid-1960s at least and probably before that. Sam Tomato (talk) 05:04, 11 February 2015 (UTC)
- Note also that Windows was multithreaded before 1990. It was incapable of pre-emptive multitasking under DOS but it was capable of multi-threading. Sam Tomato (talk) 05:13, 11 February 2015 (UTC)
I removed the incorrect history. I hope it stays out. Sam Tomato (talk) 05:53, 11 February 2015 (UTC)
Also, thread safety usually refers to an object, not a "piece of code", correct? Probably the introduction should say something like that. Sam Tomato (talk) 05:53, 11 February 2015 (UTC)
Vagueness
[edit]There is in my opinion an unnecessary vagueness in the current definition of "thread safety" as "[being] usable in a multi-threaded environment", as even functions that are not thread-safe can be used in a multi-thread environment, provided sufficient precautions (such as calling them from the same thread). Some of the examples push this confusion even further, as they show examples of code that is not "process-safe" (code that fails when another code deletes a file, including when there are no threads involved) as being thread-unsafe. I edited the article to use the definition found both in the well-known book "Linux Programing Interface" and in the Oracle documentation. I also updated the corresponding example. --un_brice (talk) 08:07, 9 June 2011 (UTC)
Circular definition
[edit]The first paragraph:
- Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it only manipulates shared data structures in a thread-safe manner, which enables safe execution by multiple threads at the same time. There are various strategies for making thread-safe data structures [1].
This is a completely circular definition. "Thread-safe code is is thread-safe." Who knew? --Cromas (talk) 22:04, 19 October 2011 (UTC)
- Yes I think thread safety usually refers to an object, not a "piece of code". Sam Tomato (talk) 05:54, 11 February 2015 (UTC)
Incomplete explanation?
[edit]At present the page includes: "
Examples
[edit]In the following piece of C code, the function is thread-safe, but not reentrant:
int function()
{
mutex_lock();
...
function body
...
mutex_unlock();
}
In the above, function
can be called by different threads without any problem. But if the function is used in a reentrant interrupt handler and a second interrupt arises inside the function, the second routine will hang forever. As interrupt servicing can disable other interrupts, the whole system could suffer.
"
The sentence "But if the function is used in a reentrant interrupt handler and a second interrupt arises inside the function, the second routine will hang forever" seems incorrect and incomplete to me, and therefore confusing.
- - There's no "second routine"; there's only a second invocation of a single routine. This, surely, is the whole point of thread-safety.
- - In the scenario described, certainly the second [invocation of the] routine will hang forever - no argument with that. But there's no mention of the fact that, for it to be true, the first invocation must also hang forever; nor is there any explanation of why either invocation hangs at all. In fact they hang for quite different reasons.
- What's really happening? The first invocation acquires the lock and enters the function body. The caller's described as a "re-entrant interrupt handler", so we know interrupts are enabled. Therefore when another interrupt occurs it too can invoke the handler, which again calls the function. This is the second invocation; it hangs because the first invocation is holding the lock.
- But why does it hang forever? The reason is that the first invocation has also hung forever; it's prevented from continuing and eventually releasing the lock. That happens because it was interrupted. In order to invoke the handler, the second interrupt must have been given priority over the first and allowed to interrupt it. The first invocation is therefore sitting in the function body, waiting for whatever interrupted it to finish and return. There's the problem - a deadlock: the second invocation is blocked by the first which holds the lock; but the first invocation has been interrupted so can't release the lock until the second invocation finishes and returns.
I think the first point - about there being no "second routine" - needs correcting.
Re. the second point, anyone reading that "the second routine will hang forever" could be forgiven for inferring that only the second invocation hangs, not the first also. Deducing (incorrectly) from that that the first routine doesn't hang, they might then conclude that the lock would eventually be released and wonder why the second routine hangs forever. 118.92.40.55 (talk) 01:45, 29 April 2012 (UTC) L Blythen
Re. I think we could add this explanation to the page, I came to this talk page exactly to see if I could find this. I'm glad someone added it, it's very clear now and it matches my interpretation (after a lot of effort). Jard18 (talk) 20:39, 26 November 2020 (UTC)
Better example
[edit]Does anyone have an opinion on if a sightly more complex function than increment_counter would make it more evident to the reader why interfering threads could lead to disaster, maybe by failing a test? Like an "add_and_roundup" or similar? —Hobart (talk) 19:16, 21 March 2017 (UTC)
Threads
[edit]Threads should be a separate topic and some of the material here belongs there. In the threads topic the other terms, such as tasks, should be included. Sam Tomato (talk) 05:37, 11 February 2015 (UTC)