[Ipython-tickets] [IPython] #210: Race condition in MTInteractiveShell
IPython
ipython-tickets@scipy....
Thu Jan 31 01:06:27 CST 2008
#210: Race condition in MTInteractiveShell
---------------------+------------------------------------------------------
Reporter: marc | Owner: fperez
Type: defect | Status: closed
Priority: high | Milestone:
Component: ipython | Version:
Severity: major | Resolution: fixed
Keywords: |
---------------------+------------------------------------------------------
Changes (by fperez):
* status: new => closed
* resolution: => fixed
Old description:
> I am using ipython (0.8.2) in threaded mode (for pylab support) with a
> GTK backend in a (sort-of) embedded shell (by calling make_session).
> However, occasionally ipython locks up. I traced the problem after a
> while back to the threading synchronization in MTInteractiveShell:
>
> Runsource:
> 411 got_lock = self.thread_ready.acquire(False)
> 412 self.code_queue.put(code)
> 413 if got_lock:
> 414 self.thread_ready.wait() # Wait until processed in
> timeout interval
> 415 self.thread_ready.release()
>
> Runcode:
> 424 global CODE_RUN
> 425
> 426 # Exceptions need to be raised differently depending on
> which thread is
> 427 # active
> 428 CODE_RUN = True
> 429
> 430 # lock thread-protected stuff
> 431 got_lock = self.thread_ready.acquire(False)
> 432
> ...
> 449 # Flush queue of pending code by calling the run methood
> of the parent
> 450 # class with all items which may be in the queue.
> 451 while 1:
> 452 try:
> 453 code_to_run = self.code_queue.get_nowait()
> 454 except Queue.Empty:
> 455 break
> 456 if got_lock:
> 457 self.thread_ready.notify()
> 458 InteractiveShell.runcode(self,code_to_run)
> 459 else:
> 460 break
> 461
> 462 # We're done with thread-protected variables
> 463 if got_lock:
> 464 self.thread_ready.release()
> 465
> 466 # We're done...
> 467 CODE_RUN = False
>
> The problem is that both functions acquire the lock only if available
> (the got_lock parameter). The race condition that occurs (every 40
> commands or so) is that:
> A. runsource acquires lock, puts code in queue (411-412)
> B. runcode trys to acquire lock, fails as runsource has the lock (431)
> C. runsource starts waiting (as it has the lock) (414)
> D. runcode obtains code, but breaks as it doesn not have the lock. It
> does not notify the waiting Runsource! (451-460)
>
> (C and D) could also be in different order
>
> Possible solution:
> Make the Lock an Rlock (to enable the thread calling runcode to call
> runsource)
> 364 self.thread_ready = threading.Condition(threading.RLock())
>
> Runsource
> - Make lock acquire blocking
> 411 got_lock = self.thread_ready.acquire()
> - Only perform wait if this is not an reentrant lock (got_lock is True on
> outer lock, and 1 on inner locks)
> 413 if(got_lock is True):
> 414 self.thread_ready.wait() # Wait until processed in timeout
> interval
> - always release (not based on if(got_lock))
> 415 self.thread_ready.release()
>
> Runcode
> - make locking required
> 431 self.thread_ready.acquire()
> - always run code if available (not dependent on if(got_lock)) (in the
> current implementation code just disappears)
> - move notify out of while loop, only call it if code has been obtained
> and executed (not essential)
> - always release lock (not dependent on if(got_lock))
> 450 code_to_run=None
> 451 while 1:
> 452 try:
> 453 code_to_run = self.code_queue.get_nowait()
> 454 except Queue.Empty:
> 455 break
> 458 InteractiveShell.runcode(self,code_to_run)
> ...
> # We're done with thread-protected variables
> 461 if(not code_to_run is None):
> 462 self.thread_ready.notify()
> 463 self.thread_ready.release()
>
> This seems to solve the deadlocking problem I encountered. Furthermore,
> using this the code is still reentrant (e.g. you can run
> ip.IP.runsource('a=1') from the console, or even something like
> 'ip.IP.runsource('a=1'); ip.IP.runcode()' without deadlocking), so i
> guess macros are ok too. I did not test it with the other backends
> (QT,etc.) however.
New description:
I am using ipython (0.8.2) in threaded mode (for pylab support) with a GTK
backend in a (sort-of) embedded shell (by calling make_session). However,
occasionally ipython locks up. I traced the problem after a while back to
the threading synchronization in MTInteractiveShell:
{{{
Runsource:
411 got_lock = self.thread_ready.acquire(False)
412 self.code_queue.put(code)
413 if got_lock:
414 self.thread_ready.wait() # Wait until processed in
timeout interval
415 self.thread_ready.release()
Runcode:
424 global CODE_RUN
425
426 # Exceptions need to be raised differently depending on
which thread is
427 # active
428 CODE_RUN = True
429
430 # lock thread-protected stuff
431 got_lock = self.thread_ready.acquire(False)
432
...
449 # Flush queue of pending code by calling the run methood
of the parent
450 # class with all items which may be in the queue.
451 while 1:
452 try:
453 code_to_run = self.code_queue.get_nowait()
454 except Queue.Empty:
455 break
456 if got_lock:
457 self.thread_ready.notify()
458 InteractiveShell.runcode(self,code_to_run)
459 else:
460 break
461
462 # We're done with thread-protected variables
463 if got_lock:
464 self.thread_ready.release()
465
466 # We're done...
467 CODE_RUN = False
}}}
The problem is that both functions acquire the lock only if available (the
got_lock parameter). The race condition that occurs (every 40 commands or
so) is that:
A. runsource acquires lock, puts code in queue (411-412)
B. runcode trys to acquire lock, fails as runsource has the lock (431)
C. runsource starts waiting (as it has the lock) (414)
D. runcode obtains code, but breaks as it doesn not have the lock. It does
not notify the waiting Runsource! (451-460)
(C and D) could also be in different order
Possible solution:
{{{
Make the Lock an Rlock (to enable the thread calling runcode to call
runsource)
364 self.thread_ready = threading.Condition(threading.RLock())
Runsource
- Make lock acquire blocking
411 got_lock = self.thread_ready.acquire()
- Only perform wait if this is not an reentrant lock (got_lock is True on
outer lock, and 1 on inner locks)
413 if(got_lock is True):
414 self.thread_ready.wait() # Wait until processed in timeout
interval
- always release (not based on if(got_lock))
415 self.thread_ready.release()
Runcode
- make locking required
431 self.thread_ready.acquire()
- always run code if available (not dependent on if(got_lock)) (in the
current implementation code just disappears)
- move notify out of while loop, only call it if code has been obtained
and executed (not essential)
- always release lock (not dependent on if(got_lock))
450 code_to_run=None
451 while 1:
452 try:
453 code_to_run = self.code_queue.get_nowait()
454 except Queue.Empty:
455 break
458 InteractiveShell.runcode(self,code_to_run)
...
# We're done with thread-protected variables
461 if(not code_to_run is None):
462 self.thread_ready.notify()
463 self.thread_ready.release()
}}}
This seems to solve the deadlocking problem I encountered. Furthermore,
using this the code is still reentrant (e.g. you can run
ip.IP.runsource('a=1') from the console, or even something like
'ip.IP.runsource('a=1'); ip.IP.runcode()' without deadlocking), so i guess
macros are ok too. I did not test it with the other backends (QT,etc.)
however.
Comment:
Closed by r2997, where I applied your changes with small cleanups to the
current version of Shell.py.
Many thanks! Your analysis seems correct, which I greatly appreciate.
I'm pretty ignorant of threaded code, and I've known for a long time this
code had bugs, let's hope your fixes are the last word on this. I'd
appreciate it if you could test it further from SVN, in particular
regarding the behavior of Ctrl-C, which is always very iffy with threads.
--
Ticket URL: <http://ipython.scipy.org/ipython/ipython/ticket/210#comment:1>
IPython <http://ipython.scipy.org>
The IPython interactive Python system
More information about the Ipython-tickets
mailing list