[Ipython-tickets] [IPython] #210: Race condition in MTInteractiveShell

IPython ipython-tickets@scipy....
Thu Jan 31 01:06:27 CST 2008


#210: Race condition in MTInteractiveShell
---------------------+------------------------------------------------------
 Reporter:  marc     |        Owner:  fperez
     Type:  defect   |       Status:  closed
 Priority:  high     |    Milestone:        
Component:  ipython  |      Version:        
 Severity:  major    |   Resolution:  fixed 
 Keywords:           |  
---------------------+------------------------------------------------------
Changes (by fperez):

  * status:  new => closed
  * resolution:  => fixed

Old description:

> I am using ipython (0.8.2) in threaded mode (for pylab support) with a
> GTK backend in a (sort-of) embedded shell (by calling make_session).
> However, occasionally ipython locks up. I traced the problem after a
> while back to the threading synchronization in MTInteractiveShell:
>
> Runsource:
> 411             got_lock = self.thread_ready.acquire(False)
> 412             self.code_queue.put(code)
> 413             if got_lock:
> 414                 self.thread_ready.wait()  # Wait until processed in
> timeout interval
> 415                 self.thread_ready.release()
>
> Runcode:
> 424             global CODE_RUN
> 425
> 426             # Exceptions need to be raised differently depending on
> which thread is
> 427             # active
> 428             CODE_RUN = True
> 429
> 430             # lock thread-protected stuff
> 431             got_lock = self.thread_ready.acquire(False)
> 432
> ...
> 449             # Flush queue of pending code by calling the run methood
> of the parent
> 450             # class with all items which may be in the queue.
> 451             while 1:
> 452                 try:
> 453                     code_to_run = self.code_queue.get_nowait()
> 454                 except Queue.Empty:
> 455                     break
> 456                 if got_lock:
> 457                     self.thread_ready.notify()
> 458                     InteractiveShell.runcode(self,code_to_run)
> 459                 else:
> 460                     break
> 461
> 462             # We're done with thread-protected variables
> 463             if got_lock:
> 464                 self.thread_ready.release()
> 465
> 466             # We're done...
> 467             CODE_RUN = False
>

> The problem is that both functions acquire the lock only if available
> (the got_lock parameter). The race condition that occurs (every 40
> commands or so) is that:
> A. runsource acquires lock, puts code in queue (411-412)
> B. runcode trys to acquire lock, fails as runsource has the lock (431)
> C. runsource starts waiting (as it has the lock) (414)
> D. runcode obtains code, but breaks as it doesn not have the lock. It
> does not notify the waiting Runsource! (451-460)
>
> (C and D) could also be in different order
>
> Possible solution:
> Make the Lock an Rlock (to enable the thread calling runcode to call
> runsource)
> 364      self.thread_ready = threading.Condition(threading.RLock())
>
> Runsource
> - Make lock acquire blocking
> 411     got_lock = self.thread_ready.acquire()
> - Only perform wait if this is not an reentrant lock (got_lock is True on
> outer lock, and 1 on inner locks)
> 413     if(got_lock is True):
> 414        self.thread_ready.wait()  # Wait until processed in timeout
> interval
> - always release (not based on if(got_lock))
> 415     self.thread_ready.release()
>

> Runcode
> - make locking required
> 431    self.thread_ready.acquire()
> - always run code if available (not dependent on if(got_lock)) (in the
> current implementation code just disappears)
> - move notify out of while loop, only call it if code has been obtained
> and executed (not essential)
> - always release lock (not dependent on if(got_lock))
> 450    code_to_run=None
> 451    while 1:
> 452         try:
> 453             code_to_run = self.code_queue.get_nowait()
> 454         except Queue.Empty:
> 455             break
> 458         InteractiveShell.runcode(self,code_to_run)
> ...
>         # We're done with thread-protected variables
> 461     if(not code_to_run is None):
> 462        self.thread_ready.notify()
> 463     self.thread_ready.release()
>
> This seems to solve the deadlocking problem I encountered. Furthermore,
> using this the code is still reentrant (e.g. you can run
> ip.IP.runsource('a=1') from the console, or even something like
> 'ip.IP.runsource('a=1'); ip.IP.runcode()' without deadlocking), so i
> guess macros are ok too. I did not test it with the other backends
> (QT,etc.) however.

New description:

 I am using ipython (0.8.2) in threaded mode (for pylab support) with a GTK
 backend in a (sort-of) embedded shell (by calling make_session). However,
 occasionally ipython locks up. I traced the problem after a while back to
 the threading synchronization in MTInteractiveShell:
 {{{
 Runsource:
 411             got_lock = self.thread_ready.acquire(False)
 412             self.code_queue.put(code)
 413             if got_lock:
 414                 self.thread_ready.wait()  # Wait until processed in
 timeout interval
 415                 self.thread_ready.release()

 Runcode:
 424             global CODE_RUN
 425
 426             # Exceptions need to be raised differently depending on
 which thread is
 427             # active
 428             CODE_RUN = True
 429
 430             # lock thread-protected stuff
 431             got_lock = self.thread_ready.acquire(False)
 432
 ...
 449             # Flush queue of pending code by calling the run methood
 of the parent
 450             # class with all items which may be in the queue.
 451             while 1:
 452                 try:
 453                     code_to_run = self.code_queue.get_nowait()
 454                 except Queue.Empty:
 455                     break
 456                 if got_lock:
 457                     self.thread_ready.notify()
 458                     InteractiveShell.runcode(self,code_to_run)
 459                 else:
 460                     break
 461
 462             # We're done with thread-protected variables
 463             if got_lock:
 464                 self.thread_ready.release()
 465
 466             # We're done...
 467             CODE_RUN = False
 }}}

 The problem is that both functions acquire the lock only if available (the
 got_lock parameter). The race condition that occurs (every 40 commands or
 so) is that:
 A. runsource acquires lock, puts code in queue (411-412)
 B. runcode trys to acquire lock, fails as runsource has the lock (431)
 C. runsource starts waiting (as it has the lock) (414)
 D. runcode obtains code, but breaks as it doesn not have the lock. It does
 not notify the waiting Runsource! (451-460)

 (C and D) could also be in different order

 Possible solution:
 {{{
 Make the Lock an Rlock (to enable the thread calling runcode to call
 runsource)
 364      self.thread_ready = threading.Condition(threading.RLock())

 Runsource
 - Make lock acquire blocking
 411     got_lock = self.thread_ready.acquire()
 - Only perform wait if this is not an reentrant lock (got_lock is True on
 outer lock, and 1 on inner locks)
 413     if(got_lock is True):
 414        self.thread_ready.wait()  # Wait until processed in timeout
 interval
 - always release (not based on if(got_lock))
 415     self.thread_ready.release()


 Runcode
 - make locking required
 431    self.thread_ready.acquire()
 - always run code if available (not dependent on if(got_lock)) (in the
 current implementation code just disappears)
 - move notify out of while loop, only call it if code has been obtained
 and executed (not essential)
 - always release lock (not dependent on if(got_lock))
 450    code_to_run=None
 451    while 1:
 452         try:
 453             code_to_run = self.code_queue.get_nowait()
 454         except Queue.Empty:
 455             break
 458         InteractiveShell.runcode(self,code_to_run)
 ...
         # We're done with thread-protected variables
 461     if(not code_to_run is None):
 462        self.thread_ready.notify()
 463     self.thread_ready.release()
 }}}

 This seems to solve the deadlocking problem I encountered. Furthermore,
 using this the code is still reentrant (e.g. you can run
 ip.IP.runsource('a=1') from the console, or even something like
 'ip.IP.runsource('a=1'); ip.IP.runcode()' without deadlocking), so i guess
 macros are ok too. I did not test it with the other backends (QT,etc.)
 however.

Comment:

 Closed by r2997, where I applied your changes with small cleanups to the
 current version of Shell.py.

 Many thanks!  Your analysis seems correct, which I greatly appreciate.
 I'm pretty ignorant of threaded code, and I've known for a long time this
 code had bugs, let's hope your fixes are the last word on this.  I'd
 appreciate it if you could test it further from SVN, in particular
 regarding the behavior of Ctrl-C, which is always very iffy with threads.

-- 
Ticket URL: <http://ipython.scipy.org/ipython/ipython/ticket/210#comment:1>
IPython <http://ipython.scipy.org>
The IPython interactive Python system


More information about the Ipython-tickets mailing list