[IPython-User] troubleshooting ipcluster

Robert Nishihara robertnishihara@gmail....
Wed Jul 11 14:19:24 CDT 2012


Ah, I take that back, I see that the command "ipcluster start -n 5
--profile=sge" dumps more informative message to my home directory, for
instance the attached files, which are copied below. I am looking into this.

ipcontroller error

    -sh: module: line 1: syntax error: unexpected end of file
    -sh: error importing function definition for `module'
    /usr/share/gridengine/hpc/spool/cloudcompute-6/job_scripts/2093: line
5: syntax error near unexpected token `--log-to-file'
    /usr/share/gridengine/hpc/spool/cloudcompute-6/job_scripts/2093: line
5: `/software/linux/x86_64/epd-7.3-1/bin/python -c from
IPython.parallel.apps.ipcontrollerapp import launch_new_instance;
launch_new_instance() --log-to-file
--profile-dir="/home/robert/.ipython/profile_sge" --cluster-id=""'

ipengine error

    -sh: module: line 1: syntax error: unexpected end of file
    -sh: error importing function definition for `module'
    /usr/share/gridengine/hpc/spool/cloudcompute-7/job_scripts/2094: line
5: syntax error near unexpected token
`--profile-dir="/home/robert/.ipython/profile_sge"'
    /usr/share/gridengine/hpc/spool/cloudcompute-7/job_scripts/2094: line
5: `/software/linux/x86_64/epd-7.3-1/bin/python -c from
IPython.parallel.apps.ipengineapp import launch_new_instance;
launch_new_instance() --profile-dir="/home/robert/.ipython/profile_sge"
--cluster-id=""'

On Wed, Jul 11, 2012 at 12:09 PM, Robert Nishihara <
robertnishihara@gmail.com> wrote:

> There was no stdout, and the stderr (attached and copied below) looks
> normal.
>
> I started the controller and engines separately this time (with qsub),
> using the two attached scripts. This procedure worked fine before the
> upgrade.
>
> I tried recreating the sge profile using the instructions from this thread
> <
> http://python.6.n6.nabble.com/Getting-setup-on-a-remote-cluster-w-Sun-Grid-Engine-td1663090.html>,
> and this had no effect.
>
> stderr for controller
>
>     2012-07-11 14:59:49,674.674 [IPControllerApp] Using existing profile
> dir: u'/home/robert/.ipython/profile_sge'
>     2012-07-11 14:59:49.797 [IPControllerApp] Hub listening on tcp://
> 0.0.0.0:52512 for registration.
>     2012-07-11 14:59:49.799 [IPControllerApp] Hub using DB backend: 'NoDB'
>     2012-07-11 14:59:50.079 [IPControllerApp] hub::created hub
>     2012-07-11 14:59:50.085 [IPControllerApp] writing connection info to
> /home/robert/.ipython/profile_sge/security/ipcontroller-client.json
>     2012-07-11 14:59:50.103 [IPControllerApp] writing connection info to
> /home/robert/.ipython/profile_sge/security/ipcontroller-engine.json
>     2012-07-11 14:59:50.129 [IPControllerApp] task::using Python leastload
> Task scheduler
>     2012-07-11 14:59:50.135 [IPControllerApp] Heartmonitor started
>     2012-07-11 14:59:50.166 [scheduler] Scheduler started [leastload]
>     2012-07-11 14:59:50.195 [IPControllerApp] Creating pid file:
> /home/robert/.ipython/profile_sge/pid/ipcontroller.pid
>
> stderr for engines
>
>     2012-07-11 14:59:58,438.438 [IPClusterEngines] Using existing profile
> dir: u'/home/robert/.ipython/profile_sge'
>     2012-07-11 14:59:58.470 [IPClusterEngines] IPython cluster: started
>     2012-07-11 14:59:58.471 [IPClusterEngines] Starting engines with
> [daemon=False]
>     2012-07-11 14:59:58.471 [IPClusterEngines] Starting 5 Engines with
> SGEEngineSetLauncher
>     2012-07-11 14:59:58.559 [IPClusterEngines] Job submitted with job id:
> '2092'
>     2012-07-11 15:00:28.559 [IPClusterEngines] Engines appear to have
> started successfully
>
> -Robert
>
> On Wed, Jul 11, 2012 at 10:42 AM, MinRK <benjaminrk@gmail.com> wrote:
>
>> What is the stdout/err of the controller and engine jobs?
>>
>> On Wed, Jul 11, 2012 at 6:03 PM, Robert Nishihara
>> <robertnishihara@gmail.com> wrote:
>> > My cluster recently upgraded to IPython 0.13. Now, when I run
>> >
>> >     ipcluster start -n 3 --profile=sge
>> >
>> > the controller and engines get submitted to the queue, but the terminate
>> > immediately after starting. However, the output looks normal
>> >
>> >     2012-07-11 11:56:27,531.531 [IPClusterStart] Using existing profile
>> dir:
>> > u'/home/robert/.ipython/profile_sge'
>> >     2012-07-11 11:56:27.566 [IPClusterStart] Starting ipcluster with
>> > [daemon=False]
>> >     2012-07-11 11:56:27.570 [IPClusterStart] Creating pid file:
>> > /home/robert/.ipython/profile_sge/pid/ipcluster.pid
>> >     2012-07-11 11:56:27.573 [IPClusterStart] Starting Controller with
>> > SGEControllerLauncher
>> >     2012-07-11 11:56:27.723 [IPClusterStart] Job submitted with job id:
>> > '2088'
>> >     2012-07-11 11:56:28.568 [IPClusterStart] Starting 3 Engines with
>> > SGEEngineSetLauncher
>> >     2012-07-11 11:56:28.645 [IPClusterStart] Job submitted with job id:
>> > '2089'
>> >     2012-07-11 11:56:58.647 [IPClusterStart] Engines appear to have
>> started
>> > successfully
>> >
>> > Is there a good way to troubleshoot this? The --debug flag doesn't seem
>> to
>> > give me any useful information.
>> >
>> > -Robert
>> >
>> > _______________________________________________
>> > IPython-User mailing list
>> > IPython-User@scipy.org
>> > http://mail.scipy.org/mailman/listinfo/ipython-user
>> >
>> _______________________________________________
>> IPython-User mailing list
>> IPython-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-user
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20120711/f72eec84/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ipengine.e2094.5
Type: application/octet-stream
Size: 548 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/ipython-user/attachments/20120711/f72eec84/attachment-0002.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ipcontroller.e2093.1
Type: application/octet-stream
Size: 530 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/ipython-user/attachments/20120711/f72eec84/attachment-0003.obj 


More information about the IPython-User mailing list