[gridway-user] Job remains pending

Tino Vazquez tinova79 at gmail.com
Thu Nov 29 04:25:56 CST 2007


Hello rasyid,

Could you send the output of gwhost? It can be that the job doesn't have
free slots to run. also, could you send the output of

gwhost -m <job_id>

where job_id is the jid of a job in the pending state.

Regards,

-Tino

On Nov 29, 2007 2:12 AM, rasyid mujahid <rasyidmujahid at gmail.com> wrote:

> Hello,
> I have problem while submitting job. it's always in pending state. I hope
> anyone could solve this problem.
> This is the job file
>
> EXECUTABLE  = /bin/ls
> ARGUMENTS   =     -la
> STDIN_FILE  =   /dev/null
> STDOUT_FILE =   ls.out.${JOB_ID}
> STDERR_FILE =   ls.err.${JOB_ID}
>
>
> below I print the output of gwps and logs file:
>
> $gwps
> USER         JID DM   EM   START    END      EXEC    XFER    EXIT
> NAME            HOST
> gwman        4   pend ----     01:58:08 --:--:--       0:00:00 0:00:00
> --       ls.jt                      --
>
> $cat gwd.log
> Wed Nov 28 07:13:26 2007 [GW][I]:
> ---------------------------------------------------
> Wed Nov 28 07:13:26 2007 [GW][I]:                    gwd.conf values
> Wed Nov 28 07:13:26 2007 [GW][I]:
> ---------------------------------------------------
> Wed Nov 28 07:13:26 2007 [GW][I]:   Core configuration attributes
> Wed Nov 28 07:13:26 2007 [GW][I]:     GWD_PORT                 : 6725
> Wed Nov 28 07:13:26 2007 [GW][I]:     MAX_NUMBER_OF_CLIENTS    : 25
> Wed Nov 28 07:13:26 2007 [GW][I]:     NUMBER_OF_ARRAYS         : 200
> Wed Nov 28 07:13:26 2007 [GW][I]:     NUMBER_OF_JOBS           : 5000
> Wed Nov 28 07:13:26 2007 [GW][I]:     NUMBER_OF_HOSTS          : 100
> Wed Nov 28 07:13:26 2007 [GW][I]:     NUMBER_OF_USERS          : 30
> Wed Nov 28 07:13:26 2007 [GW][I]:     SCHEDULING_INTERVAL      : 30
> Wed Nov 28 07:13:26 2007 [GW][I]:     DISCOVERY_INTERVAL       : 900
> Wed Nov 28 07:13:26 2007 [GW][I]:     MONITORING_INTERVAL      : 300
> Wed Nov 28 07:13:26 2007 [GW][I]:     POLL_INTERVAL            : 180
> Wed Nov 28 07:13:26 2007 [GW][I]:     MAX_ACTIVE_IM_QUERIES    : 10
> Wed Nov 28 07:13:26 2007 [GW][I]:   Information Manager MADs
> Wed Nov 28 07:13:26 2007 [GW][I]:     MAD(0)  name  : mds4
> Wed Nov 28 07:13:26 2007 [GW][I]:         executable: gw_im_mad_mds4_thr
> Wed Nov 28 07:13:26 2007 [GW][I]:         argument  : -l etc/hosts.list
> Wed Nov 28 07:13:26 2007 [GW][I]:         TM        :
> Wed Nov 28 07:13:26 2007 [GW][I]:         EM        : gridftp
> Wed Nov 28 07:13:26 2007 [GW][I]:   Transfer Manager MADs
> Wed Nov 28 07:13:26 2007 [GW][I]:     MAD(0)  name  : gridftp
> Wed Nov 28 07:13:26 2007 [GW][I]:         executable: gw_tm_mad_ftp
> Wed Nov 28 07:13:26 2007 [GW][I]:         argument  :
> Wed Nov 28 07:13:26 2007 [GW][I]:   Execution Manager MADs
> Wed Nov 28 07:13:26 2007 [GW][I]:     MAD(0)  name  : ws
> Wed Nov 28 07:13:26 2007 [GW][I]:         executable: gw_em_mad_ws
> Wed Nov 28 07:13:26 2007 [GW][I]:         argument  :
> Wed Nov 28 07:13:26 2007 [GW][I]:         rsl mode  : rsl
> Wed Nov 28 07:13:26 2007 [GW][I]:   Dispatch Manager Scheduler
> Wed Nov 28 07:13:26 2007 [GW][I]:         name      : builtin
> Wed Nov 28 07:13:26 2007 [GW][I]:         executable: gw_sched
> Wed Nov 28 07:13:26 2007 [GW][I]:         argument  :
> Wed Nov 28 07:13:26 2007 [GW][I]:
> ---------------------------------------------------
> Wed Nov 28 07:13:26 2007 [GW][I]:             sched.conf built-in policies
> Wed Nov 28 07:13:26 2007 [GW][I]:
> ---------------------------------------------------
> Wed Nov 28 07:13:26 2007 [GW][I]:   Scheduler configuration attributes
> Wed Nov 28 07:13:26 2007 [GW][I]:     DISABLE                  : no
> Wed Nov 28 07:13:26 2007 [GW][I]:     DISPATCH_CHUNK           : 15
> Wed Nov 28 07:13:26 2007 [GW][I]:     MAX_RUNNING_USER         : 30
> Wed Nov 28 07:13:26 2007 [GW][I]:     MAX_RUNNING_RESOURCE     : 10
> Wed Nov 28 07:13:26 2007 [GW][I]:   Job Fixed Priority Policy
> Wed Nov 28 07:13:26 2007 [GW][I]:     FP_WEIGHT                : 1.00
> Wed Nov 28 07:13:26 2007 [GW][I]:     Fixed Priority Values (users)
> Wed Nov 28 07:13:26 2007 [GW][I]:       DEFAULT                : 0
> Wed Nov 28 07:13:26 2007 [GW][I]:   Job Share Policy
> Wed Nov 28 07:13:26 2007 [GW][I]:     SH_WEIGHT (share)        : 1.00
> Wed Nov 28 07:13:26 2007 [GW][I]:     SH_WINDOW_SIZE           : 1.00
> Wed Nov 28 07:13:26 2007 [GW][I]:     SH_WINDOW_DEPTH          : 5
> Wed Nov 28 07:13:26 2007 [GW][I]:     User Shares
> Wed Nov 28 07:13:26 2007 [GW][I]:       DEFAULT                : 5
> Wed Nov 28 07:13:26 2007 [GW][I]:   Job Waiting time Policy
> Wed Nov 28 07:13:26 2007 [GW][I]:     WT_WEIGHT                : 0.00
> Wed Nov 28 07:13:26 2007 [GW][I]:   Job Deadline Policy
> Wed Nov 28 07:13:26 2007 [GW][I]:     DL_WEIGHT (deadline)     : 1.00
> Wed Nov 28 07:13:26 2007 [GW][I]:     DL_HALF                  : 0
> Wed Nov 28 07:13:26 2007 [GW][I]:   Resource Fixed Priority Policy
> Wed Nov 28 07:13:26 2007 [GW][I]:     RP_WEIGHT                : 1.00
> Wed Nov 28 07:13:26 2007 [GW][I]:     Fixed Priority Values (information
> managers)
> Wed Nov 28 07:13:26 2007 [GW][I]:       DEFAULT                : 1
> Wed Nov 28 07:13:26 2007 [GW][I]:   Resource Failure Rate Policy
> Wed Nov 28 07:13:26 2007 [GW][I]:     RA_WEIGHT                : 1.00
> Wed Nov 28 07:13:26 2007 [GW][I]:   Resource Failure Rank Policy
> Wed Nov 28 07:13:26 2007 [GW][I]:     FR_MAX_BANNED            : 3600
> Wed Nov 28 07:13:26 2007 [GW][I]:     FR_BANNED_C              : 650.00
> Wed Nov 28 07:13:26 2007 [GW][I]:   Resource Usage Policy
> Wed Nov 28 07:13:26 2007 [GW][I]:     UG_WEIGHT                : 1.00
> Wed Nov 28 07:13:26 2007 [GW][I]:     UG_HISTORY_WINDOW        : 3.00
> Wed Nov 28 07:13:26 2007 [GW][I]:     UG_HISTORY_RATIO         : 0.25
> Wed Nov 28 07:13:26 2007 [GW][I]:
> ---------------------------------------------------
> Wed Nov 28 07:13:26 2007 [DM][I]: Job pool initialized.
> Wed Nov 28 07:13:26 2007 [DM][I]: Array pool initialized.
> Wed Nov 28 07:13:26 2007 [IM][I]: Host pool initialized.
> Wed Nov 28 07:13:26 2007 [UM][I]: User pool initiated.
> Wed Nov 28 07:13:26 2007 [GW][I]: Loading Information Manager MADs.
> Wed Nov 28 07:13:27 2007 [IM][I]:       MAD mds4 loaded (exec:
> gw_im_mad_mds4_thr, arg: -l etc/hosts.list).
> Wed Nov 28 07:13:27 2007 [GW][I]: Loading the scheduler.
> Wed Nov 28 07:13:27 2007 [DM][I]:       Scheduler builtin loaded (exec:
> gw_sched, arg: ).
> ...
> Wed Nov 28 07:14:27 2007 [UM][I]: User gwman registered.
> ...
> Thu Nov 29 01:58:08 2007 [DM][I]: New job 4 allocated and initialized.
>
> $cat 4/job.log
> Thu Nov 29 01:58:08 2007 [DM][I]: ----------- Job configuration file (
> ls.jt) values -----------
> Thu Nov 29 01:58:08 2007 [DM][I]:       EXECUTABLE             : /bin/ls
> Thu Nov 29 01:58:08 2007 [DM][I]:       ARGUMENTS              : -la
> Thu Nov 29 01:58:08 2007 [DM][I]:       INPUT_FILES   (Total 0):
> Thu Nov 29 01:58:08 2007 [DM][I]:       OUTPUT_FILES  (Total 0):
> Thu Nov 29 01:58:08 2007 [DM][I]:       RESTART_FILES (Total 0):
> Thu Nov 29 01:58:08 2007 [DM][I]:       STDIN_FILE             : /dev/null
> Thu Nov 29 01:58:08 2007 [DM][I]:       STDOUT_FILE            :
> ls.out.${JOB_ID}
> Thu Nov 29 01:58:08 2007 [DM][I]:       STDERR_FILE            :
> ls.err.${JOB_ID}
> Thu Nov 29 01:58:08 2007 [DM][I]:       REQUIREMENTS           :
> Thu Nov 29 01:58:08 2007 [DM][I]:       RANK                   :
> Thu Nov 29 01:58:08 2007 [DM][I]:       RESCHEDULING_INTERVAL  : 0
> Thu Nov 29 01:58:08 2007 [DM][I]:       RESCHEDULING_THRESHOLD : 300
> Thu Nov 29 01:58:08 2007 [DM][I]:       SUSPENSION_TIMEOUT     : 900
> Thu Nov 29 01:58:08 2007 [DM][I]:       CPULOAD_THRESHOLD      : 50
> Thu Nov 29 01:58:08 2007 [DM][I]:       RESCHEDULE_ON_FAILURE  : yes
> Thu Nov 29 01:58:08 2007 [DM][I]:       NUMBER_OF_RETRIES      : 3
> Thu Nov 29 01:58:08 2007 [DM][I]:       CHECKPOINT_INTERVAL    : 0
> Thu Nov 29 01:58:08 2007 [DM][I]:       CHECKPOINT_URL         :
> Thu Nov 29 01:58:08 2007 [DM][I]:       WRAPPER                :
> /home/gwadmin/gw/libexec/gw_wrapper.sh
> Thu Nov 29 01:58:08 2007 [DM][I]:       MONITOR                :
> Thu Nov 29 01:58:08 2007 [DM][I]:       PRE_WRAPPER            :
> Thu Nov 29 01:58:08 2007 [DM][I]:       PRE_WRAPPER_ARGUMENTS  :
> Thu Nov 29 01:58:08 2007 [DM][I]:       TYPE                   : single
> Thu Nov 29 01:58:08 2007 [DM][I]:       NP                     : 1
> Thu Nov 29 01:58:08 2007 [DM][I]:       DEADLINE               : 0:00:00 0
> Thu Nov 29 01:58:08 2007 [DM][I]:
> ----------------------------------------------------------
> Thu Nov 29 01:58:08 2007 [DM][I]: New state is PENDING.
>
> But it's wondering i can submit manually using gw_em_mad_ws
>
> $ gw_em_mad_ws
> INIT - - - -
> INIT - SUCCESS -
> SUBMIT 5 152.118.26.202 job.rsl
> SUBMIT 5 SUCCESS https://riset-c-3208-202.riset.cs.ui.ac.id:8443/wsrf/services/ManagedExecutableJobService?0ac03c50-9e13-11dc-8646-b69b9c478baf
>
> CALLBACK 5 SUCCESS CLEANUP
> CALLBACK 5 SUCCESS ACTIVE
> CALLBACK 5 SUCCESS DONE:0
>
>
> Thanks.
> --
> Rasyid
>
>


-- 
+-----------------------------------------------------------+
Tino Vázquez
Grid Technology Engineer/Researcher
Dpto. Arquitectura de Computadores y Automatica
Facultad de Informatica
Universidad Complutense 28040 Madrid
Phone : +34 91 394 75 74
http://asds.dacya.ucm.es/
+-----------------------------------------------------------+

GridWay, The Way to Grid! http://www.gridway.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.globus.org/pipermail/gridway-user/attachments/20071129/444ac90d/attachment.htm>


More information about the gridway-user mailing list