[gram-dev] Subject: PBS SEG not working properly
Andrew Howard
ahoward at purdue.edu
Mon Aug 11 09:11:06 CDT 2008
Stu,
Thanks for the reply. I've double-checked that the path and
permissions are correct. What confuses me is that when I watch the
container log when I submit a job, it sees that the job is submitted.
(i.e., the Globus container log gives me the PBS job ID) It just never
seems to tell the client that the job was submitted.
On Fri, Aug 8, 2008 at 10:26 AM, Stuart Martin <smartin at mcs.anl.gov> wrote:
> This email bounced due to majordomo finding u-n-s-u-b-m-i-t-t-e-d in the
> message body (/\buns\w*b/i at line 5), editing and resending...
>
> Andrew: take a look at the pbs section here:
> http://www-unix.globus.org/toolkit/docs/4.0/execution/wsgram/admin-index.html#s-wsgram-Interface_Config_Fragscheduler_specific_config
>
> Can you confirm that the path and permissions are correct? The account the
> container is running under must be able to read the pbs log file.
>
> -Stu
>
>>>>
> Hi,
> I've been struggling with getting Globus-WS working with PBS. It
> worked at one point, but now it seems the PBS SEG isn't working
> properly, even after I've configured it. It keeps giving me "Current
> job state: Un$ubmitted"
>
> I ran $GLOBUS_LOCATION/setup/globus/setup-seg-pbs.pl and it produced
> no errors. Then I ran the test at
> $GLOBUS_LOCATION/test/globus_scheduler_event_generator_pbs_test/TESTS.pl
> and got this output:
>
> root at tg-steele globus_scheduler_event_generator_pbs_test]# ./TESTS.pl
> Warning: Do not start a service container while this test script is running.
> test-pbs-seg....ok
> All tests successful.
> Files=1, Tests=1, 10 wallclock secs ( 0.05 cusr + 0.06 csys = 0.11 CPU)
>
> Seeing that that was happy, I submitted a job to the server, but it
> still returns "Current job state: Un$ubmitted":
> [ahoward at tg-steele globus_test]$ globusrun-ws -submit -F
> https://tg-steele.purdue.teragrid.org -Ft PBS -f hostname_ws.rsl
> Submitting job...Done.
> Job ID: uuid:4af67660-64b3-11dd-86dd-001ec9aa7d43
> Termination time: 08/08/2008 19:01 GMT
> Current job state: Un$ubmitted
>
> However, if I look in the $GLOBUS_LOCATION/var/container.log, I can
> see that the job was successfully submitted to PBS:
> 2008-08-07 15:01:51,426 INFO exec.StateMachine
> [RunQueueThread_11,logJobAccepted:3424] Job
> 4b298a00-64b3-11dd-a07c-da8d50e1996e accepted for local user 'ahoward'
> 2008-08-07 15:01:52,056 INFO exec.StateMachine
> [RunQueueThread_15,logJobSubmitted:3436] Job
> 4b298a00-64b3-11dd-a07c-da8d50e1996e submitted with local job ID
> '150799.steele-adm.rcac.purdue.edu'
>
> FWIW, if I try running the SEG test script again as myself, it fails:
> [ahoward at tg-steele globus_scheduler_event_generator_pbs_test]$ ./TESTS.pl
> Warning: Do not start a service container while this test script is running.
> test-pbs-seg....ok
> 1/1 skipped: PBS SEG not configured
> All tests successful, 1 subtest skipped.
> Files=1, Tests=1, 0 wallclock secs ( 0.03 cusr + 0.00 csys = 0.03 CPU)
>
>
> Any suggestions? Because this has me completely stumped at the moment.
>
> Thanks in advance!
>
> --
> Andrew Howard
> Rosen Center for Advanced Computing
> Purdue University
>
> <<<
>
>
--
Andrew Howard
Rosen Center for Advanced Computing
Purdue University
More information about the gram-dev
mailing list