[gram-user] job cannot be submiited by globusws-run

Peter G Lane lane at mcs.anl.gov
Mon May 8 18:08:21 CDT 2006


You haven't provided any of the information that I asked for except for
the GT version. Are my instructions not clear enough? Please tell me if
I'm not making any sense. Anyway, I already know your fork SEG isn't
working. Could you try those things from my last email. Thanks.

Peter

On Mon, 2006-05-08 at 14:32 -0700, wenwen LI wrote:
> I'm using GT4.0.1, but when I test to run globus-start-container
> today, it doesn't give errors or warnings, everything looks fine, but
> when I run "globusrun-ws -submit -c /bin/true", I got the errors
> telling that "Current Job State : unsubmitted".
> When I test to run TEST.pl with the web container shut down, I got
> such errors:
> ==================================================================
> [globus at srb globus_scheduler_event_generator_fork_test]$ ./TESTS.pl
> Warning: Do not start a service container while this test script is
> running.
> test-fork-seg....NOK 1# Test 1 got: 'Unable to run SEG with fork
> module: is it installed?' (test-fork-seg.pl at line 23)
> #   Expected: '0'
> #  test-fork-seg.pl line 23 is: skip($skip_all ? "Fork SEG not
> configured" : 0, &run_test, 0);
> test-fork-seg....FAILED test
> 1                                               
>         Failed 1/1 tests, 0.00% okay
> Failed Test      Stat Wstat Total Fail  Failed  List of Failed
> -------------------------------------------------------------------------------
> test-fork-seg.pl                1    1 100.00%  1
> Failed 1/1 test scripts, 0.00% okay. 1/1 subtests failed, 0.00% okay.
> ===============================================================
>  
> Thank you very much!
>  
>  
>                     Wenwen 
>  
>  
> 
> Peter G Lane <lane at mcs.anl.gov> wrote:
>         What version of the Globus Toolkit are you using? I can't
>         figure out why
>         it's trying to recover the same job twice. The recover method
>         is
>         synchronized and also checks a flag to make sure it doesn't
>         run twice.
>         This should be impossible. Do you have two deployments of the
>         GRAM
>         services by any chance?
>         
>         I guess for now you can just delete your ~/.globus/persisted/
>         directory
>         to clean up all the job persistence data. Then we can address
>         the
>         original problem. Can you turn on full GRAM debug logging in
>         container-log4j.properties (just uncomment the appropriate
>         line) and
>         just start your container (don't submit any jobs). You should
>         see some
>         lines that list the command-line arguments for running the
>         Fork SEG. If
>         not, send me the container log. If you do, reconstruct the
>         command-line
>         from those logging statements and run it by hand. If you don't
>         see any
>         output, adjust the timestamp (it's in seconds since the epoch)
>         so that
>         it represents an earlier time and try again. You should
>         eventually see
>         something like the following (the command should "hang"):
>         
>         logan%
>         $GLOBUS_LOCATION/libexec/globus-scheduler-event-generator -s
>         fork
>         -t 1145994457
>         001;1145994457;58d45b32-d494-11da-8c01-000d61215ff0:6616;2;0
>         001;1145994457;58d45b32-d494-11da-8c01-000d61215ff0:6616;8;0
>         001;1145994584;a4ceee44-d494-11da-8600-000d61215ff0:6709;2;0
>         001;1145994584;a4ceee44-d494-11da-8600-000d61215ff0:6709;8;0
>         001;1146023492;f36ab716-d4d7-11da-9124-000d61215ff0:11605;2;0
>         001;1146023492;f36ab716-d4d7-11da-9124-000d61215ff0:11605;8;0
>         
>         Peter
>         
>         On Fri, 2006-04-28 at 10:56 -0700, wenwen LI wrote:
>         > Here is the result:
>         >
>         --------------------------------------------------------------------------------------------------------------------
>         > total 40
>         > -rw-rw-r-- 1 globus globus 6925 Apr 27 16:22
>         > 111fb780-d590-11da-b53b-00093d1067b1.xml
>         > -rw-rw-r-- 1 globus globus 6925 Apr 27 16:22
>         > 17889586-d579-11da-830a-00093d1067b1.xml
>         > -rw-rw-r-- 1 globus globus 6926 Apr 27 16:22
>         > 362de86c-d57c-11da-82c7-00093d1067b1.xml
>         > -rw-rw-r-- 1 globus globus 6925 Apr 27 16:22
>         > 438ae57e-d580-11da-babc-00093d1067b1.xml
>         > -rw-rw-r-- 1 globus globus 6925 Apr 27 16:22
>         > b6f4ab42-d59b-11da-a021-00093d1067b1.xml
>         > -rw-rw-r-- 1 globus globus 0 Apr 21 00:09 xph27814.tmp
>         > 
>         >
>         -------------------------------------------------------------------------------------------------------------------
>         > And I have attached the
>         111fb780-d590-11da-b53b-00093d1067b1.xml file
>         > in the mail.
>         > Thank you very much!
>         > 
>         > 
>         > Peter G Lane wrote:
>         > On Thu, 2006-04-27 at 12:17 -0700, wenwen LI wrote:
>         > > Here is the result:
>         > > [root at srb var]# ls -l
>         > > total 24
>         > > -rw-r--r-- 1 globus globus 4831 Apr 12 15:46 container.log
>         > > -rw-rw-rw- 1 globus globus 1346 Apr 26 23:12
>         > > globus-fork.log
>         > > -rw-rw-r-- 1 globus globus 46 Apr 27 04:02
>         > > globus-jsm-fork.stamp
>         > > -rw-rw-r-- 1 globus globus 46 Apr 27 04:02
>         > > globus-jsm-multi.stamp
>         > > drwxrwxr-x 3 globus globus 4096 Mar 30 17:19 lib
>         > > I think it has the right permissions.
>         > > But today when I start the web service container by user
>         > 'globus', it
>         > > has such errors that never comes before,
>         > >
>         >
>         -------------------------------------------------------------------------------------------------------------------
>         > > [globus at srb postgre]$ globus-start-container
>         > > 2006-04-27 16:22:07,937 INFO exec.ManagedExecutableJobHome
>         > > [Thread-3,recover:163] Recovered resource with ID
>         > > 438ae57e-d580-11da-babc-00093d1067b1.
>         > > 2006-04-27 16:22:07,944 INFO exec.RunQueue [Thread-3,:54]
>         > > Starting state machine with 16 run queues.
>         > > 2006-04-27 16:22:09,027 INFO exec.ManagedExecutableJobHome
>         > > [Thread-3,recover:163] Recovered resource with ID
>         > > 111fb780-d590-11da-b53b-00093d1067b1.
>         > > 2006-04-27 16:22:12,918 INFO exec.ManagedExecutableJobHome
>         > > [Thread-6,recover:163] Recovered resource with ID
>         > > 438ae57e-d580-11da-babc-00093d1067b1.
>         > > 2006-04-27 16:22:12,919 INFO exec.ManagedExecutableJobHome
>         > > [Thread-6,recover:163] Recovered resource with ID
>         > > 111fb780-d590-11da-b53b-00093d1067b1.
>         > > 2006-04-27 16:22:12,958 ERROR
>         > utils.JobStateMonitorSubscriptionManager
>         > > [Thread-23,subscribe:179] unable to monitor job for state
>         > changes
>         > > org.globus.exec.monitoring.AlreadyRegisteredException
>         > 
>         > I don't understand how, but it looks like a job is being
>         > recovered twice
>         > (111fb780-d590-11da-b53b-00093d1067b1). What version of the
>         > toolkit are
>         > you using? Would it be possible for you to find the file in
>         > the
>         > container owner's
>         > ~/.globus/persisted/-/ManagedExecutableJobResourceStateType/
>         > directory named 111fb780-d590-11da-b53b-00093d1067b1.xml and
>         > attach it
>         > to your response. I'm wondering if the persistence data got
>         > corrupted.
>         > After that, if you delete that directory then you won't have
>         > all these
>         > jobs being recovered.
>         > 
>         > Peter
>         > 
>         > > at
>         > >
>         >
>         org.globus.exec.monitoring.JobStateMonitor.registerJobID(JobStateMonitor.java:227)
>         > > at
>         > >
>         >
>         org.globus.exec.service.exec.utils.JobStateMonitorSubscriptionManager.subscribe(JobStateMonitorSubscriptionManager.java:171)
>         > > at
>         > >
>         >
>         org.globus.exec.service.exec.utils.JobStateMonitorSubscriptionManager.run(JobStateMonitorSubscriptionManager.java:136)
>         > > 2006-04-27 16:22:12,963 ERROR
>         > utils.JobStateMonitorSubscriptionManager
>         > > [Thread-23,subscribe:179] unable to monitor job for state
>         > changes
>         > > org.globus.exec.monitoring.AlreadyRegisteredException
>         > > at
>         > >
>         >
>         org.globus.exec.monitoring.JobStateMonitor.registerJobID(JobStateMonitor.java:227)
>         > > at
>         > >
>         >
>         org.globus.exec.service.exec.utils.JobStateMonitorSubscriptionManager.subscribe(JobStateMonitorSubscriptionManager.java:171)
>         > > at
>         > >
>         >
>         org.globus.exec.service.exec.utils.JobStateMonitorSubscriptionManager.run(JobStateMonitorSubscriptionManager.java:136)
>         > > 2006-04-27 16:22:12,963 WARN
>         > factory.ManagedJobFactoryResource
>         > > [Thread-3,run:164] Recovery exception
>         > > org.globus.wsrf.NoSuchResourceException
>         > > at
>         > >
>         >
>         org.globus.wsrf.impl.ResourceHomeImpl.get(ResourceHomeImpl.java:285)
>         > > at
>         > >
>         >
>         org.globus.wsrf.impl.ResourceHomeImpl.find(ResourceHomeImpl.java:262)
>         > > at
>         > >
>         >
>         org.globus.exec.service.exec.ManagedExecutableJobHome.recover(ManagedExecutableJobHome.java:160)
>         > > at
>         org.globus.exec.service.factory.ManagedJobFactoryResource
>         > > $1RecoveryThread.run(ManagedJobFactoryResource.java:161)
>         > > 2006-04-27 16:22:13,084 INFO exec.ManagedExecutableJobHome
>         > > [Thread-6,recover:163] Recovered resource with ID
>         > > 9a8bbb5c-d56a-11da-bb0b-00093d1067b1.
>         > > 2006-04-27 16:22:13,206 INFO exec.ManagedExecutableJobHome
>         > > [Thread-6,recover:163] Recovered resource with ID
>         > > 17889586-d579-11da-830a-00093d1067b1.
>         > > 2006-04-27 16:22:13,324 INFO exec.ManagedExecutableJobHome
>         > > [Thread-6,recover:163] Recovered resource with ID
>         > > 362de86c-d57c-11da-82c7-00093d1067b1.
>         > > 2006-04-27 16:22:13,438 INFO exec.ManagedExecutableJobHome
>         > > [Thread-6,recover:163] Recovered resource with ID
>         > > b6f4ab42-d59b-11da-a021-00093d1067b1.
>         > > 
>         > >
>         >
>         -------------------------------------------------------------------------------------------------------------------
>         > > 
>         > > What's wrong with it? 
>         > > 
>         > > Peter G Lane wrote:
>         > > On Wed, 2006-04-26 at 19:17 -0700, wenwen LI wrote:
>         > > > Hi,
>         > > > This is the information for globus-fork.conf:
>         > > >
>         > >
>         >
>         ----------------------------------------------------------------------------------------------------------------------------
>         > > > [root at srb etc]# ls -l globus-fork.conf
>         > > > -rw-rw-rw- 1 globus globus 47 Mar 30 17:20
>         > > > globus-fork.conf
>         > > 
>         > > I wasn't clear enough. I want you to look *in*
>         > > globus-fork.conf. It is a
>         > > configuration file that contains a path to the fork SEG
>         log
>         > > file. It is
>         > > the fork SEG log file that I want you to check for
>         > > permissions. The
>         > > globus-fork.conf file only needs to be -rw for the owner.
>         > > 
>         > > Peter
>         > > 
>         > > >
>         > >
>         >
>         ---------------------------------------------------------------------------------------------------------------------------
>         > > > After that I run the TESTS.pl by the user 'wenwen' , but
>         > > still get
>         > > > this
>         > > >
>         > >
>         >
>         ----------------------------------------------------------------------------------------------------------------------------
>         > > > [wenwen at srb
>         > > globus_scheduler_event_generator_fork_test]$ ./TESTS.pl
>         > > > Warning: Do not start a service container while this
>         test
>         > > script is
>         > > > running.
>         > > > test-fork-seg....ok 
>         > > > 1/1 skipped: Fork SEG not configured
>         > > > All tests successful, 1 subtest skipped.
>         > > > Files=1, Tests=1, 0 wallclock secs ( 0.01 cusr + 0.02
>         csys
>         > =
>         > > 0.03
>         > > > CPU)
>         > > >
>         > >
>         >
>         ----------------------------------------------------------------------------------------------------------------------------
>         > > > Then I submit a job, it says "job
>         status:unsubmitted",then
>         > > nothing
>         > > > comes out under that sentence.
>         > > > I think the webservice server cannot received my job
>         > > request,
>         > > > What's wrong with it?
>         > > > Thanks in advance!
>         > > > 
>         > > > Wenwen
>         > > > 
>         > > > 
>         > > > Peter G Lane wrote:
>         > > > On Wed, 2006-04-26 at 15:50 -0700, wenwen LI wrote:
>         > > > > Hi,everyone:
>         > > > > 
>         > > > > I start POSTGRESQL under user 'postgre',success; Then
>         I
>         > > run
>         > > > the web
>         > > > > service container in user 'globus', it starts well;but
>         > > when
>         > > > I run :
>         > > > > globusrun-ws -submit -c /bin/true 
>         > > > > by user 'wenwen' , I got such results:
>         > > > > Submitting job...Done.
>         > > > >
>         > > >
>         > >
>         >
>         -------------------------------------------------------------------------------------------------------
>         > > > > Job ID: uuid:362de86c-d57c-11da-82c7-00093d1067b1
>         > > > > Termination time: 04/27/2006 23:27 GMT
>         > > > > (after waiting 2 minutes,I got)
>         > > > > Current job state: Unsubmitted
>         > > > > (Then nothing comes out in this window and in the web
>         > > > service
>         > > > > container window,
>         > > > > nothing comes out, either)
>         > > > >
>         > > >
>         > >
>         >
>         -------------------------------------------------------------------------------------------------------
>         > > > > Then I run TESTS.pl by user 'wenwen' like this:
>         > > > >
>         > > >
>         > >
>         >
>         ------------------------------------------------------------------------------------------------------
>         > > > > [wenwen at srb
>         > > > globus_scheduler_event_generator_test]$ ./TESTS.pl
>         > > > > seg-api-test............ok 
>         > > > > seg-module-load-test....ok 
>         > > > > seg-timestamp-test......ok 
>         > > > > All tests successful.
>         > > > > Files=3, Tests=6, 1 wallclock secs ( 0.09 cusr + 0.05
>         > csys
>         > > =
>         > > > 0.14
>         > > > > CPU)
>         > > > > [wenwen at srb globus_scheduler_event_generator_test]$ cd
>         > > > >
>         > > >
>         > >
>         >
>         $GLOBUS_LOCATION/test/globus_scheduler_event_generator_fork_test
>         > > > > [wenwen at srb
>         > > > globus_scheduler_event_generator_fork_test]$ ./TESTS.pl
>         > > > > Warning: Do not start a service container while this
>         > test
>         > > > script is
>         > > > > running.
>         > > > > test-fork-seg....ok 
>         > > > > 1/1 skipped: Fork SEG not configured
>         > > > 
>         > > > Check $GLOBUS_LOCATION/etc/globus-fork.conf for a valid
>         > > path.
>         > > > Check the
>         > > > file pointed to by that path for proper permissions. It
>         > > should
>         > > > be world
>         > > > readable and writable.
>         > > > 
>         > > > Peter
>         > > > 
>         > > > > All tests successful, 1 subtest skipped.
>         > > > > Files=1, Tests=1, 0 wallclock secs ( 0.04 cusr + 0.00
>         > csys
>         > > =
>         > > > 0.04
>         > > > > CPU)
>         > > > >
>         > > >
>         > >
>         >
>         ------------------------------------------------------------------------------------------------------
>         > > > > But I restart web service container:
>         > > > > It gives such informations:
>         > > > >
>         > > >
>         > >
>         >
>         ------------------------------------------------------------------------------------------------------
>         > > > > globus-start-container
>         > > > > 2006-04-26 19:52:02,266 WARN
>         > > > factory.ManagedJobFactoryResource
>         > > > > [Thread-3,run:164] Recovery exception
>         > > > > org.globus.wsrf.NoSuchResourceException
>         > > > > at
>         > > > >
>         > > >
>         > >
>         >
>         org.globus.wsrf.impl.ResourceHomeImpl.get(ResourceHomeImpl.java:285)
>         > > > > at
>         > > > >
>         > > >
>         > >
>         >
>         org.globus.wsrf.impl.ResourceHomeImpl.find(ResourceHomeImpl.java:262)
>         > > > > at
>         > > > >
>         > > >
>         > >
>         >
>         org.globus.exec.service.exec.ManagedExecutableJobHome.recover(ManagedExecutableJobHome.java:160)
>         > > > > at
>         > > org.globus.exec.service.factory.ManagedJobFactoryResource
>         > > > >
>         $1RecoveryThread.run(ManagedJobFactoryResource.java:161)
>         > > > > 2006-04-26 19:52:05,222 INFO exec.RunQueue
>         > [Thread-6,:54]
>         > > > > Starting state machine with 16 run queues.
>         > > > > 2006-04-26 19:52:07,289 WARN
>         > > > factory.ManagedJobFactoryResource
>         > > > > [Thread-6,run:164] Recovery exception
>         > > > > org.globus.wsrf.NoSuchResourceException
>         > > > > at
>         > > > >
>         > > >
>         > >
>         >
>         org.globus.wsrf.impl.ResourceHomeImpl.get(ResourceHomeImpl.java:285)
>         > > > > at
>         > > > >
>         > > >
>         > >
>         >
>         org.globus.wsrf.impl.ResourceHomeImpl.find(ResourceHomeImpl.java:262)
>         > > > > at
>         > > > >
>         > > >
>         > >
>         >
>         org.globus.exec.service.exec.ManagedExecutableJobHome.recover(ManagedExecutableJobHome.java:160)
>         > > > > at
>         > > org.globus.exec.service.factory.ManagedJobFactoryResource
>         > > > >
>         $1RecoveryThread.run(ManagedJobFactoryResource.java:161)
>         > > > > Starting SOAP server at:
>         > > > https://129.174.124.107:8443/wsrf/services/ 
>         > > > > With the following services:
>         > > > > [1]:
>         > > >
>         > >
>         >
>         https://129.174.124.107:8443/wsrf/services/TriggerFactoryService
>         > > > > [2]:
>         > > >
>         > >
>         >
>         https://129.174.124.107:8443/wsrf/services/DelegationTestService
>         > > > > [3]:
>         > > >
>         > >
>         >
>         https://129.174.124.107:8443/wsrf/services/SecureCounterService
>         > > > > [4]:
>         > > >
>         > https://129.174.124.107:8443/wsrf/services/IndexServiceEntry
>         > > > > [5]:
>         > > >
>         > https://129.174.124.107:8443/wsrf/services/DelegationService
>         > > > > [6]:
>         > > > >
>         > > >
>         > >
>         >
>         https://129.174.124.107:8443/wsrf/services/InMemoryServiceGroupFactory
>         > > > > [7]:
>         > > > >
>         > > >
>         > >
>         >
>         https://129.174.124.107:8443/wsrf/services/mds/test/execsource/IndexService
>         > > > > [8]:
>         > > > >
>         > > >
>         > >
>         >
>         https://129.174.124.107:8443/wsrf/services/mds/test/subsource/IndexSe
>         > > > > ......
>         > > > > [51]
>         > > > >
>         > > >
>         > >
>         >
>         ------------------------------------------------------------------------------------------------------
>         > > > > Can any body help??
>         > > > > Thank you very much!
>         > > > > 
>         > > > > 
>         > > > > Wenwen
>         > > > > 
>         > > > >
>         > > >
>         > >
>         >
>         ______________________________________________________________________
>         > > > > Yahoo! Messenger with Voice. PC-to-Phone calls for
>         > > > ridiculously low
>         > > > > rates.
>         > > > > 
>         > > > > 
>         > > > >
>         > > >
>         > >
>         >
>         ______________________________________________________________________
>         > > > > Yahoo! Messenger with Voice. PC-to-Phone calls for
>         > > > ridiculously low
>         > > > > rates.
>         > > > 
>         > > > 
>         > > > 
>         > > > 
>         > > >
>         > >
>         >
>         ______________________________________________________________________
>         > > > Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone
>         > > calls. Great
>         > > > rates starting at 1?min.
>         > > > 
>         > > > 
>         > > >
>         > >
>         >
>         ______________________________________________________________________
>         > > > New Yahoo! Messenger with Voice. Call regular phones
>         from
>         > > your PC and
>         > > > save big.
>         > > 
>         > > 
>         > > 
>         > > 
>         > >
>         >
>         ______________________________________________________________________
>         > > Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone
>         > calls. Great
>         > > rates starting at 1?min.
>         > 
>         > 
>         > 
>         > 
>         >
>         ______________________________________________________________________
>         > New Yahoo! Messenger with Voice. Call regular phones from
>         your PC and
>         > save big.
>         > 
>         > 
>         >
>         ______________________________________________________________________
>         > Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the
>         US (and 30+
>         > countries) for 2?min or less.
> 
> 
> 
> 
> ______________________________________________________________________
> How low will we go? Check out Yahoo! Messenger’s low PC-to-Phone call
> rates.
> 
> 
> ______________________________________________________________________
> Get amazing travel prices for air and hotel in one click on Yahoo!
> FareChase 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3720 bytes
Desc: not available
URL: <http://lists.globus.org/pipermail/gram-user/attachments/20060508/e35a2f4e/attachment.bin>


More information about the gram-user mailing list