[workspace-user] workspace doesn't work after a name change
Tim Freeman
tfreeman at mcs.anl.gov
Tue Apr 24 12:21:08 CDT 2007
On Tue, 24 Apr 2007 18:36:54 +0200
"Manuel Rodriguez Pascual" <supermanue at gmail.com> wrote:
> Hi everyone,
> I had succesfully configured Workspace to run on a cluster. I could start
> VMs, and they executed on a remote machine.
>
> The problem came when I changed the name of all computers. Now, when I try
> to start a new VM, I keep receiving errors.
>
> I changed "etc/workspace_service/resourcepools/pool1" to have the new names
> instead of the old ones. As long as I know, that is the only place that the
> slave machines should appear, isn't it?
I think you're hitting a known usability problem with the database vs. the pool
files. Those pool files currently are used as a way to populate the database
with the node information.
But it only works one way, when you remove things from the pool file it does
not delete them from the DB. To do so there would need to be code that would
a) make sure nothing is using or scheduled to be using the node to be removed,
b) perhaps marks the node for no more use in the future if there is something
running there (but continues to run whatever workspace is running) or c)
perhaps even code that shuts down or moves workspaces running on pool nodes
that were removed. So for now it has been left like this.
To remove the nodes from the DB, see this note that is tucked away about this
in the admin guide (if you have no running workspaces the best option is
currently just to run "ant resetDB" from the workspace source directory):
http://workspace.globus.org/vm/TP1.2.3/doc/admin-index.html#workspaceVM-db-sidenote
Sorry for the inconvenience, removing entries should be easier in future
releases.
Tim
>
> Outside workspace everything works correctly, so I suppose the problem must
> be here.
>
> these are my etc/workspace_service/resources/pool1 and container.log files:
>
> etc/workspace_service/resourcepools/pool1:
> piscis01 383
> piscis02 383
> #vmm1 1024
> #vmm2 2048
> #vmm3 2048
> #vmm4 2048
> #vmm5 2048
> #vmm6 2048
> #vmm7 2048
>
> ---------------------------
> container.log
> 007-04-24 18:30:30,056 INFO
> factory.WorkspaceFactoryService[ServiceThread-13,create:62]
> [WORKSPACE-EVENT]: WS-CREATE invoked by
> /O=UCM/OU=ASDS-DACYA/CN=Manuel Rodriguez
> 2007-04-24 18:30:30,063 INFO
> factory.AssociationUtil[ServiceThread-13,getNextEntry:96]
> [WORKSPACE-EVENT]: 'public' association
> entry leased, ip=192.168.0.11
> 2007-04-24 18:30:30,076 INFO
> factory.ResourcepoolUtil[ServiceThread-13,getResourcepoolEntry:92]
> [WORKSPACE-EVENT]: 'pool1'
> resource pool entry 'ursa02': 64 MB reserved, 255 MB left
> 2007-04-24 18:30:30,087 INFO
> service.WorkspaceHome[ServiceThread-13,create:631]
> [WORKSPACE-EVENT][id-205]:
>
> WORKSPACE RESOURCE CREATED:
> - Name: 'http://example1/localhost/image'
> - Key: '205'
> - Start time: Apr 24, 2007 6:30:30 PM
> - Shutdown time: Apr 24, 2007 7:00:30 PM
> - Resource termination time: Apr 24, 2007 7:30:30 PM
> - Creator DN: /O=UCM/OU=ASDS-DACYA/CN=Manuel Rodriguez
>
> 2007-04-24 18:30:30,130 INFO
> impls.WorkspaceResourceImpl[Timer-0,setOpsEnabled:370]
> [WORKSPACE-EVENT][id-205]: WS-operations
> enabled
> 2007-04-24 18:30:30,192 INFO
> workspace.WorkspaceUtil[Thread-18_WorkspTaskThrd,runCommand:155]
> [WORKSPACE-EVENT]: /usr/bin/ssh
> xenadmin at ursa02 /opt/workspace/bin/workspace-control --create --name
> workspace-205 --memory 64 --networking
> 'eth0;public;ANY;Bridged;Static;192.168.0.11;null;192.168.0.255;255.255.255.0;192.168.0.1;null;null;null;null;null'
> --image file://base.img --imagemount sda1
> 2007-04-24 18:30:30,230 INFO
> workspace.WorkspaceUtil[Thread-18_WorkspTaskThrd,runCommand:176]
> [WORKSPACE-EVENT]: Return code is
> 255
> 2007-04-24 18:30:30,232 ERROR
> workspace.WorkspaceUtil[Thread-18_WorkspTaskThrd,runCommand:232]
> [WORKSPACE-EVENT]: system command
> FAILURE
> STDERR:
>
> ssh: ursa02: Name or service not known
> 2007-04-24 18:30:30,233 INFO
> xen.XenTask[Thread-18_WorkspTaskThrd,execute:126]
> [WORKSPACE-EVENT][id-205]: Start
> failed
> 2007-04-24 18:30:30,233 ERROR
> impls.StatefulResourceImpl[Thread-18_WorkspTaskThrd,notify:123]
> Problem moving [id-205] to state
> 'Started'
> org.globus.workspace.WorkspaceException: Unknown problem
> at org.globus.workspace.xen.XenUtil.throwErr(XenUtil.java:485)
> at org.globus.workspace.xen.XenUtil.translateReturnException(
> XenUtil.java:473)
> at org.globus.workspace.xen.XenTask._execute(XenTask.java:203)
> at org.globus.workspace.xen.XenTask.execute(XenTask.java:105)
> at org.globus.workspace.service.impls.async.WorkspaceThread.run(
> WorkspaceThread.java:56)
> 2007-04-24 18:30:30,238 WARN
> impls.StateTransition[Thread-18_WorkspTaskThrd,corrupted:232]
> Workspace was corrupted (when
> moving to state Started): can not change state anymore unless workspace is
> going to be destroyed
>
> As you can see, Workspace is trying to ssh "ursa02", which is the old name
> of the computer. It should say "piscis02".
>
> Do you have any clues of what is happening?
>
> Thanks for yout attention,
>
> Manuel Rodríguez Pascual
>
More information about the workspace-user
mailing list