[gridway-user] GridGateway problem

rayban rayban at dsi.uclm.es
Mon Nov 12 07:37:29 CST 2007


Of course I don't mind ;-). My gridway admin user is usuario and my gw
group is usuario. I have two users groups involved, one for local and
remote globus users and another for globus foreign users. Globus foreign
users also belong to gw user group. All gw users are mapped as gwadmin
user for gridway mads. The problem is that the temp folder created by
gridgateway for the generation of new gridway job template for the local
gridway belongs to the gwuser and have no write permissions for gwadmin
user, only for gridway user, so when the job execution finnish in the
computing element and gridway bring files back to the temp folder
gridway executed by usuario it has no write privileges on that folder
with the consequent job error. Simply adding write permissions to group
on the temp folders the problem solves but I think is not an elegant way
to do it. it would be posible to change the owner of that temp folder to
usuario avoiding write to group. I didn't test that.

Bye.

El lun, 12-11-2007 a las 11:11 +0100, Tino Vazquez escribió:
> Hi rayban,
> 
> Thanks for the feedback. It seems like you have discovered and fixed a
> bug.
> 
> Could you be more specific about what changes did you do in gw.in? I
> am assuming you are changing permissions on folder  "$self->job_dir()"
> in the submit routine of the gw.in script. Is that correct?
> 
> I would love to take a look at that gw.in if you don't mind, so I can
> include the bug fix in the next release.
> 
> Thanks again, 
> 
> -Tino
> 
> On Nov 10, 2007 9:26 PM, <rayban at dsi.uclm.es> wrote:
>         Hi. I've installed gridway gateway yesterday and I've found an
>         error
>         related with folder privileges.
>         
>         I haven't installed any GT patch for supporting file staging
>         because I
>         only want to use Gridgateway from another meta-scheduler like
>         Gridway. 
>         When I launch the pi test provided by gw installation the job
>         is prolog
>         from my scheduler to gridgateway and then from gridgateway to
>         gridway and
>         to the computing elements. The job executes correctly and
>         then, while 
>         copying the files back to the gridway host, I mean the gridway
>         host behind
>         gridgateway, I get a fail and then it goes to pending state to
>         be executed
>         again. I read some logs and I realized that files
>         stdout.execution and
>         stderr.execution could not be copied from computing element to
>         the gridway
>         host (two-hop file staging). Then I checked the privileges of
>         the gwadmin
>         user on that folder and I realized that it hadn't write
>         privileges. This 
>         is because the owner of that folder is different from gwadmin
>         so it has
>         been run as gwadmin. The owner of the conflictive folder
>         called ekis is
>         member of gwadmin group but gwadmin has no write privileges on
>         that 
>         folder. Thus, when gridway tries to write back
>         stdout.execution and
>         stderr.execution it fails.
>         
>         The solution I saw is to modify
>         $GLOBUS_LOCATION/setup/globus/gw.in by
>         adding write privileges to the group gwadmin, and then
>         gpt-posinstall 
>         -force. I don't know if thas was a bug or my bad practises on
>         grid
>         deployment.
>         
>         Has anyone has the same error?, if so, you have the solution
>         on the top.
>         
>         
>         Bye.
>         
> 
> 
> 
> -- 
> +-----------------------------------------------------------+
> Tino Vázquez
> Grid Technology Engineer/Researcher
> Dpto. Arquitectura de Computadores y Automatica
> Facultad de Informatica
> Universidad Complutense 28040 Madrid 
> Phone : +34 91 394 75 74
> http://asds.dacya.ucm.es/
> +-----------------------------------------------------------+
> 
> GridWay, The Way to Grid! http://www.gridway.org
-- 
Iván Fernández Hernández
Becario investigador
Instituto de investigación en Informática de Albacete (I3A)
-------------- next part --------------
#-------------------------------------------------------------------------- 
# Copyright 1999-2006 University of Chicago
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#-------------------------------------------------------------------------- 

#--------------------------------------------------------------------------
# The adaptation of this software for GridWay was developed under the
# Apache License, Version 2.0, by the GridWay Team
#
# Copyright 2002-2006 GridWay Team, Distributed Systems Architecture
# Group, Universidad Complutense de Madrid
#-------------------------------------------------------------------------- 
use Globus::GRAM::Error;
use Globus::GRAM::JobState;
use Globus::GRAM::JobManager;
use Globus::Core::Paths;

use Config;

# NOTE: This package name must match the name of the .pm file!!
package Globus::GRAM::JobManager::gw;

use File::Basename qw(fileparse);

@ISA = qw(Globus::GRAM::JobManager);

my ($gwsubmit, $gwkill, $gwps);

BEGIN
{
    $gwsubmit = '@GWSUBMIT@';
    $gwkill   = '@GWKILL@';
    $gwps     = '@GWPS@';
    $ENV{'GW_LOCATION'} = '@GW_LOCATION@';
}

sub submit
{
    my $self = shift;
    my $description = $self->{JobDescription};
    my $status;
    my $gw_job_template_name;
    my $gw_job_err_name;
    my $job_id;
    my @arguments;
    my $id;
    my @filestageinarray = @{$description->{filestagein}};
    my @filestageoutarray = @{$description->{filestageout}};


    $self->log('Entering gw submit');

    $self->log('++++++++++++++++++++++++++++++ GW SUBMIT +++++++++++++++++++++++++++++++++++');


    # Set the directory for this job
    if( $description->directory eq '')
    {
		return Globus::GRAM::Error::RSL_DIRECTORY;
    }
    if ($description->directory =~ m|^[^/]|) 
    {
        $description->add('directory', (getpwuid($<))[7] . '/' . $description->directory);
    }
    if((! -d $description->directory) || (! -r $description->directory))
    {
		return Globus::GRAM::Error::BAD_DIRECTORY;
    }

    # make sure the files are accessible (NFS sync) when you check for them
    $self->nfssync( $description->executable() )
	unless $description->executable() eq '';
    $self->nfssync( $description->stdin() )
	unless $description->stdin() eq '';
	
	my $my_executable = $description->executable;
	
	# Check that the executable exists
    if ($description->executable =~ m|^[^/]|) 
    {
        $description->add('executable', $description->directory . '/' . $description->executable);
    }
    if( $description->executable eq '')
    {
		return Globus::GRAM::Error::RSL_EXECUTABLE();
    }
    elsif(! -f $description->executable())
    {
		return Globus::GRAM::Error::EXECUTABLE_NOT_FOUND();
    }
    elsif(! -x $description->executable())
    {
		return Globus::GRAM::Error::EXECUTABLE_PERMISSIONS();
    }

    $self->log('Building job template');

    $gw_job_template_name = $self->job_dir() . '/gwt';
    
    local(*JOB);
    open( JOB, '>' . $gw_job_template_name );
    print JOB<<"EOF";
    
EOF

chmod 0765, $self->job_dir();

	($base_executable) = $description->executable() =~ m~^/.*/([^/]*)$~;
	foreach my $files_in_staging (@filestageinarray)
	{
		my @stage_in_files = split(/ /,$files_in_staging);
		my $base_stage_in = fileparse(@stage_in_files[1]);
	
		if($base_stage_in eq  $base_executable) 
		{
			($full_path_stage_in_file) = @stage_in_files[1] =~ m~^.*//.*?/(.*)~;
			my $job_dir = $self->job_dir();
			`cp /$full_path_stage_in_file $job_dir`;
		}
	}	
    
	print JOB "EXECUTABLE = ", $my_executable, "\n";

    @arguments = $description->arguments();
    foreach(@arguments)
    {
        if(ref($_))
		{
	   	 return Globus::GRAM::Error::RSL_ARGUMENTS;
		}
    }
    if($#arguments >= 0)
    {   
		$args = '"'; 
        foreach(@arguments)
        {
             $_ =~ s/\\/\\\\/g;
	    	 $_ =~ s/\$/\\\$/g;
	    	 $_ =~ s/"/\\\"/g; #"
	    	 $_ =~ s/`/\\\`/g; #`
	     
	    	 $args .= ' ' . $_ . ' '; 
	    	 
        }
		$args .= '"';
    }
    else
    {
		$args = '';
    }
	
    print JOB "ARGUMENTS = ", $args, "\n";

    # We construct the environment filtering out fields that won't make
    # sense to the GGW
    
    print JOB "ENVIRONMENT = ";
    foreach my $tuple ($description->environment())
    {
		if(!ref($tuple) || scalar(@$tuple) != 2)
		{
		    return Globus::GRAM::Error::RSL_ENVIRONMENT();
		}

        if ($tuple->[0] eq "GW_JOB_ID")
        {
          $gw_job_id = $tuple->[1];
        }

        if ($tuple->[0] eq "GW_USER")
        {
          $gw_user = $tuple->[1];
        }

        if ($tuple->[0] eq "X509_USER_PROXY")
        {
          $user_proxy = $tuple->[1];
        }

        if (($tuple->[0] ne "X509_USER_PROXY")&&($tuple->[0] ne "GLOBUS_LOCATION")&&($tuple->[0] ne "GLOBUS_GRAM_JOB_CONTACT")&&($tuple->[0] ne "GLOBUS_GRAM_MYJOB_CONTACT")&&($tuple->[0] ne "HOME")&&($tuple->[0] ne "LOGNAME")&&($tuple->[0] ne "JAVA_HOME")&&($tuple->[0] ne "X509_CERT_DIR")&&($tuple->[0] ne "X509_USER_CERT")&&($tuple->[0] ne "X509_USER_KEY")&&($tuple->[0] ne "GLOBUS_GRAM_JOB_HANDLE")&&($tuple->[0] ne "GW_HOSTNAME")&&($tuple->[0] ne "GW_USER")&&($tuple->[0] ne "GW_JOB_ID"))
        {
           print JOB $tuple->[0], '=', $tuple->[1], ", ";
        }
    }
    print JOB "\n";

    if ((defined($description->stdin())) && ($description->stdin() ne "/dev/null"))
    {
    	print JOB "STDIN_FILE = file://",  $description->stdin(),  "\n"; 
    }

    if (defined($description->stdout()))
    {
    	print JOB "STDOUT_FILE = file://", $description->stdout(),  "\n";
    }

    if (defined($description->stderr()))
    {
    	print JOB "STDERR_FILE = file://", $description->stderr(),  "\n";
    }      
    
    
    # Now we create the input_files job template entry, taking into 
    # account the fact that all the rft staged files will be in 
    # the job dir

    if(defined(@filestageinarray))
    {
    	print JOB "INPUT_FILES = ";
	    foreach my $files_in (@filestageinarray)
	    {
			my @files_stage_in = split(/ /,$files_in);
			if (@files_stage_in[1] =~ m~^gsiftp~ )
			{
				($strip_url_tmp) = @files_stage_in[1] =~ m~^.*//.*?/(.*)~;
				if ("/".$strip_url_tmp ne $description->stdin())
				{
					print JOB "file:///","$strip_url_tmp, ";
				}
			}
			else
			{
				
				if(@files_stage_in[1] =~ m~^file~)
				{
					print JOB "@files_stage_in[1], ";			
				}
				else
				{
					print JOB "file:///","@files_stage_in[1], ";
				}
			}	
	    }
	    print JOB "\n";
    }

    # Now we create the output_files job template entry, taking into 
    # account the fact that the output files are placed in the
    # job dir by GridWay

    if(defined(@filestageoutarray))
    {
		print JOB "OUTPUT_FILES = ";

	    foreach my $files_out (@filestageoutarray)
	    {
	        my @files_stage_out = split(/ /,$files_out);

			if (@files_stage_out[0] =~ m~^gsiftp~ )
			{
				($base_name)   = $files_stage_out[0] =~ m~^.*//.*/(.*)~;
				($wDir)        = $files_stage_out[0] =~ m~^.*//.*?/(.*)~;
				# Skip the file if it is already going to be transfer in the stdout or stderr
				if ( ( "/".$wDir ne  $description->stdout() ) && ( "/".$wDir ne  $description->stderr() ) )
				{
					print JOB "${base_name} /", $wDir , ", ";
				}
			}
			else
			{
				if(@files_stage_in[0] =~ m~^file:/*~)
				{
					($base_name) = $files_stage_out[0] =~ m~^file:.*?([^/]*)$~;
					($wDir)      = $files_stage_out[0] =~ m~^file:/*(.*)~;
					print JOB "${base_name} /", $wDir , ", ";	
				}
				else
				{
					if(@files_stage_in[0] =~ m~^/*~)
					{
						($base_name) = $files_stage_out[0] =~ m~^/.*/([^/]*)$~;
						print JOB "${base_name} ", @files_stage_in[0] , ", ";	
					}
					else
					{
						print JOB "@files_stage_in[0] ", @files_stage_in[0] , ", ";	
					}
				}
			}
	    }     
	    print JOB "\n";
    }

    open(FD,">" . $description->stdout());
    close(FD);
    open(FD,">" . $description->stderr());
    close(FD);

    chmod 0765, $description->stdout();
    chmod 0765, $description->stderr();
    
    $ext = $description->extensions();
    if(defined($ext))
    {
        $ext =~ s/<(.*:)?\bgw\b.*>/<gw>/;
		$ext =~ s/<\/(.*:)?gw>/<\/gw>/;
		$pos = index($ext, "<gw>");
		if ($pos >= 0)
		{
		  $pos += 4;
	      $len = index($ext, "</gw>") - $pos;
		  $sub = substr($ext, $pos, $len);
		  $sub =~s/&quot;/"/g;
		  print JOB $sub, "\n";
		}
    }

    close(JOB);
    chmod 0755, $gw_job_template_name;

    $gw_job_err_name = $self->job_dir() . '/gwsubmit_stderr';
    $self->log("job err is at $gw_job_err_name");

    $self->log("about to submit job");
    $self->nfssync( $gw_job_template_name );
    $self->nfssync( $gw_job_err_name );

	# Get the experiment directory
	$gwdir = (getpwuid($<))[7] . '/.gw_' . $gw_user . '_' . $gw_job_id;
	
	# Depending on how the job was submitted (either by GW or by
	# sending a RSL directly to GRAM) ,GGWs job dir will be 
	# GW's or Globus's one
	
    if (-d "$gwdir")
    {
     #chmod g+w -R, $gwdir;
      @argv = ("cp", "$gw_job_template_name", "$gwdir");
      system(@argv);

      $gwt_path = $gwdir . '/gwt';
    }
    else
    {
      #chmod g+w -R, $gwdir;
      $gwt_path = $gw_job_template_name;
    }

    my $user_home       = $ENV{'HOME'};

	if(!$user_proxy=="")
	{
	    my $globus_location = $ENV{'GLOBUS_LOCATION'};

	    # The delegated proxy is saved always in HOME/.globus/gwdelegcred
   
	    my $proxy_time_left = `$globus_location/bin/grid-proxy-info  -timeleft -f $user_home/.globus/gwdelegcred.pem  2> /dev/null`;

	    if($proxy_time_left == "-1")
	    {
		$proxy_time_left = "no";
	    }
	    else
	    {
	 	$proxy_time_left = "yes";
	    }

	     # We copy the new  delefated proxy if :
	     #     1.- The old one doesn't exist
	     #     2.- The old on has expired

	    `if [ -f $user_home/.globus/gwdelegcred.pem ];  then if [ "x$proxy_time_left" == "xno" ]; then cp $user_proxy $user_home/.globus/gwdelegcred.pem; fi; else cp $user_proxy $user_home/.globus/gwdelegcred.pem; fi`; 
     
	     # We add the info in $HOME/.gwrc
   
	    `if [ ! -f $user_home/.gwrc ];  then touch $user_home/.gwrc; fi`;
	    `grep -v "X509_USER_PROXY=" $user_home/.gwrc > $user_home/.gwrc.tmp`;
	    `echo "X509_USER_PROXY=$user_home/.globus/gwdelegcred.pem" >> $user_home/.gwrc.tmp`;
	    `mv $user_home/.gwrc.tmp $user_home/.gwrc`;
	}

    $job_id = `$gwsubmit -v -t $gwt_path 2> $gw_job_err_name`;

    # Check gwsubmit exit code to see if there were any problems

    if($? == 0)
    {
		($trash,$id) = split (": ",$job_id);
        
		chomp($id);
		$self->log('++++++++++++++++++++++++++++++ END GW SUBMIT +++++++++++++++++++++++++++++++++++');
		return 
		{
		   JOB_ID => $id,
		   JOB_STATE => Globus::GRAM::JobState::PENDING
		};
    }
    else
    {
        $self->log("job submission failed, checking $gw_job_err_name");

        my $stderr;
        local(*ERR);
        $self->nfssync( $gw_job_err_name );
        open(ERR, "<$gw_job_err_name");
        local $/;
        $stderr = <ERR>;
        close(ERR);

        open(ERR, '>' . $description->stderr());
        print ERR $stderr;
        close(ERR);

        $stderr =~ s/\n/ /g;

        $self->respond({ GT3_FAILURE_MESSAGE => $stderr });
    }
    $self->log('++++++++++++++++++++++++++++++ END GW SUBMIT +++++++++++++++++++++++++++++++++++');
    close(descriptor);
    return Globus::GRAM::Error::JOB_EXECUTION_FAILED;
}


sub poll
{
    # The GridWay gwps command is used to obtain the current
    # status of the job. This status is then returned.
    #
    # The Status field can contain one of the following strings:
    #
    # string        stands for                      	Globus context meaning
    # ------------------------------------------------------------------------
    # pend         waiting for a resource to run on		        PENDING 
    # wrap         executing the Wrapper          		        ACTIVE
    # prew         preparation tasks in the remote resource     ACTIVE
    # prol         preparing the remote system                  ACTIVE
    # epil         finalizing the remote system                 ACTIVE
    # zomb         the job is done          			        DONE
    # fail         the job failed         		            	FAIL
    # migr         migrating from one resource to another      	SUSPENDED
    # stop         the job is stooped                        	SUSPENDED
    # hold         the job execution is delayed                 SUSPENDED

    my $self = shift;
    my $description = $self->{JobDescription};
    my $job_id = $description->jobid();
    my $state;

    $self->log("+++++++++++++++++++++++++++++++ polling job $job_id ++++++++++++++++++++++++++++++");
    $_ = ($self->pipe_out_cmd($gwps, '-n', $job_id))[0];

    $self->log("+++++++++++++++++++++++++ gwps: $_ +++++++++++++");

    @array = split(/\ +/, $_);

    $gws = $array[4];
    $self->log("gwps job state is: $gws");
	
    if ($gws eq "pend")
    {
	$state = Globus::GRAM::JobState::PENDING;
    }
    elsif(($gws eq "wrap")||($gws eq "prol")||($gws eq "epil")||($gws eq "prew"))
    {
        $state = Globus::GRAM::JobState::ACTIVE;
    }
    elsif($gws eq "zomb")
    {
	system("$gwkill $job_id >/dev/null 2>/dev/null");
        $state = Globus::GRAM::JobState::DONE;
    }
    elsif($gws eq "fail")
    {
        $state = Globus::GRAM::JobState::FAILED;
    }
    elsif(($gws eq "migr")||($gws eq "stop")||($gws eq "hold"))
    {
        $state = Globus::GRAM::JobState::SUSPENDED;
    }
    else
    {
        $self->log("gwps returned an unknown response.  Telling JM to ignore this poll");
        return {};
    }

    return {JOB_STATE => $state};
}

sub cancel
{
    my $self = shift;
    my $description = $self->{JobDescription};
    my $job_id = $description->jobid();

    $self->log("cancel job $job_id");
    system("$gwkill $job_id >/dev/null 2>/dev/null");

    if($? == 0)
    {
		return 
		{ 
			JOB_STATE => Globus::GRAM::JobState::FAILED 
		};
    }
    return Globus::GRAM::Error::JOB_CANCEL_FAILED();
}

1;


More information about the gridway-user mailing list