[metrics-dev] What went on with GridFTP yesterday?

Ravi Madduri madduri at mcs.anl.gov
Wed Oct 18 13:05:38 CDT 2006


Lee
John was testing some control channel command pipelining yesterday to be 
shown as part of a SC Demo. I am cc'ing john

Lee Liming wrote:
> Ok, this was an interesting test case of what we can (and can't) learn 
> using our usage stats data and the current report generators.  Here's 
> what I can tell from the reports that we currently generate for GridFTP.
> 
> SUMMARY: The most likely theory, I think, is that someone is using a 
> post-GT4 development version of GridFTP for stress testing purposes on a 
> LAN using third-party transfers between a large number of servers 
> (somewhere around 80, I suspect).
> 
>           -- Lee
> 
> Details:
> 
> The resulting data points were gleaned by comparing the usage profile on 
> the 16th (normal usage level) with the 17th (usage spike).  This 
> comparison is not exact, of course, because the "control" usage (the 
> baseline, ordinary usage) likely varied a bit between the two days as well.
> 
> -- Most relevant, I think: The GridFTP version used was version 3.2.  
> This version is a development version, post GT-4.0.  So someone is using 
> what is probably the latest development version for testing purposes.
> 
> --  Something we need to follow up on: The data in the database suggests 
> that there were actually ~3.3M transfers reported yesterday, not the 
> 1.5M that the daily summary report showed. (The daily summary report 
> shows packets received, not number of transfers, but I would have 
> thought they'd be the same.)  I'm not sure why these numbers disagree 
> but it's sure to be interesting.  A possible clue is that in a 
> third-party transfer, both the sending server and the receiving server 
> are likely to send reports. But one would think that would be counted 
> both in the packets received and in the transfers reported, so it's not 
> clear why one is still twice the other.
> 
> --  The transfers were all single stream, single stripe (not striped).
> --  The file sizes were mostly in the 0 - 100kb range, though it appears 
> that a smaller set of them (10% or less) were in the 100kb - 1mb range.
> 
> -- A very large number of the transfers (~1.5M) appear to have been 
> handled at transfer rates of 100mb/s or higher. A smaller number were in 
> the lower bandwidth ranges (10mb/s - 50 mb/s, 500kb/s - 1 mb/s, etc.).
> NOTE: Given that these were all single stripe, single stream transfers, 
> it suggests to me that these were transfers on a local area network.
> 
> -- With regard to the servers that were used, the main difference 
> between the 17th and the 16th is that on the 17th, there were quite a 
> number of transfers between hosts with unregistered IP addresses (no 
> reverse DNS entry).  On the 16th, there were ~20 servers in this 
> category.  On the 17th, there were slightly over 100 servers in that 
> category.
> 
> NOTE: This might further support the theory that the "extra" transfers 
> were done between hosts on a LAN.
> 
> -- There was a 50/50 breakdown between STOR and RETR requests.
> 
> NOTE: I think it's safe to assume from this that what we're seeing are 
> third-party transfers between GridFTP servers, and we're seeing the 
> reports from both the sending hosts and the receiving hosts. (So the 
> true number of transfers is probably 50% of the total reported, because 
> each transfer is reported twice, once by sending host and once by 
> receiving.)
> 
> -- All of the transfers that were reported had return code 226 (which I 
> assume is success?), but based on the fact that I've never seen a 
> different response code in any of our reports, I suspect that we only 
> report successful transfers.
> 
> -- The TCP buffer sizes used were mostly in the 10kb - 100kb range, but 
> some--slightly over 500k--were in the 100kb - 1mb range.  (Just a guess, 
> but I'd bet those were the files that were >100kb in size.)  The block 
> sizes used were all 100kb - 1mb.
> 
> 

-- 
Ravi K Madduri
The Globus Alliance | Argonne National Laboratory
http://www-unix.mcs.anl.gov/~madduri




More information about the metrics-dev mailing list