[metrics-dev] What went on with GridFTP yesterday?

Lee Liming liming at mcs.anl.gov
Wed Oct 18 12:53:12 CDT 2006


Ok, this was an interesting test case of what we can (and can't)  
learn using our usage stats data and the current report generators.   
Here's what I can tell from the reports that we currently generate  
for GridFTP.

SUMMARY: The most likely theory, I think, is that someone is using a  
post-GT4 development version of GridFTP for stress testing purposes  
on a LAN using third-party transfers between a large number of  
servers (somewhere around 80, I suspect).

           -- Lee

Details:

The resulting data points were gleaned by comparing the usage profile  
on the 16th (normal usage level) with the 17th (usage spike).  This  
comparison is not exact, of course, because the "control" usage (the  
baseline, ordinary usage) likely varied a bit between the two days as  
well.

-- Most relevant, I think: The GridFTP version used was version 3.2.   
This version is a development version, post GT-4.0.  So someone is  
using what is probably the latest development version for testing  
purposes.

--  Something we need to follow up on: The data in the database  
suggests that there were actually ~3.3M transfers reported yesterday,  
not the 1.5M that the daily summary report showed. (The daily summary  
report shows packets received, not number of transfers, but I would  
have thought they'd be the same.)  I'm not sure why these numbers  
disagree but it's sure to be interesting.  A possible clue is that in  
a third-party transfer, both the sending server and the receiving  
server are likely to send reports. But one would think that would be  
counted both in the packets received and in the transfers reported,  
so it's not clear why one is still twice the other.

--  The transfers were all single stream, single stripe (not striped).  

--  The file sizes were mostly in the 0 - 100kb range, though it  
appears that a smaller set of them (10% or less) were in the 100kb -  
1mb range.

-- A very large number of the transfers (~1.5M) appear to have been  
handled at transfer rates of 100mb/s or higher. A smaller number were  
in the lower bandwidth ranges (10mb/s - 50 mb/s, 500kb/s - 1 mb/s,  
etc.).  

NOTE: Given that these were all single stripe, single stream  
transfers, it suggests to me that these were transfers on a local  
area network.

-- With regard to the servers that were used, the main difference  
between the 17th and the 16th is that on the 17th, there were quite a  
number of transfers between hosts with unregistered IP addresses (no  
reverse DNS entry).  On the 16th, there were ~20 servers in this  
category.  On the 17th, there were slightly over 100 servers in that  
category.

NOTE: This might further support the theory that the "extra"  
transfers were done between hosts on a LAN.

-- There was a 50/50 breakdown between STOR and RETR requests.

NOTE: I think it's safe to assume from this that what we're seeing  
are third-party transfers between GridFTP servers, and we're seeing  
the reports from both the sending hosts and the receiving hosts. (So  
the true number of transfers is probably 50% of the total reported,  
because each transfer is reported twice, once by sending host and  
once by receiving.)

-- All of the transfers that were reported had return code 226 (which  
I assume is success?), but based on the fact that I've never seen a  
different response code in any of our reports, I suspect that we only  
report successful transfers.

-- The TCP buffer sizes used were mostly in the 10kb - 100kb range,  
but some--slightly over 500k--were in the 100kb - 1mb range.  (Just a  
guess, but I'd bet those were the files that were >100kb in size.)   
The block sizes used were all 100kb - 1mb.





More information about the metrics-dev mailing list