[metrics-dev] What went on with GridFTP yesterday?
Ian Foster
foster at mcs.anl.gov
Wed Oct 18 15:51:51 CDT 2006
shouldn't the answer be to filter those out? My view is we shouldn't be
including Argonne traffic, in any case.
At 02:35 PM 10/18/2006 -0500, Ravi Madduri wrote:
>I think we should turn off usage stats when we are doing internal tests. A
>large quantity of useless data will skew the inferences we wish to draw. I
>know how to turn it off with java stuff but i am not sure with the c side
>of things
>my 2c
>
>Lee Liming wrote:
>>Ravi:
>>Yes, that is what it looks like. Turns out I was wrong on at least one
>>point. John was testing both UC -> UC transfers (LAN) and UC -> SDSC
>>transfers (WAN). I'm guessing that the lower transfer rates (10mb/s - 50
>>mb/s, 500kb/s - 1 mb/s) were on the WAN because all of the transfers were
>>single TCP streams.
>>Also, although there were an additional ~80 servers in the "unregistered
>>IP address" category yesterday vs. the day before, John says that he used
>>only a few interactive nodes at SDSC (allocated to him by the local
>>scheduler) and a few nodes at U-C (same). It shouldn't have added up to
>>80 nodes total. So it might actually be that the additional 80 servers
>>were not related to John's testing. (I think it more likely is and we
>>just haven't figured out how it happened.)
>>FYI, John is apparently running these tests to generate some interesting
>>data on the performance difference resulting from changes made in the
>>development version of the GridFTP code.
>>I'd be interested in your thoughts regarding whether people doing testing
>>like this should be turning usage reporting off or not. Anyone have an opinion?
>> -- Lee
>>
>>On Oct 18, 2006, at 12:05 PM, Ravi Madduri wrote:
>>
>>>Lee
>>>John was testing some control channel command pipelining yesterday to be
>>>shown as part of a SC Demo. I am cc'ing john
>>>
>>>Lee Liming wrote:
>>>>Ok, this was an interesting test case of what we can (and can't) learn
>>>>using our usage stats data and the current report generators.
>>>>Here's what I can tell from the reports that we currently generate for
>>>>GridFTP.
>>>>SUMMARY: The most likely theory, I think, is that someone is using a
>>>>post-GT4 development version of GridFTP for stress testing purposes on
>>>>a LAN using third-party transfers between a large number of servers
>>>>(somewhere around 80, I suspect).
>>>> -- Lee
>>>>Details:
>>>>The resulting data points were gleaned by comparing the usage profile
>>>>on the 16th (normal usage level) with the 17th (usage spike). This
>>>>comparison is not exact, of course, because the "control" usage (the
>>>>baseline, ordinary usage) likely varied a bit between the two days as well.
>>>>-- Most relevant, I think: The GridFTP version used was version 3.2.
>>>>This version is a development version, post GT-4.0. So someone is
>>>>using what is probably the latest development version for testing purposes.
>>>>-- Something we need to follow up on: The data in the database
>>>>suggests that there were actually ~3.3M transfers reported yesterday,
>>>>not the 1.5M that the daily summary report showed. (The daily summary
>>>>report shows packets received, not number of transfers, but I would
>>>>have thought they'd be the same.) I'm not sure why these numbers
>>>>disagree but it's sure to be interesting. A possible clue is that in a
>>>>third-party transfer, both the sending server and the receiving server
>>>>are likely to send reports. But one would think that would be counted
>>>>both in the packets received and in the transfers reported, so it's not
>>>>clear why one is still twice the other.
>>>>-- The transfers were all single stream, single stripe (not striped).
>>>>-- The file sizes were mostly in the 0 - 100kb range, though it
>>>>appears that a smaller set of them (10% or less) were in the 100kb - 1mb range.
>>>>-- A very large number of the transfers (~1.5M) appear to have been
>>>>handled at transfer rates of 100mb/s or higher. A smaller number were
>>>>in the lower bandwidth ranges (10mb/s - 50 mb/s, 500kb/s - 1 mb/s, etc.).
>>>>NOTE: Given that these were all single stripe, single stream transfers,
>>>>it suggests to me that these were transfers on a local area network.
>>>>-- With regard to the servers that were used, the main difference
>>>>between the 17th and the 16th is that on the 17th, there were quite a
>>>>number of transfers between hosts with unregistered IP addresses (no
>>>>reverse DNS entry). On the 16th, there were ~20 servers in this
>>>>category. On the 17th, there were slightly over 100 servers in that category.
>>>>NOTE: This might further support the theory that the "extra" transfers
>>>>were done between hosts on a LAN.
>>>>-- There was a 50/50 breakdown between STOR and RETR requests.
>>>>NOTE: I think it's safe to assume from this that what we're seeing are
>>>>third-party transfers between GridFTP servers, and we're seeing the
>>>>reports from both the sending hosts and the receiving hosts. (So the
>>>>true number of transfers is probably 50% of the total reported, because
>>>>each transfer is reported twice, once by sending host and once by receiving.)
>>>>-- All of the transfers that were reported had return code 226 (which I
>>>>assume is success?), but based on the fact that I've never seen a
>>>>different response code in any of our reports, I suspect that we only
>>>>report successful transfers.
>>>>-- The TCP buffer sizes used were mostly in the 10kb - 100kb range, but
>>>>some--slightly over 500k--were in the 100kb - 1mb range. (Just a
>>>>guess, but I'd bet those were the files that were >100kb in size.)
>>>>The block sizes used were all 100kb - 1mb.
>>>
>>>--Ravi K Madduri
>>>The Globus Alliance | Argonne National Laboratory
>>>http://www-unix.mcs.anl.gov/~madduri
>
>--
>Ravi K Madduri
>The Globus Alliance | Argonne National Laboratory
>http://www-unix.mcs.anl.gov/~madduri
_______________________________________________________________
Ian Foster -- Weblog: http://ianfoster.typepad.com
Computation Institute: www.ci.uchicago.edu & www.ci.anl.gov
Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
Tel: +1 630 252 4619 --- Globus Alliance: www.globus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.globus.org/pipermail/metrics-dev/attachments/20061018/36ab9efb/attachment.htm>
More information about the metrics-dev
mailing list