Categories
Network Performance news Research

Data transfer expectations

A common question that comes up when our Jisc network performance team talks with researchers and other people at Janet-connected sites is whether it’s realistic to expect to be able to move a certain amount of data over the network in a certain period of time. What are reasonable expectations?

You can come at this question from at least two perspectives.

For the researcher or research community wanting to move data, how much data is being generated or transferred? Over what period of time? If data is being sent to a remote compute facility to be returned for visualisation, what is an acceptable turnaround time? Do whole copies of data sets need to be moved, or just incremental changes or differences? And to what extent will that requirement grow as the project progresses? In general, a ball park idea is sufficient to understand at least the order of magnitude of throughput required, but the more accurate the estimate, the better.

Ultimately the question is, given the volume of data, and the required time, what data rate do I need to achieve?

A second perspective lies with a campus IT team, who will have established a certain capacity of connectivity to Janet. A typical university may have a 10Gbit/s link, with a second link as a backup. Campus connectivity to the server to/from which files need to be moved might be 10Gbit/s, but it may be 1Gbit/s. The best data rate available will be that of the lowest capacity link on the end-to-end path between the storage servers, in ideal conditions, with no other users moving data over that path.

Does that capacity allow a desired data rate to be achieved, allowing for other users on the network?

To ground the question in tangible figures, the following chart, drawn from an article by ESnet, shows the rate required to move varying volumes of data in increasing periods of time. Or to flip the question, given a certain data rate available on a path, how long would it take at best to move a certain volume of data over that link?

It can be useful to remember specific figures from the chart. For example, to move 1TB in one hour, you need to average just over 2Gbit/s. Extending that if you could sustain a 2Gbit/s average rate you should be able to move around 10TB overnight between two sites. If you’re fortunate enough to have a 100Gbit/s link available, as a few Janet campuses now do, you could potentially move 1PB in a day.

The chart of course assumes ideal conditions – that you have sole use of the capacity, your data transfer tools are efficient, your file servers well-tuned, and no stateful firewall/IDS is reducing your data rate. Longer paths, with higher round trip times (RTTs) for packets and the greater potential for packet loss, will also make achieving high rates more challenging, but don’t underestimate what may be possible.

That’s not always the case, and that’s why in recent years the Science DMZ model, as documented by ESnet, has proven popular. It’s the same model that the CERN experiments, including GridPP in the UK, have evolved from experience over the years. You can read more about that in our advice and guidance page on large scale data transfers over Janet.

If you’re looking to move large volumes of data over Janet, please do talk to our network performance team, reachable via netperf@jisc.ac.uk, or ask your Jisc relationship manager to put you in touch. And don’t be disheartened if you’ve tried moving data and seen poor results – there’s a good chance that your experience will be so much better if researchers, campus IT teams and Jisc work together to identify the bottlenecks and implement improvements. Your relationship manager should also be able to help with discussions on the capacity of your campus Janet network connectivity should you feel it needs to be increased.

The NetPerf team can advise, and assist with use of our network performance test facilities which include iperf servers, data transfer nodes, and perfSONAR servers connected to the Janet backbone at 10G and 100G capacities. We also have monthly Research Network Engineering calls which you can join, open to researchers, campus IT staff, or anyone interested – see the RNE community page to find out more and access past materials via our community Teams area.

If in doubt, please get in touch, the Jisc NetPerf team is here to help.

Leave a Reply

Your email address will not be published. Required fields are marked *