NEW INTERNET PROTOCOL SETS MILESTONE FOR FAST
AND FRIENDLY TRANS-ATLANTIC
DATA TRANSPORT
transmission today by researchers at the
(UIC) who demonstrated the practicality of
transferring even very large
data sets over high-speed production
networks.
UIC's
Computing flashed a set of astronomical data
across the
gigabits per second --- 6800 times faster
than the 1 megabit per second
effective speed that connects most companies
to the internet.
In the test, 1.4 terabytes of astronomical
data was transmitted from
the NCDM at the
the same amount of data using the TCP
Protocol, which is the standard used
on the internet today for data transfers,
would take 25 days.
Moving large data sets over the internet
faces several hurdles:
First, the network infrastructure for long
distance 1 Gigabit per second
and 10 Gigabit per second network links is
still maturing and software that
can use this infrastructure is just being
developed. The UIC computer
clusters used for the test were connected to
the SURFnet network in
the quality and power of these, two of the
world's leading research
networks.
In the past, high-speed data transfers of very large data sets
have usually employed specialized
experimental networks and used data
protocols that did not allow other network
traffic to share the same link.
Second, today's predominant network protocol,
TCP, is not effective at
moving massive data over long distances. UDP, another network protocol
that is also widely deployed, cannot reliably
transport data (some data may
be lost) and is not friendly to other flows
(using it for large data
transfers can starve other network
traffic). Currently, efforts are
underway to improve TCP, to develop new
protocols to replace TCP, and/or to
develop protocols on top of TCP and UDP that
are effective for high
performance data transport.
To overcome these problems, in the past, high
speed data transfers of very
large data sets have used special purpose
research networks and employed
specialized data protocols that in practice
did not allow other network
traffic to share the same link.
Friday's test run used a new network protocol
called UDP-based Data
Transport or UDT, which was developed by the
Mining at the
protocols now being studied for high speed
data transfer, UDP-based
protocols can be used over today's Internet
without making changes to the
network infrastructure. Today's demonstration not only showed that
UDT was
fast, but also that it was friendly and could
effectively coexist with thousands of other networks connections.
The demonstration is part of an ongoing
international effort to find and
test new ways of reliably moving massive data
sets around the globe using
advanced networks and new data transfer
protocols. Such systems hold
enormous promise for advancing scientific
research, in addition to numerous
commercial applications. Today, although it is becoming common for
global
business to have important data in different
cities, it is still quite
difficult to integrate this data to create a
common view.
"Using UDT, it is now practical for the
first time to move even massive
data sets over very long distances in a
friendly fashion using today's
networks," said Robert Grossman,
Director of UIC's
Mining and President of Open Data Partners.
UDT is currently being used by several
international research
projects.
UDT is used by the OptIPuter, a research project developing next
generation computing infrastructures based
upon advanced photonics. UDT
also plays a role in research projects
developing high performance web
services, something that is required in order
to scale today's web services
to large remote and distributed data sets.
UDT is used as the network transport layer in
the joint University of
Illinois/Northwestern project on Photonic Data
Services (PDS), which is
developing open source data services for next
generation photonic networks,
such as the OptIPuter. The OptIPuter is an example of what are
sometimes
called lambda grids, distributed computing
infrastructures in which
applications can set up their own photonic
paths (lambdas) supporting data
transport at Gigabit per second speeds and
higher.
"Moving data at 6.8 Gigabits per second
across the
important milestone for the OptIPuter Project
and brings us a bit closer to
effective data management over lambda
grids," said Larry Smarr, Principal
Investigator of the OptIPuter Project and
Director of the
Institute for Telecommunications and
Information Technology, a UC San
Diego/ UC Irvine partnership.
UDT is also being used as one of the layers
of a UIC project called Open
DMIX (for Data Mining, Data Integration, and
Data Exploration), which is
developing open source high performance web
services for data mining.
"Using UDT and the scalable data mining
and data integration web services
built on top of it may emerge as an important
enabling technology for the
grid computing required for next generation
virtual observatories,"
according to Alex Szalay, Alumni Centennial
Professor in the Department of
Physics and Astronomy at The
The tests were made possible by support from
the following manufacturers
and organizations, who have generously
contributed their equipment,
facilities, and know-how: OMNInet, StarLight,
Nortel, SARA and
CANARIE.
Partial funding for the tests was provided by the National
Science Foundation (Grants 0129609, 9977868
and 0225642) and the University
of
For more information, contact:
Shirley Connelly, Associate Director, NCDM
312 413 2176, connelly@uic.edu.
Robert Grossman Director, NCDM
312 413 2176, grossman@uic.edu.
The
Chicago (UIC) was established in 1998 to
serve as a national resource for
high performance and distributed data mining.
The Center sponsors research
projects, facilitates standards, operates
testbeds, and provides
outreach.
The Center is coordinating the development of the Predictive
Model Markup Language (PMML), the standard
for statistical and data mining
models, as well as the WS-DMX web services
for data mining and data
exploration standard. The NCDM also operates the Terra Wide Data
Mining
Testbed, a worldwide testbed for high performance
and distributed data
mining. For more information about NCDM, see www.ncdm.uic.edu.
SURFnet
SURFnet operates and innovates the national
research network in The
the
sustained effort to improve the
infrastructure and to develop new
applications to give users faster and better
access to new Internet
services. Currently SURFnet's network
innovation is funded by the Dutch
government via the GigaPort project. For more
information please visit
About the OptIPuter
The OptIPuter, started in October 2002, is a
five-year, $13.5 million
project funded by the National Science
Foundation. It will enable
scientists who are generating massive amounts
of data to interactively
visualize, analyze and correlate their data
from multiple storage sites
connected to optical networks.
partners at
Information Sciences Institute at
Microsystems, Telcordia Technologies, Inc.
and Chiaro Networks. See
---------------------------------------------------------------i2-news-+
For list utilities, archives, subscribe,
unsubscribe, etc. please visit the ListProc web interface at
http://archives.internet2.edu/
---------------------------------------------------------------i2-news--