The University of Georgia has signed
a five-year $3 million subcontract to develop a database that
will contain comprehensive information about some pathogens on a biodefense
priority list established by the National Institute of Allergy and
Infectious Disease.
The subcontract teams UGA with the University of Pennsylvania to develop
a “virtual database” that serves as a single access point
to genomic and related information about parasites in the phylum Apicomplexa,
which includes organisms that cause malaria and toxoplasmosis.
NIAID, part of the National Institutes of Health, awarded a total
of eight contracts in 2004 to establish national Bioinformatics Resource
Centers, including the Penn-UGA award.
Jessica Kissinger, assistant professor of genetics and a member of
UGA’s Center for Tropical and Emerging Global Diseases, is the
principal investigator for UGA; co-principal investigators are -Eileen
Kraemer, associate professor of computer science, and John A. Miller,
professor of computer science.
NIH has funded many genome-sequence projects over the past decade,
including more than 50 organisms that either are considered to
be biothreats or are related to emerging or re-emerging infectious
diseases. Once a genome is sequenced, a database project must be developed
to provide access to the data and provide tools to analyze it.
“You have to be able to read the sequenced genome, use
it, learn it and study it,” Kissinger says. “Few of the
genome projects had a database project built into the original sequence
proposal.”
Tools already have been developed to facilitate database construction
for single organisms. However, as more genomes are sequenced, additional
information can be gathered by comparing one genome to another.
Currently, existing apicomplexan databases do not provide access to
information about multiple organisms, making comparisons difficult.
Simultaneous access to information about multiple pathogens may accelerate
development of new vaccines, diagnostics and therapeutics.
Also, scientists want immediate access to as much information as possible
about a pathogen in the event of a sudden disease outbreak, Kissinger
says.
The UGA-Penn team plans to develop a database that links existing
databases for Plasmodium species,
the causative agent of malaria; Toxoplasma
gondii, a widespread parasite that is dangerous for pregnant
women and immuno-suppressed individuals; and Cryptosporidum
parvum, a common intestinal parasite that is also dangerous
for the immuno-suppressed.
“We could make the database for these organisms by collecting
all of the data together in one location,” Kissinger says. “I
call that the vacuum-cleaner approach. But there’s too much
data to suck it all up, so new approaches are needed.”
Instead, the UGA team will use a relatively new technology called
“Web services” to link the multiple databases. Web services
technology allows one database to talk to another database.
“In fact there will be multiple separate databases,” Kraemer
says. “But with this Web service layer that will go on top of
them, users will have the illusion that there is one database. They’ll
be able to ask a single question that applies to multiple databases
and get a response.”
The database for Plasmodium is already well advanced; for the other
parasites, sequence data is just becoming available. “Penn
is largely producing the infrastructure and tools to store and analyze
all the different data types,” Kraemer says. “We are working
to produce the Web services infrastructure that will sit on top of
that and allow the multiple separate databases to be linked.”
Each UGA collaborator contributes special skills to the project: Miller
has the Web service expertise, Kraemer has user interface and visualization
background, and Kissinger is expert in the use of molecular and computational
tools to study parasite genomes. “There’s a tremendous
amount of research involved in how to integrate the data—how
do you link this to that. It’s actually very hard,” Kissinger
says. “But, that said, once you figure how to solve the problem,
you have to put it on the Web, make it public, make it work and keep
it all going. That’s a huge effort.”
The other NIAID awardees are developing databases that provide access
to information about pathogens and vectors such as bacteria that cause
anthrax, plague, and water and food-borne diseases; viruses that cause
rabies, Ebola and influenza; and vectors such as mosquitoes.
“We hope some of the technologies that we develop to link our
databases will be useful for linking to the other databases created
by the eight Bioinformatics Resource Centers,” Kissinger says. |