|
|
The RDP Tutorial
|
|
This tutorial will take the reader through the analytical functions of
the Ribosomal Database Project. The tutorial assumes that the reader
has basic computer skills as well as at least an advanced
undergraduate standing in molecular biology and evolution. The window
in which the tutorial appears will open when the Begin
Tutorial button is clicked. The window from which you are now
reading will be used by the tutorial pages to present the analytical
functions of the RDP. Hence if you select Sequence Match from
the tutorial Table of Contents (TOC), the window containing the
tutorial text will advance to the selected section and the main
browser window (this window) will advance to the Sequence Match
page of the RDP. Navigation buttons are provided on the top/bottom of
pages within the tutorial. The following introductory pages describe
sequence analysis at the RDP.
|
|
|
Structure of the RDP
|
|
Before we begin the tutorial, let s look at the structure of the
RDP. The RDP was initially designed as a repository of rRNA sequences
for use by investigators interested in phylogeny as well as rRNA
structure and function. To that end, rRNA sequences were deposited
along with annotation that included the source of the sequence and
references to the literature. The sequences are currently stored in
flat text files that can be queried in many different ways. More
importantly, the sequences are curated by the RDP, meaning that when a
new sequence is added to the RDP, it is aligned to all of the
sequences in the current database release and the references are
checked. The alignment is based on models of secondary and tertiary
structure of rRNA derived from chemical, physical, and comparative
sequence analyses (see Figure 1). Keep in mind that no
alignment is perfect (see below).
|
|
Figure 1. Secondary structure model for E. coli 16S
|
|
|
Because sequences are added to databases on a daily basis, there are
two files maintained by the RDP, one containing aligned sequences and
one containing newly deposited sequences that have yet to be
aligned. The unaligned sequences are included in the RDP analysis
functions wherever possible because they represent a significant
collection of information. In some instances however, the nature of
the query demands that only the aligned sequence database is
addressed. The general structure of the RDP, presented in Figure 2,
depicts a user addressing the database through the provided analytical
functions. The two diagramed queries address the RDP through either
the aligned database (eg. Sequence_Align function), or the complete
database including both the aligned and unaligned databases
(eg. Sequence_Match function).
|
|
Figure 2. The structure of the RDP at Michigan State
|
|
|
|
Entering the Online Analyses pages of the RDP
|
|
All analytical functions of the RDP are accessed by clicking the
Online Analyses button on the RDP homepage
(Figure 3). Doing so takes the user to a page that lists the
analytical functions available online.
|


|
Figure 3. Entering the Analyses Page of the RDP
|
|
|
|
Sequence Analysis at the RDP
|
Online Analyses functions currently available at the RDP are
graphically listed in Figure 3. The first page of Online
Analyses presents a list of functions that is hypertext-active
with buttons linking the user to detailed descriptions of each
function (the info buttons) as well as to the work page of each
analysis function. In addition, the database within the RDP that is
accessed for each analysis function is indicated. As described above,
some functions address the aligned database and others require only
unaligned sequences. In the latter case, all sequences currently
stored in the database, including both aligned as well as those that
have yet to be brought into alignment, would be queried. The analysis
functions are described briefly below.
-
Probe Match searches the complete database for sequences that
match a user-provided oligonucleotide sequence (probe).
-
Sequence Match searches the complete database for the sequences
that most closely match a user-provided sequence.
-
Sequence Aligner searches the aligned database for sequences
that most closely match a user-provided sequence and aligns the
submitted sequence with the identified closely related sequences.
-
Similarity Matrix searches the aligned database for sequences
that most closely match a user-provided sequence; aligns the submitted
sequence with related sequences; creates a mask or filter that
eliminates all ambiguous positions, and calculates a similarity matrix
based on "identity" of characters at the unambiguous positions.
-
Chimera Check attempts to find a "breakpoint" at which the
inferred phylogenies from the regions on each side of the point are
different in a user-provided sequence.
-
Alignment Slices permits the user to specify a region of the
16S rRNA alignment to download.
-
T-RFLP constructs a similarity matrix for T-RFLP profiles
provided by the user.
-
TAP-TRFLP this function allows the user to perform an in silico
digestion of the entire database and determine the size of terminal
restriction fragments generated through a user-provided primer
sequence and a user-selected restriction enzyme(s).
-
Sub-Trees this function constructs a graphic java display of
phylogenetic trees provided by either the RDP (17 provided) or the
user.
-
Sequence Selection this function enables the user to select
specific sequences or entire phylogenetic groups to download.
-
Phylip Interface NEW! this function
allows the user to construct UPGMA or Neighbor Joining trees through a
graphic interface with Phylip 3.5c. Both user-supplied sequences and
RDP-selected sequences can be incorporated into the analysis. Matrices
and trees can be downloaded in several formats.
|
|
|
The Design of the Tutorial
|
|
Figure 4. Flow chart of RDP Tutorial
|
|
|
The order of analysis functions presented in this tutorial is designed
to reflect a typical scenario of investigators in the
field. Frequently a microorganism is isolated and its rDNA is cloned
and sequenced. The investigator then connects to the RDP and attempts
to compare his/her sequence with sequences in the database in order to
identify the closest phylogenetic relative. This working scenario will
guide us through the analytical functions of the RDP. The series of
steps in this scenario are presented in Figure 4. Note that the
first analysis function discussed is Chimera Check. This is the
logical starting point in as much as an initial triage that eliminates
chimeras will conserve time and energy. The remaining analysis
functions are described in the order presented in Figure 4.
|
|
|
|
References
|
-
Woese, C. R. 1987. Bacterial evolution. Microbiol Rev
51:221-71.
-
Olsen, G. J., N. Larsen, and C. R. Woese. 1991. The ribosomal
RNA database project. Nucleic Acids Res 19 Suppl:2017-21.
-
Maidak BL, Cole JR, Lilburn TG, Parker CT, Saxman PR, Farris RJ,
Garrity GM, Olsen GJ, Schmidt TM, Tiedje JM. 2001. The RDP-II
(Ribosomal Database Project).Nucleic Acids Res. 29(1):173-4.
-
Olsen, G. J., and C. R. Woese. 1993. Ribosomal RNA: a key to
phylogeny. Faseb J 7:113-23.
-
Olsen, G. J., C. R. Woese, and R. Overbeek. 1994. The winds of
(evolutionary) change: breathing new life into microbiology. J
Bacteriol 176:1-6.
-
Pace, N. R., G. J. Olsen, and C. R. Woese. 1986. Ribosomal RNA
phylogeny and the primary lines of evolutionary descent. Cell
45:325-6.
-
Pace, N. R. 1995. Opening the door onto the natural microbial
world: molecular microbial ecology. Harvey Lect 91:59-78.
-
Gutell, R. R., N. Larsen, and C. R. Woese. 1994. Lessons from
an evolving rRNA: 16S and 23S rRNA structures from a comparative
perspective. Microbiol Rev 58:10-26. The Ribosome : structure,
function, and evolution. 1990. Walter E. Hill ed.Washington, D.C. :
American Society of Microbiology.
|
|
|
|
[ Home | News | Download Area | Online Analyses | Documentation | Citation | Contacts ]
|
|
Questions? Mail them to RDP-II Web Support.
|