RDP Overview FAQ
Overview
This FAQ contains information related to using the RDP database. Please direct all corrections and additions to RDP-II General Support (rdpstaff@msu.edu). This FAQ can be found on the Web at http://rdp.cme.msu.edu/docs/general_faq.html. Information in this FAQ is primarily based on e-mail received via the RDP-II Support page linked at the bottom of the Online Analysis form.
    Frequently Asked Questions
  1. Q: I am a first year graduate student just starting to work with molecular techniques to look at microbial ecology. I would appreciate an overview of what data and services are available at the RDP website. Currently I am interested in analyzing sequences I may get from PCR primers.
    A: The RDP provides several tools which analyze ribosomal RNA sequences. One important tool is Chimera Detection -- this will suggest whether the PCR product came from two different sequences and is thus a chimeric sequence. Using Sequence Match, Sequence Align, and Similarity Matrix, you can determine which sequences are similar to the ones that you amplified, align them, and calculate a similarity/distance matrix. Using the Phylip Interface, it is possible to generate a corrected distance matrix and a Neighbor-joining or UPGMA tree based on the matrix. It is also possible to edit and download the tree in four formats.

  2. Q: I would like to submit a partial 16S rRNA sequence (bacterial, ABI traces, 900bp) for alignment and tree reconstruction analyses.
    A: ABI traces (or chromatograms) are not a suitable format for sequence analysis. Your sequence should be submitted as, in the simplest case, a text file, but many other file formats are accepted. Please see our Tips page.

  3. Q: I am working on a project where I need to use FASTA-formatted ssu rRNA sequences. I would really like to obtain the RDP database in this format (not Genbank). Is that available from you? If not, do you know of software which will take the RDP Genbank formatted unaligned sequences and convert them all into FASTA format?
    A: At present, RDP-II is not able to provide the data in FASTA format. ReadSeq (a program available from Indiana University Dept. of Biology) should be able to do this. The URL for ReadSeq is: http://iubio.bio.indiana.edu/IUBio-Software+Data/molbio/readseq/.

  4. Q: I am a Research Associate working on a project where I am studying bacterial diveristy [sic] in forest soils. We have about 900 16S ribosomal partial sequences we are aligning and performing phylogenetic analysis. I would like to learn more about filters and masks (including lane masks and other masks used for diverse bacterial sequences or sequences from a common bacterial division). I would appreciate any information you can provide me or direct me where I could get more information.
    A: Filters and masks are generally acknowledged to be important in phylogenetic analysis, but are seldom discussed beyond the phrase "the analysis was carried out using unambiguously aligned positions" or words to that effect. The following references may be of some help.

    Olsen GJ, Woese CR. Ribosomal RNA: a key to phylogeny. FASEB J 1993 Jan;7(1):113-23

    James BD, Olsen GJ, Pace NR. Phylogenetic comparative analysis of RNA secondary structure. Methods Enzymol 1989;180:227-39

    Olsen GJ. Phylogenetic analysis using ribosomal RNA. Methods Enzymol 1988;164:793-812

    "Molecular Systematics" edited by Hillis et al. (1996, Sinauer Associates, Sunderland,MA)

    The ARB software package offers users at least four ways of generating masks/filters. It also offers an excellent way of organising your sequences and allied data. It is available from the Technical University of Munich's web site. Also, the program DNArates , available at our website in the download area, can be very useful in making masks/filters.



  5. Q: Can partial sequences of 16S rDNA sequences be submitted directly to RDP or should they be submitted to another data base such as Genebank [sic]?
    A: At the present time, it is better for the RDP and the scientific community if you submit the sequences to another database such as GenBank or EBI because they assign a unique, standardized accession number to the sequence. After you submit the sequence, you can send the RDP the accession number assigned to the sequence by GenBank.

  6. Q: I just submitted two 16S sequences to GenBank. How long will it take to have these sequences in the RDP datadbase [sic]? Do I need to submit them separately to the RDP? Would this speed up the process?
    A: The RDP has developed tools for harvesting ribosomal RNA sequences from the sequence repositories. Tests lead us to believe we are retrieving over 95% of the sequences of interest. It would be better, at present, if you sent your sequences to GenBank (or EMBL or DDBJ). We are currently developing a submission tool for the RDP that would allow users to submit sequences to the RDP and GenBank simultaneously. This will also allow users to enhance their submissions with information of interest to RDP users, for example, detailed environmental information about the source of the sequenced organism or clone, that is not collected by GenBank.

  7. Q: (1)Is the RDP short ID, the only unique (and unchanging) identifier for sequences in the RDP i.e. equivalent to an accession number? (2) Is the one line of information in the phylogenetic/alphabetic listing of organisms in the aligned set of sequences the only information obtainable from the RDP about the organisms. I am interested in sym.Ricket an unclassified bacterial endosymbiont (SSU data) - I would like to know which eukaryote this endosymbiont is found in.
    A: The short answers to your questions are Not really and No. The RDP shortID is unique for the sequences in the RDP, it may change if the organism name or species changes, and it shouldn't really be considered equivalent to an accession number. The RDP has discussed the need for an unchanging unique identifier; I'm not certain when such a unique identifier system will be implemented. Regarding the amount of information available for a sequence, there should be more in the individual file for each sequence than the one line in the DESCRIPTION.

  8. Q: I recently downloaded TreeTool 2.0.1 on to my Irix machine. I changed the make file to find the xv directory on my system. But when I run make, I receive these error messages. I was wondering if you could let me know how to make TreeTool work on the Irix architecture.
    A: Mike Maciukenas, who wrote this program, left the RDP in 1992 and I'm afraid that no one remains on staff who can help with the errors you obtained when trying to load TreeTool. The software remains on the server because it can still be useful in situations where it can be installed successfully.

  9. Q: Will your 16S rRNA alignments be standardized to the 16S rRNA alignment of the german database program ARB? There was some talk of this at the "Workshop on the Phylogeny of Prokaryotes Based Upon Sequence Similarity of the Small Ribosomal Subunit" held in Michigan in October 1997 (sponsored by the NSF Center for Microbial Ecology at Michigan State University and Bergey's Manual Trust).
    A: I anticipate that in the future that the RDP will attempt to coordinate its alignments with the ARB alignment; a mapping of one alignment to the other via an E.coli template would be straightforward. However, the subject of alignments and their standardization is an ongoing topic of dicussion here. We would be very interested to hear users' thoughts and ideas about alignments.

  10. Q: I am a little surprised to find that your Ribosome data base has no mention of proteins. Perhaps it should be called a Ribosomal RNA database. Or links to useful ribosomal protein databases would be useful for those of us who approach the ribosome from the protein perspective.
    A: Thank you for your message -- the staff has discussed the lack of ribosomal protein sequence data in the RDP several times. At present we do not have enough resources to add these data to the RDP. Do you know of any links that we can add to the site until we add ribosomal protein data?

    NB:We subsequently received a link from this correspondent: (http://geta.life.uiuc.edu/~nikos/Ribosome/rproteins.html) and it now appears on the RDP Links to other WWW sites page. Anyone who knows of web pages related to ribosomal topics is welcome to send the address to us and we will add it to our Links page.



  11. Q: How do I search for specific rRNA sequences using your alphabetical list? Can I use information from the alphabetical list to search for a sequence? I[s] there a way to extract a few 16s sequences from your files?
    A: The alphabetical list is just that - a list. It does not have, for example, hyperlinks to sequence files. The alphabetical listing files are found in the "Organism lists" folder in the Download Area. This folder contains a subfolder "alignments" which in turn contains the subfolder "sequences". If the organism you are interested in appears on the alphabetical list you can retrieve the sequence(s) from the appropriate folder inside the "sequences" folder. If you know the phylogenetic placement of the sequence(s) of interest, you can also use the Sequence Selection tool in the Online Analyses area to create a list of files for downloading.

    NB:With Release 8.1, it is now possible to search sequences from within the Hierarchy Browser. Users can search on the organism name, the RDP short ID, or a culture deposit number and the search results can be selected as a group.

  12. Q: We are interested in getting the secondary structure of the SSU for beetles or an organism closely related to them. From your home page we don' t understand how to decode the information on the secondary structure apparently contained in the sequences. Is there a fool proof descritption somewhere on how to understand the secondary structure information? Is the structure available in some kind of graphics file? If so, which program do we need to view it on a Macintosh?
    A: All secondary structures provided by R. Gutell to the RDP are in PostScript format. They are found in the Download Area in the "Secondary structure diagrams" folder. Download the structures you are interested in by choosing "Save As..." from the web browser and saving the file as text. PostScript files can be viewed using the 'ghostview' application on most UNIX machines, or by using software freely available at http://www.cs.wisc.edu/~ghost/index.html on Windows 3.x/95/98/NT, Macintosh and OS/2 Warp machines. These PostScript files can also be viewed in Adobe Illustrator. If you are on a Mac and simply want to print these files, and you are connected to a PostScript printer, just drag the downloaded file onto the printer icon.

    NB:The secondary structure files previously provided by Dr. Robin Gutell have been removed prior to Release 8.0 as Dr. Gutell has a new WWW site which contains the newest versions of many more secondary structure files. The Gutell Lab URL is http://www.rna.icmb.utexas.edu/.

  13. Q: 1. Can I enter in say, 4 of my representative sequences and obtain suggested secondary structure information for them? 2. Better yet, can I enter in 5-6 sequences from GenBank (including outgroups) along with 4 of my representative sequences and obtain some kind of lineup based on secondary structure?
    A: RDP-II does not offer a secondary structure generator. The alignments provided do, of course, take secondary structure into account. Thus you could generate an alignment using the Sequence Aligner and, using this alignment as a guide, map your sequence onto the secondary structure most closely related to your sequence.

  14. Q:  How do I go about generating a phylogenetic tree for an organism? I can pull 16S sequence from genbank.