Ephemeris 1.0 Readme
Summary
Ephemeris searches for repetitive elements in DNA sequences. It searches through any text file to find the repetative elements. It was designed to find microsatellites (aka, SSR's, SSLP's, or STR's), and, therefore, includes all possible mono, di-, tri-, and tetra-nucleotide repeats as the "Built-in patterns". It allows me to quickly scan through large numbers of ABI sequencing files to determine which ones have "good repeats" & are thus deserving of further attention. Since sequencing errors are frequent in repeats, especially as it goes on & on, the program will tolerate N's, treating them as the "correct" base to continue the repeat. You will, therefore, notice that strings of N's are "incorrectly" identified as repeats. Thus, it is nice to delete "leading and trailing" N's from ABI files before you drop them on ephemeris 1.0; alternately, you can ignore the many extra lines of junk output.
OS & Hardware
- The program was written for Macs and has been tested under OS 8.0 & 8.6, on Mac G3's, 8500's, and Power Computing Powerbases. It should be relatively easy to modify the code to have it run on any computer, but that's up to you to do.
- Most of the mac pleasantries, such as drag & drop and dialog boxes, do function appropriately.
Getting Started
- You must download and have MacPerl installed on your computer. For info go to:
- To keep things organized, I keep ephemeris 1.0 in it's own folder with this readme, the xx.pat file(s), and data files in their own folders within the ephemeris folder. Most of this is simply a matter of preference, but you can see the rationale below.
Input Files
- Any text file should work. I have tested the xx.seq files output from ABI sequencers, text files from SimpleText, and text files output from Sequencher. I definately encourage you to import your ABI files into Sequencher, automatically trim the crap off the ends (and vector sequence), output the data from the trimmed files as ASCII plain text from Sequencher and then search those files. If you don't have Sequencher (or something similar), make use of the ABI programs to trim off N's from the beginning and end of the files.
- Multiple files can be highlighted and dropped on the ephemeris icon. There is an upper limit of files that can be processed, somewhere around 40 on my computers.
- You can NOT drop a folder of files to processed. Sorry.
- The files do not have to be in the same place (folder) as the ephemeris 1.0 program. The output files, however, will be placed where the program is, not where the files come from.
Pattern Files
- The "Built-in patterns" include all possible mono, di-, tri-, and tetra-nucleotide repeats. [If I'm wrong about this, I'd really appreciate an e-mail letting me know what I've got wrong.] The lowest alphabetical designation for each unique repeat is given. This same information is given in the Msat.pat file.
- You can search for any particular repeat you want by making a text file named as filename.pat and including a list of the repeats you want to find. See Msat.pat for an example. The listed information as should be structured as (core_repeat_unit)minimum_number_of_repeats, e.g., (AC)7
- You can put as many different repeats as you want into the filename.pat file, separating each by placing them on their own line. Reverse complements are found automatically, so you need only use "unique" repeats -> i.e., AC = CA = GT = TG.
- The filename.pat file must be in the same folder as ephemeris 1.0 to be found by the program.
Output Files
- Output from the program is directed to the MacPerl output window & an optional output file that will be written in the same folder as the ephemeris 1.0 program. If you've got the program on your desktop, the output files will go there.
- The output file can be renamed anything you want, but you can't change where it goes.
- The output file is a text document.
- The first thing listed is the files that are read, followed by:
sequence ID [position] repeats found
- Since sequencing errors are frequent in repeats, especially as it goes on & on, the program will tolerate N's, treating them as the "correct" base to continue the repeat. You should manually edit/inspect the sequences to ensure you get the correct count.
About the program's name:
An ephemeris is a table listing the spatial position of celestial bodies, including satellites, as a function of time. There's no real time component here, but hey, that's as close as we could get for a clever name [cleverness all belonging to Dean]. For more than you ever wanted to know on the subject, go to:
This program was written out the goodness of Dean Pentcheff's heart. Please don't make him regret doing it by pestering him with computer-specific question such as, "Why won't this run on my ..." or "Why won't Perl install on my...".
If you find this program useful, please send a postcard with a picture of your study organism or someplace nice where they are found [an e-mail with a scanned photo is an acceptable alternative].
Happy Searching & Good Luck,
Travis C. Glenn
Savannah River Ecology Laboratory
University of Georgia
Drawer E
Aiken, SC 29802
e-mail: Glenn@srel.edu&
Dept. of Biological Sciences
University of South Carolina
Columbia, SC 29208
e-mail: Travis.Glenn@sc.edu
----------------------------------------------------------------------------------------------------
The program is Copyright 1999 by N. Dean Pentcheff. All rights reserved. The program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The program may be redistributed under the terms of the GNU General Public Licence (any version) or Larry Wall's Artistic Licence (your choice). You can get a copy of the GNU Licence from <URL:ftp://prep.ai.mit.edu/pub/gnu/COPYING> (or with any GNU software, such as emacs or gcc), and the Artistic Licence from <URL:http://www.perl.com/perl/misc/Artistic.html> (or accompanying any distribution of the Perl language).
Computer Programs | DNA Lab Home Page | SREL Home Page
Last reviewed: December 7, 1999