PPI logo P. s. pv. tomato DC3000 P. s. pv. phaseolicola 1448A
Artemis Tutorial for the PPI Community

genome resources
home page


The purpose of this tutorial is to introduce new users to some of the tools in Artemis that are most relevant to analyses of the P .syringae genomes. A more comprehensive description of Artemis capabilities is available in the Artemis manual available at the Sanger Institute website. Answers to some questions are also available at Artemis FAQ for PPI users. Though this tutorial was written for early versions of Artemis, the steps described apply to subsequent versions.

Note to Mac users: this tutorial has been designed using Artemis run on Windows. However, all of the concepts described here are applicable to use of Artemis run on the Mac. For specific instructions related to use on the Mac, see Tips for Mac users.

I. Artemis terminology and file format
II. Obtaining the Artemis program and Artemis-readable files
III. Visualizing preexisting entries and their features

A. Starting the Artemis program
B. Locating features of interest using the Goto menu
C. Visualizing feature properties using the View menu

IV. Creation of new entry files

A. creating and naming a new entry file
B. selecting and moving features into your new entry

V. Changing colors and other display properties
VI. Creating and formatting new features
VII. Tips for Mac users

I. Artemis terminology and file format

A. Terminology

Entry

a file specifying a DNA sequence and/or one or more features in Genbank or EMBL format. Entries which do not contain sequence are generally referred to as feature tables or tab files

Feature

a unit of information describing some aspect of the genome.
Examples include: gene, CDS, promoter, operon

Key

the term in the entry file specifying the type of feature. Examples include,"gene" and "CDS" (see sample entries below)

Location

genome coordinates of the feature. Given on the same line as the key (see sample entries below)

Qualifiers

terms specifying additional information about the feature. They are listed below the location and delimited by / and =. Examples include gene, note, product, and translation.

B. File format
Sequence files must be in FASTA format
Files specifying features must be in either Genbank or EMBL format

Sample entry specifying the first two features of the DC3000 chromosome in EMBL format:

FTsource1..6397126
FT/organism="Pseudomonas syringae pv. tomato str. DC3000"
FT/strain="DC3000"
FT/db_xref="taxon:223283"
FT/note="pathovar: tomato"
FTgene 339..1874
FT /gene="dnaA"
FT /note="locus_tag: PSPTO0001"
FTCDS339..1874
FT /gene="dnaA"
FT/codon_start=1
FT /transl_table=11
FT/product="chromosomal replication initiator protein DnaA"
FT/protein_id="NP_789863.1"
FT /db_xref="GI:28867244"
FT/translation="MSVELWQQCVELLRDELPAQQFNTWIRPLQVEAEGDELRVYAPN
FTRFVLDWVNEKYLGRLLELLGERGQGMAPALSLLIGSKRSSAPRAAPNAPLAAAASQAL
FTSGNSVSSVSASAPAMAVPAPMVAAPVPVHNVATHDEPSRDSFDPMAGASSQQAPARAE
FTQRTVQVEGALKHTSYLNRTFTFENFVEGKSNQLARAAAWQVADNPKHGYNPLFLYGGV
FTGLGKTHLMHAVGNHLLKKNPNAKVVYLHSERFVADMVKALQLNAINEFKRFYRSVDAL
FTLIDDIQFFARKERSQEEFFHTFNALLEGGQQVILTSDRYPKEIEGLEERLKSRFGWGL
FTTVAVEPPELETRVAILMKKADQAKVDLPHDAAFFIAQRIRSNVRELEGALKRVIAHSH
FTFMGRDITIELIRESLKDLLALQDKLVSVDNIQRTVAEYYKIKISDLLSKRRSRSVARP
FTRQVAMALSKELTNHSLPEIGDVFGGRDHTTVLHACRKINELKESDADIREDYKNLLRT
FTLTT"

Sample Entry specifying the two features listed above in Genbank format:

FEATURESlocation/qualifiers
source1..6397126
/organism="Pseudomonas syringae pv. tomato str. DC3000"
/strain="DC3000"
/db_xref="taxon:223283"
/note="pathovar: tomato"
gene 339..1874
/gene="dnaA"
/note="locus_tag: PSPTO0001"
CDS339..1874
/gene="dnaA"
/codon_start=1
/transl_table=11
/product="chromosomal replication initiator protein DnaA"
/protein_id="NP_789863.1"
/db_xref="GI:28867244"
/translation="MSVELWQQCVELLRDELPAQQFNTWIRPLQVEAEGDELRVYAPN
RFVLDWVNEKYLGRLLELLGERGQGMAPALSLLIGSKRSSAPRAAPNAPLAAAASQAL
SGNSVSSVSASAPAMAVPAPMVAAPVPVHNVATHDEPSRDSFDPMAGASSQQAPARAE
QRTVQVEGALKHTSYLNRTFTFENFVEGKSNQLARAAAWQVADNPKHGYNPLFLYGGV
GLGKTHLMHAVGNHLLKKNPNAKVVYLHSERFVADMVKALQLNAINEFKRFYRSVDAL
LIDDIQFFARKERSQEEFFHTFNALLEGGQQVILTSDRYPKEIEGLEERLKSRFGWGL
TVAVEPPELETRVAILMKKADQAKVDLPHDAAFFIAQRIRSNVRELEGALKRVIAHSH
FMGRDITIELIRESLKDLLALQDKLVSVDNIQRTVAEYYKIKISDLLSKRRSRSVARP
RQVAMALSKELTNHSLPEIGDVFGGRDHTTVLHACRKINELKESDADIREDYKNLLRT
LTT"

As illustrated above, most loci have been assigned two features - one specifying the gene and the other, the Coding Sequence (CDS).

II. Obtaining the Artemis program and Artemis-readable files

Users can choose between two basic options for viewing sequences in Artemis:

Option 1: Deploy Artemis over the network without downloading either the program or the sequence, as described in:
Tools for Genome Analysis - Search and view with the Artemis Genome Viewer (quick version)

Pros: Easiest approach for computing or bioinformatics novices. Users have ready access to the latest version of Artemis and the most up-to-date genome sequences available at NCBI and EBI
Cons: Requires that your computer have Java Webstart capability and a high speed internet connection

Option 2: Download Artemis and the genome sequences to your hard drive as described below:

  1. Create an Artemis folder or directory on your local system
  2. Running Artemis requires a Java platform. Mac OS X and Windows XP come with Java pre-installed, but if you are using another operating system, you may need to install Java before running Artemis. Instructions for downloading the version of Java appropriate to your system can be accessed through the Sanger website.
  3. Download the version of Artemis appropriate for your system from the Sanger website to your Artemis folder. Mac users: see Tips for Mac users.
    TIP: Artemis should download as a application file with a .jar extension. If you receive a ZIPed folder when downloading from WWW, try downloading again from the FTP site. Alternatively, you can manually delete the .zip extension and select a non-compressed format before saving to your hard drive.)
  4. Open the sequence file of interest and save as a text file (with extension .txt, .gbk or .tab) to your Artemis folder. RefSeq accession numbers in the table below are hyperlinked to the corresponding files at the RefSeq FTP site. Users are advised to download sequence from FTP sites, as the hyperlinks present in other file formats interfere with file loading by Artemis.
    TIP: If you have difficulty loading the sequence into Artemis, open the text file and verify that the entire sequence was successfully downloaded.
Sequence/EntryEBI/Genbank accession RefSeq accession
Pto DC3000 chromosomeAE016853 NC_004578
Pto DC3000 pDC3000AAE016855 NC_004633
Pto DC3000 pDC3000BAE016854NC_004632
Psy B728a chromosomeCP000075 NC_007005
Psp 1448A chromosomeCP000058 NC_005773
Psp 1448A p1448A-ACP000059NC_007274
Psp 1448A p1448A-BCP000060 NC_007275

III. Visualizing preexisting entries and their features

A. Starting the Artemis Program

1. Click on the Artemis application. The window pictured immediately below will appear (Figures below are taken from Artemis v5 but are generally applicable to later versions.)

2. Go to File>Open and select your sequence file. A window similar to the one below will appear (DC3000 genome sequence and annotation are shown in the example below):

main Artemis window with description

B. Locating features of interest using the Goto menu:

Bring up the Navigation window using Goto>Navigator or Ctl-G. Enter a search term.
Artemis navigator window

  • Double clicking on a feature will select it in all three windows (hold down the shift key to select more than one feature).
  • Go to the beginning and end of of selected feature(s) using Ctl-left arrow and Ctl-right arrow
  • Go to the beginning and end of the genome using the Ctl-up arrow and Ctl-down arrow

C. Visualizing feature properties using the View menu:

View the entire record for the selected feature using Ctl-V or View>View Selected Features
View>View Selection brings up a window showing the entire record PLUS base content, % GC, 1st and last 300 bps and translation (shown below)

Artemis selection view window

Other useful commands:

  • Ctl-W displays hydrophobicity plots
  • The View menu allows display of bases and amino acids in FASTA format which can be copied and pasted into BLAST windows
  • The Write menu enables sequences to be saved to files
    (TIP: you may have to add a .txt extension to the file after it has been saved to your hard drive before it can be opened)


IV. Creation of new entry files containing features from the full annotation

One of the greatest assets of Artemis is that the user can create entry files containing lists of genes or other features of interest, that can be color coded so they can be easily viewed in their genomic context.

  • Preexisting entry files corresponding to many of the major virulence genes are available for downloading can be found at genome analysis tools.
  • The following steps describe how to create novel entry files.

A. Creating and naming a novel entry file

  1. Create>"New entry" (a new entry entitled "no name" will appear on the entry line near the top of the Artemis display)
  2. Entries>"Set name of entry". (Select the entry "no name" and type in your name of choice followed by a .txt or .tab extension)
  3. File>"Save an entry" (select the new entry for saving)
    You now have an empty entry file to which you can save features of your choosing

B. Select and move features into your new entry
If you are moving only a small number of features to your new entry:

  1. Select features individually (you can select more than one by holding the shift key)
  2. Edit>"Copy selected features to" ( copies selected features into the entry file of choice)
  3. File>"Save an entry" (saves the entry you select)
    TIP: avoid using File>"Save all entries". If Artemis crashes while saving, you may lose data even from those files which were not changed

Selection of a large number of features can be done using the Feature selector option:

Example 1. Select coding sequences involved in alginate biosynthesis:
  1. Select>"Feature selector.." (selects features based on their shared qualifiers) In this example, I am selecting coding sequences for which the "product" qualifier contains the word "alginate" (see window at right)
  2. click on Select
  3. click on View (brings up a window showing the list of selected features)
  4. Select desired features on the list
  5. Edit>"Copy selected features" (specify the entry file to which they will be copied)
Artemis feature selector window

Example 2. Select genes involved in alginate biosynthesis

  1. Select>"Feature selector.." using the following selection terms:
    Key = gene

    Qualifier = gene
    Containing this text = alg
  2. Proceed as described in Example 1.

V. Changing colors and other feature display properties
The default color for all "gene" and "CDS" features has been set to white (or white and turquoise if you are launching Artemis from the Sanger site. However, viewing multiple feature files simultaneously is greatly simplified by color coding the individual files. The feature files available for downloading from the genome analysis tools page are colored through addition of color qualifiers added to the individual gene and CDS features. These colors can be altered in one of the following ways:

  1. Open the feature file in a program with a text editor (Word will work fine). Search for the existing RGB coordinates and replace with the coordinates corresponding to your color of choice. RGB coordinates for various colors can be found in the options file or in the color settings in programs such as Powerpoint.
  2. To change feature colors (or any other qualifier) in Artemis:
    • Select the feature(s) you wish to change.
    • Go to Edit>Change qualifiers of selected...
    • In the window that appears, select "colour" from the pull down menu and click on "Insert qualifier"
    • Type the 3-number RGB coordinate after "/colour=" (see table below for RGB coordinates)
    • Click on either "add" or "replace". Selecting "Add" will leave any previous color qualifiers in place. The color shown corresponds to the last one listed in the feature record.
      ColorRGB coordinates ColorRGB coordinates
      white255 255 255brown200 150 100
      dark grey100 100 100pink255 200 200
      red255 0 0light grey170 170 170
      green0 255 0black0 0 0
      blue0 0 255light steel blue176 196 222
      cyan0 255 255purple178 58 238
      magenta255 0 255tomato255 99 71
      yellow255 255 0lemon chiffon255 250 205
      pale green152 251 152turquoise0 245 255
      sky blue135 206 250plum238 174 238
      orange255 165 0brick red139 58 58

VI. Creating and formatting new features

A. Select the region to be included in a new feature by one of the two following methods

  1. Highlight an area in the Artemis window using the mouse
    Create>"Create feature from base range"
  2. Create a new feature by manually specifying the base range
    • Create>"Create feature from base range"
    • Manually type in the base range in the #..# format

B. Format the new feature:

Key - Select a key term from the pulldown menu
Location - after entering the base range, the complement button can be used to specify the complementary strand
Qualifier - Select desired qualifier from the pulldown menu and type relevant information between the "". (Note is a good choice for general information)

An example of a feature in create/edit mode is shown below:
Artemis Feature Edit window

To add additional base ranges to the feature, highlight an area or select another feature
Click "Grab Range" and the base range will be added to your feature
A line will appear linking the different components in the Artemis display

Use Ctl-V to verify that your feature has been formatted to your liking
The feature can be subsequently edited using the Edit menu

VII. Tips for Mac Users

Macintosh users with MacOS X (i.e. MacOS version 10) or better do not need to install Java because it comes as a standard feature of the operating system.

Sanger suggests that Mac OS X users download the UNIX/Linux version of Artemis (see suggestions for Mac OS X users in Tips and FAQs). However, for those less familiar with UNIX, Artemis for Windows is easily downloaded and fully functional on OS X. Note that when downloaded onto a Mac, the Artemis program is renamed"Diana".

The Artemis window on the Mac differs somewhat from the sample windows displayed above, with the chief differences being (i) the location of the pull-down menu and (ii) shortcuts utilize the Mac Command key instead of the Ctl key. However, all the functions described above are preserved.


Magdalen Lindeberg
PPI Project Coordinator
Dept Plant Pathology
Cornell University
contact