Scaffold provides the user the option to view GO Annotations associated with proteins. These annotations come from a variety of sources including the NBCI and the UniProt GOA knowledge base. Scaffold comes preconfigured with the ability to add the UniProt All Proteomes and Human Only GOA databases. While the All Proteomes database contains a wealth of information regarding numerous species it is very large and takes a long time to download and index. Often users want the ability to quickly search a smaller subset database that is relevant to the species with which they are working. Scaffold allows users to download and search organism-specific GOA databases that are much smaller and therefore downloaded and index more quickly.
For general information about GOA databases and downloads, here are a couple resources:
https://www.ebi.ac.uk/GOA/downloads
http://geneontology.org/page/download-annotations
To download specific databases follow the instructions below, there are two ways to add GOA annotations in Scaffold:
Option 1 (preferred method):
- Access the UniProt FTP server, also available from the link above.
- From here select the specific organism you are working with and open the corresponding folder, CHICKEN or DOG for example. Note, if you do not see your organism listed on the main page click on the Proteomes folder to bring up a vast collection of organism specific proteomes. You can use your brower's search function to help you locate specific organisms.
- It is important to select the file with the extension GAF.GZ in order for Scaffold to be able to read the data contained therein. Note, select from the standard file or files
*_complex.gaf.gz
,*_isoform.gaf.gz
, or*_rna.gaf.gz
based on your experimental needs. Consult the UniProt website for more information - Right click the file you are interested in adding and select Copy Link Location
- Follow this pathway: Edit > Edit Annotation Options... Here you will see any databases that have been added (NCBI is added automatically)
- Select Add. Here you have the option the add the All Proteomes and Human Only UniProt databases.
- Select Other Web Site... from the dropdown menu and paste the copied link into the the box
- Give your database a name and choose a folder to save your GOA database to. This should be a local folder that you have permissions to read/write to (where you store your data or FASTA files, for example). Click Add
- Your database should appear under Database Name. Note, the currently selected database is highlighted in green.
- Once your database is selected GO terms can be applied. You can add go terms to an experiment from a selected database by clicking Experiment >Add or Edit Annotations... This dialog allows you to add either GO or Pathway annotations. Choose the appropriate GO term source from the dropdown menu next to the GO Terms button. Then click OK.
- Here you have the option to select the terms displayed, adjust the default and add the selected terms. Simply selecting OK here will apply the default set of GO terms to your experiment file.
- If two or more samples have been loaded, Scaffold has the ability to find enriched terms using the PSEA-Quant algorithm. More information can be found in the User Guide
Option 2 (alternative method):
- Follow steps 1 through 3 above to locate the organism specific GOA database required
- Download the file by clicking and select Save File
- Follow this pathway: Edit > Edit Annotation Options...
- Select Add
- To add an additional database Select Other File, this will bring up a dialog box which will allow you to add the GAZ.GF file you recently downloaded
- Give your database a name and choose a folder to save your GOA database to. This should be a local folder that you have permissions to read/write to (where you store your data or FASTA files for example). Click Add
Note: Option 1 requires an Internet connection to download the database file the first time. If installing on a computer without Internet you can use Option 2 and transfer the file to your computer via a thumb drive.