File Types ReadMe
  • Files named as reviewedAnnotations.gaf are Gene Ontology (GO) and Plant Ontology (PO) Annotations combined in a single file in GO GAF2.2 format
    Full reference from GO site:
    http://geneontology.org/docs/go-annotation-file-gaf-format-2.2

    Column number: content

    1. Database: Source Database of Annotated Object. One of UniProtKB or TAIR or RNACentral.
    2. Database Object Identifier :the unique identifier for an object in the Database from column one.
    3. Database Object Symbol : the symbolic name of the object (gene, protein, locus, RNA) being annotated. Can be a gene product symbol or ORF name. Usually something with biological significance.
    4. Qualifier: term that qualifies the relationship between the object and ontology term (e.g. enables).
    5. Ontology ID: the unique identifier for a GO or PO term
    6. Reference ID: the unique identifier for the reference used for annotation. The format is source:identifier. Source is either PubMed Database (e.g. PMID:23445566) or DOI (e.g. DOI:10.1073/pnas.1713574114)
    7. Evidence code: three letter code corresponding to a GO evidence code (one of IDA, IGI, IPI, IMP, IEP).See http://www.geneontology.org/GO.evidence.html for details on evidence codes.
    8. Evidence With: Used for some annotations (IPI and IGI) that require supporting information about interacting component (for example a protein binding partner or a genetic supressor). Format is DBname:DBidentifier. More than one entity is allowed. When more than one entity is included a pipe (| )is used to indicate an OR relation, and a comma (,) is used to represent AND relation.
    9. Aspect: Refers to the namespace or ontology aspect. F=GO molecular function, C=GO cellular component, P=GO biological process, S=PO structure, G=PO growth and development stage
    10. Database Object Name: name of the gene, gene product.
    11. Database Object Synonym: Additional symbolic names for the gene product. Used to aid searching.
    12. Database object type: A description of the object (from column 2) that is being annotated. May be one of : gene_product, protein, RNA
    13. Taxon: unique identifier corresponding to the taxon ID of the gene product being annotated (from column 2).
    14. Date: Date on which the annotation was made; format is YYYMMDD.
    15. Annotator: The ORCiD of the individual who make the annotation.
    16. Annotation Extension: date the annotation was made.
    17. Gene Product Form ID

  • Files named as otherAnnotations.json are files containing gene name, comment annotations and associations to publications, in JSON format. At a minimum each set will have a gene-publication link, other values may be null.
    Example:
    {
    “publication”: "PMID:16891302",
    “locus”: "Q08IT7",
    “source”: "UniprotKB",
    “symbolicName”: "GmICHG",
    “fullName”: "SOYBEAN Isoflavone conjugate-specific beta-glucosidase",
    “comments”: [ ]
    }
    publication: either PubMed ID (PMID) or DOI
    locus: one of UniProtKB id, AGI locus identifier, or RNACentral id
    source: one of UniProtKB, TAIR, or RNACentral
    symbolicName: may be null
    fullName: may be null
    comments: may be empty list