Fungal Pathogen Gene and Protein Nomenclature

Fungal pathogen gene symbols are consistent with those assigned by genomic sequencing projects, wherever applicable, and otherwise conform to the most common usage in the literature. Whenever possible, the Proteome module follows the lead of the scientific community in choosing gene symbols. Occasionally we must decide which primary symbol to assign to a gene, if no single symbol has been agreed upon by the community. In cases where no genetic symbol exists, we assign the protein name ("Acetylase, for example) as the primary symbol. If no unique gene or protein symbol has been assigned, a clone symbol or number may be used as an identifier.

Each fungal pathogen gene has at least one symbol, which must be unique to that species. All symbols used in the literature or in GenBank records are collected and listed as synonyms. Synonyms that conflict with other gene symbols or synonyms appear in parentheses. Occasionally if two different genes have been assigned the same symbol and there are no available synonyms, numbers are appended to the gene symbols, e.g. 'ABC1_1', 'ABC1_2'.

The format of gene and protein symbols is an adaptation of the format agreed upon by genome projects or commonly used in the literature to the requirements of our database (italics, special characters, and spaces may not be used). The format differs for the different species, as indicated below.

Genes and Proteins for Aspergillus Species

For Aspergillus species, gene symbols are generally in the format abcA or abc1. Protein symbols are written by capitalizing the initial letter of the gene symbol (for example, AbcA or Abc1). Mutant alleles are denoted in the style used in the literature.

Genes and Proteins for B. dermatitidis, C. immitis, H. capsulatum, and Candida Species

B. dermatitidis, C. immitis, H. capsulatum, and all of the Candida species follow the style established for S. cerevisiae: gene symbols are in the format ABC1 and protein symbols are in the format Abc1p. Mutant alleles are denoted in the style used in the literature (usually, recessive alleles are represented in lowercase letters). Many Candida names are derived from CandidaDB.

Nucleotide sequence data for C. albicans were obtained from the Stanford Genome Technology Center website at Sequencing of C. albicans was accomplished with the support of the NIDR and the Burroughs Wellcome Fund. Informations about coding sequences and proteins were obtained from CandidaDB available at, which has been developed by the Galar Fungal European Consortium (QLK2-2000-00795).

Genes and Proteins for C. neoformans

Several different serotypes, or strains, of C. neoformans are commonly studied, most often serotypes A and D. Recent work suggests that the serotypes are distinct enough from each other that they may represent separate species. Accordingly, information about homologous proteins derived from different serotypes is kept on separate Locus Reports. To distinguish these gene symbols, a suffix denoting the serotype is appended to the gene symbol, which follows the format ADE2 (for example, ADE2_AADE2_D). Protein symbols are in the format Ade2_A. If nothing is appended to the gene symbols, this indicates that no serotype information is available. The currently recognized serotypes of C. neoformans are shown in the table below.

C. neoformans Varieties and Associated Serotypes

Variety Serotype
C. neoformans var. grubii A
C. neoformans var. gattii B
C. neoformans var. gattii C
C. neoformans var. gattii B/C
(indicates var. gattii strains whose serotype is unknown but could be B or C; denoted "B-C" in Proteome databases)
C. neoformans var. neoformans D
C. neoformans var. grubii/var. neoformans hybrid AD

Genes and Proteins for Pneumocystis Species

Pneumocystis species. P. carinii exists in different host species as distinct strains termed formae specialis or special forms. Since the special forms probably represent separate species, information about homologous proteins derived from different special forms is kept on separate Locus Reports. Gene symbols are in the format abc1, to which the name of the special form is appended (for example, abc1_carinii). Protein symbols are in the format Abc1_carinii. If the host organism is rat, but the distinction has not been made between f. sp. carinii and f. sp. ratti (two special forms which both exist in the rat host), _rat is appended to the gene or protein symbol. If nothing is appended to the gene symbol, this indicates that no special form information is available. The most commonly studied P. carinii special forms are shown in the table below.

Special Forms of P. carinii and Their Associated Hosts

Special form Host
P. carinii f. sp. carinii rat
P. carinii f. sp. rattii rat
P. carinii f. sp. muris mouse
P. carinii f. sp. mustelae ferret
P. carinii f. sp. oryctolagi rabbit

P. jiroveci. P. carinii f. sp. hominis, the special form that exists in humans, has been recognized as a distinct species and renamed Pneumocystis jiroveci. The Locus Reports for all the P. carinii f. sp. hominis proteins are now available under the species name P. jiroveci, and the _hominis designation has been removed from the gene symbols.

