This folder contains the benchmark proteins and annotations used in the final assessment of CAFA3. Benchmark proteins and annotations were generated by comparing the experimental annotation gain in UniProt GOA between two time points t0 and t1. This period is called the annotation growth period. t0 = 20170213 t1 = 20171115 In this benchmark set we considered experimental annotations as having the following evidence code: EXP, IDA, IPI, IMP, IGI, IEP, TAS and IC. Some benchmarks proteins were eliminated from this set due to several reasons. 1. UniProt GOA incorporated annotation data from the Candida Genome Database (CGD). Almost 1000 proteins gained experimental annotation due to this incorporation. However, these annotations were publicly available on CGD before time t0, just not on UniProt GOA. Therefore, we have eliminated any annotations that were gained during our annotation growth period, but have an "assigned" date prior to time t0. 2. We have also eliminated 965 (46.7%) MFO benchmark with "GO:0005515" (protein binding) as its ONLY MFO annotation. In other words, if the protein is a benchmark in other ontologies (BPO or CCO), it is not deleted as a benchmark from those ontologies, but it is deleted from MFO if it has "GO:0005515" as its only annotation in MFO. Such a large fraction of exclusive annotation using Protein Binding Activity introduced a large bias in the data, to the point that the Naive method performed unreasonably well. As some of you may recall, we debiased the benchmark in the same fashion in CAFA1. For more information on CAFA benchmark generation, please refer to the CAFA2 paper[1]: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1037-6. Code for comparing two UniProt GOA files for experimental annotation differences can be found at https://github.com/nguyenngochuy91/CAFA_benchmark. If you have any questions regarding the CAFA benchmark generation process, please contact nzhou@iastate.edu. [1]Jiang, Yuxiang, et al. "An expanded evaluation of protein function prediction methods shows an improvement in accuracy." Genome biology 17.1 (2016): 184.