I’m not a protein person (DNA, arthropods, SNPs, RNA, that’s me), so as I was doing some research using the protein databases, I came across this tidbit of information. UniProt is a central repository of protein sequences from Swiss-Prot, TrEMBL, and PIR. Check, I knew that. What I just learned was (yes, slow on the uptake, I know) the IPI (International Protein Index) is somewhat different.
IPI protein sets are made for a limited number of higher eukaryotic species whose genomic sequence has been completely determined but where there are a large number of predicted protein sequences that are not yet in UniProt. IPI takes data from UniProt and also from sources of such predictions, and combines them non-redundantly into a comprehensive proteome set for each species.