This week’s highlighted discussion offers a peek at some odd situations in public databases. Sometimes there are things missing that you can’t quite figure out. I thought the exploration of why this happens was interesting and informative about working with databases.
Biostars is a site for asking, answering and discussing bioinformatics questions and issues. We are members of the community and find it very useful. Often questions and answers arise at Biostars that are germane to our readers (end users of genomics resources). Every Thursday we will be highlighting one of those items or discussions here in this thread. You can ask questions in this thread, or you can always join in at Biostars.
This week’s highlighted issue at Biostars is one of the ones that can be really mystifying to encounter. But because of the way databases are curated, sometimes there are odd situations that don’t make sense at first glance. Sometimes these are real bugs–but other times they are decisions that had to be made to accommodate some strange feature of biology that doesn’t align with a database configuration.
I am looking at some mass-spec data.
I found several fragments mapping to Ig heavy chain V-II region WAH protein and want to find corresponding gene.
Uniprot says the gene name as “NULL”. Is this an annotation error or any special aspect of Ig regions am missing ? I want to map several proteins with these type of names to genes.
- Cluster of Ig heavy chain V-I region HG3
- Cluster of Ig heavy chain V-II region SESS
- Cluster of Ig heavy chain V-III region BRO
- Cluster of Ig lambda chain V-I region NEW
- Cluster of Ig lambda chain V-II region BUR
- Ig heavy chain V-II region WAH
- Ig heavy chain V-III region BUT
- Ig heavy chain V-III region GAL
- Ig heavy chain V-III region NIE
- Ig heavy chain V-III region WEA
- Ig kappa chain V-I region Kue
- Ig kappa chain V-I region Wes
- Ig kappa chain V-III region VG (Fragment)
- Ig lambda chain V-III region LOI
- Ig lambda chain V-III region SH
- Ig lambda chain V-V region DEL
How can I map these to corresponding gene names ?
Having been involved in curation, I can see how this transpired. But there was a great answer from the UniProt folks themselves in the thread. And input from others too. I thought the discussion was fascinating. Go have a look at the outcome.