AbGenBank - Documentation
AbGenBank curates non-structural antibody depositions from GenBank. We analyze translations of the sequences in GenBank by sequence-aligning them to antibody germline sequences from multiple species. Details of how you can interact with the database are given below.
Database Contents
AbGenBank curates the antibodies found in the sequence entries reported in GenBank translations. For each antibody sequence we record the following values:
Antibody sequence and CDRs We record the original sequence entry, the identified variable region and IMGT-numbered variable region.
Title and Description These entries usually reveal what was the purpse of an antibody (e.g.
Organism Not all organisms have a plentiful set of germlines, specific organism annotations help with determination of the correct species.
References Links to journal publications if available.
Sequence Search
We facilitate the interaction with the data in AbGenBank by offering a sequence-search functionality available here. This is used to to facilitate identifying similar full variable region chains that might have been published. Below we describe the input and output pages and how to use them.
Input: The user is asked to provide a sequence of an antibody heavy or light chain (an example is also provided). Upon clicking Search the system will align the input sequence to closest germline genes and afterwards perform IMGT sequence alignment to the sequences in AbGenBank with the same sequence. Up to top 100 matches are returned.
Output: The results are presented in an interactive table. The users can use various filters to constrain the results presented in the table. The filters allow the user to constrain by the results sequence identity to the entire variable region or a particular CDR. Individual results can be examined by clicking on the accessions.
CDR Search
Since antibody binding is determined by the CDRs, we allow users to perform searches just on this region, available here. This is used to to facilitate identifying antibodies similar or identical CDRs, chiefly for shared specificity. Below we describe the input and output pages and how to use them.
Input: The user is asked to provide a sequence of an IMGT CDR and to specify which CDR (e.g. H1) it is (an example is also provided). Upon clicking Search the system will align the input CDR to IMGT CDRs with the same length. Up to top 100 matches are returned.
Output: The results are presented in an interactive table showing the matchins CDR sequence, its sequence identity and accessions of full sequences harboring such CDRs. Individual results can be examined by clicking on the accessions.
Text Search
Oftentimes, authors of depositions in GenBank describe the purpose of the antibody, by revealing its target. To facilitate identifying such antibodies, we allow for text searches in titles and descriptions of AbGenBank entries available here.
Input: The user is asked to provide keywords, such as
Output: The results are presented in an interactive table showing the accessions of the entries together with matchint text fields. Individual results can be examined by clicking on the accessions.
License
AbGenBank and all its contents were developed by NaturalAntibody and are released under CC-BY-NC license, so in lay terms reuse is permitted so long its non-commercial. If you would like to use these data for commercial purposes or install it in-house, please contact konrad@naturalantibody.com for details.