PDB Blast Search
This module defines functions for blast searching Protein Data Bank.
-
class PDBBlastRecord(xml, sequence=None)[source]
A class to store results from ProteinDataBank blast search.
Instantiate a PDBlast object instance.
Parameters: |
- xml (str) – blast search results in XML format or an XML file that
contains the results
- sequence (str) – query sequence
|
-
getBest()[source]
Return a dictionary containing structure and alignment information
for the hit with highest sequence identity.
-
getHits(percent_identity=90.0, percent_overlap=70.0, chain=False)[source]
Return a dictionary in which PDB identifiers are mapped to structure
and alignment information.
Parameters: |
- percent_identity (float) – PDB hits with percent sequence identity equal
to or higher than this value will be returned, default is 90.0
- percent_overlap (float) – PDB hits with percent coverage of the query
sequence equivalent or better will be returned, default is 70.0
- chain (bool) – if chain is True, individual chains in a PDB file
will be considered as separate hits , default is False
|
-
getParameters()[source]
Return parameters used in blast search.
-
getSequence()[source]
Return the query sequence that was used in the search.
-
blastPDB(sequence, filename=None, **kwargs)[source]
Return a PDBBlastRecord instance that contains results from
blast searching of ProteinDataBank database sequence using NCBI blastp.
Parameters: |
- sequence (str) – single-letter code amino acid sequence of the protein
without any gap characters, all white spaces will be removed
- filename (str) – a filename to save the results in XML format
|
hitlist_size (default is 250) and expect (default is 1e-10)
search parameters can be adjusted by the user. sleep keyword argument
(default is 2 seconds) determines how long to wait to reconnect for
results. Sleep time is doubled when results are not ready. timeout
(default is 120s) determines when to give up waiting for the results.