Pfam Access Functions
This module defines functions for interfacing Pfam database.
-
searchPfam(query, search_b=False, skip_a=False, **kwargs)[source]
Return Pfam search results in a dictionary. Matching Pfam accession
as keys will map to evalue, alignment start and end residue positions.
Parameters: |
- query (str) – UniProt ID, PDB identifier, protein sequence, or a sequence
file, sequence queries must not contain without gaps and must be at
least 16 characters long
- search_b (bool) – search Pfam-B families when True
- skip_a (bool) – do not search Pfam-A families when True
- ga (bool) – use gathering threshold when True
- evalue (float) – user specified e-value cutoff, must be smaller than 10.0
- timeout (int) – timeout for blocking connection attempt in seconds, default
is 60
|
query can also be a PDB identifier, e.g. '1mkp' or '1mkpA' with
chain identifier. UniProt ID of the specified chain, or the first
protein chain will be used for searching the Pfam database.
-
fetchPfamMSA(acc, alignment='full', compressed=False, **kwargs)[source]
Return a path to the downloaded Pfam MSA file.
Parameters: |
- acc (str) – Pfam ID or Accession Code
- alignment – alignment type, one of 'full' (default), 'seed',
'ncbi', 'metagenomics', 'rp15', 'rp35', 'rp55',
or 'rp75' where rp stands for representative proteomes
- compressed – gzip the downloaded MSA file, default is False
|
Alignment Options
Parameters: |
- format – a Pfam supported MSA file format, one of 'selex',
(default), 'stockholm' or 'fasta'
- order – ordering of sequences, 'tree' (default) or
'alphabetical'
- inserts – letter case for inserts, 'upper' (default) or 'lower'
- gaps – gap character, one of 'dashes' (default), 'dots',
'mixed' or None for unaligned
|
Other Options
Parameters: |
- timeout – timeout for blocking connection attempt in seconds, default
is 60
- outname – out filename, default is input 'acc_alignment.format'
- folder – output folder, default is '.'
|