PDB File¶

This module defines functions for parsing and writing PDB files.

parsePDBStream(stream, **kwargs)[source]¶

Return an AtomGroup and/or dictionary containing header data parsed from a stream of PDB lines.

Parameters:

stream – Anything that implements the method readlines (e.g. file, buffer, stdin)
title (str) – title of the AtomGroup instance, default is the PDB filename or PDB identifier
ag (AtomGroup) – AtomGroup instance for storing data parsed from PDB file, number of atoms in ag and number of atoms parsed from the PDB file must be the same and atoms in ag and those in PDB file must be in the same order. Non-coordinate data stored in ag will be overwritten with those parsed from the file.
chain (str) – chain identifiers for parsing specific chains, e.g. chain='A', chain='B', chain='DE', by default all chains are parsed
subset (str) – a predefined keyword to parse subset of atoms, valid keywords are 'calpha' ('ca'), 'backbone' ('bb'), or None (read all atoms), e.g. subset='bb'
model (int, list) – model index or None (read all models), e.g. model=10
header (bool) – if True PDB header content will be parsed and returned
altloc (str) – if a location indicator is passed, such as 'A' or 'B', only indicated alternate locations will be parsed as the single coordinate set of the AtomGroup, if altloc is set True all alternate locations will be parsed and each will be appended as a distinct coordinate set, default is "A"
biomol (False) – if True, biomolecule obtained by transforming the coordinates using information from header section will be returned
secondary (False) – if True, secondary structure information from header section will be assigned atoms

If model=0 and header=True, return header dictionary only.

Note that this function does not evaluate CONECT records.

parsePDB(pdb, **kwargs)[source]¶

Return an AtomGroup and/or dictionary containing header data parsed from a PDB file.

This function extends parsePDBStream().

See Parse PDB files for a detailed usage example.

Parameters:

pdb – a PDB identifier or a filename If needed, PDB files are downloaded using fetchPDB() function.
title (str) – title of the AtomGroup instance, default is the PDB filename or PDB identifier
ag (AtomGroup) – AtomGroup instance for storing data parsed from PDB file, number of atoms in ag and number of atoms parsed from the PDB file must be the same and atoms in ag and those in PDB file must be in the same order. Non-coordinate data stored in ag will be overwritten with those parsed from the file.
chain (str) – chain identifiers for parsing specific chains, e.g. chain='A', chain='B', chain='DE', by default all chains are parsed
subset (str) – a predefined keyword to parse subset of atoms, valid keywords are 'calpha' ('ca'), 'backbone' ('bb'), or None (read all atoms), e.g. subset='bb'
model (int, list) – model index or None (read all models), e.g. model=10
header (bool) – if True PDB header content will be parsed and returned
altloc (str) – if a location indicator is passed, such as 'A' or 'B', only indicated alternate locations will be parsed as the single coordinate set of the AtomGroup, if altloc is set True all alternate locations will be parsed and each will be appended as a distinct coordinate set, default is "A"
biomol (False) – if True, biomolecule obtained by transforming the coordinates using information from header section will be returned
secondary (False) – if True, secondary structure information from header section will be assigned atoms

If model=0 and header=True, return header dictionary only.

Note that this function does not evaluate CONECT records.

parsePQR(filename, **kwargs)[source]¶

Return an AtomGroup containing data parsed from PDB lines.

Parameters:

filename (str) – a PQR filename
title (str) – title of the AtomGroup instance, default is the PDB filename or PDB identifier
ag (AtomGroup) – AtomGroup instance for storing data parsed from PDB file, number of atoms in ag and number of atoms parsed from the PDB file must be the same and atoms in ag and those in PDB file must be in the same order. Non-coordinate data stored in ag will be overwritten with those parsed from the file.
chain (str) – chain identifiers for parsing specific chains, e.g. chain='A', chain='B', chain='DE', by default all chains are parsed
subset (str) – a predefined keyword to parse subset of atoms, valid keywords are 'calpha' ('ca'), 'backbone' ('bb'), or None (read all atoms), e.g. subset='bb'

writePDBStream(stream, atoms, csets=None, **kwargs)[source]¶

Write atoms in PDB format to a stream.

Parameters:	stream – anything that implements a `write()` method (e.g. file, buffer, stdout) atoms – an object with atom and coordinate data csets – coordinate set indices, default is all coordinate sets beta – a list or array of number to be outputted in beta column occupancy – a list or array of number to be outputted in occupancy column

writePDB(filename, atoms, csets=None, autoext=True, **kwargs)[source]¶

Write atoms in PDB format to a file with name filename and return filename. If filename ends with .gz, a compressed file will be written.

Parameters:	atoms – an object with atom and coordinate data csets – coordinate set indices, default is all coordinate sets beta – a list or array of number to be outputted in beta column occupancy – a list or array of number to be outputted in occupancy column autoext – when not present, append extension `.pdb` to filename