ProDy Manual¶
This is a partial copy of ProDy documentation. Please visit ProDy Homepage for complete documentation with tutorials.
Installation¶
Required Software¶
Python 2.6, 2.7, 3.2 or later
Windows: You need to use 32-bit Python on Windows to be able to install NumPy and ProDy.
NumPy 1.7 or later
When compiling from source, on Linux for example, you will need a C compiler
(e.g. gcc) and Python developer libraries (i.e. python.h
).
If you don’t have Python developer libraries installed on your machine,
use your package manager to install python-dev
package.
In addition, matplotlib is required for using plotting functions. ProDy, ProDy Applications, and Evol Applications can be operated without this package.
Quick Install¶
If you have pip installed, type the following:
pip install -U ProDy
If you don’t have pip, please download an installation file and follow the instructions.
Download & Install¶
After installing the required packages, you will need to download a suitable ProDy source or installation file from http://python.org/pypi/ProDy. For changes and list of new features see Release Notes.
Linux
Download ProDy-x.y.z.tar.gz
. Extract tarball contents and run
setup.py
as follows:
$ tar -xzf ProDy-x.y.z.tar.gz
$ cd ProDy-x.y.z
$ python setup.py build
$ python setup.py install
If you need root access for installation, try sudo python setup.py install
.
If you don’t have root access, please consult alternate and custom installation
schemes in Installing Python Modules.
Mac OS
For installing ProDy, please follow the Linux installation instructions.
Windows
Remove previously installed ProDy release from Uninstall a program in Control Panel.
Download ProDy-0.x.y.win32-py2.z.exe
and run to install ProDy.
To be able use ProDy Applications and Evol Applications in command prompt
(cmd.exe), append Python and scripts folders (e.g.
C:\Python27
and C:\Python27\Scripts
) to PATH
environment variable.
Recommended Software¶
- Scipy, when installed, replaces linear algebra module of Numpy. Scipy linear algebra module is more flexible and can be faster.
- IPython is a must have for interactive ProDy sessions.
- PyReadline for colorful IPython sessions on Windows.
- MDAnalysis for reading molecular dynamics trajectories.
Included in ProDy¶
Following software is included in the ProDy installation packages:
Source Code¶
Source code is available at https://github.com/prody/ProDy.
Applications¶
ProDy comes with two sets of applications that automate structural dynamics and sequence coevolution analysis:
ProDy Applications¶
ProDy applications are command line programs that automates structure processing and structural dynamics analysis:
prody align¶
Usage¶
Running prody align -h displays:
usage: prody align [-h] [--quiet] [--examples] [-s SEL] [-m INT] [-i INT]
[-o INT] [-p STR] [-x STR]
pdb [pdb ...]
positional arguments:
pdb PDB identifier(s) or filename(s)
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
atom/model selection:
-s SEL, --select SEL reference structure atom selection (default: calpha)
-m INT, --model INT for NMR files, reference model index (default: 1)
chain matching options:
-i INT, --seqid INT percent sequence identity (default: 90)
-o INT, --overlap INT
percent sequence overlap (default: 90)
output options:
-p STR, --prefix STR output filename prefix (default: PDB filename)
-x STR, --suffix STR output filename suffix (default: _aligned)
Examples¶
Running prody align --examples displays:
Align models in a PDB structure or multiple PDB structures and save
aligned coordinate sets. When multiple structures are aligned, ProDy
will match chains based on sequence alignment and use best match for
aligning the structures.
Fetch PDB structure 2k39 and align models (reference model is the
first model):
$ prody align 2k39
Fetch PDB structure 2k39 and align models using backbone of residues
with number less than 71:
$ prody align 2k39 --select "backbone and resnum < 71"
Align 1r39 and 1zz2 onto 1p38 using residues with number less than
300:
$ prody align --select "resnum < 300" 1p38 1r39 1zz2
Align all models of 2k39 onto 1aar using residues 1 to 70 (inclusive):
$ prody align --select "resnum 1 to 70" 1aar 2k39
Align 1fi7 onto 1hrc using heme atoms:
$ prody align --select "noh heme and chain A" 1hrc 1fi7
prody anm¶
Usage¶
Running prody anm -h displays:
usage: prody anm [-h] [--quiet] [--examples] [-n INT] [-s SEL] [-c FLOAT]
[-g FLOAT] [-m INT] [-a] [-o PATH] [-e] [-r] [-u] [-q] [-v]
[-z] [-t STR] [-b] [-l] [-k] [-p STR] [-f STR] [-d STR]
[-x STR] [-A] [-R] [-Q] [-B] [-K] [-F STR] [-D INT]
[-W FLOAT] [-H FLOAT]
pdb
positional arguments:
pdb PDB identifier or filename
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
parameters:
-n INT, --number-of-modes INT
number of non-zero eigenvectors (modes) to calculate
(default: 10)
-s SEL, --select SEL atom selection (default: "protein and name CA or
nucleic and name P C4' C2")
-c FLOAT, --cutoff FLOAT
cutoff distance (A) (default: 15.0)
-g FLOAT, --gamma FLOAT
spring constant (default: 1.0)
-m INT, --model INT index of model that will be used in the calculations
output:
-a, --all-output write all outputs
-o PATH, --output-dir PATH
output directory (default: .)
-e, --eigenvs write eigenvalues/vectors
-r, --cross-correlations
write cross-correlations
-u, --heatmap write cross-correlations heatmap file
-q, --square-fluctuations
write square-fluctuations
-v, --covariance write covariance matrix
-z, --npz write compressed ProDy data file
-t STR, --extend STR write NMD file for the model extended to "backbone"
("bb") or "all" atoms of the residue, model must have
one node per residue
-b, --beta-factors write beta-factors calculated from GNM modes
-l, --hessian write Hessian matrix
-k, --kirchhoff write Kirchhoff matrix
output options:
-p STR, --file-prefix STR
output file prefix (default: pdb_anm)
-f STR, --number-format STR
number output format (default: %12g)
-d STR, --delimiter STR
number delimiter (default: " ")
-x STR, --extension STR
numeric file extension (default: .txt)
figures:
-A, --all-figures save all figures
-R, --cross-correlations-figure
save cross-correlations figure
-Q, --square-fluctuations-figure
save square-fluctuations figure
-B, --beta-factors-figure
save beta-factors figure
-K, --contact-map save contact map (Kirchhoff matrix) figure
figure options:
-F STR, --figure-format STR
pdf (default: pdf)
-D INT, --dpi INT figure resolution (dpi) (default: 300)
-W FLOAT, --width FLOAT
figure width (inch) (default: 8.0)
-H FLOAT, --height FLOAT
figure height (inch) (default: 6.0)
Examples¶
Running prody anm --examples displays:
Perform ANM calculations for given PDB structure and output results in
NMD format. If an identifier is passed, structure file will be
downloaded from the PDB FTP server.
Fetch PDB 1p38, run ANM calculations using default parameters, and
write NMD file:
$ prody anm 1p38
Fetch PDB 1aar, run ANM calculations using default parameters for
chain A carbon alpha atoms with residue numbers less than 70, and save
all of the graphical output files:
$ prody anm 1aar -s "calpha and chain A and resnum < 70" -A
prody biomol¶
Usage¶
Running prody biomol -h displays:
usage: prody biomol [-h] [--quiet] [--examples] [-p STR] [-b INT] pdb
positional arguments:
pdb PDB identifier or filename
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
-p STR, --prefix STR prefix for output files (default: pdb_biomol_)
-b INT, --biomol INT index of the biomolecule, by default all are generated
Examples¶
Running prody biomol --examples displays:
Generate biomolecule coordinates:
$ prody biomol 2bfu
prody blast¶
Usage¶
Running prody blast -h displays:
usage: prody blast [-h] [--quiet] [--examples] [-i FLOAT] [-o FLOAT] [-d PATH]
[-z] [-f STR] [-e FLOAT] [-l INT] [-s INT] [-t INT]
sequence
positional arguments:
sequence sequence or file in fasta format
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
-i FLOAT, --identity FLOAT
percent sequence identity (default: 90.0)
-o FLOAT, --overlap FLOAT
percent sequence overlap (default: 90.0)
-d PATH, --output-dir PATH
download uncompressed PDB files to given directory
-z, --gzip write compressed PDB file
Blast Parameters:
-f STR, --filename STR
a filename to save the results in XML format
-e FLOAT, --expect FLOAT
blast search parameter
-l INT, --hit-list-size INT
blast search parameter
-s INT, --sleep-time INT
how long to wait to reconnect for results (sleep time
is doubled when results are not ready)
-t INT, --timeout INT
when to give up waiting for results
Examples¶
Running prody blast --examples displays:
Blast search PDB for the first sequence in a fasta file:
$ prody blast seq.fasta -i 70
Blast search PDB for the sequence argument:
$ prody blast MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG
Blast search PDB for avidin structures, download files, and align all
files onto the 2avi structure:
$ prody blast -d . ARKCSLTGKWTNDLGSNMTIGAVNSRGEFTGTYITAVTATSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFSESTTVFT
$ prody align 2avi.pdb *pdb
prody catdcd¶
Usage¶
Running prody catdcd -h displays:
usage: prody catdcd [-h] [--quiet] [--examples] [-s SEL] [-o FILE] [-n]
[--psf PSF] [--pdb PDB] [--first INT] [--last INT]
[--stride INT] [--align SEL]
dcd [dcd ...]
positional arguments:
dcd DCD filename(s) (all must have same number of atoms)
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
-s SEL, --select SEL atom selection (default: all)
-o FILE, --output FILE
output filename (default: trajectory.dcd)
-n, --num print the number of frames in each file and exit
--psf PSF PSF filename (must have same number of atoms as DCDs)
--pdb PDB PDB filename (must have same number of atoms as DCDs)
--first INT index of the first output frame, default: 0
--last INT index of the last output frame, default: -1
--stride INT number of steps between output frames, default: 1
--align SEL atom selection for aligning frames, a PSF or PDB file
must be provided, if a PDB is provided frames will be
superposed onto PDB coordinates
Examples¶
Running prody catdcd --examples displays:
Concatenate two DCD files and output all atmos:
$ prody catdcd mdm2.dcd mdm2sim2.dcd
Concatenate two DCD files and output backbone atoms:
$ prody catdcd mdm2.dcd mdm2sim2.dcd --pdb mdm2.pdb -s bb
prody contacts¶
Usage¶
Running prody contacts -h displays:
usage: prody contacts [-h] [--quiet] [--examples] [-s SELSTR] [-r FLOAT]
[-t STR] [-p STR] [-x STR]
target ligand [ligand ...]
positional arguments:
target target PDB identifier or filename
ligand ligand PDB identifier(s) or filename(s)
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
-s SELSTR, --select SELSTR
selection string for target
-r FLOAT, --radius FLOAT
contact radius (default: 4.0)
-t STR, --extend STR output same residue, chain, or segment as contacting
atoms
-p STR, --prefix STR output filename prefix (default: target filename)
-x STR, --suffix STR output filename suffix (default: _contacts)
Examples¶
Running prody contacts --examples displays:
Identify contacts of a target structure with one or more ligands.
Fetch PDB structure 1zz2, save PDB files for individual ligands, and
identify contacting residues of the target protein:
$ prody select -o B11 "resname B11" 1zz2
$ prody select -o BOG "resname BOG" 1zz2
$ prody contacts -r 4.0 -t residue -s protein 1zz2 B11.pdb BOG.pdb
prody eda¶
Usage¶
Running prody eda -h displays:
usage: prody eda [-h] [--quiet] [--examples] [-n INT] [-s SEL] [-a] [-o PATH]
[-e] [-r] [-u] [-q] [-v] [-z] [-t STR] [-j] [-p STR] [-f STR]
[-d STR] [-x STR] [-A] [-R] [-Q] [-J STR] [-F STR] [-D INT]
[-W FLOAT] [-H FLOAT] [--psf PSF | --pdb PDB] [--aligned]
dcd
positional arguments:
dcd file in DCD or PDB format
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
--psf PSF PSF filename
--pdb PDB PDB filename
--aligned trajectory is already aligned
parameters:
-n INT, --number-of-modes INT
number of non-zero eigenvectors (modes) to calculate
(default: 10)
-s SEL, --select SEL atom selection (default: "protein and name CA or
nucleic and name P C4' C2")
output:
-a, --all-output write all outputs
-o PATH, --output-dir PATH
output directory (default: .)
-e, --eigenvs write eigenvalues/vectors
-r, --cross-correlations
write cross-correlations
-u, --heatmap write cross-correlations heatmap file
-q, --square-fluctuations
write square-fluctuations
-v, --covariance write covariance matrix
-z, --npz write compressed ProDy data file
-t STR, --extend STR write NMD file for the model extended to "backbone"
("bb") or "all" atoms of the residue, model must have
one node per residue
-j, --projection write projections onto PCs
output options:
-p STR, --file-prefix STR
output file prefix (default: pdb_pca)
-f STR, --number-format STR
number output format (default: %12g)
-d STR, --delimiter STR
number delimiter (default: " ")
-x STR, --extension STR
numeric file extension (default: .txt)
figures:
-A, --all-figures save all figures
-R, --cross-correlations-figure
save cross-correlations figure
-Q, --square-fluctuations-figure
save square-fluctuations figure
-J STR, --projection-figure STR
save projections onto specified subspaces, e.g. "1,2"
for projections onto PCs 1 and 2; "1,2 1,3" for
projections onto PCs 1,2 and 1, 3; "1 1,2,3" for
projections onto PCs 1 and 1, 2, 3
figure options:
-F STR, --figure-format STR
pdf (default: pdf)
-D INT, --dpi INT figure resolution (dpi) (default: 300)
-W FLOAT, --width FLOAT
figure width (inch) (default: 8.0)
-H FLOAT, --height FLOAT
figure height (inch) (default: 6.0)
Examples¶
Running prody eda --examples displays:
This command performs PCA (or EDA) calculations for given multi-model
PDB structure or DCD format trajectory file and outputs results in NMD
format. If a PDB identifier is given, structure file will be
downloaded from the PDB FTP server. DCD files may be accompanied with
PDB or PSF files to enable atoms selections.
Fetch pdb 2k39, perform PCA calculations, and output NMD file:
$ prody pca 2k39
Fetch pdb 2k39 and perform calculations for backbone of residues up to
71, and save all output and figure files:
$ prody pca 2k39 --select "backbone and resnum < 71" -a -A
Perform EDA of MDM2 trajectory:
$ prody eda mdm2.dcd
Perform EDA for backbone atoms:
$ prody eda mdm2.dcd --pdb mdm2.pdb --select backbone
prody fetch¶
Usage¶
Running prody fetch -h displays:
usage: prody fetch [-h] [--quiet] [--examples] [-d PATH] [-z] pdb [pdb ...]
positional arguments:
pdb PDB identifier(s) or a file that contains them
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
-d PATH, --dir PATH target directory for saving PDB file(s)
-z, --gzip write compressed PDB file(S)
Examples¶
Running prody fetch --examples displays:
Download PDB file(s) by specifying identifiers:
$ prody fetch 1mkp 1p38
prody gnm¶
Usage¶
Running prody gnm -h displays:
usage: prody gnm [-h] [--quiet] [--examples] [-n INT] [-s SEL] [-c FLOAT]
[-g FLOAT] [-m INT] [-a] [-o PATH] [-e] [-r] [-u] [-q] [-v]
[-z] [-t STR] [-b] [-k] [-p STR] [-f STR] [-d STR] [-x STR]
[-A] [-R] [-Q] [-B] [-K] [-M STR] [-F STR] [-D INT]
[-W FLOAT] [-H FLOAT]
pdb
positional arguments:
pdb PDB identifier or filename
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
parameters:
-n INT, --number-of-modes INT
number of non-zero eigenvectors (modes) to calculate
(default: 10)
-s SEL, --select SEL atom selection (default: "protein and name CA or
nucleic and name P C4' C2")
-c FLOAT, --cutoff FLOAT
cutoff distance (A) (default: 10.0)
-g FLOAT, --gamma FLOAT
spring constant (default: 1.0)
-m INT, --model INT index of model that will be used in the calculations
output:
-a, --all-output write all outputs
-o PATH, --output-dir PATH
output directory (default: .)
-e, --eigenvs write eigenvalues/vectors
-r, --cross-correlations
write cross-correlations
-u, --heatmap write cross-correlations heatmap file
-q, --square-fluctuations
write square-fluctuations
-v, --covariance write covariance matrix
-z, --npz write compressed ProDy data file
-t STR, --extend STR write NMD file for the model extended to "backbone"
("bb") or "all" atoms of the residue, model must have
one node per residue
-b, --beta-factors write beta-factors calculated from GNM modes
-k, --kirchhoff write Kirchhoff matrix
output options:
-p STR, --file-prefix STR
output file prefix (default: pdb_gnm)
-f STR, --number-format STR
number output format (default: %12g)
-d STR, --delimiter STR
number delimiter (default: " ")
-x STR, --extension STR
numeric file extension (default: .txt)
figures:
-A, --all-figures save all figures
-R, --cross-correlations-figure
save cross-correlations figure
-Q, --square-fluctuations-figure
save square-fluctuations figure
-B, --beta-factors-figure
save beta-factors figure
-K, --contact-map save contact map (Kirchhoff matrix) figure
-M STR, --mode-shape-figure STR
save mode shape figures for specified modes, e.g. "1-3
5" for modes 1, 2, 3 and 5
figure options:
-F STR, --figure-format STR
pdf (default: pdf)
-D INT, --dpi INT figure resolution (dpi) (default: 300)
-W FLOAT, --width FLOAT
figure width (inch) (default: 8.0)
-H FLOAT, --height FLOAT
figure height (inch) (default: 6.0)
Examples¶
Running prody gnm --examples displays:
This command performs GNM calculations for given PDB structure and
outputs results in NMD format. If an identifier is passed, structure
file will be downloaded from the PDB FTP server.
Fetch PDB 1p38, run GNM calculations using default parameters, and
results:
$ prody gnm 1p38
Fetch PDB 1aar, run GNM calculations with cutoff distance 7 angstrom
for chain A carbon alpha atoms with residue numbers less than 70, and
save all of the graphical output files:
$ prody gnm 1aar -c 7 -s "calpha and chain A and resnum < 70" -A
prody pca¶
Usage¶
Running prody pca -h displays:
usage: prody pca [-h] [--quiet] [--examples] [-n INT] [-s SEL] [-a] [-o PATH]
[-e] [-r] [-u] [-q] [-v] [-z] [-t STR] [-j] [-p STR] [-f STR]
[-d STR] [-x STR] [-A] [-R] [-Q] [-J STR] [-F STR] [-D INT]
[-W FLOAT] [-H FLOAT] [--psf PSF | --pdb PDB] [--aligned]
dcd
positional arguments:
dcd file in DCD or PDB format
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
--psf PSF PSF filename
--pdb PDB PDB filename
--aligned trajectory is already aligned
parameters:
-n INT, --number-of-modes INT
number of non-zero eigenvectors (modes) to calculate
(default: 10)
-s SEL, --select SEL atom selection (default: "protein and name CA or
nucleic and name P C4' C2")
output:
-a, --all-output write all outputs
-o PATH, --output-dir PATH
output directory (default: .)
-e, --eigenvs write eigenvalues/vectors
-r, --cross-correlations
write cross-correlations
-u, --heatmap write cross-correlations heatmap file
-q, --square-fluctuations
write square-fluctuations
-v, --covariance write covariance matrix
-z, --npz write compressed ProDy data file
-t STR, --extend STR write NMD file for the model extended to "backbone"
("bb") or "all" atoms of the residue, model must have
one node per residue
-j, --projection write projections onto PCs
output options:
-p STR, --file-prefix STR
output file prefix (default: pdb_pca)
-f STR, --number-format STR
number output format (default: %12g)
-d STR, --delimiter STR
number delimiter (default: " ")
-x STR, --extension STR
numeric file extension (default: .txt)
figures:
-A, --all-figures save all figures
-R, --cross-correlations-figure
save cross-correlations figure
-Q, --square-fluctuations-figure
save square-fluctuations figure
-J STR, --projection-figure STR
save projections onto specified subspaces, e.g. "1,2"
for projections onto PCs 1 and 2; "1,2 1,3" for
projections onto PCs 1,2 and 1, 3; "1 1,2,3" for
projections onto PCs 1 and 1, 2, 3
figure options:
-F STR, --figure-format STR
pdf (default: pdf)
-D INT, --dpi INT figure resolution (dpi) (default: 300)
-W FLOAT, --width FLOAT
figure width (inch) (default: 8.0)
-H FLOAT, --height FLOAT
figure height (inch) (default: 6.0)
Examples¶
Running prody pca --examples displays:
This command performs PCA (or EDA) calculations for given multi-model
PDB structure or DCD format trajectory file and outputs results in NMD
format. If a PDB identifier is given, structure file will be
downloaded from the PDB FTP server. DCD files may be accompanied with
PDB or PSF files to enable atoms selections.
Fetch pdb 2k39, perform PCA calculations, and output NMD file:
$ prody pca 2k39
Fetch pdb 2k39 and perform calculations for backbone of residues up to
71, and save all output and figure files:
$ prody pca 2k39 --select "backbone and resnum < 71" -a -A
Perform EDA of MDM2 trajectory:
$ prody eda mdm2.dcd
Perform EDA for backbone atoms:
$ prody eda mdm2.dcd --pdb mdm2.pdb --select backbone
prody select¶
Usage¶
Running prody select -h displays:
usage: prody select [-h] [--quiet] [--examples] [-o STR] [-p STR] [-x STR]
select pdb [pdb ...]
positional arguments:
select atom selection string
pdb PDB identifier(s) or filename(s)
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
output options:
-o STR, --output STR output PDB filename (default: pdb_selected.pdb)
-p STR, --prefix STR output filename prefix (default: PDB filename)
-x STR, --suffix STR output filename suffix (default: _selected)
Examples¶
Running prody select --examples displays:
This command selects specified atoms and writes them in a PDB file.
Fetch PDB files 1p38 and 1r39 and write backbone atoms in a file:
$ prody select backbone 1p38 1r39
Running prody command will provide a description of applications:
$ prody
usage: prody [-h] [-c] [-v]
{anm,gnm,pca,eda,align,blast,biomol,catdcd,contacts,fetch,select}
...
ProDy: A Python Package for Protein Dynamics Analysis
optional arguments:
-h, --help show this help message and exit
-c, --cite print citation info and exit
-v, --version print ProDy version and exit
subcommands:
{anm,gnm,pca,eda,align,blast,biomol,catdcd,contacts,fetch,select}
anm perform anisotropic network model calculations
gnm perform Gaussian network model calculations
pca perform principal component analysis calculations
eda perform essential dynamics analysis calculations
align align models or structures
blast blast search Protein Data Bank
biomol build biomolecules
catdcd concatenate dcd files
contacts identify contacts between a target and ligand(s)
fetch fetch a PDB file
select select atoms and write a PDB file
See 'prody <command> -h' for more information on a specific command.
Detailed information on a specific application can be obtained by typing the command and application names as prody anm -h.
Running prody anm application as follows will perform ANM calculations for the p38 MAP kinase structure, and will write eigenvalues/vectors in plain text and NMD Format:
$ prody anm 1p38
In the above example, the default parameters (cutoff=15.
and gamma=1.
)
and all of the Cα atoms of the protein structure 1p38 are used.
In the example below, the cutoff distance is changed to 14 Å,
and the Cα atoms of residues with numbers smaller than 340 are used,
the output files are prefixed with p38_anm
:
$ prody anm -c 14 -s "calpha resnum < 340" -p p38_anm 1p38
The output file p38_anm.nmd
can be visualized using `NMWiz`_.
Evol Applications¶
Evol applications are command line programs that automate retrieval, refinement, and analysis of multiple sequence alignments:
evol coevol¶
Usage¶
Running evol coevol -h displays:
usage: evol coevol [-h] [--quiet] [--examples] [-n] [-c STR] [-m STR] [-t]
[-p STR] [-f STR] [-S] [-L FLOAT] [-U FLOAT] [-X STR]
[-T STR] [-D INT] [-H FLOAT] [-W FLOAT] [-F STR]
msa
positional arguments:
msa refined MSA file
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
calculation options:
-n, --no-ambiguity treat amino acids characters B, Z, J, and X as non-
ambiguous
-c STR, --correction STR
also save corrected mutual information matrix data and
plot, one of apc, asc
-m STR, --normalization STR
also save normalized mutual information matrix data
and plot, one of sument, minent, maxent, mincon,
maxcon, joint
output options:
-t, --heatmap save heatmap files for all mutual information matrices
-p STR, --prefix STR output filename prefix, default is msa filename with
_coevol suffix
-f STR, --number-format STR
number output format (default: %12g)
figure options:
-S, --save-plot save coevolution plot
-L FLOAT, --cmin FLOAT
apply lower limits for figure plot
-U FLOAT, --cmax FLOAT
apply upper limits for figure plot
-X STR, --xlabel STR specify xlabel, by default will be applied on ylabel
-T STR, --title STR figure title
-D INT, --dpi INT figure resolution (dpi) (default: 300)
-H FLOAT, --height FLOAT
figure height (inch) (default: 6)
-W FLOAT, --width FLOAT
figure width (inch) (default: 8)
-F STR, --figure-format STR
figure file format, one of svgz, rgba, png, pdf, eps,
svg, ps, raw (default: pdf)
Examples¶
Running evol coevol --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
evol conserv¶
Usage¶
Running evol conserv -h displays:
usage: evol conserv [-h] [--quiet] [--examples] [-n] [-g] [-p STR] [-f STR]
[-S] [-H FLOAT] [-W FLOAT] [-F STR] [-D INT]
msa
positional arguments:
msa refined MSA file
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
calculation options:
-n, --no-ambiguity treat amino acids characters B, Z, J, and X as non-
ambiguous
-g, --gaps do not omit gap characters
output options:
-p STR, --prefix STR output filename prefix, default is msa filename with
_conserv suffix
-f STR, --number-format STR
number output format (default: %12g)
figure options:
-S, --save-plot save conservation plot
-H FLOAT, --height FLOAT
figure height (inch) (default: 6)
-W FLOAT, --width FLOAT
figure width (inch) (default: 8)
-F STR, --figure-format STR
figure file format, one of raw, png, ps, svgz, eps,
pdf, rgba, svg (default: pdf)
-D INT, --dpi INT figure resolution (dpi) (default: 300)
Examples¶
Running evol conserv --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
evol fetch¶
Usage¶
Running evol fetch -h displays:
usage: evol fetch [-h] [--quiet] [--examples] [-a STR] [-f STR] [-o STR]
[-i STR] [-g STR] [-t INT] [-d PATH] [-p STR] [-z]
acc
positional arguments:
acc Pfam accession or ID
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
download options:
-a STR, --alignment STR
alignment type, one of full, seed, ncbi, metagenomics
(default: full)
-f STR, --format STR Pfam supported MSA format, one of selex, fasta,
stockholm (default: selex)
-o STR, --order STR ordering of sequences, one of tree, alphabetical
(default: tree)
-i STR, --inserts STR
letter case for inserts, one of upper, lower (default:
upper)
-g STR, --gaps STR gap character, one of dashes, dots, mixed (default:
dashes)
-t INT, --timeout INT
timeout for blocking connection attempts (default: 60)
output options:
-d PATH, --outdir PATH
output directory (default: .)
-p STR, --outname STR
output filename, default is accession and alignment
type
-z, --compressed gzip downloaded MSA file
Examples¶
Running evol fetch --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
evol filter¶
Usage¶
Running evol filter -h displays:
usage: evol filter [-h] [--quiet] [--examples] (-s | -e | -c) [-F] [-o STR]
[-f STR] [-z]
msa word [word ...]
positional arguments:
msa MSA filename to be filtered
word word to be compared to sequence label
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
filtering method (required):
-s, --startswith sequence label starts with given words
-e, --endswith sequence label ends with given words
-c, --contains sequence label contains with given words
filter option:
-F, --full-label compare full label with word(s)
output options:
-o STR, --outname STR
output filename, default is msa filename with _refined
suffix
-f STR, --format STR output MSA file format, default is same as input
-z, --compressed gzip refined MSA output
Examples¶
Running evol filter --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
evol merge¶
Usage¶
Running evol merge -h displays:
usage: evol merge [-h] [--quiet] [--examples] [-o STR] [-f STR] [-z]
msa [msa ...]
positional arguments:
msa MSA filenames to be merged
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
output options:
-o STR, --outname STR
output filename, default is first input filename with
_merged suffix
-f STR, --format STR output MSA file format, default is same as first input
MSA
-z, --compressed gzip merged MSA output
Examples¶
Running evol merge --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
evol occupancy¶
Usage¶
Running evol occupancy -h displays:
usage: evol occupancy [-h] [--quiet] [--examples] [-o STR] [-p STR] [-l STR]
[-f STR] [-S] [-X STR] [-Y STR] [-T STR] [-D INT]
[-W FLOAT] [-F STR] [-H FLOAT]
msa
positional arguments:
msa MSA file
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
calculation options:
-o STR, --occ-axis STR
calculate row or column occupancy or both., one of
row, col, both (default: row)
output options:
-p STR, --prefix STR output filename prefix, default is msa filename with
_occupancy suffix
-l STR, --label STR index for column based on msa label
-f STR, --number-format STR
number output format (default: %12g)
figure options:
-S, --save-plot save occupancy plot/s
-X STR, --xlabel STR specify xlabel
-Y STR, --ylabel STR specify ylabel
-T STR, --title STR figure title
-D INT, --dpi INT figure resolution (dpi) (default: 300)
-W FLOAT, --width FLOAT
figure width (inch) (default: 8)
-F STR, --figure-format STR
figure file format, one of png, pdf, raw, svg, eps,
ps, svgz, rgba (default: pdf)
-H FLOAT, --height FLOAT
figure height (inch) (default: 6)
Examples¶
Running evol occupancy --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
evol rankorder¶
Usage¶
Running evol rankorder -h displays:
usage: evol rankorder [-h] [--quiet] [--examples] [-z] [-d STR] [-p STR]
[-m STR] [-l STR] [-n INT] [-q INT] [-t FLOAT] [-u]
[-o STR]
mutinfo
positional arguments:
mutinfo mutual information matrix
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
input options:
-z, --zscore apply zscore for identifying top ranked coevolving
pairs
-d STR, --delimiter STR
delimiter used in mutual information matrix file
-p STR, --pdb STR PDB file that contains same number of residues as the
mutual information matrix, output residue numbers will
be based on PDB file
-m STR, --msa STR MSA file used for building the mutual info matrix,
output residue numbers will be based on the most
complete sequence in MSA if a PDB file or sequence
label is not specified
-l STR, --label STR label in MSA file for output residue numbers
output options:
-n INT, --num-pairs INT
number of top ranking residue pairs to list (default:
100)
-q INT, --seq-sep INT
report coevolution for residue pairs that are
sequentially separated by input value (default: 3)
-t FLOAT, --min-dist FLOAT
report coevolution for residue pairs whose CA atoms
are spatially separated by at least the input value,
used when a PDB file is given and --use-dist is true
(default: 10.0)
-u, --use-dist use structural separation to report coevolving pairs
-o STR, --outname STR
output filename, default is mutinfo_rankorder.txt
Examples¶
Running evol rankorder --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
evol refine¶
Usage¶
Running evol refine -h displays:
usage: evol refine [-h] [--quiet] [--examples] [-l STR] [-s FLOAT] [-c FLOAT]
[-r FLOAT] [-k] [-o STR] [-f STR] [-z]
msa
positional arguments:
msa MSA filename to be refined
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
refinement options:
-l STR, --label STR sequence label, UniProt ID code, or PDB and chain
identifier
-s FLOAT, --seqid FLOAT
identity threshold for selecting unique sequences
-c FLOAT, --colocc FLOAT
column (residue position) occupancy
-r FLOAT, --rowocc FLOAT
row (sequence) occupancy
-k, --keep keep columns corresponding to residues not resolved in
PDB structure, applies label argument is a PDB
identifier
output options:
-o STR, --outname STR
output filename, default is msa filename with _refined
suffix
-f STR, --format STR output MSA file format, default is same as input
-z, --compressed gzip refined MSA output
Examples¶
Running evol refine --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
evol search¶
Usage¶
Running evol search -h displays:
usage: evol search [-h] [--quiet] [--examples] [-b] [-s] [-g] [-e FLOAT]
[-t INT] [-o STR] [-d STR]
query
positional arguments:
query protein UniProt ID or sequence, a PDB identifier, or a
sequence file, where sequence have no gaps and 12 or
more characters
optional arguments:
-h, --help show this help message and exit
--quiet suppress info messages to stderr
--examples show usage examples and exit
sequence search options:
-b, --searchBs search Pfam-B families
-s, --skipAs do not search Pfam-A families
-g, --ga use gathering threshold
-e FLOAT, --evalue FLOAT
e-value cutoff, must be less than 10.0
-t INT, --timeout INT
timeout in seconds for blocking connection attempt
(default: 60)
output options:
-o STR, --outname STR
name for output file, default is standard output
-d STR, --delimiter STR
delimiter for output data columns (default: )
Examples¶
Running evol search --examples displays:
Sequence coevolution analysis involves several steps that including
retrieving data and refining it for calculations. These steps are
illustrated below for RnaseA protein family.
Search Pfam database:
$ evol search 2w5i
Download Pfam MSA file:
$ evol fetch RnaseA
Refine MSA file:
$ evol refine RnaseA_full.slx -l RNAS1_BOVIN --seqid 0.98 --rowocc 0.8
Checking occupancy:
$ evol occupancy RnaseA_full.slx -l RNAS1_BOVIN -o col -S
Conservation analysis:
$ evol conserv RnaseA_full_refined.slx
Coevolution analysis:
$ evol coevol RnaseA_full_refined.slx -S -c apc
Rank order analysis:
$ evol rankorder RnaseA_full_refined_mutinfo_corr_apc.txt -p 2w5i_1-121.pdb --seq-sep 3
Running evol command will provide a description of applications:
$ evol
usage: evol [-h] [-c] [-v] [-e]
{search,fetch,filter,refine,merge,occupancy,conserv,coevol,rankorder}
...
Evol: Sequence Evolution and Dynamics Analysis
optional arguments:
-h, --help show this help message and exit
-c, --cite print citation info and exit
-v, --version print ProDy version and exit
-e, --examples show usage examples and exit
subcommands:
{search,fetch,filter,refine,merge,occupancy,conserv,coevol,rankorder}
search search Pfam with given query
fetch fetch MSA files from Pfam
filter filter an MSA using sequence labels
refine refine an MSA by removing gapped rows/colums
merge merge multiple MSAs based on common labels
occupancy calculate occupancy of rows and columns in MSA
conserv analyze conservation using Shannon entropy
coevol analyze co-evolution using mutual information
rankorder identify highly coevolving pairs of residues
See 'evol <command> -h' for more information on a specific command.
Detailed information on a specific application can be obtained by typing the command and application names as evol search -h.
Running prody search application as follows will search Pfam database for protein families that match the proteins in PDB structure 2w5i:
$ evol search 2w5i
On Linux, when installing ProDy from source, application scripts are placed
into a default folder that is included in PATH
environment variable,
e.g. /usr/local/bin/
.
On Windows, installer places the scripts into the Scripts
folder under
Python distribution folder, e.g. C:\Python27\Scripts
. You may need
to add this path to PATH
environment variable yourself.
Reference Manual¶
Atomic Data¶
Dynamics Analysis¶
Analysis Functions¶
Anisotropic Network Model¶
Comparison Functions¶
NMA Model Editing¶
Supporting Functions¶
Custom Gamma Functions¶
Gaussian Network Model¶
Heatmapper Functions¶
Normal Mode¶
Mode Set¶
Normal Mode Analysis¶
NMD File¶
Principal Component Analysis¶
Plotting Functions¶
Rotation Translation Blocks¶
Sampling Functions¶
Ensemble Analysis¶
Protein Structure¶
Sequence Analysis¶
ProDy Utilities¶
Applications API¶
Coevolution Application¶
Conservation Application¶
Pfam MSA Fetcher¶
MSA File Filter¶
MSA File Merger¶
MSA Occupancy Calculation¶
Identify Coevolving Pairs¶
MSA Refinement¶
Pfam Search¶
PDB Model/Structure Alignment¶
ANM Application¶
Biomolecule Builder¶
Blast Search PDB¶
DCD Files Concatenation¶
Contact Identification¶
PDB File Fetcher¶
GNM Application¶
PCA Application¶
Atom Selection¶
Configuration & Logging¶
This module defines functions for logging in files, configuring ProDy, and running tests.
Developer’s Guide¶
Contributing to ProDy¶
Install Git and a GUI¶
ProDy source code is managed using Git distributed revision controlling system. You need to install git, and if you prefer a GUI for it, on your computer to be able to contribute to development of ProDy.
On Debian/Ubuntu Linux, for example, you can run the following to install git and gitk:
$ sudo apt-get install git gitk
For other operating systems, you can obtain installation instructions and files from Git.
You will only need to use a few basic git commands. These commands are provided below, but usually without an adequate description. Please refer to Git book and Git docs for usage details and examples.
Fork and Clone ProDy¶
ProDy source code an issue tracker are hosted on Github. You need to create an account on this service, if you do not have one already.
If you work on Mac OS or Windows, you may consider getting GitHub Mac or GitHub Windows to help you manage a copy of the repository.
Once you have an account, you need to make a fork of ProDy, which is creating a copy of the repository in your account. You will see a link for this on ProDy source code page. You will have write access to this fork and later will use it share your changes with others.
The next step is cloning the fork from your online account to your local system. If you are not using the GitHub software, you can do it as follows:
$ git clone https://github.com/prody/ProDy.git
This will create ProDy
folder with a copy of the project files in it:
$ cd ProDy
$ ls
bdist_wininst.bat docs INSTALL.rst LICENSE.rst Makefile
MANIFEST.in prody README.rst scripts setup.py
Setup Working Environment¶
You can use ProDy directly from this clone by adding ProDy
folder
to your PYTHONPATH
environment variable, e.g.:
export PYTHONPATH=$PYTHONPATH:$/home/USERNAME/path/to/ProDy
This will not be enough though, since you also need to compile C extensions. You can run the following series of commands to build and copy C modules to where they need to be:
$ cd ProDy
$ python setup.py build_ext --inplace --force
or, on Linux you can:
$ make build
You may also want to make sure that you can run ProDy Applications from anywhere
on your system. One way to do this by adding ProDy/scripts
folder
to your PATH
environment variable, e.g.:
export PATH=$PATH:$/home/USERNAME/path/to/ProDy/scripts
Modify, Test, and Commit¶
When modifying ProDy files you may want to follow the Style Guide for ProDy. Closely following the guidelines therein will allow for incorporation of your changes to ProDy quickly.
If you changed .py
files, you should ensure to check the integrity
of the package. To do this, you should at least run fast ProDy tests as
follows:
$ cd ProDy
$ nosetests
See Testing ProDy for alternate and more comprehensive ways of testing. ProDy unittest suit may not include a test for the function or the class that you just changed, but running the tests will ensure that the ProDy package can be imported and run without problems.
After ensuring that the package runs, you can commit your changes as follows:
$ git commit modified_file_1.py modified_file_2.py
or:
$ git commit -a
This command will open a text editor for you to describe the changes that you just committed.
Push and Pull Request¶
After you have committed your changes, you will need to push them to your Bitbucket account:
git push origin master
This step will ask for your account user name. If you are going to push to your GitHub/Bitbucket account frequently, you may add an SSH key for automatic authentication. To add an SSH key for your system, go to
page on GitHub or page on Bitbucket.After pushing your changes, you will need to make a pull request from your to notify ProDy developers of the changes you made and facilitate their incorporation to ProDy.
Update Local Copy¶
You can also keep an up-to-date copy of ProDy by pulling changes from the master ProDy repository on a regular basis. You need add to the master repository as a remote to your local copy. You can do this running the following command from the ProDy project folder:
$ cd prody
$ git remote add prodymaster git@github.com:abakan/ProDy.git
or:
$ cd prody
$ git remote add prodymaster git@bitbucket.org:abakan/prody.git
You may use any name other than prodymaster, but origin, which points to the ProDy fork in your account.
After setting up this remote, calling git pull command will fetch latest changes from ProDy master repository and merge them to your local copy:
$ git pull prodymaster master
Note that when there are changes in C modules, you need to run the following commands again to update the binary module files:
$ python setup.py build_ext --inplace --force
Documenting ProDy¶
ProDy documentation is written using reStructuredText markup and prepared
using Sphinx. You may install Sphinx using easy_install, i.e.
easy_install -U Sphinx
, or using package manager on your Linux machine.
Building Manual¶
ProDy Manual in HTML and PDF formats can be build as follows:
$ cd docs
$ make html
$ make pdf
If all documentation strings and pages are properly formatted according to reStructuredText markup, documentation pages should compile without any warnings. Note that to build PDF files, you need to install latex and pdflatex programs.
Read the Docs
A copy of ProDy manual is hosted on Read the Docs
and can be viewed at http://prody.readthedocs.org/. Read the Docs is configured
to build manual pages for the devel
branch (latest) and the recent stable
versions. The user name for Read the Docs is prody
.
Building Website¶
ProDy-website source is hosted at https://github.com/prody/ProDy-website This project contains tutorial files and the home pages for ProDy and other related software.
Latest version
To build website on ProDy server, start with pulling changes:
$ cd ProDy-website
$ git pull
Running the following command will build HTML pages for the latest stable release of ProDy:
$ make html
HTML pages for manual and all tutorials are build as a single project, which allows for referencing from manual to tutorials.
PDF files for the manual and tutorials, and also download files are build as follows:
$ make pdf
PDF and TGZ/ZIP files are copied to appropriate places after they are built.
How to Make a Release¶
Make sure ProDy imports and passes all unit tests both Python 2 and Python 3, and using nose nosetests command:
$ cd ProDy $ nosetests $ nosetests3
See Testing ProDy for more on testing.
Update the version number in:
prody/__init__.py
Also, commend
+ '-dev'
out, so that documentation will build for a stable release.Update the most recent changes and the latest release date in:
docs/release/vX.Y_series.rst
.
If there is a new incremental release, start a new file.
Make sure the following files are up-to-date.
README.txt
MANIFEST.in
setup.py
If there is a new file format, that is a new extensions not captured in
MANIFEST.in
, it should be included.If there is a new C extension, it should be listed in
setup.py
.After checking these files, commit change and push them to GitHub.
Generate the source distributions:
$ cd .. $ python setup.py sdist --formats=gztar,zip
Prepare and test Windows installers (see Making Windows Installers).
Installers should be prepared for the following versions of Python:
$ C:\Python27\python setup.py bdist_msi $ C:\Python35\python setup.py bdist_msi $ C:\Python36\python setup.py bdist_msi
Alternatively, use bdist_msi.bat to run these commands. When there is a newer Python major release, it should be added to this list. Don’t forget to pull most recent changes to your Windows machine.
A good practice is installing ProDy using all newly created installers and checking that it works. ProDy script can be used to check that, e.g.:
$ C:\Python33\Scripts\prody.bat anm 1ubi
If this command runs for all supported Python versions, release is good to go.
Put all installation source and executable in dist directory.
Upload the new release files to the PyPI using twine:
$ twine upload dist/*
This will offer a number of options. ProDy on PyPI is owned by user
prody.devel
.Commit final changes, if there are any:
$ cd .. $ git commit -a
Tag the repository with the current version number and push new tag:
$ git tag vX.Y $ git push --tags
Rebase
devel
branch tomaster
:$ git checkout master $ git rebase devel $ git push
Update the documentation on ProDy website. See Documenting ProDy.
Now that you made a release, you can go back to development. You may start with appending
'-dev'
to__release__
inprody/__init__.py
.
Style Guide for ProDy¶
Introduction¶
PEP 8, the Style Guide for Python Code, is adopted in the development of ProDy package. Contributions to ProDy shall follow PEP 8 and the specifications and additions provided in this addendum.
Code Layout¶
Indentation
Use 4 spaces per indentation level in source code (.py
) and never use
tabs as a substitute.
In documentation files (.rst
), use 2 spaces per indentation level.
Maximum line length
Limit all lines to a maximum of 79 characters in both source code and documentation files. Exceptions may be made when tabulating data in documentation files and strings. The length of lines in a paragraph may be much less than 79 characters if the line ends align better with the first line, as in this paragraph.
Encodings
In cases where an encoding for a .py
file needs to be specified,
such as when characters like α, β, or Å are used in docstrings, use UTF-8
encoding, i.e. start the file with the following line:
# -*- coding: utf-8 -*-
Imports
In addition to PEP 8#imports recommendations regarding imports, the following should be applied:
- relative intra-ProDy imports are discouraged, use
from prody.atomic import AtomGroup
notfrom atomic import AtomGroup
- always import from second top level module, use
from prody.atomic import AtomGroup
and notfrom prody.atomic.atomgroup import AtomGroup
, because file names may change or files that grow too big may be split into smaller modules, etc.
Here is a series of properly formatted imports following a module documentation string:
"""This module defines a function to calculate something interesting."""
import os.path
from collections import defaultdict
from time import time
import numpy as np
from prody.atomic import AtomGroup
from prody.measure import calcRMSD
from prody.tools import openFile
from prody import LOGGER, SETTINGS
__all__ = ['calcSomething']
Whitespaces¶
In addition to recommendations regarding whitespace use in Python code (PEP 8#whitespace-in-expressions-and-statements), two whitespace characters should follow a period in documentation files and strings to help reading documentation in terminal windows and text editors.
Naming Conventions¶
ProDy naming conventions aim at making the library suitable for interactive sessions, i.e. easy to remember and type.
Class names
Naming style for classes is CapitalizedWords
(or CapWords
, or
CamelCase
). Abbreviations and/or truncated names should be used to
keep class names short. Some class name examples are:
ANM
for Anisotropic Network ModelHierView
for Hierarchical View
Exception names
Prefer using a suitable standard-library exception over defining a new one. If you absolutely need to define one, use the class naming convention. Use the suffix “Error” for exception names, when exception is an error:
SelectionError
, the only exception defined in ProDy package
Method and function names
Naming style for methods and functions is mixedCase
, that differs from
CapWords
by initial lowercase character. Starting with a lowercase
(no shift key) and using no underscore characters decreases the number of
key strokes by half in many cases in interactive sessions.
Method and function names should start with a verb, suggestive on the action,
and followed by one or two names, where the second name may start with a lower
case letter. Some examples are moveAtoms()
, wrapAtoms()
,
assignSecstr()
, and calcSubspaceOverlap()
.
Abbreviations and/or truncated names should be used and obvious words
should be omitted to limit number of names to 20 characters. For example,
buildHessian()
is preferred over buildHessianMatrix()
.
Another example is the change from using getResidueNames()
to
using AtomGroup.getResnames()
. In fact, this was part of a series of
major Release Notes aimed at refining the library for interactive usage.
In addition, the following should be applied to enable grouping of methods and functions based on their action and/or return value:
buildSomething()
: methods and functions that calculate a matrix should start withbuild
, e.g.GNM.buildKirchhoff()
andbuildDistMatrix()
calcSomething()
: methods that calculate new data but does not necessarily return anything and especially those that take timely actions, should start withcalc
, e.g.PCA.calcModes()
getSomething()
: methods, and sometimes functions, that return a copy of data should start withget
, such aslistReservedWords()
setSomething()
: methods, and sometimes functions, that alter internal data should start withset
Variable Names¶
Variable names in functions and methods should contain only lower case letters, and may contain underscore characters to increase readability.
Testing ProDy¶
Running Unittests¶
The easiest way to run ProDy unit tests is using nose. The following will run all tests:
$ nosetests prody
To skip tests that are slow, use the following:
$ nosetests prody -a '!slow'
To run tests for a specific module do as follows:
$ nosetests prody.tests.atomic prody.tests.sequence
Unittest Development¶
Unit test development should follow these guidelines:
- For comparing Python numerical types and objects, e.g. int, list, tuple, use methods of
unittest.TestCase
.- For comparing Numpy arrays, use assertions available in
numpy.testing
module.- All test files should be stored in
tests
folder in the ProDy package directory, i.e.prody/tests/
- All tests for functions and classes in a ProDy module should be in a single test file named after the module, e.g.
test_atomic/test_select.py
.- Data files for testing should be located in
tests/test_datafiles
.
Writing Tutorials¶
This is a short guide for writing ProDy tutorials that are published as part of online documentation pages, and also as individual downloadable PDF files.
Tutorial Setup¶
First go to doc
folder in ProDy package and generate necessary files
for your tutorial using start-tutorial.sh script:
$ cd doc
$ ./start-tutorial.sh
Enter tutorial title: ENM Analysis using ProDy
Enter a short title: ENM Analysis
Enter author name: First Last
Tutorial folders and files are prepared, see tutorials/enm_analysis
This will generate following folder and files:
$ cd tutorials/enm_analysis/
$ ls -lgo
-rw-r--r-- 1 328 Apr 30 16:48 conf.py
-rw-r--r-- 1 395 Apr 30 16:48 index.rst
-rw-r--r-- 1 882 Apr 30 16:48 intro.rst
-rw-r--r-- 1 1466 Apr 30 16:48 Makefile
lrwxrwxrwx 1 13 Apr 30 16:48 _static -> ../../_static
Note that short title will be used as filename and part of the URL of the online documentation pages.
If tutorial logo/image that you want to use is different from ProDy logo,
update the following line in conf.py
:
tutorial_logo = u'enm.png' # default is ProDy logo
tutorial_prody_version = u'' # default is latest ProDy version
Also, note ProDy version if the tutorial is developed for a specific release.
Style and Organization¶
ProDy documentation and tutorials are written using reStructuredText, an easy-to-read/write file format. See reStructuredText Primer for a quick introduction.
reStructuredText is stored in plain-text files with .rst
extension,
and converted to HTML and PDF pages using Sphinx.
index.rst
and intro.rst
files are automatically generated.
index.rst
file should include title and table of contents of the
tutorial. Table of contents is just a list of .rst
files that are
part of the tutorial. They be listed in the order that they should appear
in the final PDF file:
.. _enm-analysis:
.. use "enm-analysis" to refer to this file, i.e. :ref:`enm-analysis`
*******************************************************************************
ENM Analysis using ProDy
*******************************************************************************
.. add .rst files to `toctree` in the order that you want them
.. toctree::
:glob:
:maxdepth: 2
intro
Add more .rst
files as needed. See other tutorials in
doc/tutorials
folder as examples.
Input/Output Files¶
All files needed to follow the tutorial should be stored in
tutorial_name_files
folder. There is usually no need to provide
PDB files, as ProDy automatically downloads them when needed. Optionally,
output files can also be provided.
Note
Small input and output files that contain textual information may be included in the git repository, but please avoid including large files in particular those that contain binary data.
Including Code¶
Python code in tutorials should be included using IPython Sphinx directive.
In the beginning of each .rst
file, you should make necessary imports
as follows:
.. ipython:: python
from prody import *
from matplotlib.pylab import *
ion()
This will convert to the following:
In [1]: from prody import *
ImportErrorTraceback (most recent call last)
<ipython-input-1-5d14cc12dc44> in <module>()
----> 1 from prody import *
/home/docs/checkouts/readthedocs.org/user_builds/prody/envs/v1.10.2/local/lib/python2.7/site-packages/prody/__init__.pyc in <module>()
83 from .atomic import SELECT
84
---> 85 from . import proteins
86 from .proteins import *
87 __all__.extend(proteins.__all__)
/home/docs/checkouts/readthedocs.org/user_builds/prody/envs/v1.10.2/local/lib/python2.7/site-packages/prody/proteins/__init__.py in <module>()
160 __all__ = []
161
--> 162 from . import compare
163 from .compare import *
164 __all__.extend(compare.__all__)
/home/docs/checkouts/readthedocs.org/user_builds/prody/envs/v1.10.2/local/lib/python2.7/site-packages/prody/proteins/compare.py in <module>()
15 from prody.measure import calcTransformation, printRMSD, calcDistance
16 from prody import LOGGER, SELECT, PY2K, PY3K
---> 17 from prody.sequence import MSA
18 from prody.utilities import cmp
19
/home/docs/checkouts/readthedocs.org/user_builds/prody/envs/v1.10.2/local/lib/python2.7/site-packages/prody/sequence/__init__.py in <module>()
51 __all__ = []
52
---> 53 from . import msa
54 from .msa import *
55 __all__.extend(msa.__all__)
/home/docs/checkouts/readthedocs.org/user_builds/prody/envs/v1.10.2/local/lib/python2.7/site-packages/prody/sequence/msa.py in <module>()
7 from numpy import where, sort, concatenate, vstack, isscalar, chararray
8
----> 9 from Bio import AlignIO
10 from Bio import pairwise2
11 from Bio.SubsMat import MatrixInfo as matlist
ImportError: No module named Bio
In [2]: from matplotlib.pylab import *
In [3]: ion()
Then you can add the code for the tutorial:
.. ipython:: python
pdb = parsePDB('1p38')
In [4]: pdb = parsePDB('1p38')
NameErrorTraceback (most recent call last)
<ipython-input-4-08265ebed54c> in <module>()
----> 1 pdb = parsePDB('1p38')
NameError: name 'parsePDB' is not defined
Including Figures¶
IPython directive should also be used for including figures:
.. ipython:: python
@savefig tutorial_name_figure_name.png width=4in
plot(range(10))
@savefig tutorial_name_figure_two.png width=4in
plot(range(100)); # used ; to suppress output
@savefig
decorator was used to save the figure.
Note
Figure names needs to be unique within the tutorial and should be prefixed with the tutorial name.
Note that in the second plot()
call, we used a
semicolon to suppress the output of the function.
If you want to make modifications to the figure, save it after the last modification:
.. ipython:: python
plot(range(10));
grid();
xlabel('X-axis')
@savefig tutorial_name_figure_three.png width=4in
ylabel('Y-axis')
Testing Code¶
If there is any particular code output that you want to test, you can use
@doctest
decorator as follows:
.. ipython::
@doctest
In [1]: 2 + 2
Out[1]: 4
In [5]: 2 + 2
Out[5]: 4
Failing to produce the correct output will prevent building the documentation.
Publishing Tutorial¶
To see how your .rst
files convert to HTML format, use the following
command:
$ make html
You will find HTML files in _build/html
folder.
Once your tutorial is complete and looks good in HTML (no code execution problems), following commands can be used to generate a PDF file and tutorial file achieves:
$ make pdf
$ make files
ProDy online documentation will contain these files as well as tutorial pages in HTML format.
Making Windows Installers¶
MinGW (for 32-bit system) or MinGW-w64 (for 64-bit system) can be used for compiling C modules when making Windows installers. Please follow the instructions to install and configure them.
libpython is also required to be installed. If the compiler complains, such as, “’::hypot’ has not been declared”, please refer to this link.
Cross-platform Issues¶
This section describes cross-platform issues that may emerge and provides possible solutions for them.
Numpy integer type¶
Issues may arise when comparing Numpy integer types with Python int()
.
Python int()
equivalent Numpy integer type on Windows (Win7 64bit,
Python 32bit) is int32
, while on Linux (Ubuntu 64bit) it is
int64
. For example, the statement
isinstance(np.array([1], np.int64), int)
may return True resulting
in unexpected behavior in ProDy functions or methods. If Numpy integer type
needs to be specified, using int
seems a safe option.
Relative paths¶
os.path.relpath()
function raises exceptions when the working
directory and the path of interest are on separate drives, e.g. trying
to write a C:\temp
while running tests on D:\ProDy
.
Instead of this os.path.relpath()
, ProDy function relpath()
should be used to avoid problems.
Release Notes¶
ProDy 1.10 Series¶
1.10.2 (May 2, 2018)¶
- Minor fixes.
1.10.1 (May 1, 2018)¶
- Added the function sliceAtomicData for slicing data based on slicing atoms.
- Updated the documentation for making a release.
- Other documentation and minor fixes.
1.10 (Apr 30, 2018)¶
Signature Dynamics¶
- Added
calcEnsembleENMs()
to compute ENMs on each conformation of agiven ensemble to obtain an ensemble of modes.
- Added
ModeEnsemble
andsdarray
classes as the basicdata types for signature dynamics.
- Added functions such as
calcSignatureSqFlucts()
,
calcSignatureCrossCorr()
,calcSignatureFractVariance()
to extract signature dynamics.
- Added
calcEnsembleSpectralOverlaps()
to obtain dynamicaloverlaps/distances among the conformations in a given ensemble.
New Features:
Visualization
- Added
showAtomicLines()
andshowAtomicMatrix()
functions toimprove visualization.
- Added an networkx option to
showTree()
so that the user can chooseto use
networkx
to visualize a given tree.
Ensemble and PDBEnsemble
- Associated an
MSA
object to thePDBEnsemble
class.- Added an pairwise option to
Ensemble.getRMSDs()
to obtain anRMSD table of every pair of conformations in the ensemble.
- Improved
Ensemble.setAtoms()
for selecting a subset ofresidues/atoms of the ensemble.
Databases and Web Services
- Added methods and classes for obtaining data from CATH and Dali.
- Added additional functions for Uniprot and Pfam such as
queryUniprot()
andparsePfamPDBs()
.
Bug Fixes and Improvements
- Fixed compatibility problems for Python 2 and 3.
- Improved the
saveModel()
function to include class-specific features.- Fixed a bug related to the
Atomgroup
addition method.- Bug fixes to
NMA
classes.- Fixed a problem with
MSA
indexing.- Reorganized file structures and functions for consistency.
- Other bug fixes.
ProDy 1.9 Series¶
1.9.4 (Feb 02, 2018)¶
- Undocumented release and fixes.
1.9.3 (Oct 09, 2017)¶
Bugfixes
- Bug fix about http and ftp based pdb downloads.
- Bug fixes in PRS calculations.
1.9.2 (Aug 29, 2017)¶
** New Features**:
Migration to pypi.org
- All repositories are moved to pypi.org
1.9.1 (Aug 18, 2017)¶
** New Features**:
PDB Secondary Structures
- It is possible to write secondary structure infrmation to PDBs.
Bugfixes
- Fixed the problem about clang compiler for saxs tools.
- If FTP client is not working, HTTP client will be used when downloading PDBs.
1.9 (May 23, 2017)¶
New Features:
Perturbation Response Scanning
- Perturbation Response Scanning method is fully implemented with new plotting tools.
- Effectors and sensors are calculated from PRS tool.
Visualization with py3Dmol
- In jupyter notebook, if you have installed py3Dmol you can use
py3Dmol visualization directly instead of simple matplotlib visualization.
mmcif parser
- Another structural format cif is also a part of ProDy parser now.
Bugfixes
- Various indexing issues are fixed.
- Some of the obsolete pdbs will not be downloaded anymore, instead
replaced pdbs will be downloaded. This will change the priority between ftp and http servers.
ProDy 1.8 Series¶
1.8.2 (Jun 5, 2016)¶
addCoordset()
inPDBEnsemble
class, has an additional argument for NMR models.
1.8.1 (May 28, 2016)¶
Bugfixes
getHits()
inPDBBlastRecord
class, default overlap threshold changed to 0.7 to match withmapOntoChain()
.calcModes()
inRTB
have a bug on number of modes and fixed.- Tab and indentation errors with Python 3.4 are fixed.
1.8 (May 13, 2016)¶
MechStiff¶
- Identification of the weakest/strongest elements of the structure architecture provided together with 3D vizualization and statistics analysis.
- Determination of the effective spring constant for selected pair of residues - useful for Single Molecule Force Spectroscopy (SMFS, AFM) and Steered Molecular Dynamics simulations.
- Evaluating the contributions of each mode to selected deformations
New Features:
Python 2 and 3 Support
- ProDy has been refactored to support Python 2.7 and 3.4. Windows installers for Python 2.7 and 3.4 are available in Installation.
- Unit tests are compatible with Python 2.7 and 3.4, and running them with other versions gives errors due to unavailability of some
unittest
features.
Bugfixes
- Various indexing issues are fixed.
- Compatibility issue of
searchPfam()
with Python 2.7.11 is fixed.
ProDy 1.7 Series¶
1.7.1 (May 31, 2015)¶
Changes:
searchPfam()
uses hmmer for given sequence inputs instead of pfam search.
1.7 (Dec 23, 2013)¶
New Features:
buildPCMatrix()
is implemented for calculation of coevolution with PSICOV method from multiple sequence alignments.specMergeMSA()
is implemented for merging multiple sequence alignment files based on the species identifiers of sequences.exANM
is implemented for explicit membrane ANM calculations.writeMembranePDB()
is implemented for writing PDB structures of created membranes for exANM class.
ProDy 1.6 Series¶
1.6.1 (May 31, 2015)¶
Changes:
searchPfam()
uses hmmer for given sequence inputs instead of pfam search.
1.5 (Dec 23, 2013)¶
New Features:
buildPCMatrix()
is implemented for calculation of coevolution with PSICOV method from multiple sequence alignments.specMergeMSA()
is implemented for merging multiple sequence alignment files based on the species identifiers of sequences.exANM
is implemented for explicit membrane ANM calculations.writeMembranePDB()
is implemented for writing PDB structures of created membranes for exANM class.
ProDy 1.5 Series¶
1.5 (Dec 23, 2013)¶
New Features:
buildDirectInfoMatrix()
andcalcMeff()
are implemented for calculation of direct information from multiple sequence alignments.showDirectInfoMatrix()
andshowSCAMatrix()
functions are implemented for displaying coevolutionary data.RTB
is implemented for Rotations-Translations of Blocks calculations. Optional arguments also permit imANM calculations.
Availability:
- Source is moved from
lib/prody
toprody
.- Source code will be hosted only at GitHub.
Improvements:
DCDFile
andparseDCD()
support DCD files written by cpptraj.
Testing:
- ProDy test command (prody test) and function
prody.test()
has been removed for easier maintenance of testing functions. See Testing ProDy for more information on how to test ProDy.
ProDy 1.4 Series¶
1.4.9 (Nov 14, 2013)¶
Upcoming changes:
- Support for Python 3.1 and NumPy 1.5 will be dropped, meaning no Windows installers will be built for these versions of them.
Improvements:
HierView
can handleResidue
instances that have same segment name, chain identifier, and resnum, if PDB file containsTER
lines to terminate these residues. If these three identifiers are shared by multiple residues, indexingAtomGroup
instances will return a list of residues. This behavior can be used as follows. Note that in v1.5, this will be the default behavior.>>> pdb_lines = """ ... ATOM 1 O WAT A 1 4.694 -3.891 -0.592 1.00 1.00 ... ATOM 2 H1 WAT A 1 5.096 -3.068 -0.190 1.00 1.00 ... ATOM 3 H2 WAT A 1 5.420 -4.544 -0.808 1.00 1.00 ... TER ... ATOM 4 O WAT A 1 -30.035 19.116 -2.193 1.00 1.00 ... ATOM 5 H1 WAT A 1 -30.959 18.736 -2.244 1.00 1.00 ... ATOM 6 H2 WAT A 1 -29.993 19.960 -2.728 1.00 1.00 ... TER ... ATOM 7 O WAT A 1 -77.584 -21.524 -37.894 1.00 1.00 ... ATOM 8 H1 WAT A 1 -77.226 -21.966 -38.717 1.00 1.00 ... ATOM 9 H2 WAT A 1 -77.023 -20.726 -37.674 1.00 1.00 ... TER""" >>> from StringIO import StringIO >>> atoms = parsePDBStream(StringIO(pdb_lines))Current behavior:>>> print(atoms.numResidues()) 1 >>> atoms['A', 1] <Residue: WAT 1 from Chain A from Unknown (9 atoms)>To activate the new behavior (which will be the default behavior in v1.5):>>> hv = atoms.getHierView(ter=True) >>> print(hv.numResidues()) >>> hv['A', 1]
Bugfixes:
- Fixed memory leaks in
uniqueSequences()
andbuildSeqidMatrix()
.
1.4.8 (Nov 4, 2013)¶
New Features:
- New analysis functions
buildOMESMatrix()
andbuildSCAMatrix()
are implemented.- New
AtomGroup.numBytes()
method returns an estimate of memory usage.- New
countBytes()
utility function is added for counting bytes used by NumPy arrays.
Improvements:
parsePDB()
resizes data arrays to decrease memory usage.
Bugfixes:
- Fixed memory leaks in MSA
analysis
functions.- Fixed potential problems with importing contributed libraries.
1.4.7 (Oct 29, 2013)¶
Improvements:
AtomGroup
,Selection
, and otherAtomic
classes are picklable.- Improved equality tests for
AtomGroup
. Two different instances are considered equal if they contain identical data and coordinate sets.
1.4.6 (Oct 16, 2013)¶
Bugfixes:
- Selection problem with using resid is fixed (issue 160)
- Fixed a memory leak in MSA parsers written in C. When dealing with large files, leak would cause a segmentation fault.
- Fixed a memory leak in MSA parsers written in C. When dealing with large files, leak would cause a segmentation fault.
- Fixed a reference counting problem in MSA parsers in C that would cause segmentation fault when reading files that uses the same label for multiple sequences.
- Updated
fetchPDBLigand()
to use PDB for fetching XML files.- Revised handling of MSA file formats to avoid exceptions for unknown extensions.
1.4.5 (Sep 6, 2013)¶
New Features:
parsePDBHeader()
function can parse space group information from header section specified asREMARK 290
, e.g.parsePDBHeader('1mkp', 'space_group')
orparsePDBHeader('1mkp')['space_group']
- heavy selection flag is defined as an alias for noh.
matchChains()
function can match non-hydrogen atoms usingsubset='heavy'
keyword argument.- Added
update_coords
keyword argument toPCA.builCovariance()
, so that average coordinates calculated internally can be stored in ensemble or trajectory objects used as input.
Improvements:
- Unit tests can be run with Python 2.6 when unittest2 module is installed.
Bugfixes:
- Fixed problems with reading compressed PDB files using Python 3.3.
- Fixed a bug in
parseSTRIDE()
function that prevented reading files.- Improved parsing of biomolecular transformations.
- Fixed memory allocation in C code used by
parseMSA()
(Python 2.6).- Fixed a potential name error in trajectory classes.
- Fixed problems in handling compressed files when using Python 2.6 and 3.3.
- Fixed a problem with indexing
NMA
instances in Python 3 series.
1.4.4 (July 22, 2013)¶
Improvements:
writeNMD()
andparseNMD()
write and read segment names. NMWiz is also improved to handle segment names. Improvements will be available in VMD v1.9.2.
Bugfixes:
1.4.3 (June 14, 2013)¶
Changes:
getVMDpath()
andsetVMDpath()
functions are deprecated for removal, usepathVMD()
instead.- Increased
blastPDB()
timeout to 60 seconds.extendModel()
andextendMode()
functions have a new option for normalizing extended mode(s).sampleModes()
andtraverseMode()
automatically normalizes input modes.
Bugfixes:
- A bug in
applyTransformation()
is fixed. The function would interpret some external transformation matrices incorrectly.- A bug in
fetchPDBLigand()
function is fixed.
1.4.2 (April 19, 2013)¶
Improvements:
fetchPDB()
andfetchPDBfromMirror()
functions can handle partial PDB mirrors. SeepathPDBMirror()
for setting a mirror path.
Changes:
Bugfixes:
- Atom selection problems related to using all and none in composite selections, e.g.
'calpha and all'
, is fixed by defining these keywords as Atom Flags.- Fasta files with sequence labels using multiple pipe characters would cause C parser (and so
parseMSA()
) to fail. This issue is fixed by completely disregarding pipe characters.- Empty chain identifiers for PDB hits would cause a problem in parsing XML results file and
blastPDB()
would throw an exception. This case is handled by slicing the chain identifier string.- A problem in
viewNMDinVMD()
related to module imports is fixed.- A problem with handling weights in
loadEnsemble()
is fixed.
1.4.1 (Dec 16, 2012)¶
New Features:
buildSeqidMatrix()
anduniqueSequences()
functions are implemented for comparing sequences in anMSA
object.showHeatmap()
,parseHeatmap()
, andwriteHeatmap()
functions are implemented to support VMD plugin Heat Mapper file format.Sequence
is implemented to handle individual sequence records and point to sequences inMSA
instances.- evol occupancy application is implemented for refined MSA quality checking purposes.
mergeMSA()
function and evol merge application are implemented for merging Pfam MSA to study multi-domain proteins.
Improvements:
refineMSA()
function and evol refine application can perform MSA refinements by removing similar sequences.writePDB()
function takes beta and occupancy arguments to be outputted in corresponding columns.MSA
indexing and slicing are revised and improved.parseMSA()
is improved to handle indexing of sequences that have the same label in an MSA file, e.g. domains repeated in a protein.- prody anm, prody gnm, and prody pca applications can write heatmap files for visualization using NMWiz and Heatmapper plugins.
- Several improvements made to handling sequence labels in Pfam MSA files. Files that contain sequence parts with same protein UniProt ID are handled delicately.
Changes:
- ProDy will not emit a warning message when a wwPDB server is not set using
wwPDBServer()
, and use the default US server.- Indexing
MSA
returnsSequence
instances.- Iterating over
MSA
andMSAFile
yieldsSequence
instances.
Bugfixes:
- Fixed a syntax problem that prevented running ProDy using Python 2.6.
- Fixed
NMA
indexing problem that was introduced in v1.4.
Normal Mode Wizard¶
- NMWiz can visualize heatmaps linked to structural view via Heatmapper. Clicking on the heatmap will highlight atom or residue pairs.
- ProDy interface has the option to write and load cross-correlations.
- NMWiz can determined whether a model is an extended model. For extended models plotting mobility has been improved. Only a single value per residue will be plotted, and clicking on the plot will highlight all of the residue atoms.
1.4 (Dec 2, 2012)¶
New Features:
Python 3 Support
- ProDy has been refactored to support Python 3. Windows installers for Python 2.6, 2.7, 3.1, and 3.2 are available in Installation.
- Unit tests are compatible with Python 2.7 and 3.2, and running them with other versions gives errors due to unavailability of some
unittest
features.
Sequence Analysis
- New applications Evol Applications are available.
searchPfam()
andfetchPfamMSA()
functions are implemented for searching and retrieving Pfam data. See MSA Files for usage examples.MSAFile
class,parseMSA()
andwriteMSA()
functions are implemented for reading and writing multiple sequence alignments. See MSA Files for usage examples.MSA
class has been implemented for storing and manipulating MSAs in memory.calcShannonEntropy()
,buildMutinfoMatrix()
, andcalcMSAOccupancy()
functions are implemented implemented for MSA analysis. See Evolution Analysis for usage examples.showShannonEntropy()
,showMutinfoMatrix()
, andshowMSAOccupancy()
functions are implemented implemented for MSA analysis. See Evolution Analysis for usage examples.applyMutinfoCorr()
andapplyMutinfoNorm()
functions are implemented for applying normalization and corrections to mutual information matrices.calcRankorder()
function is implemented for identifying highly correlated/co-evolving pairs of residues.
Bugfix:
- Fixed selection issues involving use of
x
or negative numbers.
ProDy 1.3 Series¶
1.3.1 (Nov 6, 2012)¶
New Features:
- Added
fetchPDBviaHTTP()
andfetchPDBviaFTP()
functions.- Added
copyFile()
function toutilities
.- Added
prody test
command for convenient testing of ProDy package.
Improvements:
- Improved
gunzip()
function to handle.gz
extensions and string buffers.
Changes:
getWWPDBFTPServer()
andsetWWPDBFTPServer()
are deprecated for removal in v1.4, usewwPDBServer()
instead.getPDBLocalFolder()
andsetPDBLocalFolder()
are deprecated for removal in v1.4, usepathPDBFolder()
instead.getPDBMirrorPath()
andsetPDBMirrorPath()
are deprecated for removal in v1.4, usepathPDBMirror()
instead.getPDBCluster()
is deprecated for removal in v1.4, uselistPDBCluster()
instead.getReservedWords()
is deprecated for removal in v1.4, uselistReservedWords()
instead.getNonstdProperties()
is deprecated for removal in v1.4, uselistNonstdAAProps()
instead.
Bugfix:
- Fixed a bug in
HierView
that would cause wrong assignment of residue/chain indices to atoms when residue or chain atoms are separated by atoms of other entities. This would also caused problems when making keyword selections, such as protein.- Added dummy atom check in
Ensemble.setAtoms()
andTrajectory.setAtoms()
methods to avoid indexing problems.
1.3 (Sep 30, 2012)¶
Improvements:
select
module and its documentation are completely rewritten.Select
class uses simplest possible parser to evaluate selection strings and achieves more than 25% speed-up on average.- Atom Selections become more forgiving of small typos, but will issue warning messages when they are detected via
SelectionWarning
. These messages can be turned of usingconfProDy()
- Functions used in ProDy Applications have been refactored to allow for using them directly. See
apps
for their documentation.
Bugfix:
- A problem in prody catdcd command that was introduced when refactoring
trajectory
classes is fixed.
ProDy 1.2 Series¶
1.2.1 (Sep 6, 2012)¶
If you are upgrading from ProDy v1.1, see also the below changes introduced in v1.2.
Bugfix:
- A problem in
select
module regarding Numpy numeric types is fixed. Problem would emerge on platforms which do not offer some numeric types, e.f.np.float16
.- Fixed problems in prody anm, prody gnm, and prody fetch related to writing output files.
Changes:
- The way that prody fetch command handles files containing PDB identifiers has changed.
1.2 (Aug 30, 2012)¶
Important Changes:
Package folder prody
is moved into lib
folder to prevent
exceptions related to importing compiled packages from the installation
folder.
Some changes in Trajectory
and Ensemble
methods related
to linking, setting, and selecting atoms were made to make the interface
more intuitive. These changes, which may break your code, are as follows:
AtomGroup
instances can be linked to aTrajectory
usingTrajectory.link()
method and linking status of an instance can be checked usingTrajectory.isLinked()
medhod.Trajectory.setAtoms()
method acceptsAtomGroup
andSelection
instances and should be used to select a subset of atoms. This method will not linkAtomGroup
instance to the trajectory and also will not update the reference coordinates of the instance.Trajectory.select()
andEnsemble.select()
methods are removed and their functions are overloaded toTrajectory.setAtoms()
andEnsemble.setAtoms()
methods, respectively.Trajectory.getSelection()
andEnsemble.getSelection()
methods are removed, useTrajectory.getAtoms()
andEnsemble.getAtoms()
instead.Trajectory
reference coordinates must be changed usingTrajectory.setCoords()
method.
For usage examples see Trajectory Analysis, Trajectory Analysis II, Frames and Atom Groups, and Trajectory Output.
New Features:
- Atom Flags, that are used in Atom Selections, is implemented. See its documentation for handy usage examples.
sortAtoms()
function is implemented.pickCentralConf()
function is implemented to pick the conformation or the active coordinate set that is closest to the average of coordinate sets.writePSF()
, a simple PSF file writer, is implemented.glob()
utility function is implemented.iterPDBFilenames()
function is implemented, which can be used to iterate over all PDB files stored in a local mirror of Protein Data Bank.findPDBFiles()
function is implemented, which can be used to access PDB files in a path.
Improvements:
HierView
instances are built more efficiently. Two times speed-up is achieved by delaying instantiation ofChain
andResidue
instances until they are needed.- Multiple Atom Flags can be used in Atom Selections without using
'and'
operator, e.g.'sidechain carbon'
is the same as'sidechain and carbon'
.writePDB()
acceptsEnsemble
,Conformation
, andFrame
instances as atoms argument.writePDB()
function is around 25% faster.pickCentral()
is extended to acceptAtomic
andEnsemble
instances. Old function is nowpickCentralAtom()
.- prody align command and
prody_align()
function can handle non-protein atom selections (see examples for prody align).parsePDB()
andwritePDB()
supports 100K and more atoms.
Changes:
showOverlapTable()
displays first set of modes along x axis of the plot.AtomGroup.setData()
does not accept arrays with boolean data type, useAtomGroup.setFlags()
instead.writePDB()
function argument model is changed to csets that indicates the coordinate set index of atoms argument.PackageLogger.timing()
does not return elapsed time, only logs this information.PackageLogger.startLogfile()
is deprecated for removal in v1.3, usePackageLogger.start()
instead.PackageLogger.closeLogfile()
is deprecated for removal in v1.3, usePackageLogger.close()
instead.from prody.utilities import *
will not work anymore due to potential name conflicts with Python standard library functions. Import required functions explicitly.writePDB()
appends.pdb
extension to filename when it is not present- prody select command positional argument order is changed to allow for handling multiple PDBs at a time. Old older will be supported until v1.4, but a warning message will be issued.
- select argument in
alignCoordsets()
is removed, make selection outside of the function instead.
Deprecations:
AtomGroup.getHeteros()
method has been deprecated for removal in v1.3, usegetFlags('hetatm')
instead.AtomMap.getMappedFlags()
andAtomMap.getDummyFlags()
methods have been deprecated for removal in v1.3, usegetFlags('mapped')
andgetFlags('dummy')
instead.getVerbosity()
andsetVerbosity()
are deprecated for removal in v1.3, useconfProDy()
instead which save changes permanently.NMA.getModes()
andModeSet.getModes()
methods are deprecated for removal in v1.3, uselist()
, e.g.list(model)
, instead.
Bugfixes:
- Fixed a bug in prody contacts command that arose problems when when selecting a subset of the target atoms.
Normal Mode Wizard¶
Improvements:
- ProDy Interface shows the size of the trajectory output file for PCA calculations.
- Mode Graphics Options allows for copying arrows settings from one mode to another.
- Color scale method and midpoint for protein coloring based on mobility and bfactors can be adjusted from Protein Graphics Options panel.
ProDy 1.1 Series¶
1.1 (June 1, 2012)¶
New Features:
iterFragments()
function is added.findNeighbors()
function is added.calcMSF()
andcalcRMSF()
functions are added.wrapAtoms()
functions is added.extendMode()
andextendVector()
functions are added.- prody contacts command is added.
Improvements:
moveAtoms()
function is improved to move atoms to a specified location.DCDFile
andparseDCD()
take astype keyword argument for automatic type recasting for coordinate arrays. This option can be used to convert 32-bit coordinate arrays to 64-bit automatically for higher precision calculations.- Commands prody anm, prody gnm, and prody pca can extend a coarse grained model to backbone or all atoms of the residues. See their documentation pages.
Changes:
- Color scale used by
showOverlapTable()
is normalized by default.tools
module is depracated for removal, useutilities
instead.- array argument in
moveAtoms()
is replaced with by keyword argument.- which argument in
AtomGroup.copy()
method is deprecated for removal in version 1.2.DCDFile
does not log information for most common type of DCD file, i.e. 32-bit CHARMM format.Trajectory.getNextIndex()
method is deprecated for removal in v1.2, usenextIndex()
instead.
Bugfixes:
- Fixed several problems in
iterNeighbors()
function andContacts
class that were introduced after transition to newKDTree
interface.- Fixed a problem in setting selection strings of fragments identified using
findFragments()
.- Fixed a problem in
calcCenter()
related to weighted center calculation.- Fixed a problem of in copying
AtomMap
instances, which would emerge when bond information was present in unusual mappings, such as when atom orders are changed or an atom is present multiple times in the mapping.
Normal Mode Wizard¶
Improvements:
- Mode scaling options are improved.
- Options added for extending coarse grained NMA models to residue backbone or all atoms.
ProDy 1.0 Series¶
1.0.4 (May 2, 2012)¶
Bugfixes:
- Fixed a problem in
calcPhi()
function that raised a name error.- Fixed a problem in
KDTree.getDistances()
method that raised a name error when unitcell is provided.- Fixed a problem in
buildDistMatrix()
andcalcDistance()
functions causing miscalculations when unitcell is given.- Revised
KDTree
methods dealing with to handle special cases where unitcell might have some dimensions zero.
Changes:
buildKDTree()
method is removed, earlier than planned due to unexpected bugfix releases.
1.0.3 (May 1, 2012)¶
Bugfixes:
- Fixed
kdtree
import problem.
New Features:
buildDistMatrix()
function that can take periodic boundary conditions is implemented.
Improvements:
calcDistance()
function is improved to take periodic boundary conditions into account when provided by the users.
1.0.2 (May 1, 2012)¶
New Features:
- Methods to deal with connected subsets of atoms are implemented, see
AtomGroup.iterFragments()
andAtomGroup.numFragments()
.pickCentral()
method is implemented for picking the atom that is closest to the centroid of a group or subset of atoms.- ProDy configuration option auto_secondary is implemented to allow for parsing and assigning secondary structure information from PDB file header data automatically. See
assignSecstr()
andconfProDy()
for usage details.- prody align makes use of
select
when aligning multiple structures. See usage examples: prody alignprintRMSD()
function that prints minimum, maximum, and mean RMSD values when comparing multiple coordinate sets is implemented.findFragments()
function that identifies fragments in atom subsets, e.g.Selection
, is implemented.- A new
KDTree
interface with coherent method names and capability to handle periodic boundary conditions is implemented.
Improvements:
- Performance improvements made in
saveAtoms()
andloadAtoms()
.sliceMode()
,sliceModel()
,sliceVector()
, andreduceModel()
functions acceptSelection
instances as well as selection strings. In repeated use of this function, if selections are already made out of the function, considerable speed-ups are achieved when selection is passed instead of selection string.- Fragment iteration (
AtomGroup.iterFragments()
) is improved to yield items faster.
Changes:
- There is a change in the behavior of addition operation on instances of
AtomGroup
. When operands do not have same number of coordinate sets, the result will have one coordinate set that is concatenation of the active coordinate sets of operands.buildKDTree()
function is deprecated for removal, use the newKDTree
class instead.
Bugfixes:
- A problem in building hierarchical views when making selections using resindex, chindex, and segindex keywords is fixed.
- A problem in
Chain
andResidue
selection strings that would emerge when aHierView
is build using a selection is fixed.- A problem with copying
AtomGroup
instances whose coordinates are not set is fixed.AtomGroup
fragment detection algorithm is rewritten to avoid the problem of reaching maximum recursion depth for large molecules with the old recursive algorithm.- A problem with picking central atom of
AtomGroup
instances inpickCentral()
function is fixed.- A problem in
Select
class that caused exceptions when evaluating complex macro definitions is fixed.- Fixed a problem in handling multiple trajectory files. The problem would emerge when a file was added (
addFile()
) to aTrajectory
after atoms were set (setAtoms()
). Newly added file would not be associated with the atoms and coordinates parsed from this file would not be set for theAtomGroup
instance.
1.0.1 (Apr 6, 2012)¶
New Features:
- ProDy can be configured to automatically check for updates on a regular basis, see
checkUpdates()
andconfProDy()
functions for details.alignPDBEnsemble()
function is implemented to align PDB files using transformations calculated in ensemble analysis. See usage example in Homologous Proteins example.PDBConformation.getTransformation()
is implemented to return the transformation that was used to superpose conformation onto reference coordinates. This transformation can be used to superpose the original PDB file onto the reference PDB file.- Amino acid sequences with regular expressions can be used to make atom selections, e.g.
'sequence "C..C"'
. See Atom Selections for usage details.calcCrossProjection()
function is implemented.
Improvements:
Select
class raises aSelectionError
when potential typos are detected in a selection string, e.g.'chain AB'
is a grammatically correct selection string that will return None since no atoms have chain identifier'AB'
. In such cases, an exception noting that values exceed maximum number of characters is raised.- prody align command accepts percent sequence identity and overlap parameters used when matching chains from given multiple structures.
- When using prody align command to align multiple structure, all models in NMR structures are aligned onto the reference structure.
- prody catdcd command accepts
--align SELSTR
argument that can be used to align frames when concatenating files.showProjection()
andshowCrossProjection()
functions are improved to evaluate list of markers, color, labels, and texts. See usage example in Plotting.Trajectory
instances can be used for calculating and plotting projections usingcalcProjection()
,showProjection()
,calcCrossProjection()
, andshowCrossProjection()
functions.
Changes:
- Phosphorylated amino acids, phosphothreonine (TPO), O-phosphotyrosine (PTR), and phosphoserine (SEP), are recognized as acidic protein residues. This prevents having breaks in protein chains which contains phosphorylated residues. See Atom Selections for definitions of protein and acidic keywords.
- Hit dictionaries from
PDBBlastRecord
will use percent_overlap instead of percent_coverage. Older key will be removed in v1.1.Transformation.get4x4Matrix()
method is deprecated for removal in v1.1, useTransformation.getMatrix()
method instead.
Bugfixes:
- A bug in some ProDy Applications is fixed. The bug would emerge when invalid arguments were passed to effected commands and throw an unrelated exception hiding the error message related to the arguments.
- A bug in
'bonded to ...'
is fixed that emerged when'...'
selected nothing.- A bug in
'not'
selections using.
operator is fixed.
1.0 (Mar 7, 2012)¶
Improvements:
ANM.buildHessian()
method is not using a KDTree by default, since with some code optimization the version not using KDTree is running faster. Same optimization has gone intoGNM.buildKirchhoff()
too, but for Kirchoff matrix, version using KDTree is faster and is the default. Both methods have kdtree argument to choose whether to use it or not.- prody script is updated. Importing Prody and Numpy libraries are avoided. Script responses to help queries faster. See ProDy Applications for script usage details.
- Added
bonded to ...
selection method that expands a selection to immediately bound atoms. See Atom Selections for its description.fetchPDBLigand()
parses bond data from the XML file.fetchPDBLigand()
can optionally save compressed XML files into ProDy package folder so that frequent access to same files will be more rapid. SeeconfProDy()
function for setting this option.Select
class is revised. All exceptions are handled delicately to increase the stability of the class.- Distance based atom selection is 10 to 15% faster for atom groups with more than 5K atoms.
- Added uncompressed file saving option to prody blast command.
Changes:
- All deprecated method and functions scheduled for removal are removed.
getEigenvector()
andgetEigenvalue()
methods are deprecated for removal in v1.1, useMode.getEigvec()
andMode.getEigval()
instead.getEigenvectors()
andgetEigenvalues()
methods are deprecated for removal in v1.1, useNMA.getEigvecs()
andNMA.getEigvals()
instead.Mode.getCovariance()
andModeSet.getCovariance()
methods are deprecated for removal in v1.1, usecalcCovariance()
method instead.Mode.getCollectivity()
method is removed, usecalcCollectivity()
function instead.Mode.getFractOfVariance()
method is removed, use the newcalcFractVariance()
function instead.Mode.getSqFlucts()
method is removed, usecalcSqFlucts()
function instead.- Renamed
showFractOfVar()
function asshowFractVars()
function instead.- Removed
calcCumOverlapArray()
, usecalcCumulOverlap()
witharray=True
argument instead.- Renamed
extrapolateModel()
asextendModel()
.- The relation between
AtomGroup
,Trajectory
, andFrame
instances have changed. See Trajectory Analysis II and Trajectory Output, and Frames and Atom Groups usage examples.AtomGroup
cannot be deformed by direct addition with a vector instance.- Unmapped atoms in
AtomMap
instances are called dummies.AtomMap.numUnmapped()
method, for example, is renamed asAtomMap.numDummies()
.fetchPDBLigand()
accepts only filename (instead of save and folder) argument to save an XML file.
Bugfixes:
- A problem in distance based atom selection which would could cause problems when a distance based selection is made from a selection is fixed.
- Changed prody blast so that when a path for downloading files are given files are not save to local PDB folder.
ProDy 0.9 Series¶
0.9.4 (Feb 4, 2012)¶
Changes:
setAtomGroup()
andgetAtomGroup()
methods are renamed asEnsemble.setAtoms()
andEnsemble.getAtoms()
.AtomGroup
class trajectory methods, i.e.AtomGroup.setTrajectory()
,AtomGroup.getTrajectory()
,AtomGroup.nextFrame()
,AtomGroup.nextFrame()
, andAtomGroup.gotoFrame()
methods are deprecated. Version 1.0 will feature a better integration ofAtomGroup
andTrajectory
classes.
Bugfixes:
- Bugfixes in
Bond.setACSIndex()
,saveAtoms()
, andHierView.getSegment()
.- Bugfixes in
GammaVariableCutoff
andGammaStructureBased
classes.- Bugfix in
calcCrossCorr()
function.- Bugfixes in
Ensemble.getWeights()
,showOccupancies()
,DCDFile.flush()
.- Bugfixes in ProDy commands prody blast, prody fetch, and prody pca.
- Bugfix in
calcCenter()
function.
0.9.3 (Feb 1, 2012)¶
New Features:
DBRef
class is implemented for storing references to sequence databases parsed from PDB header records.Methods for storing coordinate set labels in
AtomGroup
instances are implemented:getACSLabel()
, andgetACSLabel()
.
calcCenter()
andmoveAtoms()
functions are implemented for dealing with coordinate translation.Hierarchical view,
HierView
, is completely redesigned. PDB files that contain non-empty segment name column (or when such information is parsed from a PSF file), new design delicately handles this information to identify distinct chains and residues. This prevents merging distinct chains in different segments but with same identifiers and residues in those with same numbers. New design is also using ordered dictionariescollections.OrderedDict
and lists so that chain and residue iterations yield them in the order they are parsed from file. These improvements also bring modest improvements in speed.
Segment
class is implemented for handling segments of atoms defined in molecular dynamics simulations setup, using psfgen for example.Context manager methods are added to trajectory classes. A trajectory file can be opened as follows:
with Trajectory('mdm2.dcd') as traj: for frame in traj: calcGyradius(frame)
Chain
slicing is implemented:p38 = parsePDB('1p38') chA = p38['A'] res_4to10 = chA[4:11] res_100toLAST = chA[100:]Some support for bonds is implemented to
AtomGroup
class. Bonds can be set usingsetBonds()
method. All bonds must be set at once.iterBonds()
oriterBonds()
methods can be used to iterate over bonds in an AtomGroup or an Atom.
parsePSF()
parses bond information and sets to the atom group.
Selection.update()
method is implemented, which may be useful to update a distance based selection after coordinate changes.
buildKDTree()
anditerNeighbors()
methods are implemented for facilitating identification of pairs of atoms that are proximal.
iterAtoms()
method is implemented to allatomic
classes to provide uniformity for atom iterations.
calcAngle()
,calcDihedral()
,calcPhi()
,calcPsi()
, andcalcOmega()
methods are implemented.
Improvements:
Chain.getSelstr()
andResidue.getSelstr()
methods are improved to include the selection string of aSelection
when they are built using one.
Changes:
Residue
methodsgetNumber()
,setNumber()
,getName()
,setName()
methods are deprecated and will be removed in v1.0.Chain
methodsgetIdentifier()
andsetIdentifier()
methods are deprecated and will be removed in v1.0.Polymer
attributeidentifier
is renamed aschid
.Chemical
attributeidentifier
is renamed asresname
.getACSI()
andsetACSI()
are renamed asgetACSIndex()
andsetACSIndex()
, respectively.calcRadiusOfGyration()
is deprecated and will be removed in v1.0. UsecalcGyradius()
instead.
Bugfixes:
- Fixed a problem in
parsePDB()
that caused loosing existing coordinate sets in anAtomGroup
when passed as ag argument.- Fixed a problem with
"same ... as ..."
argument ofSelect
that selected atoms when followed by an incorrect atom selection.- Fixed another problem with
"same ... as ..."
which result in selecting multiple chains when same chain identifier is found in multiple segments or multiple residues when same residue number is found in multiple segments.- Improved handling of negative integers in indexing
AtomGroup
instances.
0.9.2 (Jan 11, 2012)¶
New Features:
- prody catdcd command is implemented for concatenating and/or slicing
.dcd
files. See prody catdcd for usage examples.DCDFile
can be opened in write or append mode, and coordinate sets can be added usingwrite()
method.getReservedWords()
can be used to get a list of words that cannot be used to label user data.confProDy()
function is added for configuring ProDy.- ProDy can optionally backup existing files with
.BAK
(or another) extension instead of overwriting them. This behavior can be activated usingconfProDy()
function.
Improvements:
writeDCD()
file acceptsAtomGroup
or otherAtomic
instances as trajectory argument.- prody align command can be used to align multiple PDB structures.
- prody pca command allows atom selections for DCD files that are accompanied with a PDB or PSF file.
Changes:
DCDFile
instances, when closed, raise exception, similar to behavior offile
objects in Python.- Title of
AtomGroup
instances resulting from copying anAtomic
instances does not start with ‘Copy of’.changeVerbosity()
andgetVerbosityLevel()
are renamed assetVerbosity()
andgetVerbosity()
, respectively. Old names will be removed in v1.0.- ProDy applications (commands) module is rewritten to use new
argparse
module. See ProDy Applications for details of changes.argparse
module is added to the package for Python versions 2.6 and older.
Bugfixes:
- Fixed problems in
loadAtoms()
andsaveAtoms()
functions.- Bugfixes in
parseDCD()
andwriteDCD()
functions for Windows compatability.
0.9.1 (Nov 9, 2011)¶
Bug Fixes:
- Fixed problems with reading and writing configuration files.
- Fixed problem with importing nose for testing.
0.9 (Nov 8, 2011)¶
New Features:
- PDBML and mmCIF files can be retrieved using
fetchPDB()
function.getPDBLocalFolder()
andsetPDBLocalFolder()
functions are implemented for local PDB folder management.parsePDBHeader()
is implemented for convenient parsing of header data from.pdb
files.showProtein()
is implemented to allow taking a quick look at protein structure.Chemical
andPolymer
classes are implemented for storing chemical and polymer component data parsed from PDB header records.
Changes:
Warning
This release introduces numerous changes in method and function names all aiming to improve the interactive usage experience. All changes are listed below. Currently these functions and methods are present in both old and new names, so code using ProDy must not be affected. Old function names will be removed from version 1.0, which is expected to happen late in the first quarter of 2012.
Old function names are marked as deprecated, but ProDy will not issue any warnings until the end of 2011. In 2012, ProDy will automatically start issuing
DeprecationWarning
upon calls using old names to remind the user of the name change.For deprecated methods that are present in multiple classes, only the affected modules are listed for brevity.
Note
When modifying code using ProDy to adjust the name changes, turning on deprecation warnings may help locating all use cases of the deprecated names. See
turnonDepracationWarnings()
for this purpose.Functions:
The following function name changes are mainly to reduce the length of the name in order to make them more suitable for interactive sessions:
Old name New name applyBiomolecularTransformations()
buildBiomolecules()
assignSecondaryStructure()
assignSecstr()
scanPerturbationResponse()
calcPerturbResponse()
calcCrossCorrelations()
calcCrossCorr()
calcCumulativeOverlap()
calcCumulOverlap()
calcCovarianceOverlap()
calcCovOverlap()
showFractOfVariances()
showFractVars()
showCumFractOfVariances()
showCumulFractVars()
showCrossCorrelations()
showCrossCorr()
showCumulativeOverlap()
showCumulOverlap()
deform()
deformAtoms()
calcSumOfWeights()
calcOccupancies()
showSumOfWeights()
showOccupancies()
trimEnsemble()
trimPDBEnsemble()
getKeywordResidueNames()
getKeywordResnames()
setKeywordResidueNames()
setKeywordResnames()
getPairwiseAlignmentMethod()
getAlignmentMethod()
setPairwiseAlignmentMethod()
setAlignmentMethod()
getPairwiseMatchScore()
getMatchScore()
setPairwiseMatchScore()
setMatchScore()
getPairwiseMismatchScore()
getMismatchScore()
setPairwiseMismatchScore()
setMismatchScore()
getPairwiseGapOpeningPenalty()
getGapPenalty()
setPairwiseGapOpeningPenalty()
setGapPenalty()
getPairwiseGapExtensionPenalty()
getGapExtPenalty()
setPairwiseGapExtensionPenalty()
setGapExtPenalty()
Coordinate methods:
All
getCoordinates()
andsetCoordinates()
methods inatomic
andensemble
classes are renamed asgetCoords()
andsetCoords()
, respectively.
getNumOf
methods:All method names starting with
getNumOf
now start withnum
. This change brings two advantages: method names (i) are considerably shorter, and (ii) do not suggest that there might also be correspondingset
methods.
Old name New name Affected modules getNumOfAtoms()
numAtoms()
atomic
,ensemble
,dynamics
getNumOfChains()
numChains()
atomic
getNumOfConfs()
numConfs()
ensemble
getNumOfCoordsets()
numCoordsets()
atomic
,ensemble
getNumOfDegOfFreedom()
numDOF()
dynamics
getNumOfFixed()
numFixed()
ensemble
getNumOfFrames()
numFrames()
ensemble
getNumOfResidues()
numResidues()
atomic
getNumOfMapped()
numMapped()
atomic
getNumOfModes()
numModes()
dynamics
getNumOfSelected()
numSelected()
ensemble
getNumOfUnmapped()
numUnmapped()
atomic
getName
method:
getName()
methods are renamed asgetTitle()
to avoid confusions that might arise from changes inatomic
method names listed below. All classes inatomic
,ensemble
, anddynamics
are affected from this change.In line with this change,
parsePDB()
andparsePQR()
name arguments are changed to title, but name argument will also work until release 1.0.This name change conflicted with
DCDFile.getTitle()
method. The conflict is resolved in favor of the generalgetTitle()
method. An alternative method will be implemented to handle title strings inDCD
files.
get/set
methods of atomic classes:Names of
get
andset
methods allowing access to atomic data are all shortened as follows:
Old name New name getAtomNames()
getNames()
getAtomTypes()
getTypes()
getAltLocIndicators()
getAltlocs()
getAnisoTempFactors()
getAnisos()
getAnisoStdDevs()
getAnistds()
getChainIdentifiers()
getChains()
getElementSymbols()
getElements()
getHeteroFlags()
getHeteros()
getInsertionCodes()
getIcodes()
getResidueNames()
getResnames()
getResidueNumbers()
getResnums()
getSecondaryStrs()
getSecstrs()
getSegmentNames()
getSegnames()
getSerialNumbers()
getSerials()
getTempFactors()
getBetas()
This change affects all
atomic
classes,AtomGroup
,Atom
,Chain
,Residue
,Selection
andAtomMap
.Other changes in atomic methods:
getSelectionString()
renamed asgetSelstr()
Methods handling user data (which was previously called attribute) are renamed as follows:
Old name New name getAttribute()
getData()
getAttrNames()
getDataLabels()
getAttrType()
getDataType()
delAttribute()
delData()
isAttribute()
isData()
setAttribute()
setData()
To be removed:
Finally, the following methods will be removed, but other suitable methods are overloaded to perform their action:
- removed
AtomGroup.getBySerialRange()
, overloadedAtomGroup.getBySerial()
- removed
getProteinResidueNames()
, overloadedgetKeywordResnames()
- removed
setProteinResidueNames()
, overloadedsetKeywordResnames()
Scripts:
The way ProDy scripts work has changed. See ProDy Applications for details. Using older scripts will start issuing deprecation warnings in 2012.
Bug Fixes:
- Bugs in
execDSSP()
andexecSTRIDE()
functions that caused exceptions when compressed files were passed is fixed.- A problem in scripts for PCA of DCD files is fixed.
Normal Mode Wizard¶
Development of NMWiz is finalized and it will not be distributed in the ProDy installation package anymore. See Normal Mode Wizard pages for instructions on installing it.
ProDy 0.8 Series¶
0.8.3 (Oct 16, 2011)¶
New Features:
- Functions to read and write PQR files:
parsePQR()
andwritePQR()
.- Added
PDBEnsemble.getIdentifiers()
method that returns identifiers of all conformations in the ensemble.- ProDy tests are incorporated to the package installer. If you are using Python version 2.7, you can run the tests by calling
prody.test()
.
Improvements:
blastPDB()
function andPDBBlastRecord
class are rewritten to use faster and more compact code.- New
PackageLogger
function is implemented to unify logging and reporting task progression.- Improvements in PDB ensemble support functions, e.g.
trimPDBEnsemble()
, are made.- Improvements in ensemble concatenations are made.
Bug Fixes:
- Bugfixes in
PDBEnsemble()
slicing operation. This may have affected users when slicing a PDB ensemble for plotting projections in color for different forms of the protein.
0.8.2 (Oct 14, 2011)¶
New Features:
fetchPDBClusters()
,loadPDBClusters()
, andgetPDBCluster()
functions are implemented for handling PDB sequence cluster data. These functions can be used instead ofblastPDB()
function for fast access to structures of the same protein (at 95% sequence identity level) or similar proteins.- Perturbation response scanning method described in [AA09] is implemented as
scanPerturbationResponse()
based on the code provided by Ying Liu.
Changes:
fetchPDBLigand()
returns the URL of the XML file in the ligand data dictionary.- Name of the ProDy configuration file in user
home
directory is renamed as.prodyrc
(used to be.prody
).applyBiomolecularTransformations()
andassignSecondaryStructure()
functions raiseValueError
when the function fails to perform its action due to missing data in header dictionary.fetchPDB()
decompresses PDB files found in the working directory when user asks for decompressed files.parsePDB()
appends chain and subset arguments toAtomGroup()
name.- chain argument is added to
PDBBlastRecord.getHits()
.
Improvements:
- Atom selection class
Select
is completely redesigned to prevent breaking of the parser when evaluating invalid selection strings.- Improved type checking in
parsePDB()
function.
Bug Fixes:
- Bugfixes in
parseDSSP()
: one emerged problems in lines indicating chain breaks, another did not parse bridge-partners correctly. Both fixes are contributed by Kian Ho.- Bugfix in
parsePDB()
function. When only header is desired (header=True, model=0
), would return a tuple containing an empty atom group and the header.
Developmental:
- Unit tests for
proteins
andselect
modules are developed.
0.8.1 (Sep 16, 2011)¶
New Features:
fetchLigandData()
is implemented for fetching ligand data from Ligand Expo.parsePSF()
function is implemented for parsing X-PLOR format PSF files.
Changes:
- __slots__ is used in
AtomGroup
andAtomic
classes. This change prevents user from assigning new variables to instances of all classes derived from the baseAtomic
.pyparsing
is updated to version 1.5.6.
Bug Fixes:
- A bug in
AtomGroup.copy()
method is fixed. When AtomGroup instance itself is copied, deep copies of data arrays were not made.- A bug in
Select
class raising exceptions when negative residue number values are present is fixed.- Another bug in
Select
class misinterpretingsame residue as ...
statement when specific chains are involved is fixed.- A bug in
AtomGroup.addCoordset()
method duplicating coordinates when no coordinate sets are present in the instance is fixed.
Normal Mode Wizard¶
Changes:
- Version number in main window is iterated.
- Mode graphics material is stored for individual modes.
- Mode scaling factor is printed when active mode or RMSD is changed.
- All selections are deleted to avoid memory leaks.
0.8 (Aug 24, 2011)¶
Note
After installing v0.8, you may need to make a small change in your
existing scripts. If you are using Ensemble
class
for analyzing PDB structures, rename it as PDBEnsemble
.
See the other changes that may affect your work below and the class
documentation for more information.
New Features:
DCDFile
is implemented for handling DCD files.Trajectory
is implemented for handling multiple trajectory files.writeDCD()
is implemented for writing DCD files.- Trajectory Analysis example to illustrate usage of new classes for handling DCD files. Essential Dynamics Analysis example is updated to use new ProDy classes.
PCA
supportsTrajectory
andDCDFile
instances.Ensemble
andPDBEnsemble
classes can be associated withAtomGroup
instances. This allows selecting and evaluating coordinates of subset of atoms. SeesetAtomGroup()
,select()
,getAtomGroup()
, andgetSelection()
methods.execDSSP()
,parseDSSP()
, andperformDSSP()
functions are implemented for executing and parsing DSSP calculations.execSTRIDE()
,parseSTRIDE()
, andperformSTRIDE()
functions are implemented for executing and parsing DSSP calculations.parsePDB()
function parses atom serial numbers. Atoms can be retrieved from anAtomGroup
instance by their serial numbers usinggetBySerial()
andgetBySerialRange()
methods.calcADPs()
function can be used to calculate anisotropic displacement parameters for atoms with anisotropic temperature factor data.getRMSFs()
is implemented for calculating root mean square fluctuations.AtomGroup
andMode
orVector
additions are supported. This adds a new coordinate set to theAtomGroup
instance.getAttrNames()
is implemented for listing user set attribute names.
Improvements:
calcProjection()
,showProjection()
, andshowCrossProjection()
functions can optionally calculate/display RMSD along the normal mode.- ANM, GNM, and PCA applications can optionally write compressed ProDy data files.
fetchPDB()
function can optionally write decompressed files and force copying a file from local mirror to target folder.PCA.buildCovariance()
andPCA.performSVD()
methods accept Numpy arrays as coordinate sets.- Performance of
PCA.buildCovariance()
method is optimized for evaluation of PDB ensembles.calcRMSD()
andsuperpose()
functions are optimized for speed and memory usage.Ensemble.getMSFs()
is optimized for speed and memory usage.- Improvements in memory operations in
atomic
,ensemble
, anddynamics
modules for faster data (PDB/NMD) output.- Optimizations in
Select
andContacts
classes.
Changes:
Ensemble
does not store conformation names. Instead, newly implementedPDBEnsemble
class stores identifiers for individual conformations (PDB IDs). This class should be used in cases where source of individual conformations is important.calcProjection()
,showProjection()
, andshowCrossProjection()
function calculate/display root mean square deviations, by default.- Oxidized cysteine residue abbreviation
CSO
is added to the definition ofprotein
keyword.getMSF()
method is renamed asgetMSFs()
.parseDCD()
function returnsEnsemble
instances.
Bug Fixes:
- A bug in
select
module causing exceptions when regular expressions are used is fixed.- Another bug in
select
module raising exception when “(not ..,” is passed is fixed.- Various bugfixes in
ensemble
module.- Problem in prody fetch that occurred when a file is found in a local mirror is fixed.
- Bugfix in
AtomPointer.copy()
method.
Normal Mode Wizard¶
New Features:
- NMWiz can be used to compare two structures by calculating and depicting structural changes.
- Arrow graphics is scaled based on a user specified RMSD value.
Improvements:
- NMWiz writes DCD format trajectories for PCA using ProDy. This provides significant speed up in cases where IO rate is the bottleneck.
Changes:
- Help is provided in a text window to provide a cleaner GUI.
ProDy 0.7 Series¶
0.7.2 (Jun 21, 2011)¶
New Features:
parseDCD()
is implemented for parsing coordinate sets from DCD files.
Improvements:
parsePDB()
parsesSEQRES
records in header sections.
Changes:
- Major classes can be instantiated without passing a name argument.
- Default selection in NMWiz ProDy interface is changed to ensure selection only protein Cα atoms.
Bug Fixes:
- A bug in
writeNMD()
function causing problems when writing a single mode is fixeed.- Other bugfixes in
dynamics
module functions.
0.7.1 (Apr 28, 2011)¶
Highlights:
Atomic
__getattribute__()
is overloaded to interpret atomic selections following the dot operator. For example,atoms.calpha
is interpreted asatoms.select('calpha')
. See :ref:`` for more details.AtomGroup
class is integrated withHierView
class. Atom group instances now can be indexed to get chains or residues and number of chains/residues can be retrieved. A hierarchical view is generated and updated when needed. See :ref:`` for more details.
New Features:
matchAlign()
is implemented for quick alignment of protein structures. See Ligand Extraction usage example.setAttribute()
,getAttribute()
,delAttribute()
, andisAttribute()
functions are implemented forAtomGroup
class to facilitate storing user provided atomic data. See Storing data in AtomGroup example.saveAtoms()
andloadAtoms()
functions are implemented to allow for saving atomic data and loading it This saves custom atomic attributes and much faster than parsing data from PDB files.calcCollectivity()
function is implemented to allow for calculating collectivity of deformation vectors.
Improvements:
parsePDB()
can optionally return biomolecule whenbiomol=True
keyword argument is passed.parsePDB()
can optionally make secondary structure assignments whensecondary=True
keyword argument is passed.calcSqFlucts()
function is changed to acceptVector
instances, e.g. deformation vectors.
Changes:
- Changes were made in
calcADPAxes()
function to follow the conventions in analysis ADPs. See its documentation.
Bug Fixes:
- A in
Ensemble
slicing operations is fixed. Weights are now copied to the new instances obtained by slicing.- Bug fixes in
dynamics
plotting functionsshowScaledSqFlucts()
,showNormedSqFlucts()
,
0.7 (Apr 4, 2011)¶
New Features:
- Regular expressions can be used in atom selections. See
select
module for details.- User can define selection macros using
defSelectionMacro()
function. Macros are saved in ProDy configuration and loaded in later sessions. Seeselect
module for other related functions.parseSparseMatrix()
function is implemented for parsing matrices in sparse format. See the usage example in Using an External Matrix.deform()
function is implemented for deforming coordinate sets along a normal mode or linear combination of multiple modes.sliceModel()
function is implemented for slicing normal mode data to be used with functions calculating atomic properties using normal modes.
Improvements:
- Atom selections using bare keyword arguments is optimized. New keyword definitions are added. See
select
module for the complete list.- A new keyword argument for
calcADPAxes()
allows for comparing largest axis to the second largest one.
Changes:
- There are changes in function used to alter definitions of selection keywords. See
select
for details.assignSecondaryStructure()
function assigns SS identifiers to all atoms in a residue. Residues with no SS information specified is assigned coil conformation.- When
Ensemble
andNMA
classes are instantiated with an empty string, instances are called “Unnamed”.sliceMode()
,sliceVector()
andreduceModel()
functions return the atom selection in addition to the sliced vector/mode/model instance.
Bug Fixes:
- Default selection for
calcGNM()
function is set to “calpha”.
Normal Mode Wizard¶
New Features:
- NMWiz supports GNM data and can use ProDy for GNM calculations.
- NMWiz can gather normal mode data from molecules loaded into VMD. This allows NMWiz to support all formats supported by VMD.
- User can write data loaded into NMWiz in NMD format.
- An Arrow Graphics option allows the user to draw arrows in both directions.
- User can select Licorice representation for the protein if model is an all atom mode.
- User can select Custom as the representation of the protein to prevent NMWiz from chancing a user set representation.
- Trace is added as a protein backbone representation option.
Improvements:
- NMWiz remembers all adjustments on arrow graphics for all modes.
- Plotting Clear button clears only atom labels that are associated with the dataset.
- Removing a dataset removes all associated molecule objects.
- Selected atom representations are turned on based on atom index.
- Padding around interface button has been standardized to provide a uniform experience between different platforms.
ProDy 0.6 Series¶
0.6.2 (Mar 16, 2011)¶
New Features:
performSVD()
function is implemented for faster and more memory efficient principal compoment analysis.extrapolateModel()
function is implemented for extrapolating a coarse-grained model to an all atom model. See the usage example Extend a coarse-grained model.plog()
is implemented for enabling users to make log entries.
Improvements:
compare
functions are improved to handle insertion codes.HierView
allows for indexing using chain identifier and residue numbers. See usage example Hierarchical Views.Chain
allows for indexing using residue number and insertion code. See usage example Hierarchical Views.addCoordset()
function acceptsAtomic
andEnsemble
instances as coords argument.- New method
HierView.getAtoms()
is implemented.AtomGroup
set functions check the correctness of dimension of data arrays to prevent runtime problems.- prody pca script is updated to use the faster PCA method that uses SVD.
Changes:
- “backbone” definition now includes the backbone hydrogen atom (Thanks to Nahren Mascarenhas for pointing to this discrepancy in the keyword definition).
Bug Fixes:
- A bug in
PCA
allowed calculating covariance matrix for less than 3 coordinate sets is fixed.- A bug in
mapOntoChain()
function that caused problems when mapping all atoms is fixed.
0.6.1 (Mar 2, 2011)¶
New Features:
setWWPDBFTPServer()
andgetWWPDBFTPServer()
functions allow user to change or learn the WWPDB FTP server that ProDy uses to download PDB files. Default server is RCSB PDB in USA. User can change the default server to one in Europe or Japan.setPDBMirrorPath()
andgetPDBMirrorPath()
functions allow user to specify or learn the path to a local PDB mirror. When specified, a local PDB mirror is preferred for accessing PDB files, over downloading them from FTP servers.mapOntoChain()
function is improved to map backbone or all atoms.
Improvements:
WWPDB_PDBFetcher
can download PDB files from different WWPDB FTP servers.WWPDB_PDBFetcher
can also use local PDB mirrors for accessing PDB files.
Changes:
RCSB_PDBFetcher
is renamed asWWPDB_PDBFetcher
.mapOntoChain()
andmatchChains()
functions accept"ca"
and"bb"
as subset arguments.- Definition of selection keyword “protein” is updated to include some non-standard amino acid abbreviations.
Bug Fixes:
- A bug in
WWPDB_PDBFetcher
causing exceptions when non-string items passed in a list is fixed.- An important bug in
parsePDB()
is fixed. When parsing backbone or Cα atoms, residue names were not checked and this caused parsing water atoms with name"O"
or calcium ions with name"CA"
.
0.6 (Feb 22, 2011)¶
New Features:
- Biopython module pairwise2 and packages KDTree and Blast are incorporated in ProDy package to make installation easier. Only NumPy needs to be installed before ProDy can be used. For plotting, Matplotlib is still required.
- Normal Mode Wizard is distributed with ProDy source. On Linux, if VMD is installed, ProDy installer locates VMD plugins folder and installs NMWiz. On Windows, user needs to follow a separate set of instructions (see Normal Mode Wizard).
Gamma
class is implemented for facilitating use of force constants based on atom type, residue type, or property. An example derived classes areGammaStructureBased
andGammaVariableCutoff
.calcTempFactors()
function is implemented to calculate theoretical temperature factors.- 5 new ProDy Applications are implemented, and existing scripts are improved to output figures.
getModel()
method is implemented to make function development easier.resetTicks()
function is implemented to change X and/or Y axis ticks in plots when there are discontinuities in the plotted data.
Improvements:
ANM.buildHessian()
andGNM.buildKirchhoff()
classes are improved to acceptGamma
instances or other custom function as gamma argument. See also Custom Gamma Functions.Select
class is changed to treat single word keywords differently, e.g. “backbone” or “protein”. They are interpreted 10 times faster and in use achieve much higher speed-ups when compared to composite selections. For example, using the keyword “calpha” instead of thename CA and protein
, which returns the same selection, works >20 times faster.- Optimizations in
Select
class to increase performance (Thanks to Paul McGuire for providing several Pythonic tips and Pyparsing specific advice).applyBiomolecularTransformations()
function is improved to handle large biomolecular assemblies.- Performance optimizations in
parsePDB()
and other functions.Ensemble
class acceptsAtomic
instances and automatically adds coordinate sets to the ensemble.
Changes:
PDBlastRecord
is renamed asPDBBlastRecord
.NMA
instances can be index using a list or tuple of integers, e.g.anm[1,3,5]
.- “ca”, “bb”, and “sc” keywords are defined as short-hands for “calpha”, “backbone”, and “sidechain”, respectively.
- Behavior of
calcANM()
andcalcGNM()
functions have changed. They return the atoms used for calculation as well.
Bug Fixes:
Normal Mode Wizard¶
- NMWiz can be used as a graphical interface to ProDy. ANM or PCA calculations can be performed for molecules that are loaded in VMD.
- User can set default color for arrow graphics and paths to ANM and PCA scripts.
- Optionally, NMWiz can preserve the current view in VMD display window when loading a new dataset. Check the box in the NMWiz GUI main window.
- A bug that prevented selecting residues from plot window is fixed.
ProDy 0.5 Series¶
0.5.3 (Feb 11, 2011)¶
New Features:
- Membership, equality, and non-equality test operation are defined for all
atomic
classes. See Operations on Selections.- Two functions are implemented for dealing with anisotropic temperature factors:
calcADPAxes()
andbuildADPMatrix()
.NMA.setEigens()
andNMA.addEigenpair()
methods are implemented to assist analysis of normal modes calculated using external software.parseNMD()
is implemented for parsing NMD files.parseModes()
is implemented for parsing normal mode data.parseArray()
is implementing for reading numeric data, particularly normal mode data calculated using other software for analysis using ProDy.- The method in [BH02] to calculate overlap between covariance matrices is implemented as
calcCovOverlap()
function.trimEnsemble()
to trimEnsemble
instances is implemented.checkUpdates()
to check for ProDy updates is implemented.
Changes:
- Change in default behavior of
parsePDB()
function. When alternate locations exist, those indicated by A are parsed. For parsing all alternate locations user needs to passaltloc=True
argument.getSumOfWeights()
is renamed ascalcSumOfWeights()
.mapAtomsToChain()
is renamed asmapOntoChain()
.ProDyStartLogFile()
is renamed asstartLogfile()
.ProDyCloseLogFile()
is renamed ascloseLogfile()
.ProDySetVerbosity()
is renamed aschangeVerbosity()
.
Improvements:
- A few bugs in ensemble and dynamics classes are fixed.
- Improvements in
RCSB_PDBFetcher
allow it not to miss a PDB file if it exists in the target folder.writeNMD()
is fixed to output B-factors (Thanks to Dan Holloway for pointing it out).
0.5.2 (Jan 12, 2011)¶
Bug Fixes:
- An important fix in
sampleModes()
function was made (Thanks to Alberto Perez for finding the bug and suggesting a solution).
Improvements:
- Improvements in
ANM.calcModes()
,GNM.calcModes()
, andPCA.calcModes()
methods prevent Numpy/Scipy throwing an exception when more than available modes are requested by the user.- Improvements in
blastPDB()
enable ProDy throw an exception when no internet connection is found, and warn user when downloads fail due to restriction in network regulations (Thanks to Serkan Apaydin for helping identify these improvements).- New example Write PDB file.
0.5.1 (Dec 31, 2010)¶
Changes in dependencies:
- Scipy (linear algebra module) is not required package anymore. When available it replaces Numpy (linear algebra module) for greater flexibility and efficiency. A warning message is printed when Scipy is not found.
- Biopython KDTree module is not required for ENM calculations (specifically for building Hessian (ANM) or Kirchoff (GNM) matrices). When available it is used to increase the performance. A warning message is printed when KDTree is not found.
0.5 (Dec 21, 2010)¶
New Features:
AtomPointer
base class for classes pointing to atoms in anAtomGroup
.AtomPointer
instances (Selection, Residue, etc.) can be added. See Operations on Selections for examples.Select.getIndices()
andSelect.getBoolArray()
methods to expand the usage ofSelect
.sliceVector()
andsliceMode()
functions.saveModel()
andloadModel()
functions for saving and loading NMA data.parsePDBStream()
can now parse specific chains or alternate locations from a PDB file.alignCoordsets()
is implemented to superimpose coordinate sets of anAtomGroup
instance.
Bug Fixes:
- A bug in
parsePDBStream()
that caused unidentified errors when a model in a multiple model file did not have the same number of atoms is fixed.
Changes:
- Iterating over a
Chain
instance yieldsResidue
instances.Vector
instantiation requires an array only. name is an optional argument.- Functions starting with
get
and performing a calculations are renamed to start withcalc
, e.g.getRMSD()
is nowcalcRMSD()
.
ProDy 0.2 Series¶
0.2 (Nov 16, 2010)¶
Important Changes:
- Single word keywords not followed by “and” logical operator are not accepted, e.g. “protein within 5 of water” will raise a
SelectionError
, use “protein and within 5 of water” instead.findMatchingChains()
is renamed tomatchChains()
.showOverlapMatrix()
is renamed toshowOverlapTable()
.- Modules are reorganized.
New Features:
Atomic
for easy type checking.Contacts
for faster intermolecular contact identification.Select
can identify intermolecular contacts. See Intermolecular Contacts for an examples and details.sampleModes()
implemented for sampling conformations along normal modes.
Improvements:
proteins.compare
functions are improved. Now they perform sequence alignment if simple residue number/identity based matchin does not work, or if user passespwalign=True
argument. This impacts the speed of X-ray ensemble analysis.Select
can cache data optionally. This results in speeds up from 2 to 50 folds depending on number of atoms and selection operations.- Implementation of
showProjection()
is completed.
Normal Mode Wizard¶
Release 0.2.3
- For each mode a molecule for drawing arrows and a molecule for showing animation is formed in VMD on demand. NMWiz remembers a color associated with a mode.
- Deselecting a residue by clicking on a plot is possible.
- A bug causing incorrect parsing of NMD files from ANM server is fixed.
Release 0.2.2
- Selection string option allows user to show a subset of arrows matching a VMD selection string. Optionally, this selection string may affect protein and animation representations.
- A bug that caused problems when over plotting modes is removed.
- A bug affecting line width changes in plots is removed.
- Selected residue representations are colored according to the color of the plot.
Release 0.2.1
- Usability improvements.
- Loading the same data file more than once is prevented.
- If a GUI window for a dataset is closed, it can be reloaded from the main window.
- A dataset and GUI can be deleted from the VMD session via the main window.
Release 0.2
- Instant documentation is improved.
- Problem with clearing selections is fixed.
- Plotting options frame is populated.
- Multiple modes can be plotted on the same canvas.
ProDy 0.1 Series¶
0.1.2 (Nov 9, 2010)¶
- Important bug fixes and improvements in NMA helper and plotting functions.
- Documentation updates and improvements.
0.1.1 (Nov 8, 2010)¶
- Important bug fixes and improvements in chain comparison functions.
- Bug fixes.
- Source clean up.
- Documentation improvements.
0.1 (Nov 7, 2010)¶
- First release.
About ProDy¶
ProDy is a free and open-source Python package for protein structural dynamics and sequence evolution analysis. It is designed as a flexible and responsive API suitable for interactive usage and application development.
People¶
ProDy is being developed in the Bahar Lab at the University of Pittsburgh with support from NIH R01 GM099738 award.
Development Team¶
Ahmet Bakan initiated the ProDy project, designed and developed ProDy, NMWiz, Evol, and DruGUI.
Cihan Kaya is currently overseeing the overall development of ProDy.
She (John) Zhang is currently helping on maintaining and developing ProDy.
Hongchun Li is currently maintaining and developing ANM and GNM servers.
Anindita Dutta contributed to the development of Evol,
database
and sequence
modules.
Tim Lezon contributed to development of Rotations and Translation of Blocks and Membrane ENM.
Wenzhi Mao contributed to development of MSA analysis functions.
Lidio Meireles provided insightful comments on the design of ProDy, and contributed to the development of ProDy Applications.
Contributors¶
In addition to the development team members, we acknowledge contributions and feedback from the following individuals:
Ying Liu provided the code for Perturbation Response Scanning method.
Kian Ho contributed with bug fixes and unit tests for DSSP functions.
Gökçen Eraslan contributed with bug fixes and development and maintenance insights.
Citing¶
When using ProDy or NMWiz in published work, please cite:
Bakan A, Meireles LM, Bahar I.ProDy: Protein Dynamics Inferred from Theory and Experiments.Bioinformatics 2011 27(11):1575-1577.
When using pairwise2 or KDTree modules in published work, please cite:
Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B, de Hoon MJ.Biopython: freely available Python tools for computational molecular biology and bioinformatics.Bioinformatics 2009 25(11):1422-3.
Credits¶
ProDy makes use of the following great software:
pyparsing is used to define the sophisticated atom selection grammar. This makes every user a power user by enabling fast access to and easy handling of atomic data via simple selection statements.
Biopython KDTree package and pairwise2 module, which are distributed ProDy, significantly enrich and improve the ProDy user experience. KDtree package allows for fast distance based selections making atom selections suitable for contact identification. pairwise2 module enables performing sequence alignment for protein structure comparison and ensemble analysis.
ProDy requires NumPy for almost all major functionality including, but not limited to, storing atomic data and performing normal mode calculations. The power and speed of NumPy makes ProDy suitable for interactive and high-throughput structural analysis.
Finally, ProDy can benefit from SciPy and Matplotlib packages. SciPy makes ProDy normal calculations more flexible and on low memory machines possible. Matplotlib allows greatly enriches user experience by allowing plotting protein dynamics data calculated using ProDy.
Funding¶
Continued development of protein dynamics software ProDy is supported by NIH through R01 GM099738 award.
License¶
ProDy¶
ProDy is available under the MIT License:
ProDy: A Python Package for Protein Dynamics Analysis
Copyright (C) 2010-2014 University of Pittsburgh
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Biopython¶
Biopython KDTree package and pairwise2 module are distributed with the ProDy package. Biopython is developed by The Biopython Consortium and is available under the Biopython license:
Biopython License Agreement
Permission to use, copy, modify, and distribute this software and its
documentation with or without modifications and for any purpose and
without fee is hereby granted, provided that any copyright notices
appear in all copies and that both those copyright notices and this
permission notice appear in supporting documentation, and that the
names of the contributors or copyright holders not be used in
advertising or publicity pertaining to distribution of the software
without specific prior permission.
THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE DISCLAIM ALL
WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE
CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY SPECIAL, INDIRECT
OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE
OR PERFORMANCE OF THIS SOFTWARE.
Pyparsing¶
The pyparsing module is distributed with the ProDy package. Pyparsing is developed by Paul T. McGuire and is available under the MIT License:
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Argparse¶
The argparse module is distributed with the ProDy package. Argparse is developed by Steven J. Bethard and is available under the Python Software Foundation License.
CEalign¶
CEalign module is distributed with ProDy. The original CE method was developed by Ilya Shindyalov and Philip Bourne. The Python version which is used by ProDy is developed by Jason Vertrees and available under the New BSD license:
Copyright (c) 2007, Jason Vertrees.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in
the documentation and/or other materials provided with the
distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER
OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Release: | 1.10.2 |
---|---|
Date: | May 02, 2018 |