Tips¶
Run in batch¶
You can easily run a single tool in batch and rename new files:
$ for i in *.pdb; do rna_pdb_tools.py --get_rnapuzzle_ready $i > ${i/.pdb/_rpr.pdb}; done
or write new files in a different folder (out):
$ for i in *.pdb; do rna_pdb_tools.py --get_rnapuzzle_ready $i > ../out/$i; done
You can also easily run a single tool parallel using parallel_:
$ parallel "rna_add_chain.py -c A {} > ../nchain/{}" ::: *.pdb
# ex2
$ parallel "rna_clashscore.py {} > {}.csv" ::: *.pdb
Using sed¶
sed (stream editor) is a Unix utility that parses and transforms text, using a simple, compact programming language.
You can used sed to find & replace parts of text files:
$ head 1msy_rnakbmd_decoy1661_clx.pdb.outCR
Classifier: Clarna
chains: 1 27
2 26 bp G U WW_cis 0.8500
3 25 bp C G WW_cis 0.8114
4 24 bp U A WW_cis 0.9222
5 23 bp C G WW_cis 0.9038
6 22 bp C G WW_cis 0.8913
9 10 bp G U SH_cis 0.8563
10 19 bp U A WH_tran 0.7826
11 18 bp A G HS_tran 0.7620
$ sed 's/chains: /chains: A/' 1msy_rnakbmd_decoy1661_clx.pdb.outCR
Classifier: Clarna
chains: A 1 27
2 26 bp G U WW_cis 0.8500
3 25 bp C G WW_cis 0.8114
4 24 bp U A WW_cis 0.9222
5 23 bp C G WW_cis 0.9038
6 22 bp C G WW_cis 0.8913
9 10 bp G U SH_cis 0.8563
10 19 bp U A WH_tran 0.7826
11 18 bp A G HS_tran 0.7620
12 17 bp C G WW_cis 0.7242
Read more about sed.
In PyMOL¶
Quickref:
set ignore_case, off
Rename a chain:
PyMOL>alter (sele), chain="B"
Alter: modified 708 atoms.
PyMOL>sort
don’t forget about sort.
To select all, use PyMOL>alter all, resv -= 12.
To renumber a fragment starting with 24 to 29, select the fragment and:
PyMOL>alter (sele), resv += 5
Alter: modified 109 atoms.
To renumber residues:
PyMOL>alter (chain B), resv -= 44
Alter: modified 708 atoms.
PyMOL>sort
Read more.
The example of the pistol ribozyme editing.
Run:
PyMOL>alter (sele), chain="B"
Alter: modified 236 atoms.
PyMOL>alter (chain B), resv -= 51
Alter: modified 236 atoms.
PyMOL>sort
In Python¶
To get residue index use:
resi = int(l[22:26].strip())
Example:
ATOM 1 P C A 1 -16.936 -3.789 68.770 1.00 11.89 P
Qucikref:
COLUMNS PYTHON DATA TYPE FIELD DEFINITION
-------------------------------------------------------------------------------------
1 - 6 [0:6] Record name "ATOM "
7 - 11 [6:11] Integer serial Atom serial number.
13 - 16 [12:16] Atom name Atom name.
17 [16] Character altLoc Alternate location indicator.
18 - 20 [17:20] Residue name resName Residue name.
22 [21] Character chainID Chain identifier.
23 - 26 [22:26] Integer resSeq Residue sequence number.
27 [26] AChar iCode Code for insertion of residues.
31 - 38 [30:38] Real(8.3) x Orthogonal coordinates for X in Angstroms.
39 - 46 [38:46] Real(8.3) y Orthogonal coordinates for Y in Angstroms.
47 - 54 [46:54] Real(8.3) z Orthogonal coordinates for Z in Angstroms.
55 - 60 [54:60] Real(6.2) occupancy Occupancy.
61 - 66 [60:66] Real(6.2) tempFactor Temperature factor.
77 - 78 [76:78] LString(2) element Element symbol, right-justified. # l[76:78]
79 - 80 [78:80] LString(2) charge Charge on the atom.
Working with cluster¶
Tips:
# get your pdb files
[mm] ade rsync -v peyote2:'~/ade/*.pdb' . # ' is required!
See long name with qstat
:
magnus@peyote2:~$ qstat -xml | tr '\n' ' ' | sed 's#<job_list[^>]*>#\n#g' \
> | sed 's#<[^>]*>##g' | grep " " | column -t
4752204 5.54737 r_6bd26658_run_04 magnus dr 2017-02-20T22:09:04 all.q@c6.cluster3.genesilico.pl 10
4752201 5.54737 r_6bd26658_run_01 magnus dr 2017-02-20T22:09:04 all.q@c6.cluster3.genesilico.pl 10
4752203 5.54737 r_6bd26658_run_03 magnus dr 2017-02-20T22:09:04 all.q@c6.cluster3.genesilico.pl 10
4752202 5.54737 r_6bd26658_run_02 magnus dr 2017-02-20T22:09:04 all.q@c6.cluster3.genesilico.pl 10
4805710 5.54737 r_hTERC_251-451-85d4ac69_run_01 magnus r 2017-08-20T17:04:15 all.q@c11.cluster3.genesilico.pl 10
4805711 5.54737 r_hTERC_251-451-85d4ac69_run_02 magnus r 2017-08-20T17:04:15 all.q@c11.cluster3.genesilico.pl 10
4805712 5.54737 r_hTERC_251-451-85d4ac69_run_03 magnus r 2017-08-20T17:04:15 all.q@c8.cluster3.genesilico.pl 10
4805713 5.54737 r_hTERC_251-451-85d4ac69_run_04 magnus r 2017-08-20T17:04:15 all.q@c8.cluster3.genesilico.pl 10
4805714 5.54737 r_hTERC_251-451-85d4ac69_run_05 magnus r 2017-08-20T17:04:15 all.q@c8.cluster3.genesilico.pl 10
4805715 5.54737 r_hTERC_251-451-85d4ac69_run_06 magnus r 2017-08-20T17:04:15 all.q@c8.cluster3.genesilico.pl 10
4805716 5.54737 r_hTERC_251-451-85d4ac69_run_07 magnus r 2017-08-20T17:04:15 all.q@c8.cluster3.genesilico.pl 10
4805728 5.54737 r_mCherry_sub3-3c970489_run_03 magnus r 2017-08-20T17:21:15 all.q@c15.cluster3.genesilico.pl 10
https://stackoverflow.com/questions/26104116/qstat-and-long-job-names
Numbering line used in my flat-file notes¶
Numbering:
|1.......|10.......|20.......|30.......|40.......|50.......|60.......|70.......|80.......|90.......
123456789112345678921234567893123456789412345678951234567896123456789712345678981234567899123456789
TER format¶
Example of pro TER:
ATOM 72307 C4 U x 9 304.768 147.960 320.897 1.00218.84 C
ATOM 72308 O4 U x 9 304.171 146.902 321.104 1.00225.09 O
ATOM 72309 C5 U x 9 304.190 149.269 320.912 1.00211.91 C
ATOM 72310 C6 U x 9 304.960 150.336 320.668 1.00205.76 C
TER 72311 U x 9
Add missing atoms¶
Add missing atoms etc.:
(py37) [mx] cwc46$ pdbfixer prp46.pdb --add-atoms all --add-residues
Read more:
Test for Cuda¶
Run a test for CUDA:
rna_test_cuda.py
> rna_test_cuda.py:9 in <module>- torch.cuda.current_device(): 0
> rna_test_cuda.py:10 in <module>
torch.cuda.device(0): <torch.cuda.device object at 0x146bf9bf7b80>
> rna_test_cuda.py:11 in <module>- torch.cuda.is_available(): True
Using device: cuda
NVIDIA A40
Memory Usage:
Allocated: 0.0 GB
Reserved: 0.0 GB
tensor([[ 0.9374, 1.1526, -0.5648, 0.9870]], device='cuda:0')