MolAlignLib is a Fortran and Python library based on random rotations and quasi-local RMSD minimizations to align rigid molecules and clusters. The details of the method can be found in publication [1].
Before installing
You can try MolAlignLib on Binder if you don’t want to build and install it.
-
To build the native executable you will only need GFortran 4.8 or higher or any other compiler with Fortran 2008 support.
-
To build the python module you will need GFortran 4.8 or higher, Python 3.6 or higher, NumPy 1.18-1.26 and ASE.
The easiest way to install the required packages is with your distro’s package manager:
RHEL, Centos, Fedora, etc.
yum install git python3 python3-pip gcc-gfortran
Debian, Ubuntu, Mint, etc.
apt install git python3 python3-pip gfortran
To build the native executable
clone the repository:
git clone https://github.com/qcuaeh/molalignlib.git
then enter the cloned directory, edit the build.env file to suit your system and run:
./build.sh
it will create the molalign executable inside the build directory.
or to build and install the python module
just run:
pip3 install git+https://github.com/qcuaeh/molalignlib.git
it will install NumPy, ASE and MolAlignLib in your site packages and the molalign script in your path.
Program options
Options supported by the native executable and the python script
-remap/-sort
Remap atoms to minimize the RMSD.
-fast
Prune assignments that surpass the displacement tolerance.
-tol TOL
Set the displacement tolerance to TOL (defaults to 0.35 Å).
-out NAME
Set the output file name to NAME (defaults to aligned.xyz).
-count N
Set the count threshold to N (defaults to 10).
-trials N
Set the maximum number of trials to N.
-rec N
Record up to N assignments (defaults to 1).
-stats
Print detailed stats of the calculation.
-test
Use a fixed random seed for testing.
-mirror
Reflect aligned coordinates.
-mass
Use mass weighted RMSD.
Options only supported by the native executable
-live
Print stats in real time (if stats are enabled).
-stdin FMT
Read coordinates from standard input with format FMT.
-stdout FMT
Write coordinates to standard output with format FMT.
Basic usage
The syntax of the command is:
molalign [option[s]] file1 file2
The coordinates in file2 will be aligned to the coordinates in file1. If there is more than one set of coordinates in a file, only the first one will be read. The native executable only reads xyz and mol2 files while the python script reads all the file formats supported by ASE.
-
Running the command without options will align the atoms without reordering.
-
Running the command with the
-remap
option will remap the atoms to minimize the RMSD and the aligned coordinates will be written in the optimal mapping order.
Advanced usage
When reordering is performed the computation can take a lot of time to complete but
can be considerably speeded up with the -fast
option which enables pruning of any
assignment that surpass the displacement tolerance. However, if the atom displacements
are larger than this tolerance, the assignment will fail, or, if they are very close,
the assignment can be suboptimal. In such cases the displacement tolerance should be
increased with the -tol
option.
The count threshold is used to decide if the procedure is converged. A threshold of 10
counts is used by default, which works fine for almost all cases, but you can change it
with the -count
option. To avoid too long computations you can set a maximum number of
random trials with the -trials
option, the computation will be aborted if it is reach
before the count threshold.
The algorithm always explores multiple possible assignments, but only the best one is
recorded by default. To record additional assignments use the -rec
option. To print
the stats of the computation use the -trials
option, but notice that they will be
different on each repeated run due to the use of randomized random seeds.
Command line examples
For small atom displacements the default tolerance is enough:
./build/molalign examples/Co138_0.xyz examples/Co138_1.xyz -remap -fast
0.0506
but for atom displacements larger than the tolerance the assignment will fail:
./build/molalign examples/Co138_0.xyz examples/Co138_3.xyz -remap -fast
Error: Assignment failed
Increasing the tolerance will fix the problem but the calculation will slow down:
./build/molalign examples/Co138_0.xyz examples/Co138_3.xyz -remap -fast -tol 0.7
0.1973
Printing multiple alignments and stats can be useful to identify rotational symmetric clusters:
./build/molalign examples/Co138_0.xyz examples/Co138_1.xyz -remap -fast -rec 5 -stats
Map Count Steps Angle RMSD
-------------------------------------------
1 10 12.1 60.5 0.0506
2 9 12.3 66.4 0.0506
3 15 10.1 50.2 0.0506
4 1 9.0 41.8 0.6652
5 1 6.0 4.1 0.6716
-------------------------------------------
Random trials = 66
Minimization steps = 595
Visited local minima > 5
There are three different assignments with the same RMSD indicating that the cluster has three fold symmetry. Notice that the three optimal symmetric assignments are visited multiple times while the suboptimal ones are visited only once.
Python examples
Minimize the RMSD between two Cobalt clusters:
from ase.io import read
from molalignlib import assign_atoms
mol0 = read('Co138_0.xyz')
mol1 = read('Co138_1.xyz')
# Find optimal assignment
assignment = assign_atoms(mol0, mol1, fast=True)
# Reorder mol1 with the optimal assignment
mol1 = mol1[assignment.order]
# Align mol1 to mol0 (returns RMSD)
mol1.align_to(mol0)
MolAlignLib uses ASE to read and write atomic coordinates.
References
[1] J. M. Vasquez-Perez, L. A. Zarate-Hernandez, C. Z. Gomez-Castro, U. A. Nolasco-Hernandez. A Practical Algorithm to Solve the Near-Congruence Problem for Rigid Molecules and Clusters, Journal of Chemical Information and Modeling (2023), DOI: https://doi.org/10.1021/acs.jcim.2c01187