\documentclass{article}
\usepackage{cgw}
\usepackage{epsf}
\title{Molecular Dynamics on Parallel Computers}
\author{W. Alda\inst{1}, K. Boryczko\inst{1}, M. Bubak\inst{1},
W. Dzwinel\inst{1}, J. Kitowski\inst{1,2},
J. Mo\'sci\'nski\inst{1,2}, and M. Pogoda\inst{1}}
\institute{Institute of Computer Science,
AGH, al. Mickiewicza 30, 30-059, Krak\'ow, Poland
\and
Academic Computer Center CYFRONET, ul. Nawojki 11, 30-950 Krak\'ow,
Poland\\
\emph{email:\/} \texttt{[bubak,kito]@uci.agh.edu.pl}\\
\emph{phone:\/} (+48 12) 617 3964, \ \ \ {\em fax:\/} (+48 12) 338 054 }
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
\maketitle
%%%%%%%%%%%%%%%%%%%%%%%%%%% abstract %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{abstract}
In the paper we present experiments with parallel molecular dynamics
algorithms and programs for simulation of systems with large number of
particles and for feature extraction in pattern recognition.
In the first two algorithms Lennard-Jones systems are incorporated. In
the 3-D ({\sl Pipe Method}) domain decomposition and data-parallel
paradigm are applied.
In the second 2-D algorithm ({\sl 2-D MD}) domain decomposition with
distributed neighbor (Verlet) and link (Hockney) lists are applied. In
the third algorithm ({\sl FEA}) a kind of the harmonic potential and
all-pair interactions are introduced and parallelization is implemented
with both message-passing and data-parallel programming models.
All programs are equipped with dynamic load balancing based on
adaptive repartitioning of the computational box during the
simulation.
\end{abstract}
%%%%%%%%%%%%%%%%%%%%%%%%%% introduction %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}
\noindent
One of the fast developing approaches to material science is application
of particle simulation, mainly molecular dynamics (MD) method to solve
the problems of origin of different instabilities and others, out of
reach of classical models based on flow continuity of matter, momenta
and energy. Some early results of such simulations (e.g.
\cite{rap1}-\cite{rap3}) have been followed by other works concerning
sedimentation \cite{alda1} and simulation of macro-scale phenomena of
intrinsically discontinuous nature like fracture \cite{mora} as well as
others like convection, explosions, formation of cracks, deformation and
penetration in materials of non-liquid properties (e.g.
\cite{alda2,pc96}) showing requirements for large number of ``{\sl
molecules}'' involved in realistic simulations. More recently a
scalable multicell MD code has been implemented on the CM5 and it
demonstrated that MD simulations with $10^8+$ molecules are now feasible
\cite{beazley,lom1}.
In the paper we present our experiments with parallel MD algorithms and
programs for simulation of system dynamics with large number of
particles and for feature extraction for general pattern recognition
method.
%%%%%%%%%%%%%%%%%%%%% Characteristics of parallel algorithms %%%%%%%%%%%
\section{Characteristics of Parallel Algorithms}
\subsection{Pipe Method}
\noindent
The computational box applied in the study is a 3-D long cylinder. It
is useful for studies of fluids and mixtures in microcapillaries or
growth of optical fibers. The algorithm was originally developed for
vector computers \cite{pip1}. Short-range interactions defined by 12/6
Lennard-Jones pair potential are used with leap-frog algorithm for
solving Newtonian equations of motion. Simulation of microcanonical
ensemble is performed.
Periodic boundary conditions are introduced along the cylinder axis only
(in {\sl z}-direction). Since the number of neighbors interacting with
a given molecule is nearly constant, the integer cutoff number $n_C$ is
introduced, where $n_C$ is a number of neighbours interacting
potentially with a given particle. Particles in the cylinder are sorted
due to their {\sl z} coordinates and the index vector is set up on
returning to original particles indices. For forces calculation the
computational cylinder is stepwise shifted in respect to its copy to
subsequent neighbouring particles from 1 to $n_C$.
\subsection{2-D MD Program}
\noindent
Calculation of forces is always the most time consuming part of any MD
program. In the 2-D MD program we have applied the Verlet neighbor-list
\cite{gupta} to speed up the evaluation of interactions between
molecules as well as Hockney link-list and sorted lists to build the
neighbor list. The neighbor list is reconstructed automatically.
Equations of motion are solved using leap-frog algorithm. Tuning of the
sequential program consisted in referencing large global arrays by
locally declared pointers, assigning local variables to frequently used
global ones and removing the effect of the cache on the performance by
avoiding allocations of arrays of a size close to multiplicity of cache
line length and sorting of particle data during simulation
.....
\subsection{Feature Extraction by Molecular Dynamics}
\noindent
Feature extraction (mapping) is such a transformation of N-dimensional
space to low-dimensional $E^3$ space which preserves in the best way
mutual distances between cluster members. This may be achieved by
finding the global minimum of the Sammon's criterion function.
The quality of the transformation is a distance between
the global minimum to that found by mapping. Particles
interact via $V_{ij}$ particle-particle pair potential
\begin{equation}
V_{ij} = k(r_{ij}^2-a_{ij}^2)^2,
\end{equation}
where:
$k$ - the stiffness factor, $k > 0$,
$r_{ij}$ - the distance between particles $i$ and $j$ in ${E^3}$,
$a_{ij}$ - the distance between particles $i$ and $j$ in ${R^N}$.
The sum of potentials for each pair of $(i,j)$ gives the total potential
energy of particles system $E_p$. Minimization of $E_p$ represents the
Sammon's criterion. The system evolves according to the Newtonian
equations of motion. The transient total force acting on a
single particle $i$ is equal to the gradient of potential energy. To
reduce the numerical oscillation and assure energy dissipation the
friction force is introduced. This is the principal condition to reach
$E_p$ minimum ....
%%%%%%%%%%%%%%%%%%%%%%%%% Computers and environments %%%%%%%%%%%%%%%%%%
\section{Computers and Environments}
\noindent
\begin{table}[htb]
\centering
\begin{tabular}{lllll}
\hline
\hline
& & & System: & \\
System & Nodes & Inter- & version and & Options \\
name & & connection & environment & \\
\hline
HP/Convex & 32 & two-layer -- & SPP-UX: & \\
SPP1200/XA & & crossbar/ & 3.2.129, C (6.4), & -O2 \\
SPP1600/XA & & 1D torus & ConvexPVM (3.3.10) & \\
\hline
Intel & 98 & 2D mesh & OSF/1: & \\
Paragon XP/S & & & 1.0.4, C, NX-2 (R4.5)& -O, \\
& & & & -Knoieee\\
& & & 1.3.3, Fortran77 & -O4 \\
& & & & -Knoieee\\
\hline
TMC & 896 & data/control & C Most & \\
CM5 & & network & CM Fortran 2.1/2.2 & -O \\
& & fat/binary tree& & \\
\hline
CRAY & 512 & 3-D torus & UNICOS 8.0.3 & \\
T3D & & latency hiding & C 4.0.5, & \\
& & & PVM Cray v. 3.0.0.2 & \\
\hline
IBM RS6K/320 & 2 & Ethernet & AIX 3.2.4 & \\
SUN & & & & \\
SparcStation2& 2 & network & SunOS 4.1 & -O \\
& & & C, PVM (3.3.8) & \\
\hline
\hline
\end{tabular}
\caption{Comparison of parallel computer systems and environments.}
\label{tab1}
\end{table}
Main feature of three NUMA multiprocessor computers applied in the study
are summarized in Tab. \ref{tab1}.
%%%%%%%%%%%%%%%%%%%%%% Program performance %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Program Performance}
\subsection{The PIPE Method}
\noindent
In Fig.~\ref{WCtime} execution wall-clock time, $\tau$, per MD timestep
and particle is presented for the Pipe Method for the both models of
programming: domain decomposition and data-parallel. For the domain
decomposition paradigm, for which ...
\subsection{2-D MD Parallel Program}
\noindent
Wall-clock $\tau$ of the parallel program on homogeneous clusters of the
SUN SPARCstation IPX, IBM RS/6000-320 and CONVEX Exemplar SPP 1200, SPP
1600 and Cray T3D is presented in Fig. \ref{PARTIM}.
\begin{figure}[htb]
\centering
\begin{minipage}{5.8cm}
\epsfxsize=5.8cm
\epsfbox{fig1.eps}
\caption{Execution time for different number of computing nodes, $K$,
for PIPE algorithm on network of workstations
(NW), CM5 (CM), Paragon (PN) and SPP1200 (CX).}
\label{WCtime}
\end{minipage}
\hspace{0.2cm}
\begin{minipage}{5.8cm}
\vspace*{0.48cm}
\epsfxsize=5.8cm
%\epsfysize=7.3cm
\epsfbox{fig2.eps}
%\vspace*{0.4cm}
\caption{Execution time for the parallel 2-D MD program
on virtual network computer and on Exemplar \mbox{SPP 1200},
\mbox{SPP 1600} and Cray T3D;
constant size problem -- $131072$ particles.}
\label{PARTIM}
\end{minipage}
\end{figure}
Timings were done for the constant size problem, i.e. the number of
particles indicated in Fig. \ref{PARTIM} is the total
number of particles in the computational box.
Calculations on SPP 1200 and SPP 1600 were done on dedicated subcomplex
with 24 and 16 processors, respectively.
%%%%%%%%%%%%%%%%%%%%%%%% Sample results %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Sample Results}
\noindent
MD application is especially valuable when the fluctuations play
predominant role in the system evolution, e.g. the hydrodynamical
phenomena driven by Rayleigh - Taylor type of instabilities. The
process of mixing of two fluids: heavier one placed above and lighter
on the bottom is a good example.
The system consists of two kinds of particles which form lighter and
heavier fluids. The particles are placed in a 2-D rectangular box.
Periodic boundary conditions are applied to vertical edges, while
horizontal edges represent reflecting walls. Initially, the heavier
fluid occupies the upper half of the box. Due to external gravity field
acting downwards on every particle the motion of both layers starts.
\begin{figure}[htb]
\centering
\begin{minipage}{\textwidth}
\centering
\fbox{
\epsfxsize=9.5cm
\epsfbox{fig3.eps}
}
\caption{Simulation of mixing of two fluids.}
\label{BANKI}
\end{minipage}
\end{figure}
The MD simulations (see Fig.\ref{BANKI}) have been performed for
$2.8\times 10^6$ particles on 128 processors of Cray T3D (at Minnesota
Supercomputer Center). In Fig.\ref{BANKI} one may observe the situation
after $10^5$ timesteps.
%%%%%%%%%%%%%%%%%%%%%%%%%%% Concluding remarks %%%%%%%%%%%%%%%%%%%%%%%%%
\section{Concluding Remarks}
\noindent
We have elaborated two parallel 2-D/3-D short-range MD programs. The
pipe method is used for studies of fluid properties in microcapillaries
and test purposes while the second is highly optimized for production
runs. Both algorithms turn out to be efficient in parallel
implementation for moderate number of particles and computing nodes.
The 2-D MD code is based on PVM and thus portable over a wide range of
computer architectures like networked workstations and multiprocessors.
When it was being developed the main objective was to make large
simulations feasible by reducing the execution time and providing more
memory for computer experiments. It is suitable for simulation of
Lennard-Jones systems with large number of particles ($10^6+$) what is
required for fluid flow and macroscopic phenomena investigation with MD.
Important advantage of the parallel program is the load balancing
procedure tuning dynamically distribution of the program data on
processors to their current load which results in improvement of
performance.
MD approach can also be used as "natural solver" and applied
straightforwardly to global minimum search. In this case we were able
to use parallel systems efficiently by just adapting existing parallel
MD algorithms and software.
%%%%%%%%%%%%%%%%%%%% acknowledgement %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{acknowledgement}
We are very grateful to Prof. David A. Yuen from Minnesota
Supercomputer Institute and to Dr. Jeremy Cook from University of
Bergen for supporting us with computer time. The support of ACC
CYFRONET staff, in particular Mrs Z. Mosurska, Mr M. Pilipczuk and
mbox{Dr. P. Wyrostek}, was very helpful. This research was partially
supported by the II Joint American-Polish \mbox{M. Sk\l{}odowska-Curie}
Fund (MEN/NSF--94--193).
\end{acknowledgement}
%%%%%%%%%%%%%%%%%%%%%%%% bibliography %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{thebibliography}{99}
\bibitem{rap1}
Rapaport, D.:
Microscale hydrodynamics: Discrete particle simulation of evolving
flow patterns,
{\sl Phys.Rev.}, {\bf A36}, 7 (1987) 3288.
\bibitem{kop}
Koplik, J., Banavar, J.R., Willemsen, J.F.:
Molecular dynamics of Poiseuille flow and moving contact lines,
{\sl Phys. Rev. Lett.}, {\bf 60} (1988) 1282.
\bibitem{cui}
Cui, S.T., Evans, D.J.:
Molecular dynamics simulation of two dimensional flow past a plate,
{\sl Mol. Simul.}, {\bf 9} (1992) 179.
\bibitem{rap3}
Rapaport, D.:
Unpredictable convection in a small box: molecular dynamics
experiments flow patterns,
{\sl Phys.Rev.}, {\bf A46}, 4 (1992) 1971.
\bibitem{alda1}
Alda, W., Dzwinel, W., Kitowski, J., Mo\'sci\'nski, J., and Yuen, D.:
Convection driven by sedimentation using molecular dynamics approach,
{\sl Hertzberger, B., Serazzi, G., (eds.), Proc. Int. Conf. High
Performance Computing and Networking, Milan, Italy, May 1995\/},
LNCS {\bf 796\/},
678-683, Springer-Verlag, 1995.
\bibitem{mora}
Mora, P., Place, D.:
A lattice solid model for the nonlinear dynamics of earthquakes,
International Journal of Modern Physics C
{\bf 4} (1993) 1059-1074.
\bibitem{alda2}
Alda, W., Bubak, M., Dzwinel, W., Kitowski, J., Mo\'sci\'nski, J.:
Computational molecular dynamics at the Institute of Computer
Science, AGH - Cracow, Poland,
Supercomputer Institute Research Report UMSI 94/224 (1994).
\bibitem{pc96}
Alda, W., Bubak, M., Dzwinel, W., Kitowski, J., Mo\'sci\'nski, J.,
Pogoda, M., and Yuen D.:
Fluid dynamics simulation with particles,
in: {\sl Borcherds, P., Bubak, M., and Maksymowicz, A., (Eds.),
Proc. of the 8 Joint EPS-APS Int. Conf. on Physics Computing},
Krakow, September 17-21, 1996, Krak\'ow, Poland,
pp. 281-284, ACC CYFRONET-KRAK\'OW, Poland, 1996.
\bibitem{beazley}
Beazley, D.M., Lomdahl, P.S.:
Message-passing multi-cell molecular dynamics on the Connection
Machine 5, Parallel Computing {\bf 20} (1994) 173-195.
\bibitem{lom1}
Beazley, D.M., Lomdahl, P.S., Gronbech-Jensen, N., Giles, R.,
Tamayo, P.: Parallel algorithms for short-range molecular
dynamics, {\sl World Scientific's Annual Reviews in Computational
Physics}, {\bf 3} (1995).
\bibitem{pip1}
Mo\'sci\'nski, J., Kitowski, J., Rycerz, Z.A.,
Jacobs, P.W.M.: A vectorized algorithm on the ETA 10-P for molecular
dynamics
simulation of large number of particles confined in a long cylinder,
{\sl Comput. Phys. Commun.\/}, {\bf 54\/} (1989) 47.
\bibitem{gupta}
Gupta, S.:
Computing aspects of molecular dynamics simulation,
Computer Physics Communication {\bf 70} (1992) 243-270.
\bibitem{lb95}
Bubak, M., Mo\'sci\'nski, J., Pogoda, M., S\l{}ota, R.:
Load balancing for lattice gas and molecular dynamics simulations
on networked workstations,
{\sl Hertzberger, B., Serazzi, G., (eds.), Proc. Int. Conf. High
Performance Computing and Networking, Milan, Italy, May 1995\/},
LNCS {\bf 796\/},
329-334, Springer-Verlag, 1995.
\end{thebibliography}
\end{document}