Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

VASP Performance optimization

News

High Performance Components

Books

Recommended Links

Message Passing Interface

GPFS on Red Hat Beowulf cluster Oracle Grid Engine
VASP Performance optimization VASP Performance optimization Oracle Performance Tuning Solaris Performance Tuning AIX performance tuning NFS performance tuning Unix System Monitoring Apache performance tuning
  Linux Troubleshooting Linux Troubleshooting Tips Suse performance tuning Network Performance tuning Linux Performance Tuning Database Performance Tuning    
uptime command free top ps pmap ptree sar  
mostat vmstat iostat procstat nfsstat Admin Horror Stories Humor Etc

Derived from

Vienna ab-initio simulation package version 5 (VASP 5) is an application package used for performing ab-initio quantum-mechanical molecular dynamics (MD).

Scalability of the package in cluster environment is limited due to limitation in parallization of algorithms used.   Both rackmount servers and blades are good candidates for High Performance Computing clusters running VASP, but certain questions remain unanswered: What hardware parameters are critical for VASP performance and which are not? What aspects of hardware architecture make it the ideal choice for running VASP?

First of all the ability to parallelize VASP is limitede. Unless the system and hardware are tuned to specific task, in cluster environment VASP does not scale well beyond 8 nodes or 128 cores and even that level of scalability is achievable only with Infiniband.  Without Infiniband 16 nodes are probably closer to a limit. Paradoxically if you use fewers cores of each server, for example 2 cores on 4 servers each, the speed is higher then on a single server with 8 cores. That suggests that for VASP memory bandwidth is a bottleneck. 

Among critical for VASP performance hardware parameters we can currently suggest the following (again this is a work in progress and our understanding of VASP so far is pretty superficial):

How to improve performance?

Important parameters for cluster running vasp network

Severe scalability limitation seen with Ethernet networks –1GbE, 10GbE and 40GbE performance starts to decline after 2 nodes.

QDR InfiniBand delivers the best performance for VASP among tested interfaces. Low latency is really important

QDR InfiniBand reduces the amount of time for MPI communications. With increase on nodes, MPI Communication time increases gradually, as the compute time reduces


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Jul 26, 2013] VASP Laboratory Computing Resource Center

anl.gov

VASP 5.2.11 and VASP 4.6.36 were compiled using the Intel-11.1 compilers and uses the Intel MKL libraries for a BLAS. As long as you run VASP with the "vasprun" that is included with the "+vasp-5.2.11" keyword, no further modification of your Softenv is necessary. VASP 5.2.11 can coexist with other MPICH compilers/libraries in your environment.

General notes:

... ... ...

• If you are running a calculation with only the gamma-point in your "integration" over reciprocal space, you can save a factor of two in elapsed time by using the gamma-point only version of VASP.

• The RMM-DIIS iterative matrix diagonalization algorithm (ALGO = Very_Fast) will give the best parallel performance.

The parallel performance of VASP is very sensitive to the NPAR parameter on Fusion. You can improve performance by factor of two (or more) by using an optimum value of NPAR. Our tests with the medium Teragrid benchmark indicate that NPAR should be half the number of nodes for optimal performance.

The LPLANE input variable should be set to ".FALSE." for best performance.

• VASP 4.6.36 is faster than VAPS.5.2.11. If you don't need the advanced features of VASP 5.2.11, you might consider using VASP.4.6.36.

Useful Tools:

• Vasp Data Viewer - A user recommended visualization tool for Windows.

• P4VASP - Another user recommend viz tool for VASP. This has been installed on Fusion but we are currently sorting out a few bugs in the code. Please email [email protected] if you are interested in using P4VASP.

• VESTA is a free crystal structure viewer and builder which can read and write POSCAR and CONTCAR files. VESTA can also visualize 3D data such are charge densities, potentials and orbitals from CHG, CHGCAR, PARCHG, LOCPOT and ELFCAR files from VASP.

Note that POSCAR and CONTCAR files for VASP.5.2.* can contain the element names on line 6, followed by the number of atoms for each element on line 7. This undocumented feature of VASP.5.2.* facilitates the input of coordinates and elements into VESTA.

• RINGS can extract pair distribution functions, mean square displacements and other properties from the trajectory files generated by VASP molecular dynamics runs.

• Graeme Henkelman's group, at UT Austin, maintain the TST tools for VASP. These tools provide methods for finding saddle points, evaluating transition state theory (TST) rate constants and Bader charge analysis with VASP.

• VMD can be used to visualize structures and trajectories from the VASP xml file. See http://www.uni-due.de/~hp0058/?file=vmdplugins.html&lang=en for more details.

Benchmark Results for VASP

Figure 1 - Performance and scalability of VASP-4.6.36 on Intel Westmere (X5650) comparing three binaries, Walltime (s) vs. Number of cores (NPAR=1). The speed up it is on the top of each bar.

#Cores vasp (s) vasp_cd (s) vasp_gamma (s)
1 87924.603 77432.523 43641.539
6 20377.494 23307.909 15003.526
12 13425.067 12418.448 8030.118
24 6575.835 6370.142 4415.920
36 4874.981 4545.864 3121.167
48 3839.336 3537.369 2430.424
60 3313.659 3476.121 2398.570
72 3870.750 3309.903 2104.616

Table 1 - Walltime of XRQTC Benchmarks on Intel Westmere (X5650) with Infiniband QDR (NPAR=1).

VASP Performance Benchmark and Profiling - HPC Advisory Council

How to compile VASP - Micro and Nano Mechanics Group

The binary (executable) file vasp can run in both serial mode (e.g. ./vasp) and parallel mode (e.g. mpiexec -np 4 vasp in a PBS script). The following table compares the time to run a simple benchmark case (one Au atom, LDA, ENCUT=400, ISMEAR=1, SIGMA=0.1, KPOINTS=21x21x21) using our executable here and the one available at /share/apps/vasp.4.6/bin/vasp. Our executable is about 70% faster.

How to compile VASP

Micro and Nano Mechanics Group

The following table compares the time to run the same benchmark case as above using both executables. Our executable shows speed up with multiple CPUs.

Number of CPUs vasp compiled here /share/apps/vasp.4.6/vasp vasp another build (below)
1 76 (seconds) 75 (seconds) 48 (seconds)
2 59 (seconds) 72 (seconds) 34 (seconds)
4 35 (seconds) 64 (seconds) 29 (seconds)
8 37 (seconds) 65 (seconds) 31 (seconds)

... ... ...

This will create executable vasp in this directory. This time the executable vasp can not run interactively, but can only run in the queue through a PBS script (e.g. mpiexec --comm=ib -np 4 vasp).

Notice that here we need to specify the communication channel. (For mvapich1 we need to use --comm=ib and for mvapich2 use --comm=pmi.)

The following table provides the timing information for the same benchmark case studied above.

Number of CPUs vasp compiled here
1 40 (seconds)
2 28 (seconds)
4 27 (seconds)
6 29 (seconds)
8 31 (seconds)
16 215 (seconds)

A Word of Caution: Make sure to run a few test cases to confirm your executable not only runs but produces the correct numerical results.

For example, we have found that on su-ahpcrc, function BRMIX (broyden.f) was giving serious errors. This was solved by changing the compilation options to "OFLAG=-O1 -mtune core2 -axW -unroll", and by changing "ICHARG = 0" to "ICHARG = 2" in INCAR. ("ICHARG=2" is the default when "ISTART=0" or if there are no CHG, CHGCAR, WAVECAR files in the folder.)

... ... ...

Intel MKL instructions

There are "official" instructions to compile VASP with the Intel Compiler family, named Using Intel MKL in VASP. Those instructions are unrelated with the present instructions but can be a good reference for future builds.

A study of improving the parallel performance of VASP by Baker, Matthew

Open Access Dissertation - ProQuest
2010, 46 pp. East Tennessee State University, M.S. degree.

Abstract(summary)

This thesis involves a case study in the use of parallelism to improve the performance of an application for computational research on molecules. The application, VASP, was migrated from a machine with 4 nodes and 16 single-threaded processors to a machine with 60 nodes and 120 dual-threaded processors. When initially migrated, VASP's performance deteriorated after about 17 processing elements (PEs), due to network contention. Subsequent modifications that restrict communication amongst VASP processes, together with additional support for threading, allowed VASP to scale up to 112 PEs, the maximum number that was tested. Other performance-enhancing optimizations that were attempted included replacing old libraries, which produced improvements of about 10%, and prefetching, which degraded, rather than enhanced, VASP performance.

VASP on a GPU application to exact-exchange calculations of the stability of elemental boron hgpu.org Maxwell Hutchinson, Michael Widom

2011, Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15213
View Download (PDF) Source

General purpose graphical processing units (GPU's) offer high processing speeds for certain classes of highly parallelizable computations, such as matrix operations and Fourier transforms, that lie at the heart of first-principles electronic structure calculations. Inclusion of exact-exchange increases the cost of density functional theory by orders of magnitude, motivating the use of GPU's. Porting the widely used electronic density functional code VASP to run on a GPU results in a 5-20 fold performance boost of exact-exchange compared with a traditional CPU. We analyze performance bottlenecks and discuss classes of problems that will benefit from the GPU. As an illustration of the capabilities of this implementation, we calculate the lattice stability {alpha}- and {beta}-rhombohedral boron structures utilizing exact-exchange. Our results confirm the energetic preference for symmetry-breaking partial occupation of the {beta}-rhombohedral structure at low temperatures, but does not resolve the stability of {alpha} relative to {beta}.

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites

Compilation

Forums



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: March 12, 2019