High Performance Computing (HPC)

nodes · A more natural way to compare performance is to run some standard tests.

     T(1)
S = ------
     T(j)
^Rpeak
^Nmax
^Rmax
This is Son of Grid Engine version v8.1.9.

See <http://arc.liv.ac.uk/repos/darcs/sge-release/NEWS> for information on
recent changes.  See <https://arc.liv.ac.uk/trac/SGE> for more
information.

The .deb and .rpm packages and the source tarball are signed with PGP
key B5AEEEA9.

* sge-8.1.9.tar.gz, sge-8.1.9.tar.gz.sig:  Source tarball and PGP signature

* RPMs for Red Hat-ish systems, installing into /opt/sge with GUI
  installer and Hadoop support:

  * gridengine-8.1.9-1.el5.src.rpm:  Source RPM for RHEL, Fedora

  * gridengine-*8.1.9-1.el6.x86_64.rpm:  RPMs for RHEL 6 (and
    CentOS, SL)

  See <https://copr.fedorainfracloud.org/coprs/loveshack/SGE/> for
  hwloc 1.6 RPMs if you need them for building/installing RHEL5 RPMs.

* Debian packages, installing into /opt/sge, not providing the GUI
  installer or Hadoop support:

  * sge_8.1.9.dsc, sge_8.1.9.tar.gz:  Source packaging.  See
    <http://wiki.debian.org/BuildingAPackage>, and see
    <http://arc.liv.ac.uk/downloads/SGE/support/> if you need (a more
    recent) hwloc.

  * sge-common_8.1.9_all.deb, sge-doc_8.1.9_all.deb,
    sge_8.1.9_amd64.deb, sge-dbg_8.1.9_amd64.deb: Binary packages
    built on Debian Jessie.

* debian-8.1.9.tar.gz:  Alternative Debian packaging, for installing
  into /usr.

* arco-8.1.6.tar.gz:  ARCo source (unchanged from previous version)

* dbwriter-8.1.6.tar.gz:  compiled dbwriter component of ARCo
  (unchanged from previous version)

More RPMs (unsigned, unfortunately) are available at
<http://copr.fedoraproject.org/coprs/loveshack/SGE/>.

git clone [email protected]:gawbul/docker-sge.git
cd docker-sge
docker build -t gawbul/docker-sge .

docker pull gawbul/docker-sge

docker run -it --rm gawbul/docker-sge login -f sgeadmin

echo "echo Running test from $HOSTNAME" | qsub

export KUBE_SERVER=xxx.xxx.xxx.xxx
export DNS_DOMAIN=xxxx.xxxx
export DNS_SERVER_IP=xxx.xxx.xxx.xxx
./kubernetes/setup_all.sh 20

kubectl exec sgemaster -- sudo su sgeuser bash -c '. /etc/profile.d/sge.sh; echo "/bin/hostname" | qsub'
kubectl exec sgemaster -- sudo su sgeuser bash -c 'cat /home/sgeuser/STDIN.o1'

./kubernetes/add_sge_workers.sh 10

./kubernetes/setup_k8s.sh

export KUBE_SERVER=xxx.xxx.xxx.xxx
export DNS_DOMAIN=xxxx.xxxx
export DNS_SERVER_IP=xxx.xxx.xxx.xxx
./kubernetes/setup_dns.sh

kubectl create -f ./kubernetes/skydns/busybox.yaml

kubectl exec busybox -- nslookup kubernetes

kubectl exec busybox -- nslookup 10.0.0.1

kubectl exec busybox -- nslookup busybox.default

./kubernetes/setup_sge.sh 10

kubectl exec sgemaster -- sudo su sgeuser bash -c '. /etc/profile.d/sge.sh; echo "/bin/hostname" | qsub'
kubectl exec sgemaster -- sudo su sgeuser bash -c 'cat /home/sgeuser/STDIN.o1'

./kubernetes/add_sge_workers.sh 10

modprobe nfsd

docker run -d --hostname resolvable -v /var/run/docker.sock:/tmp/docker.sock -v /etc/resolv.conf:/tmp/resolv.conf mgood/resolvable

docker run -d --name nfshome --privileged cpuguy83/nfs-server /exports
docker run -d --name nfsopt --privileged cpuguy83/nfs-server /exports

docker run -d -h sgemaster --name sgemaster --privileged --link nfshome:nfshome --link nfsopt:nfsopt wtakase/sge-master:ubuntu

docker run -d -h sgeworker01 --name sgeworker01 --privileged --link sgemaster:sgemaster --link nfshome:nfshome --link nfsopt:nfsopt wtakase/sge-worker:ubuntu
docker run -d -h sgeworker02 --name sgeworker02 --privileged --link sgemaster:sgemaster --link nfshome:nfshome --link nfsopt:nfsopt wtakase/sge-worker:ubuntu

docker exec -u sgeuser -it sgemaster bash -c '. /etc/profile.d/sge.sh; echo "/bin/hostname" | qsub'
docker exec -u sgeuser -it sgemaster cat /home/sgeuser/STDIN.o1

ssh master -Y . /opt/sge/default/common/settings.sh \; qmon

admin@master:~$ qhost | sed -e 's/^.\{35\}[^0-9]\+//' | cut -d" " -f1

/etc/profile.d
/opt/sge/default/common
fah-a
fah-d
$ ls ~/grid/fah-a/
fah-a
client.cfg
FAH504-Linux.exe
work

admin@master:~$ qconf -sp fah-a
pe_name           fah-a
slots             1
user_lists        fah

#!/bin/sh
# use bash
#$ -S /bin/sh
# current directory
#$ -cwd
# merge output
#$ -j y
# mail at end
#$ -m e
# project
#$ -P fah
# name in queue
#$ -N fah-a
# parallel environment
#$ -pe fah-a 1
./FAH504-Linux.exe -oneunit

admin@master$ qconf -srqsl
admin@master$ qconf -mrqs lm2007_slots
{
   name         lm2007_slots
   description  Limit the lm2007 project to 20 slots across the grid
   enabled      TRUE
   limit        projects lm2007 to slots=20
}

paikea
admin@master$ qconf -scal paikeaupgrade
calendar_name    paikeaupgrade
year             17.1.2008=off
week             NONE

admin@master$ qconf -mq all.q
...
calendar              NONE,[paikea=paikeaupgrade]
...

admin@master$ qconf -scal michael
calendar_name    michael
year             NONE
week             mon-sat=13-21=off

qname                 beagle.q
hostlist              paikea.stat.auckland.ac.nz
priority              19,[paikea.stat.auckland.ac.nz=15]
user_lists            beagle
projects              beagle

qname                 paikea.q
hostlist              paikea.stat.auckland.ac.nz
suspend_thresholds    NONE,[paikea.stat.auckland.ac.nz=np_load_short=1.01]
nsuspend              1
suspend_interval      00:05:00
slots                 0,[paikea.stat.auckland.ac.nz=4]

sge-submit.stat.auckland.ac.nz
#!/bin/sh
expr 3 + 5

user@submit:~$ qsub test.sh
Your job 464 ("test.sh") has been submitted
user@submit:~$ qstat
job-ID  prior   name       user         state submit/start at     queue                slots ja-task-ID 
-------------------------------------------------------------------------------------------------------
    464 0.00000 test.sh    user         qw    01/10/2008 10:48:03                          1

user@submit:~$ ls test.sh*
test.sh  test.sh.e464  test.sh.o464
user@submit:~$ cat test.sh.o464
8

o
user@exec:~$ nohup nice R CMD BATCH toodles.R

user@submit:~$ qsub-R toodles.R
Your job 3540 ("toodles.R") has been submitted

toodles.R.o3540
qsub-R
3rd_party/uoa-dos/submit-R
#!/bin/sh
#$ -S /bin/sh
echo Factors $1 and $2
expr $1 + $2

user@submit:~$ for A in 1 2 3 4 5 ; do for B in 1 2 3 4 5 ; do qsub test.sh $A $B ; done ; done

user@submit:~$ cat test.sh.?487
Factors 3 and 5
8

alpha <- ALPHA
beta <- c(BETA)
# magic happens here
alpha
beta

0.9
0.8
0.7

0,0,1
0,1,0
1,0,0

#!/bin/sh

if [ "X${SGE_ROOT}" == "X" ] ; then
         echo Run: . /opt/sge/default/common/settings.sh
         exit
fi

cat ALPHA | while read ALPHA ; do
         cat BETA | while read BETA ; do
                 FILE="t-${ALPHA}-${BETA}"

                 # create our R file
                 cat template.R | sed -e "s/ALPHA/${ALPHA}/" -e "s/BETA/${BETA}/" > ${FILE}.R

                 # create a script
                 echo \#!/bin/sh > ${FILE}.sh
                 echo \#$ -S /bin/sh >> ${FILE}.sh
                 echo "if [ -f ${FILE}.Rout ] ; then echo ERROR: output file exists already ; exit 5 ; fi" >> ${FILE}.sh
                 echo R CMD BATCH ${FILE}.R ${FILE}.Rout >> ${FILE}.sh
                 chmod +x ${FILE}.sh

                 # submit job to grid
                 qsub -j y -cwd ${FILE}.sh
         done
done

qstat

template.R
t-ALPHA-BETA.sh
t-ALPHA-BETA.R
t-ALPHA-BETA.Rout
t-ALPHA-BETA.sh.oNNN
/tmp/*
#!/bin/sh
#$ -S /bin/sh
WORKUNIT=`dsleepc`
sleep $WORKUNIT && echo Processed $WORKUNIT seconds

user@submit:~$ qsub -t 1-100 dsleep
Your job-array 490.1-100:1 ("dsleep") has been submitted

# alpha+1 is found in the SGE TASK number (qsub -t)
alphaenv <- Sys.getenv("SGE_TASK_ID")
alpha <- (as.numeric(alphaenv)-1)

qsub -N timmy test.sh

timmy.[oe]*
qsub -cwd test.sh

qsub -l mem_free=2500M test.sh

qsub -l arch=lx24-amd64 test.bin

#$ -l arch=lx24-amd64

qsub -l hostname=mako test.sh

qsub -t 1-50 test.sh
qsub -t 75-125 test.sh

qsub -q dnetc.q test.sh

qsub -hold_jid 380 test.sh

qstat -j 490
... lots of output ...
scheduling info:            queue instance "[email protected]" dropped because it is temporarily not available
                            queue instance "[email protected]" dropped because it is full
                            cannot run in queue "all.q" because it is not contained in its hard queue list (-q)

#!/bin/sh
#$ -S /bin/bash
# run in current directory, merge output
#$ -cwd -j y
# name the job
#$ -N Splus-lic
# require a single Splus license please
#$ -l splus=1
Splus -headless < $1
RETVAL=$?
if [ $RETVAL == 1 ] ; then
        echo No license for Splus
        sleep 60
        exit 99
fi
if [ $RETVAL == 127 ] ; then
        echo Splus not installed on this host
        # you could try something like this:
        #qalter -l splus=1,h=!`hostname` $JOB_ID
        sleep 60
        exit 99
fi
exit $RETVAL

#!/bin/sh
#$ -S /bin/sh
# run in current directory, merge output
#$ -cwd -j y
# name the job
#$ -N ml
# require a single Matlab license please
#$ -l matlab=1

matlab -nodisplay < $1

RETVAL=$?
if [ $RETVAL == 1 ] ; then
        echo No license for Matlab
        sleep 60
        exit 99
fi
if [ $RETVAL == 127 ] ; then
        echo Matlab not installed on this host, `hostname`
        # you could try something like this:
        #qalter -l matlab=1,h=!`hostname` $JOB_ID
        sleep 60
        exit 99
fi
exit $RETVAL

java Simulation $@ $SGE_TASK_ID $SGE_TASK_LAST

sge_task_id   = Integer.parseInt(args[args.length-2]);
sge_task_last = Integer.parseInt(args[args.length-1]);

#!/bin/sh
#$ -S /bin/sh

DATASET=confidential.csv

# check our environment
umask 0077
cd ${TMPDIR}
chmod 0700 .

# find srm
SRM=`which srm`
NOSRM=$?
if [ $NOSRM -eq 1 ] ; then
        echo system srm not found on this host, exiting >> /dev/stderr 
        exit 99
fi

# copy files from data store
RETRIES=0
while [ ${RETRIES} -lt 5 ] ; do
        ((RETRIES++))
        scp user@filestore:/store/confidential/${DATASET} .
        if [ $? -eq 0 ] ; then
                RETRIES=5000
        else
                # wait for up to a minute (MaxStartups 10 by default)
                sleep `expr ${RANDOM} / 542`
        fi
done
if [ ! -f ${DATASET} ] ; then
        # unable to copy dataset after 5 retries, quit but retry later
        echo unable to copy dataset from store >> /dev/stderr
        exit 99
fi
# if you were decrypting the dataset, you would do that here

# copy our code over too
cp /mount/code/*.class .

# process data
java Simulation ${DATASET}

# collect results
# (We are just printing to the screen.)

# clean up
${SRM} -v ${DATASET} >> /dev/stderr

echo END >> /dev/stderr

      cat /proc/cpuinfo | grep bogomips

      cat /proc/meminfo | grep MemTotal

[root@test1 etc]# more ntp.conf
# For more information about this file, see the man pages
# ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).

#driftfile /var/lib/ntp/drift

restrict default ignore
restrict 127.0.0.1
server 10.1.0.250
restrict 10.1.0.250 nomodify

mpirun -np 4 <executable>

Hello,

I would like to mark modules as default and testing so that they show up like this:

$ module av mpi/openmpi
mpi/openmpi/1.6.5(default)
mpi/openmpi/1.7.2(testing)

and can be lauded via e.g.

$ module load mpi/openmpi/testing

I tried to use module-version in .modulerc to achieve this behaviour with the commands

module-version openmpi openmpi/1.6.5
module-version openmpi openmpi/1.7.2

but I get a warnig "Duplicate version symbol 'testing' found".
For the default version there is no such warning.

So it seems to me, that there is a problem/bug in module-version.

Best regards
Christoph Niethammer

PS: My current workaround for this problem is to use a variable in all the .modulrc files.

#%Module1.0
# File: $MODULEPATH/mpi/openmpi/.modulerc
set DEFAULT  1.6.5
module-version openmpi openmpi/$DEFAULT

# circumvent problem with duplicate definition of symbol testing
# The used variable name has to be unique to prevent conflicts if
# this workaround is used in multiple .modulerc files.
if { ![info exists MPI_OPENMPI_TESTING] } {
  set MPI_OPENMPI_TESTING   1.7.2
  module-version mpi/openmpi/$MPI_OPENMPI_TESTING    testing
}

Hello,

I would like to mark modules as default and testing so that they show up like this:

$ module av mpi/openmpi
mpi/openmpi/1.6.5(default)
mpi/openmpi/1.7.2(testing)

and can be lauded via e.g.

$ module load mpi/openmpi/testing

I tried to use module-version in .modulerc to achieve this behaviour with the commands

module-version openmpi openmpi/1.6.5
module-version openmpi openmpi/1.7.2

but I get a warnig "Duplicate version symbol 'testing' found".
For the default version there is no such warning.

So it seems to me, that there is a problem/bug in module-version.

Best regards
Christoph Niethammer

PS: My current workaround for this problem is to use a variable in all the .modulrc files.

#%Module1.0
# File: $MODULEPATH/mpi/openmpi/.modulerc
set DEFAULT  1.6.5
module-version openmpi openmpi/$DEFAULT

# circumvent problem with duplicate definition of symbol testing
# The used variable name has to be unique to prevent conflicts if
# this workaround is used in multiple .modulerc files.
if { ![info exists MPI_OPENMPI_TESTING] } {
  set MPI_OPENMPI_TESTING   1.7.2
  module-version mpi/openmpi/$MPI_OPENMPI_TESTING    testing
}

# yum update && yum install perf     [CentOS / RHEL / Fedora]
# dnf update && dnf install perf     [Fedora 23+ releases]

# sudo aptitude update && sudo aptitude install linux-tools-$(uname -r) linux-tools-generic

# echo 0 > /proc/sys/kernel/perf_event_paranoid

kernel.perf_event_paranoid = 0

# perf list sw

# perf stat dd if=/dev/zero of=test.iso bs=10M count=1

perf top -a

perf top -e cpu-clock

# perf record dd if=/dev/null of=test.iso bs=10M count=1

# sudo perf report

# man perf-subcommand

perf stat [-e <EVENT> | --event=EVENT] [-a] <command>
perf stat [-e <EVENT> | --event=EVENT] [-a] - <command> [<options>]

Performance counter stats for 'make -j':

8117.370256  task clock ticks     #      11.281 CPU utilization factor
        678  context switches     #       0.000 M/sec
        133  CPU migrations       #       0.000 M/sec
     235724  pagefaults           #       0.029 M/sec
24821162526  CPU cycles           #    3057.784 M/sec
18687303457  instructions         #    2302.138 M/sec
  172158895  cache references     #      21.209 M/sec
   27075259  cache misses         #       3.335 M/sec

Wall-clock time elapsed:   719.554352 msecs

perf record [-e <EVENT> | --event=EVENT] [-l] [-a] <command>
perf record [-e <EVENT> | --event=EVENT] [-l] [-a] - <command> [<options>]

long rand_partsum(int n)
{
  int i,k;
  long sum = 0;
  int *vec = malloc(n * sizeof(int));

  for (i = 0; i < n; i++)
    vec[i] = rand()%n;

  for (k = 0; k < 1000000; k++)
    for (i = 0; i < n; i++)
      if (vec[i] > n/2)
        sum += vec[i];

  return sum;
}

sudo apt install linux-perf

[krisman@dilma bm]$ perf record ./branch-miss.unsorted
[ perf record: Woken up 19 times to write data ]
[ perf record: Captured and wrote 4.649 MB perf.data (121346 samples) ]

[krisman@dilma bm]$ perf annotate

        :
        :      int rand_partsum()
        :      {
   0.00 :        74e:   push   %rbp
   0.00 :        74f:   mov    %rsp,%rbp
   0.00 :        752:   push   %rbx
   0.00 :        753:   sub    $0x38,%rsp
   0.00 :        757:   mov    %rsp,%rax
   0.00 :        75a:   mov    %rax,%rbx

   [...] 

   0.00 :        7ce:   mov    $0x0,%edi
   0.00 :        7d3:   callq  5d0 <time@plt>
   0.00 :        7d8:   mov    %eax,%edi
   0.00 :        7da:   callq  5c0 <srand@plt>
        :              for (i = 0; i < n; i++)
   0.00 :        7df:   movl   $0x0,-0x14(%rbp)
   0.00 :        7e6:   jmp    804 <main+0xb6>
        :                      vec[i] = rand()%n;
   0.00 :        7e8:   callq  5e0 <rand@plt>
   0.00 :        7ed:   cltd   
   0.00 :        7ee:   idivl  -0x24(%rbp)
   0.00 :        7f1:   mov    %edx,%ecx
   0.00 :        7f3:   mov    -0x38(%rbp),%rax
   0.00 :        7f7:   mov    -0x14(%rbp),%edx
   0.00 :        7fa:   movslq %edx,%rdx
   0.00 :        7fd:   mov    %ecx,(%rax,%rdx,4)
        :              for (i = 0; i < n; i++)
   0.00 :        800:   addl   $0x1,-0x14(%rbp)
   0.00 :        804:   mov    -0x14(%rbp),%eax
   0.00 :        807:   cmp    -0x24(%rbp),%eax
   0.00 :        80a:   jl     7e8 <main+0x9a>

   [...]

         :              for (k = 0; k < 1000000; k++)
    0.00 :        80c:   movl   $0x0,-0x18(%rbp)
    0.00 :        813:   jmp    85e <main+0x110>
         :                      for (i = 0; i < n; i++)
    0.01 :        815:   movl   $0x0,-0x14(%rbp)
    0.00 :        81c:   jmp    852 <main+0x104>
         :                              if (vec[i] > n/2)
    0.20 :        81e:   mov    -0x38(%rbp),%rax
    6.47 :        822:   mov    -0x14(%rbp),%edx
    1.94 :        825:   movslq %edx,%rdx
   26.86 :        828:   mov    (%rax,%rdx,4),%edx
    0.08 :        82b:   mov    -0x24(%rbp),%eax
    1.46 :        82e:   mov    %eax,%ecx
    0.62 :        830:   shr    $0x1f,%ecx
    3.82 :        833:   add    %ecx,%eax
    0.06 :        835:   sar    %eax
    0.70 :        837:   cmp    %eax,%edx
    0.42 :        839:   jle    84e <main+0x100>
         :                                      sum += vec[i];
    9.15 :        83b:   mov    -0x38(%rbp),%rax
    5.91 :        83f:   mov    -0x14(%rbp),%edx
    0.26 :        842:   movslq %edx,%rdx
    5.87 :        845:   mov    (%rax,%rdx,4),%eax
    2.09 :        848:   cltq
    9.31 :        84a:   add    %rax,-0x20(%rbp)
         :                      for (i = 0; i < n; i++)
   16.66 :        84e:   addl   $0x1,-0x14(%rbp)
    6.46 :        852:   mov    -0x14(%rbp),%eax
    0.00 :        855:   cmp    -0x24(%rbp),%eax
    1.63 :        858:   jl     81e <main+0xd0>
         :              for (k = 0; k < 1000000; k++)

   [...]

[krisman@dilma bm]$ perf stat ./branch-miss.unsorted

 Performance counter stats for './branch-miss.unsorted:

    29876.773720  task-clock (msec) #    1.000 CPUs utilized
              25  context-switches  #    0.001 K/sec
               0  cpu-migrations    #    0.000 K/sec
              49  page-faults       #    0.002 K/sec
  86,685,961,134  cycles            #    2.901 GHz
  90,235,794,558  instructions      #    1.04  insn per cycle
  10,007,460,614  branches          #  334.958 M/sec
   1,605,231,778  branch-misses     #   16.04% of all branches

   29.878469405 seconds time elapsed

[krisman@dilma bm]$ perf stat ./branch-miss.sorted

 Performance counter stats for './branch-miss.sorted:

    14003.066457  task-clock (msec) #    0.999 CPUs utilized
             175  context-switches  #    0.012 K/sec
               4  cpu-migrations    #    0.000 K/sec
              56  page-faults       #    0.004 K/sec
  40,178,067,584  cycles            #    2.869 GHz
  89,689,982,680  instructions      #    2.23  insn per cycle
  10,006,420,927  branches          #  714.588 M/sec
       2,275,488  branch-misses     #    0.02% of all branches

  14.020689833 seconds time elapsed

% mkdir -p /nopt/nrel/apps/modules/candidate/modulefiles/mysoft  # Use a directory and not a file.
% touch /nopt/nrel/apps/modules/candidate/modulefiles/mysoft/1.3 # Place environment module tcl code here.
% touch .version                                                 # If required, indicate default module in this file.

#%Module -*- tcl -*-

# Specify conflicts
# conflict 'appname'

# Prerequsite modules
# prereq 'appname/version....'

#################### Set top-level variables #########################

# 'Real' name of package, appears in help,display message
set PKG_NAME      pkg_name

# Version number (eg v major.minor.patch)
set PKG_VERSION   pkg_version 

# Name string from which enviro/path variable names are constructed
# Will be similar to, be not necessarily the same as, PKG_NAME
# eg  PKG_NAME-->VisIt PKG_PREFIX-->VISIT
set PKG_PREFIX    pkg_prefix

# Path to the top-level package install location.
# Other enviro/path variable values constructed from this
set PKG_ROOT      pkg_root

# Library name from which to construct link line
# eg PKG_LIBNAME=fftw ---> -L/usr/lib -lfftw
set PKG_LIBNAME   pkg_libname
######################################################################

proc ModulesHelp { } {
    global PKG_VERSION
    global PKG_ROOT
    global PKG_NAME
    puts stdout "Build:       $PKG_NAME-$PKG_VERSION"
    puts stdout "URL:         http://www.___________"
    puts stdout "Description: ______________________"
    puts stdout "For assistance contact [email protected]"
}

module-whatis "$PKG_NAME: One-line basic description"

#
# Standard install locations
#
prepend-path PATH             $PKG_ROOT/bin
prepend-path MANPATH          $PKG_ROOT/share/man
prepend-path INFOPATH         $PKG_ROOT/share/info
prepend-path LD_LIBRARY_PATH  $PKG_ROOT/lib
prepend-path LD_RUN_PATH      $PKG_ROOT/lib

#
# Set environment variables for configure/build
#

##################### Top level variables ##########################
setenv ${PKG_PREFIX}              "$PKG_ROOT"
setenv ${PKG_PREFIX}_ROOT         "$PKG_ROOT"
setenv ${PKG_PREFIX}_DIR          "$PKG_ROOT"
####################################################################

################ Template include directories ######################
# Only path names
setenv ${PKG_PREFIX}_INCLUDE      "$PKG_ROOT/include"
setenv ${PKG_PREFIX}_INCLUDE_DIR  "$PKG_ROOT/include"
# 'Directives'
setenv ${PKG_PREFIX}_INC          "-I $PKG_ROOT/include"
####################################################################

##################  Template library directories ####################
# Only path names
setenv ${PKG_PREFIX}_LIB          "$PKG_ROOT/lib"    
setenv ${PKG_PREFIX}_LIBDIR       "$PKG_ROOT/lib"
setenv ${PKG_PREFIX}_LIBRARY_DIR  "$PKG_ROOT/lib"
# 'Directives'
setenv ${PKG_PREFIX}_LD           "-L$PKG_ROOT/lib"
setenv ${PKG_PREFIX}_LIBS         "-L$PKG_ROOT/lib -l$PKG_LIBNAME"
####################################################################

git clone [email protected]:hpc/hpc-devel.git
cd ./hpc-devel/modules/
cat modTemplate

% cat /nopt/nrel/apps/modules/candidate/modulefiles/mysoft/.version
#%Module########################################
# vim: syntax=tcl

set ModulesVersion "1.3"

% cat /nopt/nrel/apps/modules/default/modulefiles/dakota/.version
#%Module########################################
# vim: syntax=tcl

set ModulesVersion "5.3.1/openmpi-gcc"

% module avail dakota
------------------------------------------------------------------ /nopt/nrel/apps/modules/default/modulefiles -------------------------------------------------------------------
dakota/5.3.1/impi-intel           dakota/5.3.1/openmpi-epel         dakota/5.3.1/openmpi-gcc(default) dakota/5.4/openmpi-gcc            dakota/default

% ls -l /nopt/nrel/apps/modules/default/modulefiles/dakota
total 8
drwxrwsr-x 2 ssides   n-apps 8192 Sep 22 13:56 5.3.1
drwxrwsr-x 2 hsorense n-apps   96 Jun 19 10:17 5.4
lrwxrwxrwx 1 cchang   n-apps   17 Sep 22 13:56 default -> 5.3.1/openmpi-gcc

{package_name}/{version}

{package_name}/{version}/{toolchain}

[wjones@login2 nrel]$ tree -a apps/modules/default/modulefiles/hdf5-parallel/
apps/modules/default/modulefiles/hdf5-parallel/
├── .1.6.4
│   ├── impi-intel
│   ├── openmpi-gcc
│   └── .version
├── 1.8.11
│   ├── impi-intel
│   └── openmpi-gcc
└── .version

[wjones@login2 nrel]$ tree -a apps/modules/default/modulefiles/hdf5
apps/modules/default/modulefiles/hdf5
├── .1.6.4
│   └── intel
├── 1.8.11
│   ├── gcc
│   └── intel
└── .version

[wjones@login2 nrel]$ module avail hdf5

------------------------------------------------------- /nopt/nrel/apps/modules/default/modulefiles -------------------------------------------------------
hdf5/1.8.11/gcc                          hdf5-parallel/1.8.11/impi-intel(default)
hdf5/1.8.11/intel(default)               hdf5-parallel/1.8.11/openmpi-gcc

[asrini@consign ~]$ bsub -Is bash
Job <9990024> is submitted to default queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on node063.hpc.local>>

[asrini@node063 ~]$ module avail

------------------------------------------------------------------- /usr/share/Modules/modulefiles -------------------------------------------------------------------
NAMD-2.9-Linux-x86_64-multicore dot                             module-info                     picard-1.96                     rum-2.0.5_05
STAR-2.3.0e                     java-sdk-1.6.0                  modules                         pkg-config-path                 samtools-0.1.19
STAR-hg19                       java-sdk-1.7.0                  mpich2-x86_64                   python-2.7.5                    use.own
STAR-mm9                        ld-library-path                 null                            r-libs-user
bowtie2-2.1.0                   manpath                         openmpi-1.5.4-x86_64            ruby-1.8.7-p374
devtoolset-2                    module-cvs                      perl5lib                        ruby-1.9.3-p448

module show [module name]

[asrini@node063 ~]$ module show null
-------------------------------------------------------------------
/usr/share/Modules/modulefiles/null:

module-whatis    does absolutely nothing
-------------------------------------------------------------------

[asrini@node063 ~]$ module show r-libs-user
-------------------------------------------------------------------
/usr/share/Modules/modulefiles/r-libs-user:

module-whatis    Sets R_LIBS_USER=$HOME/R/library
setenv           R_LIBS_USER ~/R/library
-------------------------------------------------------------------

[asrini@node063 ~]$ module show devtoolset-2
-------------------------------------------------------------------
/usr/share/Modules/modulefiles/devtoolset-2:

module-whatis    Devtoolset-2 packages include the newer versions of gcc
prepend-path     PATH /opt/rh/devtoolset-2/root/usr/bin
prepend-path     MANPATH /opt/rh/devtoolset-2/root/usr/share/man
prepend-path     INFOPATH /opt/rh/devtoolset-2/root/usr/share/info
-------------------------------------------------------------------

[asrini@node063 ~]$ python -V
Python 2.6.6

[asrini@node063 ~]$ which python
/usr/bin/python

[asrini@node063 ~]$ module load python-2.7.5

[asrini@node063 ~]$ python -V
Python 2.7.5

[asrini@node063 ~]$ which python
/opt/software/python/python-2.7.5/bin/python

[asrini@node063 ~]$ module unload python-2.7.5

[asrini@node063 ~]$ which python
/usr/bin/python

$HOME/.bashrc

[asrini@consign ~]$ more .bashrc
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi
#
#
# Modules to load
if [ $HOSTNAME != "consign.hpc.local" ] && [ $HOSTNAME != "mercury.pmacs.upenn.edu" ]; then
        module load python-2.7.5
fi

# more stuff below .....

[asrini@consign ~]$ which python
/usr/bin/python
[asrini@consign ~]$ bsub -Is bash
Job <172129> is submitted to default queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on node063.hpc.local>>
[asrini@node063 ~]$ which python
/opt/software/python/python-2.7.5/bin/python

 1 

IDLE JOBS----------------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME

0 Idle Jobs

BLOCKED JOBS----------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME

Total Jobs: 1   Active Jobs: 1   Idle Jobs: 0   Blocked Jobs: 0

$ checkjob 5

checking job 5

State: Running
Creds:  user:rstober  group:rstober  class:shortq  qos:DEFAULT
WallTime: 00:01:31 of 99:23:59:59
SubmitTime: Thu Aug  9 11:40:44
  (Time Queued  Total: 00:00:01  Eligible: 00:00:01)

StartTime: Thu Aug  9 11:40:45
Total Tasks: 1

Req[0]  TaskCount: 1  Partition: DEFAULT
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
Allocated Nodes:
[node003.cm.cluster:1]

IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [ALL]
Flags:       RESTARTABLE

Reservation '5' (-00:01:31 -> 99:23:58:28  Duration: 99:23:59:59)
PE:  1.00  StartPriority:  1
qsub -t 1-1000 array.seed 
>qsub -W block=true dothis.sh 
qsub: Undefined attribute  MSG=detected presence of an unknown attribute
>qsub --version
version: 2.4.11

qsub -I -x dothis.sh

qsub -I

* sge-8.1.9.tar.gz, sge-8.1.9.tar.gz.sig:  Source tarball and PGP signature

* RPMs for Red Hat-ish systems, installing into /opt/sge with GUI
  installer and Hadoop support:

  * gridengine-8.1.9-1.el5.src.rpm:  Source RPM for RHEL, Fedora

  * gridengine-*8.1.9-1.el6.x86_64.rpm:  RPMs for RHEL 6 (and
    CentOS, SL)

  See < https://copr.fedorainfracloud.org/coprs/loveshack/SGE/ > for
  hwloc 1.6 RPMs if you need them for building/installing RHEL5 RPMs.

* Debian packages, installing into /opt/sge, not providing the GUI
  installer or Hadoop support:

  * sge_8.1.9.dsc, sge_8.1.9.tar.gz:  Source packaging.  See
    <http://wiki.debian.org/BuildingAPackage> , and see
    < http://arc.liv.ac.uk/downloads/SGE/support/  > if you need (a more
    recent) hwloc.

  * sge-common_8.1.9_all.deb, sge-doc_8.1.9_all.deb,
    sge_8.1.9_amd64.deb, sge-dbg_8.1.9_amd64.deb: Binary packages
    built on Debian Jessie.

* debian-8.1.9.tar.gz:  Alternative Debian packaging, for installing
  into /usr.

* arco-8.1.6.tar.gz:  ARCo source (unchanged from previous version)

* dbwriter-8.1.6.tar.gz:  compiled dbwriter component of ARCo
  (unchanged from previous version)

Softpanorama May the source be with you, but remember the KISS principle ;-)	Home	Switchboard	Unix Administration	Red Hat	TCP/IP Networks	Neoliberalism	Toxic Managers
	(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix

News	High Performance Components	Books	Recommended Links	Small HPC cluster architecture	HPC hardware vendors	Bright Cluster Manager
Saferm -- wrapper for rm command	PDSH -- a parallel remote shell	A note of HPC user community structure and "classic" vs "containerized/virtualized" HPC cluster dilemma	DNA sequencing data compression	The FASTQ Format	The FASTA Format	Compression of FASTA/FASTQ files
Grid Engine	qalter -- Change Job Priority	SGE Queues	qstat	qsub	qconf	SGE Parallel Environment
Slot limits and restricting number of slots per server	Managing User Access	SGE cheat sheet	SGE Consumable Resources	Installation of Grid Engine Master Host	Job or Queue Reported in Error State E	SGE hostgroups
SGE Execution Host Installation	Installing Mellanox InfiniBand Driver on RHEL 6.5	Infiniband	MLNX_OFED	Message Passing Interface	perf stat
Tools	C3 Tools	PDSH -- a parallel remote shell	rdist	rsync		Parallel command execution
Clustered Parallel File Systems	GPFS on Red Hat	Lustre	Optimizing usage of NFS in Grid Engine	NFS performance tuning	Unix System Monitoring	DDR3-1866 Memory Performance
PowerEdge C6220 II Rack Server	Linux Troubleshooting	Linux Performance Tuning	Suse performance tuning	Trunking / Bonding Multiple Network Interfaces	Bonding Ethernet Interfaces in Red Hat Linux
Intel Composer XE	Lmod – Alternative Environment Modules	VASP Performance optimization	Accelrys install
uptime command	mostat	top	ps	sar	ptree
vmstat	iostat	nfsstat	HPC Humor	Admin Horror Stories	Humor	Etc

Top Visited <p>Your browser does not support iframes.</p>					Switchboard
					Latest
					Past week
					Past month

Name	Last modified	Size

Parent Directory		-
README.txt	2016-02-29 23:39	1.5K
arco-8.1.6.tar.gz	2013-11-04 18:03	1.0M
arco-8.1.6.tar.gz.sig	2013-11-04 18:03	287
dbwriter-8.1.6.tar.gz	2013-11-04 18:03	3.8M
dbwriter-8.1.6.tar.gz.sig	2013-11-04 18:03	287
debian.tar.gz	2016-02-28 20:29	30K
gridengine-8.1.9-1.el5.src.rpm	2016-02-28 19:54	18M
gridengine-8.1.9-1.el5.x86_64.rpm	2016-02-28 19:54	19M
gridengine-8.1.9-1.el6.x86_64.rpm	2016-03-01 14:25	16M
gridengine-debuginfo-8.1.9-1.el5.x86_64.rpm	2016-02-28 19:54	38M
gridengine-debuginfo-8.1.9-1.el6.x86_64.rpm	2016-03-01 14:25	35M
gridengine-devel-8.1.9-1.el5.x86_64.rpm	2016-02-28 19:54	1.5M
gridengine-devel-8.1.9-1.el6.noarch.rpm	2016-03-01 14:25	1.2M
gridengine-drmaa4ruby-8.1.9-1.el5.x86_64.rpm	2016-02-28 19:54	13K
gridengine-drmaa4ruby-8.1.9-1.el6.noarch.rpm	2016-03-01 14:25	13K
gridengine-execd-8.1.9-1.el5.x86_64.rpm	2016-02-28 19:54	1.6M
gridengine-execd-8.1.9-1.el6.x86_64.rpm	2016-03-01 14:25	1.4M
gridengine-guiinst-8.1.9-1.el5.x86_64.rpm	2016-02-28 19:54	688K
gridengine-guiinst-8.1.9-1.el6.noarch.rpm	2016-03-01 14:25	696K
gridengine-qmaster-8.1.9-1.el5.x86_64.rpm	2016-02-28 19:54	1.7M
gridengine-qmaster-8.1.9-1.el6.x86_64.rpm	2016-03-01 14:25	1.5M
gridengine-qmon-8.1.9-1.el5.x86_64.rpm	2016-02-28 19:54	1.5M
gridengine-qmon-8.1.9-1.el6.x86_64.rpm	2016-03-01 14:25	1.4M
sge-8.1.9.tar.gz	2016-02-28 19:55	11M
sge-8.1.9.tar.gz.sig	2016-02-28 19:55	287
sge-common_8.1.9_all.deb	2016-02-28 20:20	1.4M
sge-dbg_8.1.9_amd64.deb	2016-02-28 20:20	17M
sge-doc_8.1.9_all.deb	2016-02-28 20:20	916K
sge_8.1.9.dsc	2016-02-28 20:20	1.5K
sge_8.1.9.tar.xz	2016-02-28 20:20	8.2M
sge_8.1.9_amd64.deb	2016-02-28 20:20	8.4M

Name	Description	Name	Description
kernel	Rocks Bootable Kernel Roll required	zfs-linux	ZFS On Linux Roll. Build and Manage Multi Terabyte File Systems.
base	Rocks Base Roll required	fingerprint	Fingerprint application dependencies
core	Core Roll required	hpc	Rocks HPC Roll
CentOS	CentOS Roll required	htcondor	HTCondor High Throughput Computing (version 8.2.8)
Updates-CentOS	CentOS Updates Roll required	sge	Sun Grid Engine (Open Grid Scheduler) job queueing system
kvm	Support for building KVM VMs on cluster nodes	perl	Support for Newer Version of Perl
ganglia	Cluster monitoring system from UCB	python	Python 2.7 and Python 3.x
area51	System security related services and utilities	openvswitch	Rocks integration of OpenVswitch

High Performance Computing (HPC)

Old News ;-)

[Jun 30, 2021] Kubernetes a black hole of unpredictable spend, according to new report

Jun 30, 2021 | www.theregister.com

[Jun 01, 2021] World s Fastest AI Supercomputer Built from 6,159 NVIDIA A100 Tensor Core GPUs

May 31, 2021 | hardware.slashdot.org

[Feb 27, 2021] CentOS 8 fiasco and HPC clusters

Conversion of Oracle Linux is viable option. The conversion script exists.

Notable quotes:

"... My (very large, aerospace) employer is dropping RHEL for Oracle Linux, via in-place upgrade. That seems to be more and more a good idea... Is Scientific still out there as a RHEL clone? ..."

Dec 10, 2020 | blog.centos.org

[Jan 29, 2021] I just noticed that in Sept 2020 Univa was bought by Altair

Jan 29, 2021 | finance.yahoo.com

[Dec 27, 2020] [email protected] - CentOS 8.x Support Ending and Converting to CentOS Stream

Dec 27, 2020 | groups.io

[Apr 01, 2019] The Seven Computational Cluster Truths

Inspired by "The seven networking truth by R. Callon, April 1, 1996

Feb 26, 2019 | www.softpanorama.org

[Jan 29, 2019] How to Setup DRBD to Replicate Storage on Two CentOS 7 Servers by Aaron Kili

Notable quotes:

"... It mirrors the content of block devices such as hard disks, partitions, logical volumes etc. between servers. ..."

"... It involves a copy of data on two storage devices, such that if one fails, the data on the other can be used. ..."

Summary

Jan 19, 2019 | www.tecmint.com

[Jan 08, 2019] CentOS Pulse Newsletter, December 2018 (#1807) Blog.CentOS.org

Jan 08, 2019 | blog.centos.org

[Jan 08, 2019] Student supercomputing is #PoweredByCentOS at SC18

Now with the demise of CentOS the same can can be done (and more efficiently) with Oracle Linux

Notable quotes:

"... Of the 15 teams participating, 11 of them are running their clusters on CentOS. There are 2 running Ubuntu, one Running Debian, and one running Fedora. This is, of course, typical at these competitions, with Centos leading as the preferred supercomputing operating system. ..."

Jan 08, 2019 | blog.centos.org

[Dec 16, 2018] Index of -downloads-SGE-releases-8.1.9

Dec 16, 2018 | liv.ac.uk

[Dec 16, 2018] GitHub - gawbul-docker-sge Dockerfile to build a container with SGE installed

Dec 16, 2018 | github.com

[Dec 16, 2018] wtakase-sge-master - Docker Hub

Dec 16, 2018 | hub.docker.com

[Nov 08, 2018] SGE Installation on Centos 7

Nov 08, 2018 | liv.ac.uk

[Sep 07, 2018] Experiences with Sun Grid Engine

Notable quotes:

"... are important ..."

Sep 07, 2018 | auckland.ac.nz

[Aug 17, 2018] Rocks 7.0 Manzanita (CentOS 7.4)

Aug 17, 2018 | www.rocksclusters.org

[Aug 17, 2018] Installation of Son of Grid Engine(SGE) on CentOS7 by byeon iksu

Oct 15, 2017 | biohpc.blogspot.com

[Jun 21, 2018] How to install pbs on compute node and configure the server and compute node - Users-Site Administrators - PBS Professional Op

Jun 21, 2018 | community.pbspro.org

[Jun 13, 2018] The Fundamentals of Building an HPC Cluster by Jeff Layton

Jun 13, 2018 | www.admin-magazine.com

[Apr 25, 2018] GridEngine cannot be installed on CentOS7

Apr 25, 2018 | github.com

[Apr 24, 2018] SGE Installation on Centos 7

Apr 24, 2018 | liv.ac.uk

[Apr 20, 2018] Environment Modules: Duplicate version symbol found error

Apr 20, 2018 | lists.sdsc.edu

[Mar 27, 2018] Google Unveils 72-Qubit Quantum Computer With Low Error Rates

Mar 27, 2018 | hardware.slashdot.org

[Dec 25, 2017] Huawei Showcases HPC Solutions at SC16

YouTube video.

Nov 29, 2016 | www.youtube.com

[Oct 24, 2017] LAMMPS -- a classical molecular dynamics software

Oct 24, 2017 | lammps.sandia.gov

[Oct 24, 2017] The combine law of Parkinson-Murphy

[Oct 23, 2017] Optimizing HPC Applications with Intel Cluster Tools

Oct 23, 2017 | my.safaribooksonline.com

[Oct 17, 2017] Perf- A Performance Monitoring and Analysis Tool for Linux

Oct 17, 2017 | www.tecmint.com

[Oct 17, 2017] perf-stat(1) - Linux man page

Oct 17, 2017 | linux.die.net

[Oct 15, 2017] cp2k download SourceForge.net

Oct 15, 2017 | sourceforge.net

[Oct 14, 2017] Performance analysis in Linux

Notable quotes:

"... Based on the example from here . ..."

Oct 14, 2017 | www.collabora.com

[Jul 28, 2017] Module Environment Developer Notes

Jul 28, 2017 | hpc.nrel.gov

[Jul 28, 2017] HPC Environment Modules