24 July 2015

614. SIESTA with MPI and acml on debian jessie

One of my students might be using SIESTA for some simulations, and a first step towards that is to set it up on my cluster.

This isn't an optimised build -- right now I'm just looking at having a simple parallell build that runs.

I had a look at http://www.pa.msu.edu/people/tomanek/SIESTA-installation.html and http://pelios.csx.cam.ac.uk/~mc321/siesta.html.
 
NOTE: don't use the int64 acml or openblas BLAS libs, or you'll get SIGSEV due to invalid memory reference when running. NWChem is the complete opposite, and for some reason both the int64 and regulat acml libs have the same names. Not sure how that's supposed to work out on a system with nwchem, which needs the int64 libs.

See here for acml on debian. I've got /opt/acml/acml5.3.1/gfortran64_int64/lib in my /etc/ld.so.conf.d/acml.conf on behalf of nwchem.
 Being lazy, I opted for the debian scalapack and libblacs packages:
 
sudo apt-get install libscalapack-mpi-dev libblacs-mpi-dev libopenmpi-dev

To get the link to the SIESTA code, go to http://departments.icmab.es/leem/siesta/CodeAccess/selector.html

Then, if you're an academic, you can do:
sudo mkdir /opt/siesta
sudo chown $USER /opt/siesta
cd /opt/siesta
wget http://departments.icmab.es/leem/siesta/CodeAccess/Code/siesta-3.2-pl-5.tgz
tar xvf siesta-3.2-pl-5.tgz
cd siesta-3.2-pl-5/Obj
sh ../Src/obj_setup.sh
*** Compilation setup done. *** Remember to copy an arch.make file or run configure as: ../Src/configure [configure_options]
../Src/./configure --help
`configure' configures siesta 2.0 to adapt to many kinds of systems. Usage: ./configure [OPTION]... [VAR=VALUE]... [..] Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX [/usr/local] --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX [PREFIX] By default, `make install' will install all the files in `/usr/local/bin', `/usr/local/lib' etc. You can specify an installation prefix other than `/usr/local' using `--prefix', for instance `--prefix=$HOME'. [..] --enable-mpi Compile the parallel version of SIESTA --enable-debug Compile with debugging support --enable-fast Compile with best known optimization flags Optional Packages: --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) --with-netcdf=<lib> use NetCDF library --with-siesta-blas use BLAS library packaged with SIESTA --with-blas=<lib> use BLAS library --with-siesta-lapack use LAPACK library packaged with SIESTA --with-lapack=<lib> use LAPACK library --with-blacs=<lib> use BLACS library --with-scalapack=<lib> use ScaLAPACK library [..]
../Src/./configure --enable-mpi
checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu [..] checking for mpifc... no checking for mpxlf... no checking for mpif90... mpif90 checking for MPI_Init... no checking for MPI_Init in -lmpi... yes [..] checking for sgemm in /opt/openblas/lib/libopenblas.so... yes checking LAPACK already linked... yes checking LAPACK includes divide-and-conquer routines... yes configure: using DC_LAPACK routines packaged with SIESTA due to bug in library. Linker flag might be needed to avoid duplicate symbols configure: creating ./config.status config.status: creating arch.make
Edit arch.make:
# # This file is part of the SIESTA package. # # Copyright (c) Fundacion General Universidad Autonoma de Madrid: # E.Artacho, J.Gale, A.Garcia, J.Junquera, P.Ordejon, D.Sanchez-Portal # and J.M.Soler, 1996- . # # Use of this software constitutes agreement with the full conditions # given in the SIESTA license, as signed by all legitimate users. # .SUFFIXES: .SUFFIXES: .f .F .o .a .f90 .F90 SIESTA_ARCH=x86_64-unknown-linux-gnu--unknown FPP= FPP_OUTPUT= FC=mpif90 RANLIB=ranlib SYS=nag SP_KIND=4 DP_KIND=8 KINDS=$(SP_KIND) $(DP_KIND) FFLAGS=-g -O2 FPPFLAGS= -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT LDFLAGS= ARFLAGS_EXTRA= FCFLAGS_fixed_f= FCFLAGS_free_f90= FPPFLAGS_fixed_F= FPPFLAGS_free_F90= BLAS_LIBS=-L/opt/acml/acml5.3.1/gfortran64/lib -lacml LAPACK_LIBS= BLACS_LIBS=-L/usr/lib -lblacs-openmpi -lblacsCinit-openmpi SCALAPACK_LIBS=-L/usr/lib -lscalapack-openmpi COMP_LIBS=dc_lapack.a NETCDF_LIBS= NETCDF_INTERFACE= MPI_LIBS= -L/usr/lib/openmpi/lib -lmpi -lmpi_f90 LIBS=$(SCALAPACK_LIBS) $(BLACS_LIBS) $(LAPACK_LIBS) $(BLAS_LIBS) $(NETCDF_LIBS) $(MPI_LIBS) -lpthread #SIESTA needs an F90 interface to MPI #This will give you SIESTA's own implementation #If your compiler vendor offers an alternative, you may change #to it here. MPI_INTERFACE=libmpi_f90.a MPI_INCLUDE=. #Dependency rules are created by autoconf according to whether #discrete preprocessing is necessary or not. .F.o: $(FC) -c $(FFLAGS) $(INCFLAGS) $(FPPFLAGS) $(FPPFLAGS_fixed_F) $< .F90.o: $(FC) -c $(FFLAGS) $(INCFLAGS) $(FPPFLAGS) $(FPPFLAGS_free_F90) $< .f.o: $(FC) -c $(FFLAGS) $(INCFLAGS) $(FCFLAGS_fixed_f) $< .f90.o: $(FC) -c $(FFLAGS) $(INCFLAGS) $(FCFLAGS_free_f90) $<
make
cd ../
ln -s Obj/siesta siesta

I added /opt/siesta/siesta-3.2-pl-5 to $PATH.

To test, edit /opt/siesta/siesta-3.2-pl-5/test.mk:
6 #SIESTA=../../../siesta 7 SIESTA=mpirun -n 2 ../../../siesta
Then
cd /opt/siesta/siesta-3.2-pl-5/Tests/h3po4_2
export LD_LIBRARY_CONFIG=/opt/acml/acml5.3.1/gfortran64/lib 
make
>>>> Running h3po4_2 test... ==> Copying pseudopotential file for H... ==> Copying pseudopotential file for O... ==> Copying pseudopotential file for P... ==> Running SIESTA as mpirun -n 2 ../../../siesta ===> SIESTA finished successfully

Also, look at work/h3po4_2.out:
* Running on    2 nodes in parallel
>> Start of run:  24-JUL-2015  21:58:13

                           ***********************       
                           *  WELCOME TO SIESTA  *       
                           ***********************       

reinit: Reading from standard input
[..]
elaps:  optical           1       0.000       0.000     0.00
  
>> End of run:  24-JUL-2015  21:58:20

07 July 2015

613. Debian Jessie: Turn off update pop-ups in gnome, and switching to lxterminal, nemo and ksnapshot

I originally installed the OS on my desktop back 2010 (Lenny) and haven't treated it very nicely (mixed releases, repos and have in general been installing, uninstalling and replacing packages with my own compiled ones -- and have fiddled with a few too many things) so when I was beginning to have issues on my desktop when compiling ECCE -- issues that weren't present on any other systems that were freshly installed -- I decided to start over again and install debian anew. I went straight for Debian Jessie, although I had many reasons to stay with Wheezy, such as systemd and, more importantly, the fact that Sun GridEngine is completely missing in Debian Jessie! See here.

Luckily the debian wheezy package works quite well on jessie -- but that's pure luck. I'm currently on the fence between hoping that SGE continues to work well until the SID version trickles down to backports (IF it does),  or whether to learn how to set up SLURM instead.

Either way, here are a few things that annoyed me in Jessie (more specifically they annoyed me in GNOME) and that had to be fixed:

* Turn off update notifications
I hate being bugged by notifications about updating/upgrading my system. I'm not running windows -- it's unbecoming of a linux desktop to behave like that.

To fix it, go to Settings, Personal/Notifications and untick Package Updater

(simple -- you just need to know it's there)

* Nautilus doesn't do extra pane anymore. Bye bye nautilus.
Instead, install nemo, which is the rigthful heir to nautilus. It pulls in a lot of dependencies, but it's worth it.

To make nemo default do
me@beryllium:$ xdg-mime query default inode/directory
org.gnome.Nautilus.desktop
me@beryllium:$ xdg-mime default nemo.desktop inode/directory application/x-gnome-saved-search
me@beryllium:$ xdg-mime query default inode/directory
nemo.desktop
* Gnome-terminal doesn't do transparency anymore. Bye bye gnome-terminal.
Instead, install lx-terminal. To set it as default in both the OS and gnome, in the terminal do
gsettings set org.cinnamon.desktop.default-applications.terminal exec lxterminal
sudo update-alternatives --config x-terminal-emulator
* Adding firefox and thunderbird to default applications
I also felt compelled to install firefox and add it to the list over available applications in gnome so I could set it as the default browser Edit /var/lib/dpkg/alternatives/x-www-browser so that it reads
 auto
/usr/bin/x-www-browser
x-www-browser.1.gz
/usr/share/man/man1/x-www-browser.1.gz

/usr/bin/chromium
40

/usr/bin/firefox-bin
70

/usr/bin/iceweasel
70
/usr/share/man/man1/iceweasel.1.gz
I also followed this post: http://verahill.blogspot.com.au/2013/11/530-briefly-adding-new-entry-to-default.html
For thunderbird, I only followed the latter post.

* To disable screen saver completely:
gsettings set org.gnome.settings-daemon.plugins.power active false 
* I've got no idea how to elegantly exorcise bijiben-shell-search-provider, which keeps making my CPU usage spike. It's not nice.

* Note that dragging windows no longer works with Alt+left click -- instead use the windows key (Super key). Why this old standard behaviour got nuked I don't understand.

* I installed ksnapshot and set it as the default for prtn scrn instead of the crippled gnome-screenshot. Yes, it pulls in a lot of dependencies, but it's worth it.

Looking at the list above I'm slowly realising that it's probably time to say goodby to gnome for good. It's not going to a place where I want to follow it.

Pity.

There are a few things that I like about gnome 3. Well, there's a single thing that I like that got introduced: quickly searching for programs in the Activities Overview. Turns out that the applications menu wasn't that necessary after all.

The removal of features from nautilus, screenshot, terminal etc annoys me a lot though. Same goes for the removal of the minimize button.

Finally, I only find gnome useable once I've installed the gnome extensions by frippery. Stock gnome is useless to me.

09 June 2015

612. Randomly Rebooting Router (E2500-AU v1.0 w/ TomatoUSB)

Rolling update:
* 24 June 2015: 7 days uptime with wifi working perfectly. Did reboot it last night because my work computer lost contact with the router somehow (connects via reverse tunnel). The issue with the Randomly Rebooting Router can be considered solved. Obviously, it's solved by crippling the router by turning off the 5 GHz band and tkip (the latter may not be related though).

* Submitted a bug report: http://tomato.groov.pl/?page_id=334&bugerator_nav=display&bug_project=1&issue=1833

* 16 June 2015 12:42 AEST.  The router has been up for two days and four hours (and counting) in spite of heavy use of our phones. Seems like turning off 5 GHz and/or switching from AES to TKIP has worked. A fair criticism is that I don't have much of a baseline to compare with when it comes to reboots, but subjectively there's a lot less swearing over crappy wifi the past two days.

* 14 June 2015 08:05 AEST. After two days of uptime when radio-silence was enforced, we turned our phones back on. The router rebooted later that night. Same thing happened the next night. After briefly putting dd-wrt on the router, I put tomatousb back on it, turned off 5 GHz and changed from AES to TKIP. The router has been up since 9.30 pm last night (10 h and counting)

Found this bug report: http://tomato.groov.pl/?page_id=334&bugerator_nav=display&bug_project=1&issue=1813

Also read this with interest: http://movingpackets.net/2013/11/18/linksys-e2500-deserves-no-airplay/

I've seen posts that find that dd-wrt doesn't have the randomly rebooting issue. dd-wrt doesn't support dual band, at least on e2500. I was surprised that v1 of the cisco linksys firmware had the same exact issue (random reboots when 5 GHz is on). It's all pointing in a specific direction.

Not sure why using the 5 GHz channel with my laptop doesn't trigger the reboots, but maybe they did -- but happened less frequently due to the lower number of 5 GHz capable devices prior to us getting the phones.

* 10 June 2015 16:24 AEST. Since turning off wifi on the Samsung Galaxy S4 phones (but using the two laptops and the tablet listed below) the router has stayed up for 24 hours 8 hours and 11 minutes, and counting. The night between Monday and Tuesday, when we were using our phones, the router rebooted at least twice.


This is another one of those posts that don't offer a solution, but rather states a problem. I'm doing this in the hope that others who are making similar observations as I am will see this and...well, feel slightly less alone at the very least. In the best case, someone will have a solution and offer it as a comment.

So, here's the issue: 
* I have a Linksys E2500-AU v1.0 that is running TomatoUSB (howto)
Tomato v1.28.0000 MIPSR2-128 K26 USB Max ======================================================== Welcome to the Linksys E2500 v1.0 [TomatoUSB] Uptime: 08:14:26 up 1 min Load average: 0.52, 0.18, 0.06 Mem usage: 28.4% (used 17.06 of 59.96 MB) WAN : 192.168.1.100/24 @ 58:6D:8F:D3:XX:XX LAN : 192.168.2.1/24 @ DHCP: 192.168.2.2 - 192.168.2.202 WL0 : volatile @ channel: AU13 @ 58:6D:8F:D3:XX:XX WL1 : volatile50 @ channel: AU153 @ 00:01:36:1F:XX:XX ========================================================
* It has a "Broadcom BCM5357 chip rev 2 pkg 8"

* For a long time it, and its predecessor (a WRT-54GL), were running just fine. The predecessor got replaced due to a fried power supply.

* Over the past six-seven months there have been issues with the wireless signal dropping. It isn't just the wireless transmission being stopped and restarted, but the router actually reboots (according to uptime).

* We used to have the following wireless devices: Fujitsu lifebook (v100?), Thinkpad SL410, Google Nexus One and a HTC Legend. At some point we also got a Samsung Galaxy Tab 2. This configuration was running for a few years.

* Coinciding roughly with the perceived start of the rebooting issue was me purchasing a Samsung Galaxy S4 (i9505).

* The issue got a lot worse recently.

* Recently my partner also got a Samsung Galaxy S4 (i9505).

* I have an almost identical router (v2) at work, and the current uptime is 140 days. I do connect very occasionally via wireless to it using my Samsung Galaxy S4. However, this router has a "Broadcom BCM5357 chip rev 1 pkg 8".

What seems to be happening:
The Samsung Galaxy S4 phones seem to be destabilising the router and causing reboots. No, wait, hear me out. It shouldn't happen, and the adage about 'correlation vs causation' may well be true in this case too, but there are precendents (apparently) when it comes to Intel wireless devices:

On http://www.linksysinfo.org/index.php?threads/tomatousb-keeps-resetting-the-router.33208/ from 2010
Hrm...routers used to spontaneously reboot when the wireless driver failed on Tomato as a result of an Intel (mobile) wireless driver bug on Windows. Maybe similar?
On http://www.linksysinfo.org/index.php?threads/random-reboots.21020/ from 2007
Currently using DD-wrt V24, it's been up for 25 days I can confirm it has something to do with the broadcom wireless drivers.

and
What kind of wireless device does your laptop have? Is it Intel 2100/2200 by any chance?
And in the end the thread concludes that it was due to users with Intel 2100/2200 cards.

The Samsung Galaxy S4 has a Qualcomm Snapdragon 600 APQ8064AB, with is a system-on-a-chip. The Nexus One and HTC Legend also had snapdragons, but obviously much older models. The Galaxy Tab 2 seems to have a Texas Instrument chip (TI OMAP 4430).

Could it be that the phones are causing the issues?

Luckily it's something that's reasonably easy to test, so I'm looking forward to reporting back in a couple of days (of enforced radio silence).

A different test will be to swap routers (but not power supplies) between work and home and see if the behaviour is location dependent. That will take a lot more effort though due to the very specific set-ups.

As the logs get erased on reboot I'm tracking the uptime from now on using autossh and logging from a work computer that's always on.


Some more:
Below is a post regarding iphones, and rebooting routers.

While that post is not related to 5GHz causing issues, it's got me thinking that as neither the Fujitsu, Galaxy Tab, Nexus One or HTC legend support 5 GHz but the Samsung Galaxy S4 phones do, the issue may be possibly related to that. There is obviously quite a lot of things to test.
ADDITIONAL INFORMATION: in the mean time we have been checking and elimination as well. We have been trying to connect certain wireless devices to the network through the DAP and something odd has come up. We have been trying mobile phones (smartphones) at first. My own telephone (Samsung Galaxy S3) seems to cause no troubles. With that phone connected for a whole day, internet connection does not fail once (the router does not reboot). I have been trying both 5Ghz and 2.4 Ghz. bands, both worked okay. One colleague also wanted to connect his Apple Iphone 3G (S?) to the network. I told him he could but this phone could not find the 5Ghz network, so I have switched back to 2.4Ghz again. The iPhone connected and within 5 minutes the connection interrupted. I have set the network back to 5Ghz (so the iPhone could no longer connect) and changed the network settings again. This morning I switched back to 2.4 with only my phone connected. Not a problem. This afternoon I let my colleague connect his phone again and disconnected my Samsung. Within 5 minutes the router started to reboot! After I had the iPhone disconnected again and let another colleague connect his phone, a Samsung Galaxy S(1). So far no problems.

Tomato Anon
Somehow the Spontaneously Rebooting Router doesn't show up here, while the stable one does: http://anon.groov.pl/index.php?country=Australia

Either way, the anon database is a great way of quickly finding out what Tomato version you can put on your router.