Showing posts with label nwchem 6.3. Show all posts
Showing posts with label nwchem 6.3. Show all posts

25 November 2013

531. Briefly: NWChem 6.3 -- issues with planewave (PSPW) module and AMD FX8150 and 8350 CPUs

This is more of an announcement or warning than a proper blog post:

Both FX8350 and FX8150 have trouble running the pspw module causing the calculation to lead to exploding structures:
http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id1059/Issue_with_pspw_using_nwchem_6.3....html

My other nodes have no trouble running the job in question. Also, the issue was only found in nwchem 6.3 -- nwchem 6.1.1 worked fine. So it's not an FX83x50 related fault per se.

Again, see the post at the nwchem-sw.org site for more information.

04 November 2013

526. New release of NWChem 6.3 out (17th of October 2013)

There's recently a new release of nwchem 6.3 (release 2). As usual there's no public message, no release notes or anything that actually informs you as to whether there are critical bug fixes, new functionality or anything else.

The new version can be found here: http://www.nwchem-sw.org/download.php?f=Nwchem-6.3.revision2-src.2013-10-17.tar.gz

I'm not competent in telling you whether you should upgrade or not, but here's a list over the changed files:
nwchem-6.3.revision2-src.2013-10-17/INSTALL
nwchem-6.3.revision2-src.2013-10-17/src/config/makefile.h
nwchem-6.3.revision2-src.2013-10-17/src/dplot/create_contour.F
nwchem-6.3.revision2-src.2013-10-17/src/dplot/dplot_dump.F
nwchem-6.3.revision2-src.2013-10-17/src/dplot/dplot.F
nwchem-6.3.revision2-src.2013-10-17/src/dplot/dplot_input.F
nwchem-6.3.revision2-src.2013-10-17/src/dplot/get_transden.F
nwchem-6.3.revision2-src.2013-10-17/src/mcscf/detci/detci_spin.F
nwchem-6.3.revision2-src.2013-10-17/src/nwdft/lr_tddft/tddft_analysis.F
nwchem-6.3.revision2-src.2013-10-17/src/nwdft/lr_tddft/tddft_davidson.F
nwchem-6.3.revision2-src.2013-10-17/src/nwdft/lr_tddft/tddft_init.F
nwchem-6.3.revision2-src.2013-10-17/src/nwdft/lr_tddft/tddft_residual.F
nwchem-6.3.revision2-src.2013-10-17/src/nwpw/nwpwlib/Parallel/Parallel-tcgmsg.F
nwchem-6.3.revision2-src.2013-10-17/src/tce/tce_input.F
nwchem-6.3.revision2-src.2013-10-17/src/tce/tce_mo2e_hybrid_2eorb_split.F
nwchem-6.3.revision2-src.2013-10-17/src/tce/tce_mo2e_zones_4a_disk_ga_chop_N5.F
nwchem-6.3.revision2-src.2013-10-17/src/tce/tce_mo2e_zones_4a_disk_ga_N5.F
nwchem-6.3.revision2-src.2013-10-17/src/tools/GNUmakefile
nwchem-6.3.revision2-src.2013-10-17/src/util/util_nwchem_version.F

In other words, there's been changes to the TDDFT module, to the dplot module, TCE  etc.

Looking through the nwchem forum, I think the following posts may hint at what's been changed:

tddft: http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id889/Possible_Bug_in_NWCHEM%3A_TD-B97.html

dplot: http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id1013/Dplot_output_charge_density%2C_tot....html. The integrated (electron) density is printed now.

Not sure about the TCE though.

Here's the (almost fulle) diff -r output:
Only in nwchem-6.3-src.2013-05-28/QA/tests: dplot
diff -r nwchem-6.3-src.2013-05-28/src/config/makefile.h nwchem-6.3.revision2-src.2013-10-17/src/config/makefile.h
2c2
< # $Id: makefile.h 24201 2013-05-09 00:59:44Z edo $
---
> # $Id: makefile.h 24592 2013-09-24 18:49:32Z jhammond $
1171c1171
<         GNUMAJOR=$(shell $(FC) -dM -E - < /dev/null | egrep __VERS | cut -c22)
---
>         GNUMAJOR=$(shell $(FC) -dM -E - < /dev/null 2> /dev/null | egrep __VERS | cut -c22)
1173c1173
<         GNUMINOR=$(shell $(FC) -dM -E - < /dev/null | egrep __VERS | cut -c24)
---
>         GNUMINOR=$(shell $(FC) -dM -E - < /dev/null 2> /dev/null | egrep __VERS | cut -c24)
1305c1305
<         GNUMAJOR=$(shell $(FC) -dM -E - < /dev/null | egrep __VERS | cut -c22)
---
>         GNUMAJOR=$(shell $(FC) -dM -E - < /dev/null 2> /dev/null | egrep __VERS | cut -c22)
1307c1307
<         GNUMINOR=$(shell $(FC) -dM -E - < /dev/null | egrep __VERS | cut -c24)
---
>         GNUMINOR=$(shell $(FC) -dM -E - < /dev/null 2> /dev/null | egrep __VERS | cut -c24)
1532c1532
<         GNUMAJOR=$(shell $(FC) -dM -E - < /dev/null | egrep __VERS | cut -c22)
---
>         GNUMAJOR=$(shell $(FC) -dM -E - < /dev/null 2> /dev/null | egrep __VERS | cut -c22)
1534c1534
<         GNUMINOR=$(shell $(FC) -dM -E - < /dev/null | egrep __VERS | cut -c24)
---
>         GNUMINOR=$(shell $(FC) -dM -E - < /dev/null 2> /dev/null | egrep __VERS | cut -c24)
1697,1700c1697,1704
<        ifdef USE_I4FLAGS
<            ifeq ($(_FC),gfortran)
< #wrong             FOPTIONS += -fdefault-integer-8
<     else  ifeq ($(_FC),crayftn)
---
>       ifeq ($(_FC),gfortran)
>         ifdef USE_I4FLAGS
> #             FOPTIONS += -fdefault-integer-4
>         else
>              FOPTIONS += -fdefault-integer-8
>         endif
>       else ifeq ($(_FC),crayftn)
>         ifdef USE_I4FLAGS
1702,1708c1706
<     else   
<              FOPTIONS += -i4
<            endif
<        else
<          ifeq ($(_FC),gfortran)
<            FOPTIONS += -fdefault-integer-8
<          else  ifeq ($(_FC),crayftn)
---
>         else
1710,1715c1708,1717
<          else
<            FOPTIONS += -i8
<          endif
<        endif
<        DEFINES  += -DEXT_INT
<   MAKEFLAGS = -j 1 --no-print-directory
---
>         endif
>       else
>         ifdef USE_I4FLAGS
>              FOPTIONS += -i4
>         else
>              FOPTIONS += -i8
>         endif
>       endif
>       DEFINES  += -DEXT_INT
>       MAKEFLAGS = -j 1 --no-print-directory
1954c1956
<         GNUMAJOR=$(shell $(FC) -dM -E - < /dev/null | egrep __VERS | cut -c22)
---
>         GNUMAJOR=$(shell $(FC) -dM -E - < /dev/null 2> /dev/null | egrep __VERS | cut -c22)
1956c1958
<         GNUMINOR=$(shell $(FC) -dM -E - < /dev/null | egrep __VERS | cut -c24)
---
>         GNUMINOR=$(shell $(FC) -dM -E - < /dev/null 2> /dev/null | egrep __VERS | cut -c24)
1969a1972,1974
>         ifeq ($(GNU_GE_4_6),true) 
>           FOPTIMIZE += -march=native -mtune=native
>         else
1974,1976d1978
<         ifeq ($(GNU_GE_4_6),true) 
<           FOPTIMIZE += -march=native -mtune=native
<         else
2198d2199
<    EXPLICITF = TRUE
2211a2213
>     EXPLICITF = TRUE
2224a2227
>     EXPLICITF = TRUE
2247c2250,2251
<     #CC = mpicc
---
>     FC = mpixlf77_r
> 
2248a2253
>         CC         = mpicc
2251,2253c2256,2258
<         FOPTIONS  += -g -funderscoring
<         FOPTIMIZE += -O3 -ffast-math -Wuninitialized 
<         FOPTIMIZE += -O0 -g
---
>         FOPTIONS  += -g -funderscoring -Wuninitialized 
>         FOPTIMIZE += -O3 -ffast-math
>         FDEBUG    += -O1 -g
2262c2267,2281
<         CORE_LIBS +=  -llapack  -lblas 
---
>         CORE_LIBS +=  -llapack $(BLASOPT) -lblas
> 
>         # Here is an example for ALCF:
>         # IBMCMP_ROOT=${IBM_MAIN_DIR}
>         # BLAS_LIB=/soft/libraries/alcf/current/xl/BLAS/lib
>         # LAPACK_LIB=/soft/libraries/alcf/current/xl/LAPACK/lib
>         # ESSL_LIB=/soft/libraries/essl/current/essl/5.1/lib64
>         # XLF_LIB=${IBMCMP_ROOT}/xlf/bg/14.1/bglib64
>         # XLSMP_LIB=${IBMCMP_ROOT}/xlsmp/bg/3.1/bglib64
>         # XLMASS_LIB=${IBMCMP_ROOT}/xlmass/bg/7.3/bglib64
>         # MATH_LIBS="-L${XLMASS_LIB} -lmass -L${LAPACK_LIB} -llapack \
>                      -L${ESSL_LIB} -lesslsmpbg -L${XLF_LIB} -lxlf90_r \
>                      -L${XLSMP_LIB} -lxlsmp -lxlopt -lxlfmath -lxl \
>                      -Wl,--allow-multiple-definition"
>         # Note that ESSL _requires_ USE_64TO32 on Blue Gene
2265,2266d2283
<     FC = mpixlf77_r
<     CC = mpixlc_r
2267a2285,2286
>         EXPLICITF  = TRUE
>         CC         = mpixlc_r
2274,2278d2292
< ifdef USE_I4FLAGS
<         FOPTIONS  = -qintsize=4
< else
<         FOPTIONS  = -qintsize=8 
< endif
2280,2284c2294,2318
<         FOPTIONS  += -qEXTNAME -qxlf77=leadzero
<         FOPTIONS  +=    -qstrict -qthreaded -qnosave -g
<         FOPTIMIZE += -O2 -qarch=qp -qtune=qp -qcache=auto -qunroll=auto -qfloat=rsqrt
< #        FOPTIMIZE += -qhot=level=0 
<         FDEBUG    = -O0 
---
>         ifdef USE_I4FLAGS
>             FOPTIONS = -qintsize=4
>             ifeq ($(BLAS_SIZE),8)
>                 @echo "You cannot use BLAS with 64b integers when"
>                 @echo "the compiler generates 32b integers (USE_I4FLAGS)!"
>                 @exit 1
>             endif # BLAS_SIZE
>         else
>             FOPTIONS = -qintsize=8 
>             ifeq ($(BLAS_SIZE),4)
>                 ifneq ($(USE_64TO32),y)
>                     @echo "You cannot use BLAS with 32b integers when"
>                     @echo "the compiler generates 64b integers unless"
>                     @echo "you do the 64-to-32 conversion!"
>                     @exit 1
>                 endif # USE_64TO32
>             endif # BLAS_SIZE
>         endif # USE_I4FLAGS
> 
>         FDEBUG     = -g -qstrict -O3
>         FOPTIONS  += -g -qEXTNAME -qxlf77=leadzero
>         FOPTIONS  += -qthreaded -qnosave # -qstrict
> #        FOPTIMIZE += -g -O3 -qarch=qp -qtune=qp -qcache=auto -qunroll=auto -qfloat=rsqrt
>         FOPTIMIZE += -O3 -qarch=qp -qtune=qp -qsimd=auto -qhot=level=1 -qprefetch -qunroll=yes #-qnoipa
>         FOPTIMIZE += -qreport -qsource -qlistopt -qlist # verbose compiler output
2425a2460,2466
>   ifeq ($(ARMCI_NETWORK),ARMCI)
>     ifdef EXTERNAL_ARMCI_PATH
>       CORE_LIBS += -L$(EXTERNAL_ARMCI_PATH)/lib -larmci
>     else
>       CORE_LIBS += -L$(NWCHEM_TOP)/src/tools/install/lib -larmci
>     endif
>   else
2426a2468
>   endif
diff -r nwchem-6.3-src.2013-05-28/src/dplot/create_contour.F nwchem-6.3.revision2-src.2013-10-17/src/dplot/create_contour.F
4c4
<      .     no_of_spacings,
---
>      .                          no_of_spacings,tol_rho,
7c7
< * $Id: create_contour.F 19697 2010-10-29 16:57:34Z d3y133 $
---
> * $Id: create_contour.F 24552 2013-08-31 21:23:45Z niri $
30a31
>       double precision tol_rho
46d46
<       Double Precision TOLL
247d246
<          TOLL=1.D-15
257c256
< 
---
> c
259,271c258,271
<      T        TOLL,AO_Bas_Han,g_Dns,
<      &                  nbf_ao_mxnbf_ce,nAtom,1,1,1,
<      U        1,ngrpp,nBF,mBF,.false.,1,
<      &                  Dbl_mb(k_FMat),Dbl_mb(k_PMat),
<      &                  Dbl_mb(k_BMat),0d0,
<      &                  Dbl_mb(k_Scr1),0,0d0,Int_mb(k_ibf),
<      &                  Int_mb(k_iniz),Int_mb(k_ifin),
<      &                  Values(iOffg),0,
<      &              dbl_mb(irchi_atom), 0,
<      &              dbl_mb(k_rdat), int_mb(k_cetobfr),1d0,
<      &               0, 0, .false. )
< 
< 
---
>      T         tol_rho,
>      &         AO_Bas_Han,
>      &         g_Dns,
>      &         nbf_ao_mxnbf_ce,
>      &         nAtom,
>      &         1,1,1,
>      U         1,ngrpp,nBF,mBF,.false.,1,
>      &         Dbl_mb(k_FMat),Dbl_mb(k_PMat),Dbl_mb(k_BMat),0d0,
>      &         Dbl_mb(k_Scr1),0,0d0,Int_mb(k_ibf),
>      &         Int_mb(k_iniz),Int_mb(k_ifin), Values(iOffg),0,
>      &         dbl_mb(irchi_atom),0,
>      &         dbl_mb(k_rdat),int_mb(k_cetobfr),100.d0,
>      &         0, .false. )
> c
diff -r nwchem-6.3-src.2013-05-28/src/dplot/dplot_dump.F nwchem-6.3.revision2-src.2013-10-17/src/dplot/dplot_dump.F
3c3
<      ,     natom,xyz,charge,volume,
---
>      ,     natom,xyz,charge,volume,tol_rho,
17c17
<       double precision spread(3),step(3),angle(3)
---
>       double precision spread(3),step(3),angle(3),tol_rho
34,35c34,35
<             Write(Out_Unit,*)Title
<             Write(Out_Unit,*) 'Total Density'
---
>             Write(Out_Unit,*)"Cube file generated by NWChem"
>             Write(Out_Unit,*) Title
80c80,81
<          if(lgaussian) then
---
> c
>          if(lgaussian) then ! for cube files
85,86c86
<                if(abs(values(i)).lt.1d-10) 
<      .              values(i)=0d0
---
>                if(abs(values(i)).lt.tol_rho) values(i)=0d0
113c113
< c $Id: dplot_dump.F 21176 2011-10-10 06:35:49Z d3y133 $
---
> c $Id: dplot_dump.F 24552 2013-08-31 21:23:45Z niri $
diff -r nwchem-6.3-src.2013-05-28/src/dplot/dplot.F nwchem-6.3.revision2-src.2013-10-17/src/dplot/dplot.F
3c3
< * $Id: dplot.F 24177 2013-05-03 20:42:30Z d3y133 $
---
> * $Id: dplot.F 24552 2013-08-31 21:23:45Z niri $
59a60
>       double precision tol_rho
120a122,125
> c --  Read tol_rho 
>       if (.not. rtdb_get(rtdb, 'dplot:tol_rho', mt_dbl, 1,
>      &   tol_rho)) call errquit('dpinput:rtdbget failed',11, RTDB_ERR)
> c
498a504,505
>           call int_init(rtdb,1,AO_Bas_Han)
>           if (iproc.eq.0) write(luout,*) ' Root: ', iroot
500a508
>           call int_terminate()
563c571
<      .        no_of_spacings,
---
>      .                       no_of_spacings, tol_rho,
607c615
<      ,     natom,dbl_mb(k_xyz),dbl_mb(k_charge),volume,
---
>      ,     natom,dbl_mb(k_xyz),dbl_mb(k_charge),volume,tol_rho,
diff -r nwchem-6.3-src.2013-05-28/src/dplot/dplot_input.F nwchem-6.3.revision2-src.2013-10-17/src/dplot/dplot_input.F
3c3
< * $Id: dplot_input.F 22941 2012-09-30 02:37:23Z niri $
---
> * $Id: dplot_input.F 24552 2013-08-31 21:23:45Z niri $
22c22
<       Parameter (Num_Dirs  = 15)
---
>       Parameter (Num_Dirs  = 16)
40a41
>       double precision tol_rho
45c46
<      A     'dos',
---
>      A     'dos','tol_rho',
61c62,63
<       dodos =.false.
---
>       dodos     =.false.
>       iroot     = 1
103c105
<      &     900, 964, 1997, 9999) ind
---
>      &     900, 964, 1997, 1998, 9999) ind
162d163
<       iroot = 0
252a254
> c
255a258
> c
258a262,269
> c
>  1998 continue
>       tol_rho = 1d-15
>       If (.not. inp_f(tol_rho))
>      &  Call ErrQuit('DPlot_Input: failed to read tol_rho',0,
>      &     INPUT_ERR)
>       goto 10
> c
339a351,356
> *
>       If (.not.rtdb_put(rtdb,'dplot:tol_rho',mt_dbl,
>      &   1,tol_rho))
>      &   Call ErrQuit('DPlot_Input: rtdb_put failed - tol_rho',0,
>      &       RTDB_ERR)
> *
diff -r nwchem-6.3-src.2013-05-28/src/dplot/get_transden.F nwchem-6.3.revision2-src.2013-10-17/src/dplot/get_transden.F
4c4
<      &        g_movecs, g_dens)
---
>      &        g_movecs, g_tdens)
24c24
<          integer basis         ! AO basis set handle
---
>          integer basis            ! AO basis set handle
26c26
<          integer g_dens(ipol)     ! Number of AO basis functions
---
>          integer g_tdens(ipol)    ! Transition density matrix
36c36,37
<          double precision r
---
>          integer icntr,itmom
>          double precision r,cntr(3),tmom(20)
54a56
>          call ga_sync()
58a61
> c        initialization
60c63
<     call ga_zero(g_dens(i))
---
>     call ga_zero(g_tdens(i))
61a65,70
>          do icntr=1,3
>            cntr(icntr)=0.0d0
>          enddo
>          do itmom=1,20
>            tmom(itmom)=0.0d0
>          enddo
77a87,91
>             if (ipol.eq.1) nocc(2)=0
>             if (ipol.eq.1) nmo(2)=0
>             if (ipol.eq.1) nfc(2)=0
>             if (ipol.eq.1) nfv(2)=0
> c
119c133
<            open(unit=69,file=filename,form='formatted',
---
>           open(unit=69,file=filename,form='formatted',
133,135c147,150
<               read(69,*) r  ! energy of root
<               do i=1,ipol
<                if (tda) then
---
>              if (tda) then
>                read(69,*) r  ! energy of root
>                read(69,*) r  ! s2_save(n)
>                do i=1,ipol
140c155,159
<                else
---
>                end do ! ipol
>              else   ! full tddft
>                read(69,*) r  ! energy of root
>                read(69,*) r  ! s2_save(n)
>                do i=1,ipol
144a164,166
>                end do ! ipol
> c
>                do i=1,ipol
149,150c171,172
<                end if  ! tda
<               end do ! ipol
---
>                end do ! ipol
>              end if  ! tda
152c174,175
<            close(unit=69,status='keep',err=1002) ! file
---
>           close(unit=69,status='keep',err=1002) ! file
>           ok = 1
153a177,178
> c
>          call ga_brdcst(Msg_Vec_Stat+MSGINT, ok, inntsize, 0)
159c184
<           do i=1,ipol
---
>            do i=1,ipol
162c187
<           enddo
---
>            enddo
169c194,195
<               call ga_copy(g_temp(i),g_dens(i))
---
>           call multipole_density(basis,cntr,3,g_temp(i),tmom,20)  ! transition moments
>           call ga_copy(g_temp(i),g_tdens(i))
174c200,203
<              call tddft_transfm(iroot,g_y,g_movecs,nbf_ao,nocc,nmo,
---
>            do i = 1,ipol
>                 call ga_zero(g_temp(i))
>            end do
>            call tddft_transfm(iroot,g_y,g_movecs,nbf_ao,nocc,nmo,
177,180c206,210
< c            accumulate the Y component of the transition density matrix
<              do i = 1,ipol
<               call ga_add(1.d0,g_dens(i),1.d0,g_temp(i),g_dens(i))
<              end do
---
> c          accumulate the Y component of the transition density matrix
>            do i = 1,ipol
>               call multipole_density(basis,cntr,3,g_temp(i),tmom,20)  ! transition moments
>               call ga_add(1.d0,g_tdens(i),1.d0,g_temp(i),g_tdens(i))
>            end do
182a213,229
>          if (ipol.eq.1) then
>           do i=1,20
>             tmom(i)=tmom(i)*dsqrt(2.0d0)
>           enddo
>          end if 
> c
>          if (ga_nodeid().eq.0) then
>                 write(luout,*) " *** tmom(2)***: ", tmom(2)
>                 write(luout,*) " *** tmom(3)***: ", tmom(3)
>                 write(luout,*) " *** tmom(4)***: ", tmom(4)
>          end if
> c
> c        symmetrize the transition density matrix
>          do i = 1,ipol
>              call ga_symmetrize(g_tdens(i))
>          enddo
> c
186c233
<               Call GA_dAdd(1.d0,g_dens(1),1.d0,g_dens(2),g_dens(1))
---
>               Call GA_dAdd(1.d0,g_tdens(1),1.d0,g_tdens(2),g_tdens(1))
188c235
<               Call GA_dAdd(1.d0,g_dens(1),-1.d0,g_dens(2),g_dens(1))
---
>               Call GA_dAdd(1.d0,g_tdens(1),-1.d0,g_tdens(2),g_tdens(1))
191c238
<                Call GA_Copy(g_dens(2),g_dens(1))
---
>                Call GA_Copy(g_tdens(2),g_tdens(1))
194a242
> c        cleanup
diff -r nwchem-6.3-src.2013-05-28/src/mcscf/detci/detci_spin.F nwchem-6.3.revision2-src.2013-10-17/src/mcscf/detci/detci_spin.F
12c12
< * $Id: detci_spin.F 23708 2013-03-08 21:13:06Z bert $
---
> * $Id: detci_spin.F 24317 2013-06-12 16:58:14Z d3y133 $
162c162,163
<       call ga_access(g_civec, blo, bhi, alo, ahi, k_xxci, bdim)
---
>       if (bhi.gt.0.and.ahi.gt.0) then
>         call ga_access(g_civec, blo, bhi, alo, ahi, k_xxci, bdim)
164c165
< c  Allocate scatter data block and pointer blocks
---
> c       Allocate scatter data block and pointer blocks
166,203c167,206
<       scat_dim = 40000
<       if (.not.ma_push_get(MT_DBL, scat_dim, 'detci:lowdin',
<      $                     l_xa, k_xa))
<      $    call errquit('detci: cannot allocate xa lowdin',0, MA_ERR)
<       if (.not.ma_push_get(MT_INT, scat_dim, 'detci:lowdin',
<      $                     l_ib, k_ib))
<      $    call errquit('detci: cannot allocate ib lowdin',0, MA_ERR)
<       if (.not.ma_push_get(MT_INT, scat_dim, 'detci:lowdin',
<      $                     l_ia, k_ia))
<      $    call errquit('detci: cannot allocate ia lowdin',0, MA_ERR)
<       do istra = alo, ahi
<          call ifill((detci_maxorb*detci_maxorb),0,eij,1)
<          call ifill((detci_maxorb*detci_maxorb),0,pij,1)
<          do iex=1,nexa
<            eij(exa(6,iex,istra),exa(5,iex,istra)) = exa(1,iex,istra)
<            pij(exa(6,iex,istra),exa(5,iex,istra)) = exa(4,iex,istra)
<          enddo
<          offset=(istra-alo)*bdim-blo
<          do istrb = blo, bhi
<             val = -dbl_mb(k_xxci+offset+istrb)
<             if (dabs(val).gt.1.0d-14) then 
<                do iex=1,nexb
<                   iib = exb(5,iex,istrb)
<                   jjb = exb(6,iex,istrb)
<                   if ((eij(iib,jjb).ne.0).and.(pij(iib,jjb).ne.0)) then
<                     jstrb = exb(1,iex,istrb)
<                     jstra = eij(iib,jjb)
<                     xx = val*pij(iib,jjb)*exb(4,iex,istrb)
<                     if (dabs(xx).gt.1.0d-14) then
<                        isc=isc+1
<                        dbl_mb(k_xa+isc-1)=xx
<                        int_mb(k_ib+isc-1)=jstrb
<                        int_mb(k_ia+isc-1)=jstra
<                        if (isc.eq.scat_dim) then
<                           call ga_scatter_acc(g_pvec,dbl_mb(k_xa),
<      &                           int_mb(k_ib),int_mb(k_ia),isc,1.0d0)
<                           isc=0
<                        endif
---
>         scat_dim = 40000
>         if (.not.ma_push_get(MT_DBL, scat_dim, 'detci:lowdin',
>      $                       l_xa, k_xa))
>      $      call errquit('detci: cannot allocate xa lowdin',0, MA_ERR)
>         if (.not.ma_push_get(MT_INT, scat_dim, 'detci:lowdin',
>      $                       l_ib, k_ib))
>      $      call errquit('detci: cannot allocate ib lowdin',0, MA_ERR)
>         if (.not.ma_push_get(MT_INT, scat_dim, 'detci:lowdin',
>      $                       l_ia, k_ia))
>      $      call errquit('detci: cannot allocate ia lowdin',0, MA_ERR)
>         do istra = alo, ahi
>            call ifill((detci_maxorb*detci_maxorb),0,eij,1)
>            call ifill((detci_maxorb*detci_maxorb),0,pij,1)
>            do iex=1,nexa
>              eij(exa(6,iex,istra),exa(5,iex,istra)) = exa(1,iex,istra)
>              pij(exa(6,iex,istra),exa(5,iex,istra)) = exa(4,iex,istra)
>            enddo
>            offset=(istra-alo)*bdim-blo
>            do istrb = blo, bhi
>               val = -dbl_mb(k_xxci+offset+istrb)
>               if (dabs(val).gt.1.0d-14) then 
>                  do iex=1,nexb
>                     iib = exb(5,iex,istrb)
>                     jjb = exb(6,iex,istrb)
>                     if ((eij(iib,jjb).ne.0).and.(pij(iib,jjb).ne.0))
>      &              then
>                       jstrb = exb(1,iex,istrb)
>                       jstra = eij(iib,jjb)
>                       xx = val*pij(iib,jjb)*exb(4,iex,istrb)
>                       if (dabs(xx).gt.1.0d-14) then
>                          isc=isc+1
>                          dbl_mb(k_xa+isc-1)=xx
>                          int_mb(k_ib+isc-1)=jstrb
>                          int_mb(k_ia+isc-1)=jstra
>                          if (isc.eq.scat_dim) then
>                             call ga_scatter_acc(g_pvec,dbl_mb(k_xa),
>      &                             int_mb(k_ib),int_mb(k_ia),isc,1.0d0)
>                             isc=0
>                          endif
>                       endif
205,210c208,212
<                   endif
<                enddo
<             endif
<          enddo
<       enddo
<       if (isc.gt.0) call ga_scatter_acc(g_pvec,dbl_mb(k_xa),
---
>                  enddo
>               endif
>            enddo
>         enddo
>         if (isc.gt.0) call ga_scatter_acc(g_pvec,dbl_mb(k_xa),
212c214,221
<       call ga_release(g_civec, blo, bhi, alo, ahi)
---
>         call ga_release(g_civec, blo, bhi, alo, ahi)
>         if (.not.ma_pop_stack(l_ia))
>      $     call errquit('cannot pop stack ia detci:lowdin',0, MA_ERR)
>         if (.not.ma_pop_stack(l_ib))
>      $     call errquit('cannot pop stack ib detci:lowdin',0, MA_ERR)
>         if (.not.ma_pop_stack(l_xa))
>      $     call errquit('cannot pop stack xa detci:lowdin',0, MA_ERR)
>       endif
214,219d222
<       if (.not.ma_pop_stack(l_ia))
<      $   call errquit('cannot pop stack ia detci:lowdin',0, MA_ERR)
<       if (.not.ma_pop_stack(l_ib))
<      $   call errquit('cannot pop stack ib detci:lowdin',0, MA_ERR)
<       if (.not.ma_pop_stack(l_xa))
<      $   call errquit('cannot pop stack xa detci:lowdin',0, MA_ERR)
diff -r nwchem-6.3-src.2013-05-28/src/nwdft/lr_tddft/tddft_analysis.F nwchem-6.3.revision2-src.2013-10-17/src/nwdft/lr_tddft/tddft_analysis.F
6c6
< c $Id: tddft_analysis.F 24091 2013-04-17 17:22:55Z bert $
---
> c $Id: tddft_analysis.F 24553 2013-08-31 21:27:02Z niri $
179a180,182
>       double precision s2_save(nroots) 
>       logical lstores2
>       double precision s2_tmp(nroots)
219c222
< c     CI Vectors file 
---
> c     CI Vectors file
222a226,239
>       if (lcivecs) then
>         do n=1,nroots
>           if (ipol.eq.2) then   ! unrestricted
>             s2_save(n) = 0.0d0
>             s2_tmp(n)  = 0.0d0
>           elseif (singlet) then ! restricted singlets
>             s2_save(n) = 0.0d0
>             s2_tmp(n)  = 0.0d0
>           elseif (triplet) then ! restricted triplets
>             s2_save(n) = 2.0d0
>             s2_tmp(n)  = 2.0d0
>           endif
>         enddo
>       endif
459,494d475
< c --------------------
< c Solution vector file
< c --------------------
< c
<        if (.not.rtdb_cget(rtdb,'tddft:civecs',1,fn_civecs))
<      1  call errquit('tddft_analysis: failed to read vector',0)
< c
<        len_fn_civecs = inp_strlen(fn_civecs)
<        if (singlet) fn_civecs=fn_civecs(1:len_fn_civecs)//"_singlet"
<        if (triplet) fn_civecs=fn_civecs(1:len_fn_civecs)//"_triplet"
< c
<        if (nodezero.and.lcivecs) then
<          write(luout,*) "fn_civecs: ",fn_civecs
<          call util_file_name_resolve(fn_civecs, .false.)
<          open(unit=69,file=fn_civecs,form='formatted',status='unknown')
<          write(LuOut,2010) fn_civecs
<          rewind(69)
<          write(69,*) tda
<          write(69,*) ipol
<          write(69,*) nroots
<          if (ipol.eq.1) nocc(2) = 0
<          write(69,*) nocc(1),nocc(2)
<          if (ipol.eq.1) nmo(2) = 0
<          write(69,*) nmo(1),nmo(2)
<          if (ipol.eq.1) nfc(2) = 0
<          write(69,*) nfc(1),nfc(2)
<          if (ipol.eq.1) nfv(2) = 0
<          write(69,*) nfv(1),nfv(2)
<          if (ipol.eq.1) nov(2) = 0
<          write(69,*) nov(1),nov(2)
<          write(69,*)
<        endif ! nodezero
< c
<  2000 format(/,2x,'No CI vector file is created')
<  2010 format(/,2x,'CI vectors are stored in ',a32)
< c
517,530d497
< c       Write out solution vectors: X (Y=0 in TDA)
< c
<         if (nodezero.and.lcivecs) then
<          do n=1,nroots
<            write(69,*)apbval(n)  ! energy of the root
<            do i=1,ipol
<              do m=1,nov(i)
<                call ga_get(g_x(i),m,m,n,n,r,1)
<                write(69,*) r
<              enddo ! nov
<            enddo ! ipol
<          enddo  ! nroots
<         endif  ! nodezero and lcivecs
< c
557,578d523
< c       g_x = X+Y and g_y = X-Y
< c       Write out vectors: X+Y and X-Y
< c
<         if (nodezero.and.lcivecs) then
<            do n=1,nroots
<              write(69,*)apbval(n) ! energy of the root
<              do i=1,ipol
<                do m=1,nov(i)
<                  call ga_get(g_x(i),m,m,n,n,r,1) ! X vectors
<                  write(69,*) r
<                enddo ! nov
<              enddo ! ipol
< c
<              do i=1,ipol
<                do m=1,nov(i)
<                  call ga_get(g_y(i),m,m,n,n,r,1) ! Y vectors
<                  write(69,*) r
<                enddo ! nov
<              enddo ! ipol
<            enddo ! nroots
<         endif  ! nodezero or lcivecs
< c
588,589d532
<       if (nodezero.and.lcivecs) close(unit=69)
< c
652,653c595
< 
< 
---
> c
877a820
>           if (lcivecs) s2_save(n) = s2
1662a1606,1730
> c
> c ----------------------------------------------------------------------
> c Store the <S2> value for the first cycle of a TDDFT
> c optimization in the RTDB.  This will allow us to use it as a reference
> c for all optimization cycles.
> c ----------------------------------------------------------------------
> c
>       if (lcivecs) then
>         lstores2 = .false.
> c Check if <S2> is already in the RTDB. If it is, we don't do anything
> c else.  Otherwise, we write s2_save to the RTDB.  This only happens if
> c tddft_grad:s2 doesn't exist.
>         if (.not.rtdb_get(rtdb,'tddft_grad:s2',mt_dbl,nroots,s2_tmp))
>      1    lstores2 = .true.
>         if (lstores2) then
>           if (.not.rtdb_put(rtdb,'tddft_grad:s2',mt_dbl,nroots,s2_save))
>      1      call errquit('tddft_analysis: failed to store s2', 0,
>      2        RTDB_ERR)
>         endif
>       endif
> c
> c ---------------------------
> c Handle solution vector file
> c ---------------------------
> c
> c On top of what was present originally for storing
> c the excited state information, we also need <S2> for unrestricted
> c calculations.  This is required because we store every state and
> c it is possible that the states reorder.  We can't use the character
> c of singlet and triplet states to identify states since they can be
> c similar.
> c
>        if (.not.rtdb_cget(rtdb,'tddft:civecs',1,fn_civecs))
>      1  call errquit('tddft_analysis: failed to read vector',0,
>      2    RTDB_ERR)
> c
>        len_fn_civecs = inp_strlen(fn_civecs)
>        if (singlet) fn_civecs=fn_civecs(1:len_fn_civecs)//"_singlet"
>        if (triplet) fn_civecs=fn_civecs(1:len_fn_civecs)//"_triplet"
> c
>        if (nodezero.and.lcivecs) then
>          write(luout,*) "fn_civecs: ",fn_civecs
>          call util_file_name_resolve(fn_civecs, .false.)
>          open(unit=69,file=fn_civecs,form='formatted',status='unknown')
>          write(LuOut,2010) fn_civecs
>          rewind(69)
>          write(69,*) tda
>          write(69,*) ipol
>          write(69,*) nroots
>          if (ipol.eq.1) nocc(2) = 0
>          write(69,*) nocc(1),nocc(2)
>          if (ipol.eq.1) nmo(2) = 0
>          write(69,*) nmo(1),nmo(2)
>          if (ipol.eq.1) nfc(2) = 0
>          write(69,*) nfc(1),nfc(2)
>          if (ipol.eq.1) nfv(2) = 0
>          write(69,*) nfv(1),nfv(2)
>          if (ipol.eq.1) nov(2) = 0
>          write(69,*) nov(1),nov(2)
>          write(69,*)
>        endif ! nodezero
> c
>  2000 format(/,2x,'No CI vector file is created')
>  2010 format(/,2x,'CI vectors are stored in ',a32)
> c
> c ------------
> c Tamm-Dancoff
> c ------------
> c
> c Modified for RPA with B = 0
> c
>       if (tda) then
> c
> c       Write out solution vectors: X (Y=0 in TDA)
> c
>         if (nodezero.and.lcivecs) then
>          do n=1,nroots
>            write(69,*)apbval(n)  ! energy of the root
>            write(69,*)s2_save(n) ! <S2> value of the root
>            do i=1,ipol
>              do m=1,nov(i)
>                call ga_get(g_x(i),m,m,n,n,r,1)
>                write(69,*) r
>              enddo ! nov
>            enddo ! ipol
>          enddo  ! nroots
>         endif  ! nodezero and lcivecs
> c
> c --------------------
> c Full linear response
> c --------------------
> c
>       else  ! full tddft
> c
> c       g_x = X+Y and g_y = X-Y
> c
>         do i=1,ipol
>            call ga_add(1.0d0,g_x(i), 1.0d0,g_y(i),g_x(i)) ! X+Y
>            call ga_add(1.0d0,g_x(i),-2.0d0,g_y(i),g_y(i)) ! X+Y-2Y = X-Y
>         enddo
> c
> c       Write out vectors: X+Y and X-Y
> c
>         if (nodezero.and.lcivecs) then
>            do n=1,nroots
>              write(69,*)apbval(n)  ! energy of the root
>              write(69,*)s2_save(n) ! <S2> value of the root
>              do i=1,ipol
>                do m=1,nov(i)
>                  call ga_get(g_x(i),m,m,n,n,r,1) ! X vectors
>                  write(69,*) r
>                enddo ! nov
>              enddo ! ipol
> c
>              do i=1,ipol
>                do m=1,nov(i)
>                  call ga_get(g_y(i),m,m,n,n,r,1) ! Y vectors
>                  write(69,*) r
>                enddo ! nov
>              enddo ! ipol
>            enddo ! nroots
>         endif  ! nodezero or lcivecs
>       endif ! tda
> c
>       if (nodezero.and.lcivecs) close(unit=69)
diff -r nwchem-6.3-src.2013-05-28/src/nwdft/lr_tddft/tddft_davidson.F nwchem-6.3.revision2-src.2013-10-17/src/nwdft/lr_tddft/tddft_davidson.F
8c8
< c $Id: tddft_davidson.F 24076 2013-04-15 16:00:42Z niri $
---
> c $Id: tddft_davidson.F 24309 2013-06-06 18:30:18Z niri $
178d177
<       integer vshift
254,259d252
< c Get reference virtual state
< c --------------------------------------------
<       if (.not.rtdb_get(rtdb,'tddft:vshift',mt_int,1,vshift))
<      &   vshift = 0
< c
< c --------------------------------------------
481c474
<      2          lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      2          lowin,owstart,owend,lewin,ewinl,ewinh)
485c478
<      2            lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      2            lowin,owstart,owend,lewin,ewinl,ewinh)
500c493
<      2          lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      2          lowin,owstart,owend,lewin,ewinl,ewinh)
520c513
<      2            lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      2            lowin,owstart,owend,lewin,ewinl,ewinh)
664c657
<      2          lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      2          lowin,owstart,owend,lewin,ewinl,ewinh)
668c661
<      2            lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      2            lowin,owstart,owend,lewin,ewinl,ewinh)
683c676
<      2          lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      2          lowin,owstart,owend,lewin,ewinl,ewinh)
703c696
<      2            lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      2            lowin,owstart,owend,lewin,ewinl,ewinh)
801c794
<      7    diff_max,lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      7    diff_max,lowin,owstart,owend,lewin,ewinl,ewinh)
diff -r nwchem-6.3-src.2013-05-28/src/nwdft/lr_tddft/tddft_init.F nwchem-6.3.revision2-src.2013-10-17/src/nwdft/lr_tddft/tddft_init.F
10c10
< c $Id: tddft_init.F 22895 2012-09-23 01:19:55Z niri $
---
> c $Id: tddft_init.F 24357 2013-07-01 22:46:52Z edo $
112a113,114
>       logical xc_got2nd
>       external xc_got2nd
125a128,132
> 
>       if(.not.xc_got2nd()) call errquit(
>      A        'analytic 2nds not ready for these XC functionals',0,
>      &       CAPMIS_ERR)
> 
diff -r nwchem-6.3-src.2013-05-28/src/nwdft/lr_tddft/tddft_residual.F nwchem-6.3.revision2-src.2013-10-17/src/nwdft/lr_tddft/tddft_residual.F
7c7
<      6  diff_max,lowin,owstart,owend,lewin,ewinl,ewinh,vshift)
---
>      6  diff_max,lowin,owstart,owend,lewin,ewinl,ewinh)
9c9
< c $Id: tddft_residual.F 24037 2013-04-11 21:10:58Z bert $
---
> c $Id: tddft_residual.F 24309 2013-06-06 18:30:18Z niri $
88d87
<       integer vshift
828,832c827
<               if (vshift.gt.0) then
<                  k=nocc(i)+1+vshift
<               else
<                  k=mod(l-1,nmo(i)-nfv(i)-nocc(i))+nocc(i)+1
<               end if
---
>               k=mod(l-1,nmo(i)-nfv(i)-nocc(i))+nocc(i)+1
929,933c924
<                 if (vshift.gt.0) then
<                    k=nocc(i)+1+vshift
<                 else
<                    k=mod(l-1,nmo(i)-nfv(i)-nocc(i))+nocc(i)+1
<                 end if
---
>                 k=mod(l-1,nmo(i)-nfv(i)-nocc(i))+nocc(i)+1
diff -r nwchem-6.3-src.2013-05-28/src/nwpw/nwpwlib/Parallel/Parallel-tcgmsg.F nwchem-6.3.revision2-src.2013-10-17/src/nwpw/nwpwlib/Parallel/Parallel-tcgmsg.F
2c2
< * $Id: Parallel-tcgmsg.F 22562 2012-06-05 21:17:04Z bylaska $
---
> * $Id: Parallel-tcgmsg.F 24308 2013-06-06 03:34:42Z jhammond $
894c894
<       /* determine psr - should be made w/o using tmp array! */
---
> c      /* determine psr - should be made w/o using tmp array! */
1033c1033
<       /* determine psr - should be made w/o using tmp array! */
---
> c      /* determine psr - should be made w/o using tmp array! */
diff -r nwchem-6.3-src.2013-05-28/src/tce/tce_input.F nwchem-6.3.revision2-src.2013-10-17/src/tce/tce_input.F
6c6
< c $Id: tce_input.F 24178 2013-05-03 22:05:45Z kowalski $
---
> c $Id: tce_input.F 24360 2013-07-02 18:09:01Z jhammond $
14c14
< c        [FREEZE [[core] (atomic || <integer nfzc default 0>)] \
---
> c        [FREEZE [[core] (atomic || <integer nfzc default 0>)] 
16,18c16,18
< c        [(LCCD||CCD||CCSD||LCCSD||CCSDT||CCSDTQ|| \ 
< c          CCSD(T)||CCSD[T]||QCISD||CISD||CISDT||CISDTQ|| \
< c          MBPT2||MBPT3||MBPT4||MP2||MP3||MP4|| \
---
> c        [(LCCD||CCD||CCSD||LCCSD||CCSDT||CCSDTQ|| 
> c          CCSD(T)||CCSD[T]||QCISD||CISD||CISDT||CISDTQ|| 
> c          MBPT2||MBPT3||MBPT4||MP2||MP3||MP4|| 
diff -r nwchem-6.3-src.2013-05-28/src/tce/tce_mo2e_hybrid_2eorb_split.F nwchem-6.3.revision2-src.2013-10-17/src/tce/tce_mo2e_hybrid_2eorb_split.F
3c3
< C     $Id: tce_mo2e_hybrid_2eorb_split.F 19706 2010-10-29 17:52:31Z d3y133 $
---
> C     $Id: tce_mo2e_hybrid_2eorb_split.F 24292 2013-06-04 01:26:22Z edo $
780c780
<        next = nxtask(-nprocs)
---
>        next = nxtask(-nprocs,1)
1070c1070
<       next = nxtask(-nprocs)
---
>       next = nxtask(-nprocs,1)
1297c1297
<       next = nxtask(-nprocs)
---
>       next = nxtask(-nprocs,1)
1543c1543
<       next = nxtask(-nprocs)
---
>       next = nxtask(-nprocs,1)
1734c1734
<        next = nxtask(-nprocs)
---
>        next = nxtask(-nprocs,1)
diff -r nwchem-6.3-src.2013-05-28/src/tce/tce_mo2e_zones_4a_disk_ga_chop_N5.F nwchem-6.3.revision2-src.2013-10-17/src/tce/tce_mo2e_zones_4a_disk_ga_chop_N5.F
4c4
< C     $Id: tce_mo2e_zones_4a_disk_ga_chop_N5.F 19706 2010-10-29 17:52:31Z d3y133 $
---
> C     $Id: tce_mo2e_zones_4a_disk_ga_chop_N5.F 24330 2013-06-19 22:02:55Z kowalski $
480,483c480,489
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      & int_mb(k_range_alpha+g3b-1),nalength(azone4),
<      & nalength(azone1),nalength(azone2),
<      &2,1,3,4,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     & int_mb(k_range_alpha+g3b-1),nalength(azone4),
> ccx     & nalength(azone1),nalength(azone2),
> ccx     &2,1,3,4,1.0d0)
> c old transpositions
>        call TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & nalength(azone2),nalength(azone1),nalength(azone4),
>      & int_mb(k_range_alpha+g3b-1),
>      & 1,2,4,3,1.0d0)
> c
517,520c523,530
<       CALL TCE_SORT_4KG_(dbl_mb(k_4a),dbl_mb(k_aux),
<      & nalength(azone3),nalength(azone4),
<      & nalength(azone1),nalength(azone2),
<      &2,1,3,4,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_4a),dbl_mb(k_aux),
> ccx     & nalength(azone3),nalength(azone4),
> ccx     & nalength(azone1),nalength(azone2),
> ccx     &2,1,3,4,1.0d0)
>        call TCE_SORT_4(dbl_mb(k_4a),dbl_mb(k_aux),
>      &  nalength(azone2),nalength(azone1),nalength(azone4),
>      &  nalength(azone3),
>      &  1,2,4,3,1.0d0)
553,556c563,572
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      & int_mb(k_range_alpha+g3b-1),nalength(azone3),
<      & nalength(azone1),nalength(azone2),
<      &2,1,3,4,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     & int_mb(k_range_alpha+g3b-1),nalength(azone3),
> ccx     & nalength(azone1),nalength(azone2),
> ccx     &2,1,3,4,1.0d0)
> c  old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      1  nalength(azone2),nalength(azone1),nalength(azone3),
>      2  int_mb(k_range_alpha+g3b-1),
>      3  1,2,4,3,1.0d0)
> c
616,619c632,641
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      & int_mb(k_range_alpha+g3b-1),nalength(azone4),
<      & nalength(azone1),nalength(azone2),
<      &2,1,3,4,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     & int_mb(k_range_alpha+g3b-1),nalength(azone4),
> ccx     & nalength(azone1),nalength(azone2),
> ccx     &2,1,3,4,1.0d0)
> c old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & nalength(azone2),nalength(azone1),nalength(azone4),
>      & int_mb(k_range_alpha+g3b-1),
>      & 1,2,4,3,1.0d0)
> c
813,816c835,844
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1), 
<      &nalength(azone1),int_mb(k_range_alpha+g2b-1), 
<      &1,2,4,3,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1), 
> ccx     &nalength(azone1),int_mb(k_range_alpha+g2b-1), 
> ccx     &1,2,4,3,1.0d0)
> c old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & int_mb(k_range_alpha+g2b-1),nalength(azone1),
>      & int_mb(k_range_alpha+g3b-1),int_mb(k_range_alpha+g4b-1),
>      & 2,1,3,4,1.0d0)
> c
853,856c881,889
<       CALL TCE_SORT_4KG_(dbl_mb(k_2g2a),dbl_mb(k_aux),
<      & int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1),
<      & nalength(azone1),nalength(azone2), 
<      & 1,2,4,3,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_2g2a),dbl_mb(k_aux),
> ccx     & int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1),
> ccx     & nalength(azone1),nalength(azone2), 
> ccx     & 1,2,4,3,1.0d0)
> c old transposition
>        CALL TCE_SORT_4(dbl_mb(k_2g2a),dbl_mb(k_aux),
>      & nalength(azone2),nalength(azone1),
>      & int_mb(k_range_alpha+g3b-1),int_mb(k_range_alpha+g4b-1),
>      & 2,1,3,4,1.0d0)
890,893c923,932
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1),
<      &nalength(azone2),int_mb(k_range_alpha+g2b-1),
<      &1,2,4,3,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1),
> ccx     &nalength(azone2),int_mb(k_range_alpha+g2b-1),
> ccx     &1,2,4,3,1.0d0)
> c old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & int_mb(k_range_alpha+g2b-1),nalength(azone2),
>      & int_mb(k_range_alpha+g3b-1),int_mb(k_range_alpha+g4b-1),
>      & 2,1,3,4,1.0d0)
> c
959,962c998,1007
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1), 
<      &nalength(azone1),int_mb(k_range_alpha+g2b-1), 
<      &1,2,4,3,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1), 
> ccx     &nalength(azone1),int_mb(k_range_alpha+g2b-1), 
> ccx     &1,2,4,3,1.0d0)
> c old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & int_mb(k_range_alpha+g2b-1),nalength(azone1),
>      & int_mb(k_range_alpha+g3b-1),int_mb(k_range_alpha+g4b-1),
>      & 2,1,3,4,1.0d0)
> c
diff -r nwchem-6.3-src.2013-05-28/src/tce/tce_mo2e_zones_4a_disk_ga_N5.F nwchem-6.3.revision2-src.2013-10-17/src/tce/tce_mo2e_zones_4a_disk_ga_N5.F
4c4
< C     $Id: tce_mo2e_zones_4a_disk_ga_N5.F 19706 2010-10-29 17:52:31Z d3y133 $
---
> C     $Id: tce_mo2e_zones_4a_disk_ga_N5.F 24328 2013-06-19 17:52:34Z kowalski $
440,443c440,449
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      & int_mb(k_range_alpha+g3b-1),nalength(azone4),
<      & nalength(azone1),nalength(azone2),
<      &2,1,3,4,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     & int_mb(k_range_alpha+g3b-1),nalength(azone4),
> ccx     & nalength(azone1),nalength(azone2),
> ccx     &2,1,3,4,1.0d0)
> c old transpositions
>        call TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & nalength(azone2),nalength(azone1),nalength(azone4),
>      & int_mb(k_range_alpha+g3b-1),
>      & 1,2,4,3,1.0d0)
> c
477,480c483,491
<       CALL TCE_SORT_4KG_(dbl_mb(k_4a),dbl_mb(k_aux),
<      & nalength(azone3),nalength(azone4),
<      & nalength(azone1),nalength(azone2),
<      &2,1,3,4,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_4a),dbl_mb(k_aux),
> ccx     & nalength(azone3),nalength(azone4),
> ccx     & nalength(azone1),nalength(azone2),
> ccx     &2,1,3,4,1.0d0)
>        call TCE_SORT_4(dbl_mb(k_4a),dbl_mb(k_aux),
>      &  nalength(azone2),nalength(azone1),nalength(azone4),
>      &  nalength(azone3),
>      &  1,2,4,3,1.0d0)
> c
513,516c524,533
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      & int_mb(k_range_alpha+g3b-1),nalength(azone3),
<      & nalength(azone1),nalength(azone2),
<      &2,1,3,4,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     & int_mb(k_range_alpha+g3b-1),nalength(azone3),
> ccx     & nalength(azone1),nalength(azone2),
> ccx     &2,1,3,4,1.0d0)
> c  old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      1  nalength(azone2),nalength(azone1),nalength(azone3),
>      2  int_mb(k_range_alpha+g3b-1),
>      3  1,2,4,3,1.0d0)
> c 
576,579c593,602
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      & int_mb(k_range_alpha+g3b-1),nalength(azone4),
<      & nalength(azone1),nalength(azone2),
<      &2,1,3,4,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     & int_mb(k_range_alpha+g3b-1),nalength(azone4),
> ccx     & nalength(azone1),nalength(azone2),
> ccx     &2,1,3,4,1.0d0)
> c old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & nalength(azone2),nalength(azone1),nalength(azone4),
>      & int_mb(k_range_alpha+g3b-1),
>      & 1,2,4,3,1.0d0)
> c
775,778c798,807
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1), 
<      &nalength(azone1),int_mb(k_range_alpha+g2b-1), 
<      &1,2,4,3,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1), 
> ccx     &nalength(azone1),int_mb(k_range_alpha+g2b-1), 
> ccx     &1,2,4,3,1.0d0)
> c old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & int_mb(k_range_alpha+g2b-1),nalength(azone1),
>      & int_mb(k_range_alpha+g3b-1),int_mb(k_range_alpha+g4b-1),
>      & 2,1,3,4,1.0d0)
> c
815,818c844,852
<       CALL TCE_SORT_4KG_(dbl_mb(k_2g2a),dbl_mb(k_aux),
<      & int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1),
<      & nalength(azone1),nalength(azone2), 
<      & 1,2,4,3,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_2g2a),dbl_mb(k_aux),
> ccx     & int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1),
> ccx     & nalength(azone1),nalength(azone2), 
> ccx     & 1,2,4,3,1.0d0)
> c old transposition
>        CALL TCE_SORT_4(dbl_mb(k_2g2a),dbl_mb(k_aux),
>      & nalength(azone2),nalength(azone1),
>      & int_mb(k_range_alpha+g3b-1),int_mb(k_range_alpha+g4b-1),
>      & 2,1,3,4,1.0d0)
852,855c886,895
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1),
<      &nalength(azone2),int_mb(k_range_alpha+g2b-1),
<      &1,2,4,3,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1),
> ccx     &nalength(azone2),int_mb(k_range_alpha+g2b-1),
> ccx     &1,2,4,3,1.0d0)
> c old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & int_mb(k_range_alpha+g2b-1),nalength(azone2),
>      & int_mb(k_range_alpha+g3b-1),int_mb(k_range_alpha+g4b-1),
>      & 2,1,3,4,1.0d0)
> c
921,924c961,970
<       CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
<      &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1), 
<      &nalength(azone1),int_mb(k_range_alpha+g2b-1), 
<      &1,2,4,3,1.0d0)
---
> ccx      CALL TCE_SORT_4KG_(dbl_mb(k_integral),dbl_mb(k_aux),
> ccx     &int_mb(k_range_alpha+g4b-1),int_mb(k_range_alpha+g3b-1), 
> ccx     &nalength(azone1),int_mb(k_range_alpha+g2b-1), 
> ccx     &1,2,4,3,1.0d0)
> c old transposition
>       CALL TCE_SORT_4(dbl_mb(k_integral),dbl_mb(k_aux),
>      & int_mb(k_range_alpha+g2b-1),nalength(azone1),
>      & int_mb(k_range_alpha+g3b-1),int_mb(k_range_alpha+g4b-1),
>      & 2,1,3,4,1.0d0)
> c
1011c1057
< c      write(6,*)'DONE --- DONE ---- DONE ---- DONE'
---
> c       write(6,*)'DONE --- DONE ---- DONE ---- DONE'
diff -r nwchem-6.3-src.2013-05-28/src/tools/GNUmakefile nwchem-6.3.revision2-src.2013-10-17/src/tools/GNUmakefile
335a336,338
>     ifdef EXTERNAL_ARMCI_PATH
>         MAYBE_ARMCI = --with-armci=$(EXTERNAL_ARMCI_PATH)
>     else
336a340
>     endif
diff -r nwchem-6.3-src.2013-05-28/src/util/util_nwchem_version.F nwchem-6.3.revision2-src.2013-10-17/src/util/util_nwchem_version.F
4c4
<       nwrev="24277"
---
>       nwrev="24652"
Only in nwchem-6.3-src.2013-05-28/: svnlog



12 June 2013

449. Nwchem 6.3 -- updated sources. Compiling on Debian

The previous post uses the sources from the 17th of May.


What's new?
Maybe it's just me, but I can't find any obvious location where they list the differences between the 28th of June and the 17th of June releases. In fact, I wouldn't have known that there was a new version out unless someone had specifically pointed it out to me.

Anyway, luckily there's diff. There were more changes than the ones I'm showing below, but most of them were minor ones such as .stamp files and a re-done pair of QA test output files.
  1 Only in nwchem-6.3-src.2013-05-28/QA/tests: rodft-cam
  2 diff -r nwchem-6.3-src.2013-05-17/src/nwdft/scf_dft/dft_canorg.F nwchem-6.3-src.2013-05-28/src/nwdft/scf_dft/dft_canorg.F
  3 6c6
  4 < c     $Id: dft_canorg.F 23846 2013-03-19 04:08:25Z edo $
  5 ---
  6 > c     $Id: dft_canorg.F 24271 2013-05-24 06:48:33Z niri $
  7 279c279,280
  8 <          if(iter.ge.nfock/2)iswitc = iswitc+1
  9 ---
 10 >          if(iter.ge.nfock/2) iswitc = iswitc+1
 11 >          if(abs(delta).lt.1d-6) iswitc = iswitc+2
 12 diff -r nwchem-6.3-src.2013-05-17/src/nwdft/scf_dft/dft_scf.F nwchem-6.3-src.2013-05-28/src/nwdft/scf_dft/dft_scf.F
 13 9c9
 14 < c     $Id: dft_scf.F 23988 2013-04-08 23:06:52Z d3y133 $
 15 ---
 16 > c     $Id: dft_scf.F 24269 2013-05-24 00:55:11Z edo $
 17 819c819
 18 <          iswitc = 1
 19 ---
 20 >          iswitc = 2
 21 diff -r nwchem-6.3-src.2013-05-17/src/nwdft/scf_dft_cg/dft_roks_fock.F nwchem-6.3-src.2013-05-28/src/nwdft/scf_dft_cg/dft_roks_fock.F
 22 6c6
 23 < * $Id: dft_roks_fock.F 23999 2013-04-10 18:23:02Z d3y133 $
 24 ---
 25 > * $Id: dft_roks_fock.F 24274 2013-05-24 07:34:16Z niri $
 26 102c102
 27 <       double precision edisp    ! [input] dispersion corrrection
 28 ---
 29 >       double precision Edisp    ! [input] dispersion correction
 30 107c107
 31 <       double precision errmax, ebq
 32 ---
 33 >       double precision errmax, Ebq
 34 124c124
 35 <       integer g_tmp(2)
 36 ---
 37 >       integer g_tmp(nset)
 38 326a327,335
 39 >         call ga_zero(g_tmp(1))
 40 >         if (nopen.gt.0) then
 41 >           g_tmp(2) = ga_create_atom_blocked(geom, basis,
 42 >      $                                      'dft_roks_fock: tmp2')
 43 >           g_tmp(3) = ga_create_atom_blocked(geom, basis,
 44 >      $                                      'dft_roks_fock: tmp3')
 45 >           call ga_zero(g_tmp(2))
 46 >           call ga_zero(g_tmp(3))
 47 >         endif
 48 330d338
 49 <         call ga_zero(g_tmp(1))
 50 339c347
 51 <      $     tol2e, oskel, iv_dens, g_tmp(1), .false., .false.)
 52 ---
 53 >      $     tol2e, oskel, iv_dens, g_tmp, .false., .false.)
 54 340a349,352
 55 >         if (nopen.gt.0) then
 56 >           call ga_dadd(1d0,iv_fock(2),1d0,g_tmp(2),iv_fock(2))
 57 >           call ga_dadd(1d0,iv_fock(3),1d0,g_tmp(3),iv_fock(3))
 58 >         endif
 59 344a357,360
 60 >         if (nopen.gt.0) then
 61 >           call ga_zero(g_tmp(2))
 62 >           call ga_zero(g_tmp(3))
 63 >         endif
 64 353c369
 65 <      $     tol2e, oskel, iv_dens, g_tmp(1), .false., .true.)
 66 ---
 67 >      $     tol2e, oskel, iv_dens, g_tmp, .false., .true.)
 68 354a371,374
 69 >         if (nopen.gt.0) then
 70 >           call ga_dadd(1d0,iv_fock(2),1d0,g_tmp(2),iv_fock(2))
 71 >           call ga_dadd(1d0,iv_fock(3),1d0,g_tmp(3),iv_fock(3))
 72 >         endif
 73 358a379,385
 74 >         if (nopen.gt.0) then
 75 >           if (.not. ga_destroy(g_tmp(2))) call errquit
 76 >      $               ('xc_getv: ga corrupt?',0, GA_ERR)
 77 >           if (.not. ga_destroy(g_tmp(3))) call errquit
 78 >      $               ('xc_getv: ga corrupt?',0, GA_ERR)
 79 >         endif
 80 > c
 81 383c410
 82 <       etwo = etwo_closed + etwo_open + edisp
 83 ---
 84 >       etwo = etwo_closed + etwo_open + Edisp
 85 diff -r nwchem-6.3-src.2013-05-17/src/tools/GNUmakefile nwchem-6.3-src.2013-05-28/src/tools/GNUmakefile
 86 342a343,346
 87 > ifeq ($(ARMCI_NETWORK),SOCKETS)
 88 >         MAYBE_ARMCI = --with-sockets
 89 > endif # SOCKETS
 90 >
 91 Only in nwchem-6.3-src.2013-05-28/src/util: util_ga_version.F
 92 diff -r nwchem-6.3-src.2013-05-17/src/util/util_nwchem_version.F nwchem-6.3-src.2013-05-28/src/util/util_nwchem_version.F
 93 4c4
 94 <       nwrev="24252"
 95 ---
 96 >       nwrev="24277"

The section in red is the Iswtch.patch in http://verahill.blogspot.com.au/2013/05/424-nwchem-63-on-debian-wheezy.html.

I'm guessing that this patch is also included: http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id840/rodft_and_range_separated-functi....html
I've indicated it in blue.

Purple is probably to make sure that SOCKETS gets implemented correctly -- there were issues with that before.

I f I were to guess I'd say that if you compiled nwchem as shown here: http://verahill.blogspot.com.au/2013/05/424-nwchem-63-on-debian-wheezy.html
AND if you are not going to use CAM with open-shell molecules, then you are not in a hurry to recompile.

The former issue is manifested in slow run times, while the second issue is manifested in crashing calculations, so neither should be able to fly under the radar.

Anyway

The GabEdit Patch
The following patch allows you to use GabEdit as a GUI to nwchem, and allows you to compile nwchem with python support.
Copy the following and paste it into a file, e.g. 6.3.patch, and put it into /opt/nwchem/nwchem-6.3-src.2013-05-28:

diff -rupN src.original/config/makefile.h src/config/makefile.h
--- src.original/config/makefile.h      2013-04-15 12:41:45.016853322 +1000
+++ src/config/makefile.h       2013-04-15 12:38:44.933319544 +1000
@@ -2039,7 +2039,7 @@ endif
 
      ifeq ($(BUILDING_PYTHON),python)
 #   EXTRA_LIBS += -ltk -ltcl -L/usr/X11R6/lib -lX11 -ldl
-     EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl
+     EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl -lssl -lz
   LDOPTIONS = -Wl,--export-dynamic 
      endif
 ifeq ($(NWCHEM_TARGET),CATAMOUNT)
diff -rupN src.original/ddscf/movecs_pr_anal.F src/ddscf/movecs_pr_anal.F
--- src.original/ddscf/movecs_pr_anal.F 2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/movecs_pr_anal.F  2013-04-15 12:23:28.100409225 +1000
@@ -195,7 +195,7 @@ c
  22         format(1x,2('  Bfn.  Coefficient  Atom+Function  ',5x))
             write(LuOut,23)
  23         format(1x,2(' ----- ------------  ---------------',5x))
-            do klo = 0, min(n-1,9), 2
+            do klo = 0, min(n-1,199), 2
                khi = min(klo+1,n-1)
                write(LuOut,2) (
      $              int_mb(k_list+k)+1, 
diff -rupN src.original/ddscf/rohf.F src/ddscf/rohf.F
--- src.original/ddscf/rohf.F   2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/rohf.F    2013-04-15 12:23:28.100409225 +1000
@@ -153,7 +153,7 @@ c
             ilo = 1
             ihi = nmo
          endif
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs, 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs, 
      $        'ROHF Final Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      $        .true., dbl_mb(k_occ))
diff -rupN src.original/ddscf/scf_vec_guess.F src/ddscf/scf_vec_guess.F
--- src.original/ddscf/scf_vec_guess.F  2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/scf_vec_guess.F   2013-04-15 12:23:28.100409225 +1000
@@ -511,19 +511,19 @@ c
          nprint = min(nclosed+nopen+30,nmo)
          if (scftype.eq.'RHF' .or. scftype.eq.'ROHF') then
             call movecs_print_anal(basis, 1,
-     &           nprint, 0.15d0, g_movecs, 
+     &           nprint, 0.01d0, g_movecs, 
      &           'ROHF Initial Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      &           .true., dbl_mb(k_occ))
          else
             nprint = min(nalpha+20,nmo)
             call movecs_print_anal(basis, max(1,nbeta-20),
-     &           nprint, 0.15d0, g_movecs, 
+     &           nprint, 0.01d0, g_movecs, 
      &           'UHF Initial Alpha Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      &           .true., dbl_mb(k_occ))
             call movecs_print_anal(basis, max(1,nbeta-20),
-     &           nprint, 0.15d0, g_movecs(2), 
+     &           nprint, 0.01d0, g_movecs(2), 
      &           'UHF Initial Beta Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
      &           .true., dbl_mb(k_occ+nbf))
diff -rupN src.original/ddscf/uhf.F src/ddscf/uhf.F
--- src.original/ddscf/uhf.F    2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/uhf.F     2013-04-15 12:23:28.096409414 +1000
@@ -144,11 +144,11 @@ C
          enddo
          ihi = max(ihi-1,1)
  9611    continue
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs, 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs, 
      $        'UHF Final Alpha Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      $        .true., dbl_mb(k_occ))
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs(2), 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs(2), 
      $        'UHF Final Beta Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
      $        .true., dbl_mb(k_occ+nbf))
diff -rupN src.original/mcscf/mcscf.F src/mcscf/mcscf.F
--- src.original/mcscf/mcscf.F  2013-04-15 12:41:45.000854073 +1000
+++ src/mcscf/mcscf.F   2013-04-15 12:23:23.748613695 +1000
@@ -719,7 +719,7 @@ c
       if (util_print('final vectors analysis', print_default))
      $     call movecs_print_anal(basis, 
      $     max(1,nclosed-10), min(nbf,nclosed+nact+10),
-     $     0.15d0, g_movecs, 'Analysis of MCSCF natural orbitals',
+     $     0.01d0, g_movecs, 'Analysis of MCSCF natural orbitals',
      $     .true., dbl_mb(k_evals), .true., int_mb(k_sym), 
      $     .true., dbl_mb(k_occ))
 c     
diff -rupN src.original/nwdft/scf_dft/dft_mxspin_ovlp.F src/nwdft/scf_dft/dft_mxspin_ovlp.F
--- src.original/nwdft/scf_dft/dft_mxspin_ovlp.F        2013-04-15 12:41:45.604825677 +1000
+++ src/nwdft/scf_dft/dft_mxspin_ovlp.F 2013-04-15 12:23:28.228403211 +1000
@@ -184,14 +184,14 @@ c
       call ga_sync()
 c
       call movecs_print_anal(basis,int_mb(k_non),int_mb(k_non)
-     & ,0.15d0,g_alpha,'Alpha Orbitals without Beta Partners',
+     & ,0.01d0,g_alpha,'Alpha Orbitals without Beta Partners',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
 c
       if (nct.GE.2) then
       do i = 2,nct
       ind = int_mb(k_non+i-1)
       call movecs_print_anal(basis,ind,ind
-     & ,0.15d0,g_alpha,' ',
+     & ,0.01d0,g_alpha,' ',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
       enddo
       endif
@@ -350,7 +350,7 @@ c      endif
 c      endif
 c 9990 format(/,18x,'THERE ARE',i3,1x,'UN-PARTNERED ALPHA ORBITALS')
 c
-       call movecs_print_anal(basis, 1, nalp, 0.15d0, g_ualpha,
+       call movecs_print_anal(basis, 1, nalp, 0.01d0, g_ualpha,
      & 'Alpha Orb. w/o Beta Partners (after maxim. alpha/beta overlap)',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
 c
diff -rupN src.original/nwdft/scf_dft/dft_scf.F src/nwdft/scf_dft/dft_scf.F
--- src.original/nwdft/scf_dft/dft_scf.F        2013-04-15 12:41:45.608825490 +1000
+++ src/nwdft/scf_dft/dft_scf.F 2013-04-15 12:23:28.228403211 +1000
@@ -1774,7 +1774,7 @@ c
             else
                blob='DFT Final Beta Molecular Orbital Analysis' 
             endif
-            call movecs_print_anal(ao_bas_han, ilo, ihi, 0.15d0, 
+            call movecs_print_anal(ao_bas_han, ilo, ihi, 0.01d0, 
      &           g_movecs(ispin), 
      &           blob, 
      &           .true., dbl_mb(k_eval(ispin)), oadapt, 
diff -rupN src.original/nwdft/scf_dft_cg/dft_cg_solve.F src/nwdft/scf_dft_cg/dft_cg_solve.F
--- src.original/nwdft/scf_dft_cg/dft_cg_solve.F        2013-04-15 12:41:45.612825303 +1000
+++ src/nwdft/scf_dft_cg/dft_cg_solve.F 2013-04-15 12:23:28.220403588 +1000
@@ -183,7 +183,7 @@ c
             blob = 'DFT Final Beta Molecular Orbital Analysis'
           endif
           call movecs_fix_phase(g_movecs(ispin))
-          call movecs_print_anal(basis, ilo, ihi, 0.15d0,
+          call movecs_print_anal(basis, ilo, ihi, 0.01d0,
      &         g_movecs(ispin),blob,
      &         .true., dbl_mb(k_eval+(ispin-1)*nbf),
      &         oadapt, int_mb(k_irs+(ispin-1)*nbf),




Compiling nwchem
First install ACML or OpenBlas:
http://verahill.blogspot.com.au/2013/05/423-openblas-on-debian-wheezy.html
http://verahill.blogspot.com.au/2013/05/422-set-up-acml-on-linux.html


sudo apt-get install build-essential gfortran python2.7-dev libopenmpi-dev openmpi-bin
sudo mkdir /opt/nwchem -p
sudo chown $USER:$USER /opt/nwchem
cd /opt/nwchem
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.3.revision1-src.2013-05-28.tar.gz -O Nwchem-6.3.revision1-src.2013-05-28.tar.gz
tar xvf Nwchem-6.3.revision1-src.2013-05-28.tar.gz
cd nwchem-6.3-src.2013-05-28/

Apply the Gabedit patch:
patch -p0 < 6.3.patch 
patching file src/config/makefile.h patching file src/ddscf/movecs_pr_anal.F patching file src/ddscf/rohf.F patching file src/ddscf/scf_vec_guess.F patching file src/ddscf/uhf.F patching file src/mcscf/mcscf.F patching file src/nwdft/scf_dft/dft_mxspin_ovlp.F patching file src/nwdft/scf_dft/dft_scf.F patching file src/nwdft/scf_dft_cg/dft_cg_solve.F

export NWCHEM_TOP=`pwd`
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/opt/acml/acml5.3.1/gfortran64_int64/lib -lacml"
#export BLASOPT="-L/opt/openblas/lib -lopenblas"

export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.3.1/gfortran64_int64/lib"
#export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib

export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export ARMCI_NETWORK=SOCKETS

cd $NWCHEM_TOP/src

make clean
make nwchem_config
make FC=gfortran 1> make.log 2>make.err

cd $NWCHEM_TOP/contrib
export FC=gfortran
./getmem.nwchem

Comment out the bold parts and uncomment the commented parts to compile with openblas instead of ACML e.g. if you're compiling on an intel machine instead of AMD.

Setting up your ~/.bashrc
As usual, do
echo 'export NWCHEM_EXECUTABLE=/opt/nwchem/nwchem-6.3-src.2013-05-28/bin/LINUX64/nwchem' >> ~/.bashrc
echo 'export NWCHEM_BASIS_LIBRARY=/opt/nwchem/nwchem-6.3-src.2013-05-28/src/basis/libraries/' >> ~/.bashrc
echo 'export PATH=$PATH:/opt/nwchem/nwchem-6.3-src.2013-05-28/bin/LINUX64' >> ~/.bashrc

Also, put a .nwchemrc in your home folder:
nwchem_basis_library /opt/nwchem/nwchem-6.3-src.2013-05-28/src/basis/libraries/
ffield amber
amber_1 /opt/nwchem/nwchem-6.3-src.2013-05-28/src/data/amber_s/
amber_2 /opt/nwchem/nwchem-6.3-src.2013-05-28/src/data/amber_x/
amber_3 /opt/nwchem/nwchem-6.3-src.2013-05-28/src/data/amber_q/
amber_4 /opt/nwchem/nwchem-6.3-src.2013-05-28/src/data/amber_u/
amber_5 /opt/nwchem/nwchem-6.3-src.2013-05-28/src/data/custom/
spce /opt/nwchem/nwchem-6.3-src.2013-05-28/src/data/solvents/spce.rst
charmm_s /opt/nwchem/nwchem-6.3-src.2013-05-28/src/data/charmm_s/
charmm_x /opt/nwchem/nwchem-6.3-src.2013-05-28/src/data/charmm_x/

You're done.

10 June 2013

443. Briefly: Running the QA tests in NWChem

To make sure that everything is working properly and that you get the expected results from your nwchem binaries, you should run the QA tests that come with nwchem.

Here's how to do it with the nwchem 6.3 QA tests.

I'm presuming that you built nwchem with mpi as shown e.g. here: http://verahill.blogspot.com.au/2013/05/424-nwchem-63-on-debian-wheezy.html

In this particular case I'm using nwchem linked with openblas on an AMD Phenom II 1055T with 8 Gb RAM since it was the only node that was free.

0. Go to the QA directories.
In my case everything is housed in /opt/nwchem/nwchem-6.3-src.patched and the QA tests are in /opt/nwchem/nwchem-6.3-src.patched/QA

1. Run the tests
First set the environmental variables, then start the tests. The 6 in './doqmtests.mpi 6' is the number of threads i.e. processors to use in parallel.

export NWCHEM_TOP=/opt/nwchem/nwchem-6.3-src.patched
export NWCHEM_TARGET=LINUX64
./doqmtests.mpi 6 |tee doqmtests.mpi.log
====================================================== QM: Running all tests (including some really big ones) ====================================================== Running tests/h2o_opt/h2o_opt cleaning scratch copying input and verified output files running nwchem (/opt/nwchem/nwchem-6.3-src.patched/bin/LINUX64/nwchem) 26.3u 8.8s 0:07.13 492.8% (0t+0ds+0avg+49046max)k 0i+6199464o 18pf 0swaps verifying output ... OK Running tests/c2h4/c2h4 cleaning scratch copying input and verified output files running nwchem (/opt/nwchem/nwchem-6.3-src.patched/bin/LINUX64/nwchem) 55.9u 2.1s 0:10.92 532.3% (0t+0ds+0avg+59834max)k 0i+808848o 19pf 0swaps verifying output ... OK [..]

2. Verify

Once the runs are done, go through the log to find out which, if any, failed. In my case, I had
Running tests/autosym/autosym 
 
     cleaning scratch
     copying input and verified output files
     running nwchem (/opt/nwchem/nwchem-6.3-src.patched/bin/LINUX64/nwchem)
 
15.3u 2.5s 0:04.20 426.4% (0t+0ds+0avg+48842max)k 0i+1325784o 17pf 0swaps
     verifying output ... failed

Running tests/dft_s12gh/dft_s12gh 
 
     cleaning scratch
     copying input and verified output files
     running nwchem (/opt/nwchem/nwchem-6.3-src.patched/bin/LINUX64/nwchem)
 
619.4u 4.3s 1:45.33 592.1% (0t+0ds+0avg+76724max)k 2472i+1938952o 26pf 0swaps
     verifying output ... failed
 
Failed
 
Running tests/cosmo_trichloroethene/cosmo_trichloroethene 
 
     cleaning scratch
     copying input and verified output files
     running nwchem (/opt/nwchem/nwchem-6.3-src.patched/bin/LINUX64/nwchem)
 
113.6u 2.3s 0:19.57 592.8% (0t+0ds+0avg+63700max)k 64i+1149456o 20pf 0swaps
     verifying output ... failed

 Running tests/bsse_dft_trimer/bsse_dft_trimer 
 
     cleaning scratch
     copying input and verified output files
     running nwchem (/opt/nwchem/nwchem-6.3-src.patched/bin/LINUX64/nwchem)
 
228.2u 4.8s 0:40.16 580.5% (0t+0ds+0avg+53806max)k 0i+2716224o 21pf 0swaps
     verifying output ... failed
 
Failed

and so on. Not a good start.

3. Troubleshoot the failed tests
To find out whether the failures are significant, we first need to understand how the script is doing the testing.

In runtests.mpi.unix
369 # Now verify the output 370 371 echo -n " verifying output ... " 372 373 perl $NWPARSE $STUB.out >& /dev/null 374 if ($status) then 375 echo nwparse.pl failed on test output $STUB.out 376 set overall_status = 1 377 continue 378 endif 379 perl $NWPARSE $STUB.ok.out >& /dev/null 380 if ($status) then 381 echo nwparse.pl failed on verified output $STUB.ok.out 382 set overall_status = 1 383 continue 384 endif 385 386 diff -w $STUB.ok.out.nwparse $STUB.out.nwparse >& /dev/null 387 @ diff1status = $status 388 # 389 endif 390 # 391 392 if ($diff1status) then 393 echo "failed" 394 set overall_status = 1 395 continue 396 else

In my case autosym failed:

cd testoutputs
diff autosym.ok.out.nwparse autosym.out.nwparse 
45c45 < Effective nuclear repulsion energy (a.u.) 4265.6221 --- > Effective nuclear repulsion energy (a.u.) 4265.6222
It seems to be a rounding error. As far as I know the precision at which the data is stored is significantly higher than at which it is reported, so this doesn't necessarily need to be a problem (it's still not a good thing though). Note that everything else, such as the thermochemical parameters, are identical.

Continuing:
diff dft_s12gh.ok.out.nwparse dft_s12gh.out.nwparse 
52c52 < The Zero-Point Energy (Kcal/mol) = 21.82496 --- > The Zero-Point Energy (Kcal/mol) = 21.82497 128c128 < H 0.0123 0.0030 0.0000 --- > H 0.0122 0.0030 0.0000
Same thing.

Here's a list over the tests that failed for me (-> indicates that the execution failed -- more details below; * indicates that it is expected to fail):

autosym
dft_s12gh
cosmo_trichloroethene
bsse_dft_trimer
cosmo_h3co
cosmo_h3co_gp
h2o_diag_to_cg_ub3lyp
* oh2
dft_cr2
dft_x
dft_ozone
hess_nh3_ub3lyp
pspw_SiC
paw
-> tddft_h2o_mxvc20
-> tddft_h2o_uhf_mxvc20
tce_cr_eom_t_ch_rohf
hi_zora_sf
o2_zora_so
lys_qmmm
ethane_qmmm
qmmm_opt0
prop_ch3f
ch3f-lc-wpbe
ch3f-lc-wpbeh
ch3radical_rot
ch3radical_unrot
cho_bp_props
-> prop_cg_nh3_b3lyp
acr-camb3lyp-cdfit
acr-camb3lyp-direct
acr_lcblyp
o2_bnl
fh_m06 ???
disp_dimer_ch4
disp_dimer_ch4_cgmin
mep-test
sif_sodft
h2o_raman_3
h2o_raman_4
tropt-ch3nh2
h3_dirdyvtst
h2o_hcons
etf_hcons
cho_bp_props
-> dntmc_h2o_nh3
5h2o_core
co_core
talc
neb-fch3cl
neb-isobutene
nwxc_pspw_1he
nwxc_pspw_1ne
nwxc_pspw_4n
nwxc_pspw_4p
nwxc_pspw_new_1he
nwxc_pspw_new_3he
nwxc_pspw_new_1ne
nwxc_pspw_new_4n
nwxc_pspw_new_1ar
nwxc_pspw_new_4p
nwxc_pspw_new_1kr
nwxc_pspw_new_4as
nwxc_pspw_new_1xe
nwxc_pspw_new_4sb
hess_nh3_dimer
pbo_nesc1e
h2o_selci
hess_biph
-> ch4_zts
-> ch4cl_zts

All the jobs without a '->' or '*' failed due to rounding errors. To quickly go through them I put the list of failed jobs in a file, and then did
cat fails |xargs -I {} diff testoutputs/{}.ok.out.nwparse testoutputs/{}.out.nwparse|less


The jobs that failed outright are listed below:

-> tddft_h2o_mxvc20
tddft_diagon: negative excitation energy 0 ------------------------------------------------------------------------ This type of error is most commonly associated with calculations not reaching convergence criteria
-> tddft_h2o_uhf_mxvc20
Last System Error Message from Task 5:: Numerical result out of range tddft_diagon: negative excitation energy 0
-> prop_cg_nh3_b3lyp
task hessian incompatible with cgmin 0 ------------------------------------------------------------------------ A feature requested has not yet been implemented
-> dntmc_h2o_nh3
********** Destroying SubGroups *********** ******************************************* deleting cloned rtdb deleting cloned rtdb Closing subgroup Closing subgroup 1:1:ga_pgroup_destroy_:Attempt to destroy process group with attached GAs:: 2 (rank:1 hostname:boron pid:1941):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/armci.c:ARMCI_Error():208 cond:0 2:2:ga_pgroup_destroy_:Attempt to destroy process group with attached GAs:: 2 (rank:2 hostname:boron pid:1942):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/armci.c:ARMCI_Error():208 cond:0 Last System Error Message from Task 1:: No such file or directory Last System Error Message from Task 2:: No such file or directory
-> ch4_zts
scf string failed 0 ------------------------------------------------------------------------ This type of error is most commonly associated with calculations not reaching convergence criteria
-> ch4cl_zts
scf string failed 0 ------------------------------------------------------------------------ This type of error is most commonly associated with calculations not reaching convergence criteria

It's time to go back and compare with 1. nwchem-6.3/acml and 2. nwchem-6.1.1/openblas and 3. a different processor architecture...

The question is how serious this is. In most cases I think the rounding errors are ok, but errors do accumulate, and especially when large and small numbers are multiplied they can become significant.

24 May 2013

432. NWChem 6.3 -- COSMO is now fast(er)!

I probably would've been more excited about this about a year ago when I 'believed' in implicit solvation models (nothing's perfect, and we'll use what is practical so they do fill a strong need. They just aren't very informative for a lot of systems) but it's still a Good Thing.

COSMO has been done using numerical gradients in nwchem 6.1.1 and earlier versions, which has meant that it's been horrendously slow in many cases, in particular if you need to optimise a structure using implicit solvation. COSMO has been -- and still is -- the only implicit solvation model implemented in NWChem, so slow COSMO puts a bit of a spanner in the solvation energy works. Sometimes the calculation even refuses to converge at all.

In contrast, Gaussian has had a number of implicit solvation models implemented, ranging from the quick and dirty PCM, to slower (and better?) C-PCM and I-PCM.

So this is great news.

A quick example:


The test:
Here's a test job (the default cosmo parameters aren't realistic, but this is for testing purposes):
scratch_dir /scratch start benzene geometry units angstroms C 0.100 1.396 0.000 C 1.209 0.698 0.000 C 1.209 -0.698 0.000 C 0.000 -1.396 0.000 C -1.209 -0.698 0.000 C -1.209 0.698 0.000 H 0.000 2.479 0.000 H 2.147 1.240 0.000 H 2.147 -1.240 0.000 H 0.000 -2.479 0.000 H -2.147 -1.240 0.000 H -2.147 1.240 0.000 end basis H library "6-31+g*" c library "6-31+g*" end dft direct end cosmo end scf maxiter 999 end task dft
Note that this is the same test job (plus cosmo, minus optimize) as shown here: http://verahill.blogspot.com.au/2013/05/430-briefly-crude-comparison-of.html

The results:
And here is what I see using nwchem 6.3. (w/ acml 5.3.1, AMD FX 8150/32 gb ram):
6.1.1 19.4 seconds
6.3   14.3 seconds

The difference isn't significant (in the sense that times are too variable so we can't really tell which is faster for such a short job).

But when we change task dft to task dft optimize we get
6.1.1 Fails after 2600 seconds
6.3   128.3 seconds

6.3 churns through the steps pretty efficiently:
@ Step Energy Delta E Gmax Grms Xrms Xmax Walltime @ ---- ---------------- -------- -------- -------- -------- -------- -------- @ 0 -230.09337488 0.0D+00 0.07376 0.01302 0.00000 0.00000 18.1 @ 1 -230.10523734 -1.2D-02 0.00903 0.00231 0.03627 0.10509 45.7 @ 2 -230.10619442 -9.6D-04 0.00491 0.00084 0.01898 0.06082 69.1 @ 3 -230.10628696 -9.3D-05 0.00176 0.00030 0.00737 0.02428 93.3 @ 4 -230.10629787 -1.1D-05 0.00023 0.00005 0.00219 0.00682 115.8 @ 5 -230.10629827 -4.0D-07 0.00004 0.00001 0.00047 0.00136 128.2 @ 5 -230.10629827 -4.0D-07 0.00004 0.00001 0.00047 0.00136 128.2
while 6.1.1 drags itself along for almost an hour:
@ Step Energy Delta E Gmax Grms Xrms Xmax Walltime @ ---- ---------------- -------- -------- -------- -------- -------- -------- @ 0 -230.09389924 0.0D+00 0.07389 0.01306 0.00000 0.00000 691.4 @ 1 -230.10680306 -1.3D-02 0.01081 0.00197 0.03065 0.10438 1378.3 @ 2 -230.10690186 -9.9D-05 0.01000 0.00167 0.00231 0.00803 2092.2
before failing with
6:6:driver: task_gradient failed:: 0 (rank:6 hostname:neon pid:4536):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/armci.c:ARMCI_Error():208 cond:0 ------------------------------------------------------------------------ There is an error related to the specified geometry ------------------------------------------------------------------------

Sure, the optimization takes 128 seconds instead of ca 44 seconds, but for anyone who's used NWCHEM with COSMO in the past, that's actually not too bad.

I ran another job to get a better feeling for how much longer COSMO vs no COSMO takes for optimization. Optimization of Arecoline (available in ECCE as a fragment) at rb3lyp/6-31+G* takes 2h 5 min with COSMO (33 optimization steps). Without COSMO it takes 37 minutes and uses 14 steps.

431. Briefly: a crude comparison of performance of NWChem 6.1, 6.1.1 and 6.3.

Just a simple comparison of different versions of nwchem on different hardware. It's mostly interesting to myself as a general guide to how slow my nodes are in relative terms.

I built nwchem as shown here: http://verahill.blogspot.com.au/2013/05/424-nwchem-63-on-debian-wheezy.html
and openblas as shown here: http://verahill.blogspot.com.au/2013/05/423-openblas-on-debian-wheezy.html
and installed acml as shown here: http://verahill.blogspot.com.au/2013/05/422-set-up-acml-on-linux.html

I'm using ECCE: http://verahill.blogspot.com.au/2013/01/325-compiling-ecce-64-on-debian-testing.html
and SGE: http://verahill.blogspot.com.au/2012/06/setting-up-sun-grid-engine-with-three.html
I've set up ECCE similarly to what is shown here: http://verahill.blogspot.com.au/2012/06/ecce-in-virtual-machine-step-by-step.html

Test job:
scratch_dir /scratch
Title "opt freq"

Start  biphenyl_cation_twisted

echo

charge 1

geometry autosym units angstrom
 C     0.00000     -3.56301     0.00000
 C     -1.13927     -2.85928     -0.393841
 C     -1.13879     -1.46545     -0.394153
 C     0.00000     -0.742814     0.00000
 C     1.13879     -1.46545     0.394153
 C     1.13927     -2.85928     0.393841
 C     0.00000     0.742814     0.00000
 C     1.13879     1.46545     -0.394153
 C     1.13927     2.85928     -0.393841
 C     -1.13879     1.46545     0.394153
 C     0.00000     3.56301     0.00000
 C     -1.13927     2.85928     0.393841
 H     0.00000     -4.64896     0.00000
 H     -2.02827     -3.39662     -0.711607
 H     -2.02148     -0.928265     -0.727933
 H     2.02827     -3.39662     0.711607
 H     2.02827     3.39662     -0.711607
 H     -2.02148     0.928265     0.727933
 H     0.00000     4.64896     0.00000
 H     -2.02827     3.39662     0.711607
 H     2.02148     0.928265     -0.727933
 H     2.02148     -0.928265     0.727933
end

ecce_print ecce.out

basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
END

dft
  mult 2
  XC b3lyp
  mulliken
end

driver
end

task dft optimize
task dft freq numerical

Results:
The jobs were run using all cores available.

AMD Phenom II X6 1055T, 8 Gb RAM, Openblas. Six cores.
6.1    2461
6.1.1  2114
6.3    2044
6.3    2048**

**using MKL compiled with ifort (http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html).

AMD FX 8150, 32 Gb RAM, acml 5.3.1 (gfortran, int64, fma4) -- earlier versions of nwchem were compiled against different versions of acml. Eight cores.
6.1    1619s
6.1.1  1588s
6.3    1611s
6.3    1507s**

**using MKL compiled with ifort (http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html).

Intel i5-2400, 16 Gb RAM, openblas. Four cores.
6.1    1689s
6.1.1  1696s
6.3    1652s
6.3    1550s*
6.3    1498s**

*using Intel MKL (see http://verahill.blogspot.com.au/2013/06/465-intel-mkl-math-kernel-library-on.html)
**using MKL compiled with ifort (http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html).

AMD Athlon II X3, 4 gb RAM, acm 5.3.1. Three cores.
6.3    4818s 
6.3    4058s**
**using MKL compiled with ifort (http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html).

23 May 2013

430. Strange issue with NWChem, openmpi, SGE and ECCE

This one's a bit odd.

Odd in the sense that

  • the math libs (acml) I'm using should be suitable for the processors that I'm using them for.
  • it only happens when I submit with ECCE + SGE. Calcs on the input files are fine if I launch the by hand



The problem:
I'm having issues launching jobs on two nodes where the nwchem 6.3. binaries were compiled against acml 5.3.1 (gfortran, int64). I'm launching the jobs from ECCE and I've got SGE set up and working since a long time. My two other nodes, one i5-2400 linked against openblas, and one AMD FX 8150 linked against acml 5.3.1 (gfortran, fma4, int64) work absolutely fine.

Both binaries were linked with acml using
export BLASOPT="-L/opt/acml/acml5.3.1/gfortran64_int64/lib -lacml"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.3.1/gfortran64_int64/lib"

The first node is an AMD phenom II X6 1055, while the second one is an ancient, recently-revived AMD Athlon X2 3800+. The acml util cpuid.exe gives
Chip manufacturer: AuthenticAMD AuthenticAMD family 15 extended family 1 model 10 Model Name: AMD Phenom(tm) II X6 1055T Processor Chip supports SSE Chip supports SSE2 Chip supports SSE3 Chip does not support AVX Chip does not support FMA3 Chip does not support FMA4
and
Model Name: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ Chip supports SSE Chip supports SSE2 Chip supports SSE3 Chip does not support AVX Chip does not support FMA3 Chip does not support FMA4
respectively. On the AMD Phenom II X6 1055T I kept getting
Scaling coordinates for geometry "geometry" by 1.889725989 (inverse scale = 0.529177249) 0:Illegal Instruction error, status=: 4 (rank:0 hostname:boron pid:12386):ARMCI DASSERT fail. ../../ga-5-2/armci/src/ common/signaltrap.c:SigIllHandler():276 cond:0
. On the Athlon 64 X2 3800+ the job would just exit at
Directory information --------------------- 0 permanent = . 0 scratch = /home/me/scratch
There would be no other errors (in e.g. .po or .o files).

If I launch the job by hand, e.g.
mpirun -n 6 nwchem nwch.nw
it works fine.



The Partial solution
The errors for the AMD Phenom II X6 1055T went away when I instead of acml used openblas:
export BLASOPT="-L/opt/openblas/lib -lopenblas"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib"

See e.g. http://verahill.blogspot.com.au/2013/05/424-nwchem-63-on-debian-wheezy.html for general compilation instructions.

The odd thing:
With openblas the AMD Athlon X2 3800+ suddenly gives
Scaling coordinates for geometry "geometry" by 1.889725989 (inverse scale = 0.529177249) 0:Illegal Instruction error, status=: 4 (rank:0 hostname:beryllium pid:9267):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigIllHandler():276 cond:0

19 May 2013

424. NWChem 6.3 on Debian Wheezy

Update 23 May 2013: The execution times are pretty much the same as for 6.1.1 with a new patch. I've updated the instructions below to incorporate this new patch (http://www.nwchem-sw.org/images/Iswtch.patch.gz)

Update 21 May 2013: The execution times can be improved considerably by setting
ARMCI_NETWORK=SOCKETS

They are still ca 30% longer than 6.1.1 though due to slower SCF convergence. See http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id834/Nwchem_6.3_running_2-5_times_slo....html

Update 20 May 2013: I did a bit of basic benchmarking. NWChem 6.3 is incredibly slow (ca 190s vs 40s for the 8 core, 3.6 GHz benchmark in http://verahill.blogspot.com.au/2013/05/414-frequency-vs-cores-crude.html). It's parallellising properly from what I can see (i.e. it is not running 8 serial jobs). I've repeated the calc with an unpatched version of nwchem 6.3, and it is just as slow.
 I'll post updates here if I figure this one out.

Original post:
NWChem 6.3 is just out. Here's how to build it for CPU computations.

To build on CentOS 5.6, see http://verahill.blogspot.com.au/2013/05/421-nwchem-63-on-rocks-543centos-56.html


Math library:
Use either openblas (for intel or AMD) or ACML (for AMD).

My GabEdit/Python NWChem patch
This is NOT the patch alluded to in the 23 May update and is optional. It enables python support, and makes the output more verbose so that gabedit can be used as an alternative to ECCE. Hence, it is required if, but only if, you want to enable python and to be able to use GabEdit to open output files.

First create a patch file, e.g. diff.patch.

diff -rupN src.original/config/makefile.h src/config/makefile.h
--- src.original/config/makefile.h 2013-04-15 12:41:45.016853322 +1000
+++ src/config/makefile.h 2013-04-15 12:38:44.933319544 +1000
@@ -2039,7 +2039,7 @@ endif
 
      ifeq ($(BUILDING_PYTHON),python)
 #   EXTRA_LIBS += -ltk -ltcl -L/usr/X11R6/lib -lX11 -ldl
-     EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl
+     EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl -lssl -lz
   LDOPTIONS = -Wl,--export-dynamic 
      endif
 ifeq ($(NWCHEM_TARGET),CATAMOUNT)
diff -rupN src.original/ddscf/movecs_pr_anal.F src/ddscf/movecs_pr_anal.F
--- src.original/ddscf/movecs_pr_anal.F 2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/movecs_pr_anal.F 2013-04-15 12:23:28.100409225 +1000
@@ -195,7 +195,7 @@ c
  22         format(1x,2('  Bfn.  Coefficient  Atom+Function  ',5x))
             write(LuOut,23)
  23         format(1x,2(' ----- ------------  ---------------',5x))
-            do klo = 0, min(n-1,9), 2
+            do klo = 0, min(n-1,199), 2
                khi = min(klo+1,n-1)
                write(LuOut,2) (
      $              int_mb(k_list+k)+1, 
diff -rupN src.original/ddscf/rohf.F src/ddscf/rohf.F
--- src.original/ddscf/rohf.F 2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/rohf.F 2013-04-15 12:23:28.100409225 +1000
@@ -153,7 +153,7 @@ c
             ilo = 1
             ihi = nmo
          endif
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs, 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs, 
      $        'ROHF Final Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      $        .true., dbl_mb(k_occ))
diff -rupN src.original/ddscf/scf_vec_guess.F src/ddscf/scf_vec_guess.F
--- src.original/ddscf/scf_vec_guess.F 2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/scf_vec_guess.F 2013-04-15 12:23:28.100409225 +1000
@@ -511,19 +511,19 @@ c
          nprint = min(nclosed+nopen+30,nmo)
          if (scftype.eq.'RHF' .or. scftype.eq.'ROHF') then
             call movecs_print_anal(basis, 1,
-     &           nprint, 0.15d0, g_movecs, 
+     &           nprint, 0.01d0, g_movecs, 
      &           'ROHF Initial Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      &           .true., dbl_mb(k_occ))
          else
             nprint = min(nalpha+20,nmo)
             call movecs_print_anal(basis, max(1,nbeta-20),
-     &           nprint, 0.15d0, g_movecs, 
+     &           nprint, 0.01d0, g_movecs, 
      &           'UHF Initial Alpha Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      &           .true., dbl_mb(k_occ))
             call movecs_print_anal(basis, max(1,nbeta-20),
-     &           nprint, 0.15d0, g_movecs(2), 
+     &           nprint, 0.01d0, g_movecs(2), 
      &           'UHF Initial Beta Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
      &           .true., dbl_mb(k_occ+nbf))
diff -rupN src.original/ddscf/uhf.F src/ddscf/uhf.F
--- src.original/ddscf/uhf.F 2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/uhf.F 2013-04-15 12:23:28.096409414 +1000
@@ -144,11 +144,11 @@ C
          enddo
          ihi = max(ihi-1,1)
  9611    continue
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs, 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs, 
      $        'UHF Final Alpha Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      $        .true., dbl_mb(k_occ))
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs(2), 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs(2), 
      $        'UHF Final Beta Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
      $        .true., dbl_mb(k_occ+nbf))
diff -rupN src.original/mcscf/mcscf.F src/mcscf/mcscf.F
--- src.original/mcscf/mcscf.F 2013-04-15 12:41:45.000854073 +1000
+++ src/mcscf/mcscf.F 2013-04-15 12:23:23.748613695 +1000
@@ -719,7 +719,7 @@ c
       if (util_print('final vectors analysis', print_default))
      $     call movecs_print_anal(basis, 
      $     max(1,nclosed-10), min(nbf,nclosed+nact+10),
-     $     0.15d0, g_movecs, 'Analysis of MCSCF natural orbitals',
+     $     0.01d0, g_movecs, 'Analysis of MCSCF natural orbitals',
      $     .true., dbl_mb(k_evals), .true., int_mb(k_sym), 
      $     .true., dbl_mb(k_occ))
 c     
diff -rupN src.original/nwdft/scf_dft/dft_mxspin_ovlp.F src/nwdft/scf_dft/dft_mxspin_ovlp.F
--- src.original/nwdft/scf_dft/dft_mxspin_ovlp.F 2013-04-15 12:41:45.604825677 +1000
+++ src/nwdft/scf_dft/dft_mxspin_ovlp.F 2013-04-15 12:23:28.228403211 +1000
@@ -184,14 +184,14 @@ c
       call ga_sync()
 c
       call movecs_print_anal(basis,int_mb(k_non),int_mb(k_non)
-     & ,0.15d0,g_alpha,'Alpha Orbitals without Beta Partners',
+     & ,0.01d0,g_alpha,'Alpha Orbitals without Beta Partners',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
 c
       if (nct.GE.2) then
       do i = 2,nct
       ind = int_mb(k_non+i-1)
       call movecs_print_anal(basis,ind,ind
-     & ,0.15d0,g_alpha,' ',
+     & ,0.01d0,g_alpha,' ',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
       enddo
       endif
@@ -350,7 +350,7 @@ c      endif
 c      endif
 c 9990 format(/,18x,'THERE ARE',i3,1x,'UN-PARTNERED ALPHA ORBITALS')
 c
-       call movecs_print_anal(basis, 1, nalp, 0.15d0, g_ualpha,
+       call movecs_print_anal(basis, 1, nalp, 0.01d0, g_ualpha,
      & 'Alpha Orb. w/o Beta Partners (after maxim. alpha/beta overlap)',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
 c
diff -rupN src.original/nwdft/scf_dft/dft_scf.F src/nwdft/scf_dft/dft_scf.F
--- src.original/nwdft/scf_dft/dft_scf.F 2013-04-15 12:41:45.608825490 +1000
+++ src/nwdft/scf_dft/dft_scf.F 2013-04-15 12:23:28.228403211 +1000
@@ -1774,7 +1774,7 @@ c
             else
                blob='DFT Final Beta Molecular Orbital Analysis' 
             endif
-            call movecs_print_anal(ao_bas_han, ilo, ihi, 0.15d0, 
+            call movecs_print_anal(ao_bas_han, ilo, ihi, 0.01d0, 
      &           g_movecs(ispin), 
      &           blob, 
      &           .true., dbl_mb(k_eval(ispin)), oadapt, 
diff -rupN src.original/nwdft/scf_dft_cg/dft_cg_solve.F src/nwdft/scf_dft_cg/dft_cg_solve.F
--- src.original/nwdft/scf_dft_cg/dft_cg_solve.F 2013-04-15 12:41:45.612825303 +1000
+++ src/nwdft/scf_dft_cg/dft_cg_solve.F 2013-04-15 12:23:28.220403588 +1000
@@ -183,7 +183,7 @@ c
             blob = 'DFT Final Beta Molecular Orbital Analysis'
           endif
           call movecs_fix_phase(g_movecs(ispin))
-          call movecs_print_anal(basis, ilo, ihi, 0.15d0,
+          call movecs_print_anal(basis, ilo, ihi, 0.01d0,
      &         g_movecs(ispin),blob,
      &         .true., dbl_mb(k_eval+(ispin-1)*nbf),
      &         oadapt, int_mb(k_irs+(ispin-1)*nbf),


Compile NWChem
This examples uses the ACML libs. See e.g. this post for openblas settings.

sudo apt-get install build-essential gfortran python2.7-dev libopenmpi-dev openmpi-bin
sudo mkdir /opt/nwchem
sudo chown $USER:$USER /opt/nwchem
cd /opt/nwchem/
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.3-src.2013-05-17.tar.gz
mv download.php\?f\=Nwchem-6.3-src.2013-05-17.tar.gz Nwchem-6.3-src.2013-05-17.tar.gz
tar xvf Nwchem-6.3-src.2013-05-17.tar.gz
cd nwchem-6.3-src.2013-05-17/
patch -p0 < diff.patch
patching file src/config/makefile.h patching file src/ddscf/movecs_pr_anal.F patching file src/ddscf/rohf.F patching file src/ddscf/scf_vec_guess.F patching file src/ddscf/uhf.F patching file src/mcscf/mcscf.F patching file src/nwdft/scf_dft/dft_mxspin_ovlp.F patching file src/nwdft/scf_dft/dft_scf.F patching file src/nwdft/scf_dft_cg/dft_cg_solve.F
cd src/ wget http://www.nwchem-sw.org/images/Iswtch.patch.gz gzip -d Iswtch.patch patch -p0 < Iswtch.patch cd ../ export LARGE_FILES=TRUE export TCGRSH=/usr/bin/ssh export NWCHEM_TOP=`pwd` export NWCHEM_TARGET=LINUX64 export NWCHEM_MODULES="all python" export PYTHONVERSION=2.7 export PYTHONHOME=/usr export BLASOPT="-L/opt/acml/acml5.3.1/gfortran64_int64/lib -lacml" export USE_MPI=y export USE_MPIF=y export USE_MPIF4=y export MPI_LOC=/usr/lib/openmpi/lib export MPI_INCLUDE=/usr/lib/openmpi/include export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.3.1/gfortran64_int64/lib" export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread" export ARMCI_NETWORK=SOCKETS cd $NWCHEM_TOP/src make clean make nwchem_config make FC=gfortran 1> make.log 2>make.err cd $NWCHEM_TOP/contrib export FC=gfortran ./getmem.nwchem


Settings
Create /opt/nwchem/default.nwchemrc
nwchem_basis_library /opt/nwchem/nwchem-6.3-src.2013-05-17/src/basis/libraries/ ffield amber amber_1 /opt/nwchem/nwchem-6.3-src.2013-05-17/src/data/amber_s/ amber_2 /opt/nwchem/nwchem-6.3-src.2013-05-17/src/data/amber_x/ amber_3 /opt/nwchem/nwchem-6.3-src.2013-05-17/src/data/amber_q/ amber_4 /opt/nwchem/nwchem-6.3-src.2013-05-17/src/data/amber_u/ amber_5 /opt/nwchem/nwchem-6.3-src.2013-05-17/src/data/custom/ spce /opt/nwchem/nwchem-6.3-src.2013-05-17/src/data/solvents/spce.rst charmm_s /opt/nwchem/nwchem-6.3-src.2013-05-17/src/data/charmm_s/ charmm_x /opt/nwchem/nwchem-6.3-src.2013-05-17/src/data/charmm_x/

Symmlink to this file in each user's home:
ln -s /opt/nwchem/default.nwchemrc ~/.nwchemrc