News


Here is all the latest news from the UCF ARCC:


Fall Maintenance Cycle (December 10-16, 2018)

Stokes and Newton will be taken down per our bi-annual routine maintenance cycle during mid-December. Specifically, the clusters will be unavailable from the morning of Monday, December 10 through the morning of Monday, December 17.

The primary objective during this downtime is to upgrade our scheduler, SLURM, and to change some of its default options. There will also be some changes made to the Python installs to bring more consistency across versions. We will provide more detail in the change log when we bring the system back online.

Recall that we now routinely bring the system down twice a year, once in late Fall and once in late Spring. We will keep the users notified in advance of such downtimes, but we recommend you build such expectations into your workflow. Though we anticipate no data loss during this time, it's never a bad idea to backup your materials. So we suggest you use this opportunity to copy salient data and code off of Stokes prior to the downtime.

Changes due to ARCC Spring 2018 Maintenance Cycle

Stokes and Newton have returned to operation! Please remember that we have two such maintenance downtime periods per year, one in late Fall and one in late Spring (the one we just completed).

Please take a moment to read over the changes:

  1. The base OS of the nodes was upgraded to CentOS 7.4, and the resource manager was upgraded to SLURM 17.11.6. This has fixed a problem we were having with GPU reservations on Newton.
  2. The NSF file system that supports our shared applications area was expanded.
  3. The IST external core network switches were repaired, so our external links should be back to 20 Gb/s (they have been 10 Gb/s for several months).
  4. There is a new version of the node matrix for Stokes, and now one is available for Newton.
  5. Newer versions of a number of pieces of software were built, including gdal, geos, proj, openbugs, lapack, and openmpi.
  6. The applications in /apps/jags were rebuilt with newer compilers and renamed to fix some inconsistencies with our naming standards; the modules were also renamed. The old distributions are no longer present. This may affect some R users. You can find out more by typing:
      module avail jags/jags
    
  7. We have begun the process of rebuilding and cleaning up the R builds. Currently, the old builds and modules still exist; however, over the summer we will be transitioning to 3.5.0 under newer build tools. By the end of the summer we hope to have moved all users off of older versions of R. All library packages currently installed will still be available. If you are an R user and have concerns, please contact us.

We appreciate your attention and wish you the best of luck with your research. If you have any questions or concerns, please submit a ticket.

Spring 2018 Maintenance Cycle (May 19-27, 2018)

Stokes and Newton will be taken down per our bi-annual routine maintenance cycle during mid-May. Specifically, the clusters will be unavailable from the morning of Saturday, May 19 through the morning of Monday, May 28.

The primary objective during this downtime is to upgrade the underlying operating systems of the cluster to CentOS 7.4. There will also be some changes made to the R installs to bring more consistency across versions. We will provide more detail in the change log when we bring the system back online.

Recall that we now routinely bring the system down twice a year, once in late Fall and once in late Spring. We will keep the users notified in advance of such downtimes, but we recommend you build such expectations into your workflow. Though we anticipate no data loss during this time, it's never a bad idea to backup your materials. So we suggest you use this opportunity to copy salient data and code off of Stokes prior to the downtime.

End of ARCC Fall 2017 Maintenance Cycle

Stokes and Newton have returned to operation! Please remember that we have two such maintenance downtimes per year, one in late Fall (the one we just completed) and one in late Spring.

Please take a moment to read over the changes:

  1. Eight new nodes were installed (ec[49-56]). They all have the new Skylake Intel processors (32 cores per node) and 192 GB of memory.
  2. We replaced our data transfer node with a new machine.
  3. The scheduling software was upgraded to SLURM 17.11.0. All commands and scripts syntax should behave as they have before.
  4. All software built directly against slurm libraries (e.g., certain MVAPICH2 and OpenMPI libraries and software depending on them) are no longer available. Please contact us if you have trouble using an alternative module.
  5. The file system firmware, Lustre server software, and Lustre client software were upgraded. This should not affect users.
  6. Our Python builds had a lot of inconsistencies and errors, so we took this opportunity to clean these up and rebuild everything. There are now only four Python modules, so some of you may have to change your scripts to use the new modules. The python modules are:
    • python/python-2.7.14-gcc-7.1.0
    • python/python-3.6.3-gcc-7.1.0
    • python/python-2.7.14-ic-2017.1.043
    • python/python-3.6.3-ic-2017.1.043
  7. The environment module system was replaced with a new system, Lmod. The syntax for this system is the same as the old, so your scripts should not have to change. However, the new system has a lot more functionality. For more information about it, see: http://lmod.readthedocs.io/en/latest/010_user.html
  8. Because we have a new module system, we had to re-write all our module scripts. The vast majority of modules were replicated in the new system just as they were in the old. If you experience problems with any new modules, please submit a ticket by sending email to req...@ist.ucf.edu (Click on "..." to reveal the address). There were some changes to few modules:
    • apache-maven-3.5.0 was renamed to maven-3.5.0
    • scalapack-2.0.2-mvapich2-2.2-ic-2017.1.043 was removed
    • protobuf-3.1.0-gcc-6.2.0 was removed
    • vasp-5.4-openmpi-1.8.6-ic-2015.3.187 was renamed to vasp-5.4-openmpi-1.8.3-ic-2015.3.187
    • libdrm-2.4.81-ic-2017 was renamed to libdrm-2.4.81-ic-2017.1.043
    • meep-1.3-gcc-4.9.2 was renamed meep-1.3-openmpi-1.8.3-gcc-4.9.2
    • partitionfinder-1.1.1-ps1 was renamed partitionfinder-1.1.1-pf1
    • petsc-3.5.2-openmpi-1.8.3-ic-2013 was renamed petsc-3.5.2-openmpi-1.8.3-gcc-4.9.2
    • qt-4.8.3-gcc-4.9.2 was renamed qt-4.8.3-gcc-6.2.0
    • qt-5.8.0-ic-2017 was renamed qt-5.8.0-ic-2017.1.043
    • torch-cuda-7.5.18-openblas-0.2.13-ic-2015.3.187 was renamed to torch-cuda-7.5.18-openblas-0.2.13-gcc-4.9.2
    • openblas-0.2.13-gcc-6.2.0-openmp was renamed openblas-0.2.13-gcc-6.2.0-useopenmp-build.sh
    • armadillo-7.900-1-gcc-6.2.0 renamed to armadillo-7.900.1-gcc-6.2.0
    • arlequin-3552 renamed to arlequin-3522
    • R-3.4.1-openmpi-2.1.1-gcc-7.1.0 was removed
    • pegasus-4.7.4-gcc-7.1.0 was removed
    • All slurm-16.05 module variants were removed, as described above in item #4
    • The python modules were corrected as described in item #6
About the UCF ARCC:
The University of Central Florida (UCF) Advanced Research Computing Center is managed by the Institute for Simulation and Training, with subsidies from the UCF Provost and Vice President for Research and Commercialization, for the use by all UCF faculty and their students. Collaboration with other universities and industry is also possible.
Connect with Us!
Contact Info:
UCF Advanced Research Computing Center
3039 Technology Parkway, Suite 220
Orlando, FL 32826
P: 407-882-1147
Request Help