News


Here is all the latest news from the UCF ARCC:


Spring 2018 Maintenance Cycle (May 19-27, 2018)

Stokes and Newton will be taken down per our bi-annual routine maintenance cycle during mid-May. Specifically, the clusters will be unavailable from the morning of Saturday, May 19 through the morning of Monday, May 28.

The primary objective during this downtime is to upgrade the underlying operating systems of the cluster to CentOS 7.4. There will also be some changes made to the R installs to bring more consistency across versions. We will provide more detail in the change log when we bring the system back online.

Recall that we now routinely bring the system down twice a year, once in late Fall and once in late Spring. We will keep the users notified in advance of such downtimes, but we recommend you build such expectations into your workflow. Though we anticipate no data loss during this time, it's never a bad idea to backup your materials. So we suggest you use this opportunity to copy salient data and code off of Stokes prior to the downtime.

End of ARCC Fall 2017 Maintenance Cycle

Stokes and Newton have returned to operation! Please remember that we have two such maintenance downtimes per year, one in late Fall (the one we just completed) and one in late Spring.

Please take a moment to read over the changes:

  1. Eight new nodes were installed (ec[49-56]). They all have the new Skylake Intel processors (32 cores per node) and 192 GB of memory.
  2. We replaced our data transfer node with a new machine.
  3. The scheduling software was upgraded to SLURM 17.11.0. All commands and scripts syntax should behave as they have before.
  4. All software built directly against slurm libraries (e.g., certain MVAPICH2 and OpenMPI libraries and software depending on them) are no longer available. Please contact us if you have trouble using an alternative module.
  5. The file system firmware, Lustre server software, and Lustre client software were upgraded. This should not affect users.
  6. Our Python builds had a lot of inconsistencies and errors, so we took this opportunity to clean these up and rebuild everything. There are now only four Python modules, so some of you may have to change your scripts to use the new modules. The python modules are:
    • python/python-2.7.14-gcc-7.1.0
    • python/python-3.6.3-gcc-7.1.0
    • python/python-2.7.14-ic-2017.1.043
    • python/python-3.6.3-ic-2017.1.043
  7. The environment module system was replaced with a new system, Lmod. The syntax for this system is the same as the old, so your scripts should not have to change. However, the new system has a lot more functionality. For more information about it, see: http://lmod.readthedocs.io/en/latest/010_user.html
  8. Because we have a new module system, we had to re-write all our module scripts. The vast majority of modules were replicated in the new system just as they were in the old. If you experience problems with any new modules, please submit a ticket by sending email to req...@ist.ucf.edu (Click on "..." to reveal the address). There were some changes to few modules:
    • apache-maven-3.5.0 was renamed to maven-3.5.0
    • scalapack-2.0.2-mvapich2-2.2-ic-2017.1.043 was removed
    • protobuf-3.1.0-gcc-6.2.0 was removed
    • vasp-5.4-openmpi-1.8.6-ic-2015.3.187 was renamed to vasp-5.4-openmpi-1.8.3-ic-2015.3.187
    • libdrm-2.4.81-ic-2017 was renamed to libdrm-2.4.81-ic-2017.1.043
    • meep-1.3-gcc-4.9.2 was renamed meep-1.3-openmpi-1.8.3-gcc-4.9.2
    • partitionfinder-1.1.1-ps1 was renamed partitionfinder-1.1.1-pf1
    • petsc-3.5.2-openmpi-1.8.3-ic-2013 was renamed petsc-3.5.2-openmpi-1.8.3-gcc-4.9.2
    • qt-4.8.3-gcc-4.9.2 was renamed qt-4.8.3-gcc-6.2.0
    • qt-5.8.0-ic-2017 was renamed qt-5.8.0-ic-2017.1.043
    • torch-cuda-7.5.18-openblas-0.2.13-ic-2015.3.187 was renamed to torch-cuda-7.5.18-openblas-0.2.13-gcc-4.9.2
    • openblas-0.2.13-gcc-6.2.0-openmp was renamed openblas-0.2.13-gcc-6.2.0-useopenmp-build.sh
    • armadillo-7.900-1-gcc-6.2.0 renamed to armadillo-7.900.1-gcc-6.2.0
    • arlequin-3552 renamed to arlequin-3522
    • R-3.4.1-openmpi-2.1.1-gcc-7.1.0 was removed
    • pegasus-4.7.4-gcc-7.1.0 was removed
    • All slurm-16.05 module variants were removed, as described above in item #4
    • The python modules were corrected as described in item #6

Fall maintenance cycle, Dec.9 - Dec.17

Stokes and Newton will be taken down per our bi-annual routine maintenance cycle during mid-December. Specifically, the clusters will be unavailable from the morning of Saturday, December 9 through the morning of Sunday, December 17.

Changes made during downtime will only minimally affect users. We will provide more detail in the change log when we bring the system back online. However, it will include the following: some minor changes to some module names and an upgrade to the latest SLURM.

Recall that we now routinely bring the system down twice a year, once in late Fall and once in late Spring. We will keep the users notified in advance of such downtimes, but we recommend you build such expectations into your workflow. Though we anticipate no data loss during this time, it's never a bad idea to backup your materials. So we suggest you use this opportunity to copy salient data and code off of Stokes prior to the downtime.

Brief outage of Stokes management server Fri.18.Aug @ 10a

We need to reboot the Stokes server that is responsible for running our resource manager. We plan to do so this coming Friday at 10a. Jobs that are in the queue will remain in the queue, jobs that are running on compute nodes will continue to run, and you will still be able to login to Stokes and copy files. The only impact will be a brief interruption in your ability to obtain information from the scheduler or to submit jobs (sbatch, squeue, sinfo, and srun will be unavailable).

We appreciate you patience.

About the UCF ARCC:
The University of Central Florida (UCF) Advanced Research Computing Center is managed by the Institute for Simulation and Training, with subsidies from the UCF Provost and Vice President for Research and Commercialization, for the use by all UCF faculty and their students. Collaboration with other universities and industry is also possible.
Connect with Us!
Contact Info:
UCF Advanced Research Computing Center
3039 Technology Parkway, Suite 220
Orlando, FL 32826
P: 407-882-1147
Request Help