News


Here is all the latest news from the UCF ARCC:


Network outage, Tue.6.Dec 1a-5a

Due to building network maintenance beyond our control, there will be a network outage on Tuesday, December 6th from 1:00 AM to 5:00 AM. This will not affect Stokes itself, but will affect user access to Stokes during the outage.

Fall Maintenance cycle downtime, Dec.12 - Dec.19

Stokes and Newton will be taken down per our bi-annual routine maintenance cycle during Mid-December. Specifically, the clusters will be unavailable from the morning of Monday, December 12 through the morning of Monday, December 19.

Changes made during downtime will be minimal. The most significant change will be a slight change to the way groups are handled at the Linux level. We will provide more detail in the change log when we bring the system back online.

Recall that we now routinely bring the system down twice a year, once in late Fall and once in late Spring. We will keep the users notified in advance of such downtimes, but we recommend you build such expectations into your workflow. Though we anticipate no data loss during this time, it's never a bad idea to backup your materials. So we suggest you use this opportunity to copy salient data and code off of Stokes prior to the downtime.

Changes due to Spring 2016 maintenance cycle

Greetings Stokes users,

Stokes has returned to operation! Newton will remain down until next week. Please remember that we have two such maintenance downtimes per year, one in late Fall (the one we just completed) and one in late Spring. Let us know about issues you are having by sending requests to req...@ist.ucf.edu (Click on "..." to reveal the address).

Please take a moment to read over the changes:

  1. The scheduling software was upgraded to SLURM 16.05.0. For most of you, all commands and scripts will work the same as before, but we are hoping to eliminate a few problems with the upgrade. Also, the new version will give us better diagnostic capabilities.
  2. Some OpenMPI, MVAPICH2, and OpenFOAM builds were built with SLURM support directly in them. These had to be rebuilt, and the module names have changed. If you use one of the modules listed at the bottom of this message, then you will have to make some changes to your submission scripts (e.g., load a different module) and may have to rebuild your programs. We took the old modules away so there can be no confusion. If you were not using a version of one of those packages with "slurm" in the suffix, you are not affected.
  3. We conducted some diagnostics and firmware updates for components of our file system to address issues related to our outage last January.
  4. We racked, cabled, and configured 24 new nodes with 28 cores and 128 GB of memory each. These will become available very soon.
  5. There were some minor internal server changes relating to how we manage the cluster.
  6. The remaining IBM blades have been permanently removed from the cluster.

Modules that have changed: (xx.xx.x changed from 15.08.3 to 16.05.0)

     mvapich2/mvapich2-2.1.0-ic-2015.3.187-slurm-xx.xx.x
     openfoam/openfoam-3.0.1-openmpi-1.8.6-ic-2015.3.187-slurm-xx.xx.x
     openfoam/openfoam-3.0.1-openmpi-1.8.6-gcc-4.9.2-slurm-xx.xx.x
     openmpi/openmpi-1.8.6-ic-2015.3.187-slurm-xx.xx.x
     openmpi/openmpi-1.8.6-gcc-4.9.2-slurm-xx.xx.x

Glenn Martin & Paul Wiegand.

Spring maintenance cycle downtime, Mon.23.May - Fri.27.May

Stokes and Newton will be taken down per our bi-annual routine maintenance cycle during mid-May. Specifically, the clusters will be unavailable from the morning of Monday, May 23rd through the evening of Friday, May 27th. You should organize your activities with this downtime in mind.

Contrary to the last few maintenance cycles, there will be very few changes that affect Stokes users. We will be upgrading SLURM, installing some new compute resources (20 new 28-core nodes), and doing some simple maintenance on the new file system.

Newton, on the other hand will be more substantially changed. Currently, nodes within Newton have a variety of co-processor resources. We will be replacing the existing co-processers such that every node has a pair of Nvidia GTX 980s. This is to make the visualization cluster more consistent, as well as to upgrade its capabilities.

Recall that we now routinely bring the system down twice a year, once in late Fall and once in late Spring. We will keep the users notified in advance of such downtimes, but we recommend you build such expectations into your workflow. Though we anticipate no data loss during this time, it's never a bad idea to backup your materials. So we suggest you use this opportunity to copy salient data and code off of Stokes prior to the downtime.

Thank you, Paul & Glenn.
About the UCF ARCC:
The University of Central Florida (UCF) Advanced Research Computing Center is managed by the Institute for Simulation and Training, with subsidies from the UCF Provost and Vice President for Research and Commercialization, for the use by all UCF faculty and their students. Collaboration with other universities and industry is also possible.
Connect with Us!
Contact Info:
UCF Advanced Research Computing Center
3039 Technology Parkway, Suite 220
Orlando, FL 32826
P: 407-882-1147
Request Help