HUGO, September 22 2015

The first meeting of the HPC User Group of Orlando (HUGO) this semester will be Tuesday, September 22 from 3p-4:30p in Partnership III room 233. We will discussion what computational resources are available at the national level via the XSEDE program, and how one goes about applying for access to those resources. The UCF Advanced Research Computing Center will provide refreshments.

Partnership III
3039 Technology Pkwy
Orlando, FL

Stokes Returns to Service After Spring Downtime

We are pleased to report that the Stokes HPC is back up and running. We appreciate your patience with our scheduled downtime and would like to take this opportunity to remind you that we now have regular maintenance down times at least twice a year (one in late Fall and one in late Spring). This year, we may have another short downtime at the end of Summer, depending on equipment purchases.

While last Fall's maintenance cycle focused on OS and software changes, this cycle concentrated on hardware configuration changes. Consequently, there are very few changes that will affect the way Stokes users access the system. The following is a summary of the changes.

  1. The IBM DDR leaf of the HPC was removed. This means that the blades ec1-ec98 are permanently removed from the system. We now have just over 2,800 cores available, and there are no more 8-core blades. This was necessary for two reasons. First, much of the equipment on that leaf of the HPC was reaching end-of-life and beginning to fail (including the DDR IB switch). Second, we needed to make room in our machine room for new purchases at the end of the Summer. In the short term, Stokes has about 25% fewer cores than it had a week ago -- expect higher utilization and somewhat longer queue times.
  2. The main login node was replaced with a larger machine that has more cores and more memory. This allowed us to relax some of the ulimit constraints on users on the login node.
  3. The web server node has been replaced. This should improve the responsiveness of our website,
  4. The amount of memory registrable by the IB driver on each blade was marginally increased. This should not affect most people, but will hopefully relieve some of the MPI warnings some users get when using OpenMPI and sending very large message sizes.

Stokes Spring maintenance downtime, May 9 - May 17

Stokes will be taken down per our bi-annual routine maintenance cycle during the second week of May. Specifically, the cluster will be taken down in the morning of Saturday, May 9th and returned to operation in the evening of Sunday, May 17th. You should organize your activities with this downtime in mind.

This down time will focus on hardware related changes in our machine room. Please be aware that in order to address end-of-life issues with aging infrastructure, we will be retiring nodes ec1-ec98 and the DDR IB switch on which they reside. This will represent about a 25% reduction in the number of computational cores available on Stokes in the short-term. However, there is good news! During the downtime, the main purpose is to re-arrange our machine room to make space for new equipment and compute capacity, which will arrive during the Summer.

Recall that we now routinely bring the system down twice a year, once in late Fall and once in late Spring. We will keep the users notified in advance of such downtimes, but we recommend you build such expectations into your workflow. Though we anticipate no data loss during this time, it's never a bad idea to backup your materials. So we suggest you use this opportunity to copy salient data and code off of Stokes prior to the downtime.

Third HPC User Group of Orlando (HUGO) Meeting

The third HPC User Group of Orlando (HUGO) meeting will be held in room 233 of the Partnership 3 building on Tuesday, February 24th at 3p. Alex Balaeff will give a talk about how he makes use of HPCs, including Stokes, for his research. In addition, Paul Wiegand will give a short talk about how to use Moab tools like "showq" and "checkjob" to diagnose problems with a job on Torque/Moab based systems such as Stokes. There will be some refreshments provided.

Please consider attending and share this with relevant faculty, students, and industry researchers and engineers.

