News


Here is all the latest news from the UCF ARCC:


Stokes Spring maintenance downtime, May 9 - May 17

Stokes will be taken down per our bi-annual routine maintenance cycle during the second week of May. Specifically, the cluster will be taken down in the morning of Saturday, May 9th and returned to operation in the evening of Sunday, May 17th. You should organize your activities with this downtime in mind.

This down time will focus on hardware related changes in our machine room. Please be aware that in order to address end-of-life issues with aging infrastructure, we will be retiring nodes ec1-ec98 and the DDR IB switch on which they reside. This will represent about a 25% reduction in the number of computational cores available on Stokes in the short-term. However, there is good news! During the downtime, the main purpose is to re-arrange our machine room to make space for new equipment and compute capacity, which will arrive during the Summer.

Recall that we now routinely bring the system down twice a year, once in late Fall and once in late Spring. We will keep the users notified in advance of such downtimes, but we recommend you build such expectations into your workflow. Though we anticipate no data loss during this time, it's never a bad idea to backup your materials. So we suggest you use this opportunity to copy salient data and code off of Stokes prior to the downtime.

Third HPC User Group of Orlando (HUGO) Meeting

The third HPC User Group of Orlando (HUGO) meeting will be held in room 233 of the Partnership 3 building on Tuesday, February 24th at 3p. Alex Balaeff will give a talk about how he makes use of HPCs, including Stokes, for his research. In addition, Paul Wiegand will give a short talk about how to use Moab tools like "showq" and "checkjob" to diagnose problems with a job on Torque/Moab based systems such as Stokes. There will be some refreshments provided.

Please consider attending and share this with relevant faculty, students, and industry researchers and engineers.

Changes due to Stokes Fall 2014 Maintenance Cycle

Greetings Stokes users,

Stokes has returned to operation!  From this point forward, we will have two such maintenance downtimes per year, one in late Fall (the one we just completed) and one in late Spring.

There are several substantive changes, and we expect there will be a few unanticipated issues to resolve.  Please remain patient with us as we iron these out — and also, let us know about issues you are having by sending requests to req...@ist.ucf.edu (Click on "..." to reveal the address).

Changes include the following:

  1. The login and supporting nodes, as well as all compute nodes, have been upgraded to CentOS 7.0.  Now all nodes in the cluster are running the same OS and have the same base libraries in local directories.
  2. Software and libraries for Stokes that we (the Stokes administrators) build can be found in /apps.  All software built and compiled by users or groups that used to be in /apps have been moved (more on this below in item #7).  There are no longer any user or group related directories in /apps.
  3. Because there was a substantive upgrade in the operating system, applications and libraries in /apps had to be re-built.  When appropriate and possible, we also upgraded these to their most recent versions.  In most cases, we did not re-build old versions of software.  You can use “module avail” to see what is available currently.You will notice that some software that used to be present no longer appears.  We have noticed that there was a lot of software that did not appear to be in use, so in some cases we chose not to rebuild the software.  Moreover, we are still in the process of rebuilding some software, and modules will continue to appear as we complete these.  If there’s something missing on which you rely, please let us know ASAP so we can prioritize it.
  4. Software and module naming conventions have been standardized.  Because of this, and because versions are sometimes different, module names have changed.  You should remove all references to old modules from your scripts and .bashrc files in favor of the newer names.  If you issue the command “module avail”, you will see all the new modules currently available.Users that have lines such as the following in their .bashrc file should remove those lines:
    . /usr/local/Modules/3.2.9/init/bash
    module load modules tools/torque moab/7.1.1
  5. All user files have been consolidated under user home directories, and all user home directories are now on fs1.  There are no criss-crossing links between the two file systems anymore.  If you had a link to a “backup” or “work” directory somewhere else in the file system, those should now appear in your home directory as subdirectories with the name “backup” and/or “work”, respectively.
  6. Where possible, we’ve tried to make the account names in our resource manager (the account to which you charge your compute hours) consistent with the default unix group — using a consistent nomenclature.  For most users this will mean no change; however, a few users will notice that their default group has a different name.  For the vast majority of users, their default group has the same name as their PI’s account name.  Let us know if there is any confusion.
  7. Group locations have been provided on fs0 and are accessible at the directory /groups/<groupname>.  Software directories that had been in /apps have been moved to /groups/<groupname> for the appropriate groups, as well as any group-level materials that had been in /home or in /gpfs/fs<num>/work.  We tried to name these subdirectories in such a way as to make it clear where it came from (e.g, if “apps” is in the name, it came from /apps; if “fs0″ is in the name, it came from fs0, etc.).*** Note that you will have to rebuild your software because of the OS upgrades. ***
  8. Quotas limitations have changed.  The default user level quotas on fs1 have been increased to 250 GB of space per user and 115K files.  There are no group level quotas on fs1 any more.  The default group level quotas on fs0 are 100 GB per group and 115K files.
  9. Our scheduler and resource manager (Moab & Torque) have been upgraded.  Among other things, this means that we should be able to handle reservations properly now. All commands and submit script syntax is the same as before.
  10. Our previous version of Matlab is gone.   There is now an upgraded version of Matlab, but it is seat-limited.  This was necessary so that we could be compliant with licensing.  This means that users should no longer run Matlab jobs directly, but should instead use Matlab to compile their code and execute the compiled binary.  Instructions on how to do this will be forthcoming for those that need it.

Because the time since the last maintenance downtime was so long, there were many things that had to get done this time around.  We’re hoping that by having regular, bi-annual maintenance times future such efforts will not involve as many changes.

Note:  Ganglia is not currently available; however, we will notify you when it is back up (hopefully today).

HUGO Meeting (October 2014)

This is a reminder that the second HPC User Group of Orlando (HUGO) meeting will be held in the third floor collaboration area of Partnership 2 at 3p on October 28, 2014. Gonzalo Vaca-Castano will give a talk about his experience using HPC resources for high throughput oriented tasks, and Matt Bilskie will give a talk about his experiences using HPC resources for distributed processing oriented tasks. Rootwork InfoTech LLC will supply the snacks. Please consider attending and share this with relevant faculty, students, and industry researchers and engineers. Partnership II 3100 Technology Pkwy Orlando, FL https://map.ucf.edu/locations/8119/partnership-ii/

More Articles ...

About the UCF ARCC:
The University of Central Florida (UCF) Advanced Research Computing Center is managed by the Institute for Simulation and Training, with subsidies from the UCF Provost and Vice President for Research and Commercialization, for the use by all UCF faculty and their students. Collaboration with other universities and industry is also possible.
Connect with Us!
Contact Info:
UCF Advanced Research Computing Center
3039 Technology Parkway, Suite 220
Orlando, FL 32826
P: 407-882-1147
Request Help