8/22/11 4:08 PM
Hello Everyone, I am a new Champus Champion on a campus where there are no resources available for large dataset computing. We are utilizing R programming on Kraken, and i was wondering what other resources are available than utilizing R. I know there are many of them, I was just trying to get a list.

8/23/11 8:42 PM as a reply to Benjamin Leon Garlington.
You can use the software search mechanism to find most of the sites that have R installed (although you could install it yourself with assistance from the site if it isn't already installed. Select "starts with" since there are a large number of software packages that "contains" R in them.

Steele doesn't yet populate this so it won't show up but it also has R installed and I have directions on how to install the optional RMPI module if you would need that as well.

Depending on what you want to do with R, how large your data set is, how much I/O is involved, how many processors you will use (if you use RMPI) and how long your job needs to run to finish, not all of the sites may be a good match even if R is installed.

Trestles might be a good match for you because they have flash memory for I/O which is supposed to speed up data I/O and the system is intended for data-intensive jobs.

If your job can only use a single CPU and needs a lot of memory, Blacklight might be a better choice since it is a shared memory machine,

Nautilus (at NICS like Kraken) also has R and is a shared memory machine like Blacklight and might also be a good choice if you need large, shared memory.

If you would like some additional assistance with selecting a resource, you can contact me directly and we can discuss the more specific requirements for the R jobs you wish to run.

I am here as part of the Campus Champion technical team as well as being a Campus Champion myself for Purdue University, so don't hesitate to contact me if you would like further assistance.

Kim Dillman
