Last update: April 2, 2020
XSEDE provides resources, software, training, and consulting that enable data analysis related activities such as:
Infrastructure Resources
All XSEDE high-performance computing, high-throughput computing, storage, visualization, and cloud infrastructure resources can be used for data analysis. Specific features that that are especially useful for data analysis include:
- GPUs for accelerated data analysis using AI and machine learning techniques
- Data storage resources
Visit the XSEDE Systems Monitor to find GPUs, storage, and other useful features.
Software and Services
Software and services that are especially useful for data analysis include the following.
For data transfer and managing data:
- Managed data transfer with Globus
- Command-line data upload and download
- Data Integrity and Validation
For large-scale data analysis:
For data collections:
To discover data analysis related software and services on XSEDE allocated HPC resources:
- Visit the User Portal software search
To discover data analysis related software and services available in any form (installable packages, cloud images, containers, SaaS, etc.) on XSEDE allocated resources, XSEDE un-allocated resources, and non-XSEDE resources:
Some useful RSP searches:
- Data Analysis software on compute resources
- Data Analysis cloud images
- Hadoop software availability
- Spark software availability
To discover XSEDE integrated Science Gateways that may provide domain specific data analysis capabilities:
- Visit the XSEDE Science Gateways List
Training
The XSEDE training page includes data analysis related training on the Matlab and R software tools.
The Cornell Virtual Workshops covers data analysis topics including: Python, Matlab, R, HDF5, Large Data Visualization, ParaView, VisIt, Relational Databases, and MapReduce.
Consulting and Extended Support
Users needing assistance finding and using data analysis related resources, software, services, training, or needing lightweight consulting may contact the XSEDE help desk at help@xsede.org. The Novel and Innovative Projects (NIP) team can provide this type of lightweight assistance. XSEDE allocated projects needing more in depth consulting on data preparation, data flows, data analytics, data locality, data visualization and other data techniques can request Extended Collaborative Support Services (ECSS) assistance. The Novel and Innovative Projects (NIP) area of ECSS provides lightweight consulting to assist in this process. This can be requested via help@xsede.org.
Improving this document
If you have additions, corrections, or other suggestions to improve this documentation please send them to help@xsede.org.