Science Gateways for Developers and Operators
This page documents required and recommended steps for developers. For additional assistance, XSEDE provides Extended Consultation Support Services and community mailing lists to assist gateway developers and administrators.
Science Gateways can democratize access to the cyberinfrastructure that enables cutting-edge science
What is an XSEDE Science Gateway?
An XSEDE Science Gateway is a web or application portal that provides a graphical interface for executing applications and managing data on XSEDE and other resources. XSEDE science gateways are community services offered by XSEDE users to their communities; each gateway is associated with at least one active XSEDE allocation. For an overview of the steps a gateway provider must take to start an XSEDE Science Gateway, see the Gateways for PIs page.
See the Science Gateways Listing for a complete list of current operational gateways.
Science gateway developers and administrators may include PIs as well as their collaborators, staff, and students. The PI should add these team members to the XSEDE allocation; see Manage Users for more details. It is recommended that the allocation have at least one user with the Allocation Manager role, in addition to the PI.
- The PI obtains an XSEDE allocation.
- The PI adds developer and administrator team members to the allocation.
- Register the gateway.
- Request for a community account to be added to the allocation. The PI logs onto the XSEDE User Portal and selects "Community Accounts." from the My XSEDE tab.
- Add the XSEDE logo to the gateway. See https://www.xsede.org/web/guest/logos.
- Integrate the user counting scripts with the gateway's submission mechanism.
- Join the XSEDE gateway community mailing list (optional).
Building and Operating
Science gateways can be developed using many different frameworks and approaches. General issues include managing users, remotely executing and managing jobs on diverse XSEDE resources, tracking jobs, and moving data between XSEDE and the user environment. XSEDE specific issues include tracking users, monitoring resources, and tracking use of the gateway allocation. For a general overview of best practices for building and operating a science gateway, please see the material developed by the Science Gateways Community Institute, an independently funded XSEDE service provider. The Institute provides support for different frameworks that can be used to build science gateways.
XSEDE supports a wide range of gateways and does not require specific middleware; gateways can use team-developed middleware or third party provided middleware. Gateways that run jobs and access data on XSEDE resources may be hosted on the PI's local servers or directly on XSEDE resources that support persistent Web services, middleware, and databases; these include Bridges, Comet, and Jetstream.
For gateway teams that would like additional development assistance, XSEDE supports the integration of science gateways with XSEDE resources through Extended Collaborative Support Services (ECSS). ECSS support can be requested as part of an allocation request; PIs can add ECSS support to an existing allocation through a supplemental request.
Managing User Accounts
XSEDE science gateways are community provided applications. Gateway users are not required to have XSEDE accounts or allocations. XSEDE allows all users jobs to run on the gateway's community account instead. Gateways thus map their local user accounts to the gateway's single community account. XSEDE does require quarterly reporting of the number of unique users who executed jobs on XSEDE resources, as described below.
XSEDE Community Accounts
XSEDE allows science gateways that run applications on behalf of users to direct all submission requests to a gateway community user account. Designated gateway operators have direct shell access to their community account, but normal users do not. The community account simplifies administration of the gateway, since the gateway administrators have access to input and output files, logs, etc, for all their users, and users don't need to request individual gateway accounts.
A community account has the following characteristics:
- Only a single community user account (i.e., a XSEDE username/password) is created.
- The Science Gateway uses the single XSEDE community user account to launch jobs on XSEDE.
- The gateway user running under the community account has privileges to run only a limited set of applications.
Requesting a Community Account: The PI or Allocation Manager with a registered gateway can request a community account by logging on to the XSEDE User Portal and selecting "Community Accounts." from the "My XSEDE" tab. Select community accounts on all allocated resources.
Accessing Community Accounts: Administrators access community accounts through SSH and SCP using the community account username and password that is provided with the account. Community accounts cannot be accessed from the XSEDE single sign on hub.
Community Accounts on Sites with Two-Factor Authentication: Some XSEDE resources, including Stampede and Wrangler, require two-factor authentication. Gateways can request exceptions to this policy for their community accounts by contacting XSEDE Help Desk. The gateway will need to provide the static IP addresses of the server or servers it uses to connect to the resource.
Unique Science Gateway User Accounts
It is the gateway developer's responsibility, as described below, to implement gateway logins or otherwise uniquely identify users in order to track usage. These accounts can be local to the gateway and do not need to correspond to user accounts on XSEDE. The gateway maps these accounts to the gateway's common community account.
Gateways may optionally choose to use XSEDE's OAuth2-based authentication process for authentication. This is a service provided by Globus Auth. ECSS consultants are available to assist with this integration.
The XSEDE Cyberinfrastructure Integration (XCI) team has completed writing and testing the document "User Authentication Service for XSEDE Science Gateways." This is an introduction to the user authentication service that XSEDE offers for science gateway developers and operators. This service provides a user "login" function so that gateway developers don't need to write their own login code or maintain user password databases.
Connecting to XSEDE Resources
The most common type of XSEDE science gateway allows users to run scientific applications on XSEDE computing resources through a browser interface. This section describes XSEDE policies and requirements for doing this.
Gateways typically provide their users with a community-wide allocation acquired by the PI on behalf of the community. The gateway may implement internal restrictions on how much of this allocation a user can use.
If a user is consuming an excessive amount of resources, the gateway may require these "power users" to acquire their own allocations, either through the Startup or XRAC allocation process. After obtaining the allocation, the user adds the gateway community account to her/his allocation. The user's jobs still run under the community account, but the community account uses the user's, rather than the gateway PI's, allocation. This is implemented by adding the allocation string to the batch script. This is the standard
-A option for the SLURM schedulers used by many XSEDE resources; see examples for Stampede, Comet, and Bridges. Gateway middleware providers may provide this service as a feature.
Interacting with HPC Resources
Science gateways that run jobs on behalf of their users submit them just like regular users. For XSEDE's HPC resources, this means using the local batch scheduler to submit jobs and monitor them. For an overview, see the XSEDE Getting Started Guide. Gateways execute scheduler commands remotely through SSH and use SCP for basic file transfer. Gateways may choose to work with third party middleware and gateway framework providers to do this efficiently. For more information on third party software providers, consult the Science Gateways Community Institute service provider web site.
XSEDE ECSS consultants can assist gateways with HPC integration.
XSEDE Resources for Gateway Hosting
XSEDE includes resources that have special Virtual Machine (VM) and related capabilities for gateways and similar persistent services. These resources are allocated through the standard XSEDE allocation mechanisms.
- Bridges is designed for jobs that need large amounts of shared memory. It also has allocatable VMs that have access to Bridges' large shared file system. VM users can directly access scheduler command line tools to Bridge's computing resources inside their VMs.
- Comet, like Bridges, is a computing cluster with co-located Virtual Machines. Users can also request entire, self-contained Virtual Clusters that can run both the gateway services and computing jobs.
- Jetstream is an XSEDE cloud computing resource. Gateway users can get persistent VMs for use in gateway service hosting. They can also get multiple VMs configured as a Virtual Cluster with a private scheduler for running computing jobs.
Science Gateway Usage Metrics: Unique Users per Quarter
XSEDE requires all gateways to report the number of unique users per quarter who have executed jobs on XSEDE resources. This is a key metric that XSEDE in turn reports to the NSF. Compliance with this requirement justifies XSEDE's investment in the science gateway community. XSEDE collects this information through a simple script that is integrated into the job submission process. XSEDE ECSS consultants are available to assist gateway developers to do this.
The gateway_submit_attributes package provides a mechanism for collecting science gateway-supplied usernames used to run applications under community accounts on XSEDE resources. In this scenario, the gateway authenticates the external user, sets the username, and provides indirect access to the community account.
The gateway (via SSH) or the job management middleware invokes the script, gateway_submit_attributes, that writes the gateway-supplied username, the local job ID (obtained from the local resource manager), the submission time (also obtained from the local resource manager), and the submission host (configured by the local service provider) to special, restricted tables in the XSEDE Central Database (XDCDB).
The gateway_submit_attributes package provides a PERL client for integration into science gateways. The client is available on XSEDE resources under the module name "gateway-usage-reporting". After SSH'ing into the XSEDE resource, to access the client, simply run
$ module load gateway-usage-reporting
The gateway_submit_attributes script takes as input 3 command-line parameters in the format:
gateway_submit_attributes -gateway_user <email@example.com> \ -submit_time <submission_time> -jobid <jobid>
The submission_time should be in the standard ISO format of "YYYY-MM-DD HH:MM:SS TZ" like "1999-01-08 04:05:06 -8:00". For example, after submitting a job on an XSEDE resource, extract the job id, and run the gateway_submit_attributes script as follows:
sbatch mpi.job . . . Submitted batch job 4937919 gateway_submit_attributes -gateway_user firstname.lastname@example.org -submit_time "`date '+%F %T %:z'`" -jobid 4939827
Please note that for the Gordon resource, you need to use the full string returned from PBS (including the hostname) e.g.,
qsub test.sub2 2149587.gordon-fe2.local gateway_submit_attributes -gateway_user email@example.com -submit_time "`date '+%F %T %:z'`" -jobid 2149587.gordon-fe2.local
This command will submit the information to the XDCDB to a staging table where it will be later matched with AMIE accounting records coming from the site. To verify that the information is matched correctly, you can run the xdusage command with "-ja" option as shown in the example below:
xdusage -j -ja -s 2015-03-04 -e 2015-03-05 -p TG-STA110011S . . . job id= 4939827 resource=stampede.tacc.xsede submit=2015-03-04@17:47:28 start=2015-03-04@17:47:28 end=2015-03-04@17:57:37 nodecount=2 processors=32 queue=normal charge=24.89 job-attr id= 4939827 firstname.lastname@example.org email@example.com
It may take up to a day for the AMIE packets to be sent by the site and for the data to be matched as above.
In case of submission failures due to database errors, the attributes are saved in a log file in the $HOME directory:
If the file already exists, the script exits without overwriting the file. Attributes can later be resubmitted through gateway_submit_attributes log file option:
gateway_submit_attributes -f <attributes_filename>
A gateway_bulk_submit script is also provided for the convenience of gateway operators to submit/resubmit bulk of attributes or attribute log files. To resubmit all previously failed entries, simply use:
Upon successful resubmission, the corresponding log file will be renamed as $HOME/gateway_attributes_log/gateway_attributes_entry.
All submission histories are logged for future references in files named $HOME/gateway_attributes_log/history/gateway_submit_attributes-
For any questions or issues with the gateway_submit_attributes package, please contact the [XSEDE Help Desk[(mailto:firstname.lastname@example.org) and follow the standard XSEDE help desk procedure.
Security and Accounting
XSEDE has specific security and accounting requirements and recommendations for connecting to its resources to optimize your gateway for prevention and triage of security incidents or inadvertent misuse.
Security and Accounting Requirements and Recommendations
The following security and accounting steps are required.
- Required: Notify the XSEDE Help Desk immediately if you suspect the gateway or its community account may be compromised, or call the Help Desk at 1-866-907-2383.
- Required: Keep Science Gateway contact info up to date on the Science Gateways Listing in case XSEDE staff should need to contact you. XSEDE reserves the right to disable a community account in the event of a security incident.
- Required: Use the gateway_submit_attributes tool to submit gateway username with job.
Additional recommendations are as follows:
- Collect Accounting Statistics
- Maintain an audit trail (keep a gateway log)
- Provide the ability to restrict job submissions on a per user basis
- Safeguard and validate programs, scripts, and input
- Protect user passwords on the gateway server and over the network
- Do not use passwordless SSH keys.
- Perform Risk and Vulnerability Assessment
- Backup your gateway routinely
- Develop an an incident response plan for your gateway; review and update it regularly
- Put a contingency plan in place to prepare for a disaster or security event that could cause the total loss or lock down of the server
- Monitor changes to critical system files such as SSH with tripwire or samhain (open source)
- Make sure the OS and applications of your gateway service are properly patched - Run a vulnerability scanner against them such as nessus
- Make use of community accounts rather than individual accounts
These are described in more detail below in separate sections. XSEDE ECSS support staff can assist with designing and implementing best practices. The Science Gateways Community Institute service provider also provides information on best practices.
What To Do In Case of a Security Incident
Whether a threat is confirmed or suspected, quick action and immediate communication with XSEDE Security Working Group is essential. Please contact the XSEDE Help Desk immediately at 1-866-907-2383.