Gateways for Developers
An XSEDE Science Gateway is a web or application portal sponsored by a principal investigator (PI) who has an allocation to use data storage and compute resources provided by XSEDE. The gateway provides access to tools customized to meet the needs of a specific community of researchers and is connected to XSEDE resources. For an overview of the steps a PI must take to start an XSEDE Science Gateway, see the Gateways for PIs section.
Gateway Portal Developer Role
A portal developer or development team may be brought into an XSEDE Science Gateway project during the initial planning by the PI before any decisions have been made for implementationor may join the project after many requirements have been defined and an allocation has already been obtained. If the developers are involved early in the planning process, they can make valuable contributions to decisions that will affect the community in the future.
An XSEDE Science Gateway developer will want to incorporate recommended best practices described below, as well as meeting specific requirements to fulfill XSEDE standards:
Building the Gateway
Types of Gateways
The first steps in creating a gateway include deciding on the interface type and about the services to which a gateway will connect on the backend. You may choose to build a web-based user interface or a desktop application that is installed directly on the end-users' workstations. On the back end, the gateway may connect to only XSEDE services, or it may serve as a bridge to both XSEDE services and other community grids.
Best Practices for Planning and Design
The practices below apply to the design of any web application, and they are worth mentioning here, so that the ease of use of your gateway is considered along with its scientific objectives.
- Create a precise list of requirements your gateway must meet
- Choose technologies based on resources and time
- Select a development team with user interface (UI) experience
- Plan for the long term, i.e., for the lifetime of the gateway
- Use formal design principles to avoid confusing presentation:
- Structured layouts
- Focused and uncluttered user interface
- Easy identification of information categories and relationships
- Develop in stages
- Involve end-users in the design process
- Use mockups to perform usability testing
Desirable Gateway Characteristics
Gateway characteristics are the intrinsic qualities of the portal technologies that will lead to a robust and maintainable system.
- Universal, secure access
- Airtight security
- Based on Open Standards (JSR 168/236, OGSA, etc.)
- Modular, reusable design (use portlets)
- Technologies with a rich API/Abstraction Layer
- Platform independence (web, Java, XML, etc.)
- Ease of integration into existing infrastructure
- Use of commodity software
Software and Sample Codes
Gateway developers have contributed their recommendations for software that they have found helpful in for developing gateways and connecting them to XSEDE resources.
- Developer-recommended Software for TeraGrid Science Gateways (TeraGrid Wiki)
Connecting to XSEDE
To address scalability issues, many gateways provide access to XSEDE resources through a community account rather setting up unique XSEDE accounts for each gateway user.
A community account has the following characteristics:
- Only a single community user account (i.e., a XSEDE username/password) is created.
- The Science Gateway uses the single XSEDE community user account to launch jobs on XSEDE.
- The gateway user running under the community account typically has privileges to run only a limited set of applications.
The chief difference between an individual and a community account is that a community account is essentially a single username on XSEDE shared by many (human) users. While this eliminates the need for individual gateway users to request their own XSEDE accounts, it places additional accounting and security burdens on the gateway developers. To distinguish one gateway user from another, the gateway developer has to institute a user registry and gateway authentication mechanism.
The gateway developer may create individual logins to the gateway itself. However, after logging in, the user will be unaware that they are running applications on XSEDE through a shared, community account. Because the gateway maintains control of the XSEDE allocation, it is the gateway PI who is responsible for ensuring that the NSF computational resources are used in a manner consistent with policies and that reasonable attempts and tools have been installed to ensure appropriate usage, including monitoring for all usage of the gateway by the community. Developers will want to develop usage tracking mechanisms that allow them to attribute XSEDE resource usage to individual gateway users in the case of a security incident.
A gateway may wish to distinguish between identification mechanisms and capabilities for different types of users. For example, lightweight identification mechanisms may be appropriate for K-12 users making small demonstration runs. More substantial identification and justification for resources may be required for senior researchers using large fractions of a gateway allocation. XSEDE Service Provider sites may choose to restrict community accounts in a variety of ways, for example chroot jails or non-shell, role-based accounts which allow the execution of only selected commands or commands located in specific directories.
If a gateway provides services for use by individuals with their own allocations, users may be able to upload their own credentials and make use of gateway tools, but charge the runs on XSEDE resources to their own individual allocation.
Security and accounting requirements are below.
To request a community account, the PI can log on to the XSEDE User Portal and select "Community Accounts." from the My XSEDE tab.
Connecting to HPC Resources
- Job Submission for Science Gateways - There are many standard mechanisms for launching compute jobs described in the Basic Job Submission section of the How do I use XSEDE? guide. For GRAM 5 information, see the TeraGrid Wiki GRAM 5 Testing pages.
- XSEDE Resource Catalog (XSEDE User Support)
Data Resources and File Spaces
Data storage on XSEDE is categorized by its purpose and its location.
- Allocated storage space for archiving on disk or on tape
- Temporary or long-term storage associated with a compute allocation
- File space for sharing libraries and codes, either through the PIs home directory, a community account home directory, or a community software area that is available by special request
The XSEDE OAuth for MyProxy service (oa4mp.xsede.org) provides an OAuth 2.0 OIDC (OpenID Connect) compliant interface for XSEDE science gateways to authenticate users to and optionally obtain user certificates on behalf of XSEDE users on active XSEDE allocations. The user certificate allows science gateways to act on the users' behalf, i.e., performing GridFTP transfers and GRAM job submissions to the XSEDE user's account. More information about this service is provided in a TeraGrid 2011 paper (while this document describes the OAuth 1.0a protocol, many basic elements are applicable to the new OAuth 2.0 OIDC service as well): http://dx.doi.org/10.1145/2016741.2016776
On successful authentication of a user, Clients will, by default, receive the "sub" claim containing the XSEDE username of the user as part of ID Token. Clients can additionally request the "xsede" scope to receive the following additional info in OIDC claims as part of ID Token and userinfo:
given_name, middle_name, family_name, email, xsedeHomeOrganization.
All new clients are strongly encouraged to use the OAuth2 OIDC protocol. Current OAuth1.0a clients are encouraged to migrate to the OAuth2 OIDC protocol.
The MyProxy project provides an OAuth for MyProxy OAuth2.0 OIDC Java client for integration into science gateways. Documentation for this client, including download instructions, is provided at: http://grid.ncsa.illinois.edu/myproxy/oauth/client/
Client configuration info, including an example configuration can be found at: http://grid.ncsa.illinois.edu/myproxy/oauth/client/manuals/parameters.xhtml
Be sure to follow the instructions for the OAuth 2.0 version, rather than the (older) OAuth 1.0a version.
Registration info/link can be found at https://oa4mp.xsede.org/oauth2/
For the XSEDE OAuth for MyProxy service instance, the client configuration should include:
Make sure to replace the client secret and "id" with correct values obtained at the time of client registration.
In case science gateway developers want to use their own OAuth 2.0 OIDC implementation with the OAuth for MyProxy service, rather than the Java client provided above, the OAuth 2.0 OIDC compliant protocol for the OAuth for MyProxy service is specified at: https://docs.google.com/a/cilogon.org/document/d/1cs3peO9FxA81KN-1RC6Z-auEFIwRbJpZ-SFuKbQzS50/edit?usp=sharing
This specification document lists the OAuth endpoints, signature method, and message specification for interacting with the XSEDE OAuth for MyProxy service; make sure to substitute oa4mp.xsede.org for myproxy.example.edu.
Operations and Maintenance Practices
Once your gateway is operational, good operations and maintenance practices ensure continued, optimum integration with XSEDE resources.
- Implement new technologies as needed to keep gateway up to date
- Monitor filesystem usage
- Monitor job load
- Keep content current and relevant
- Keep security and accounting functionalities current with XSEDE requirements
- Back up your gateway routinely
- Make sure your OS and applications are properly patched
- Put a contingency plan in place for complete server loss or security incident
- Keep logs and contact the XSEDE Help Desk for troubleshooting
All gateway developers will want to build their gateways using best practices for portal and web application development. In addition, accurate accounting practices will provide statistics to help justify requests by the gateway PI subsequent proposals.
Security and Accounting for XSEDE gateways
XSEDE has specific security and accounting requirements and recommendations for connecting to its resources to optimize your gateway for prevention and triage of security incidents or inadvertent misuse. In addition, accurate accounting practices will provide statistics to help justify requests by the gateway PI subsequent proposals.
Security and Accounting Requirements and Recommendations
- Required: Notify the XSEDE Help Desk immediately if you suspect the gateway or its community account may be compromised or call the helpdesk at 1-866-907-2383 XSEDE reserves the right to disable a community account in the event of a security incident.
- Required: Keep Science Gateway contact info up to date on the gateway list in case XSEDE staff should need to contact you
- Required: Institute a user registry
- Devise a credential management strategy
- Required: Use gateway_submit_attributes tool to submit gateway username with job
- Collect Accounting Statistics
- Maintain an audit trail (keep a gateway log)
- Provide the ability to restrict job submissions on a per user basis
- Safeguard and validate Programs, scripts, and input
- Protect passwords locally and over the network
- Use proper precautions for passwordless ssh keys (not recommended, if they are stolen anyone can use them)
- Perform Risk and Vulnerability Assessment
- Backup your gateway routinely
- Develop an an incident response plan for your gateway; review and update it regularly
- Put a contingency plan in place to prepare for a disaster or security event that could cause the total loss or lock down of the server
- Monitor changes to critical system files such as SSH with tripwire or samhain (open source)
- Make sure your OS and applications are properly patched - Run a vulnerability scanner against them such as nessus
- Make use of community accounts rather than individual accounts
What to Do in a Security Incident
Whether a threat is confirmed or suspected, quick action and immediate communication with XSEDE Security Working Group is essential. Please contact the XSEDE Help Desk immediately at 1-866-907-2383
Make Use of Community Accounts
Community Accounts are described at length on the main page of the developers section; they are the most common account strategy for XSEDE Science Gateways. Community Accounts present specific security and accounting challenges for the gateway developer, because many end users share access to XSEDE resources through the shared account. Consequently, the developer will typically want to restrict privileges so that it can run a limited set of applications.
To distinguish one gateway user from another, the gateway developer has to institute a user registry and gateway authentication mechanism. A gateway may wish to distinguish between identification mechanisms and capabilities for different types of users. For example, lightweight identification mechanisms may be appropriate for K-12 users making small demonstration runs. More substantial identification and justification for resources may be required for senior researchers using large fractions of a gateway allocation. XSEDE Service Provider sites may choose to restrict community accounts in a variety of ways, for example chroot jails or non-shell, role-based accounts which allow the execution of only selected commands or commands located in specific directories. Some of the techniques for managing community accounts are described below.
Institute a User Registry
Science gateways must implement a user registry that contains contact information for all users accessing XSEDE resources through the gateway. Gateways may provide access to their XSEDE allocated resources for demonstration or class accounts. In cases such as these, capabilities would be very limited.
Collect resource usage information for each registered user.
Provide the ability to restrict job submissions on a per user basis. This is optional, but can protect the gateway from the shut down of an entire community account in the event of a security incident. Furthermore, Identification of researchers who have benefited from the services offered by the Gateway will be a fundamental part of future requests requesting XSEDE resources and may also be useful to the Gateway's own funding efforts.
Devise a Credential Strategy
Gateways require X.509 credentials for accessing XSEDE resources securely. Developers need to plan a strategy to fit their credential management scenarios. Users access XSEDE resources via science gateways using either individual credentials (i.e., issued to a single user who is known to the XSEDE Central Database) or community credentials (i.e., issued to the gateway which is responsible for per-user tracking). Individual credentials allow XSEDE service providers (SPs) to track per-person resource usage using standard account-based techniques, based on the XSEDE allocations process. Community credentials provide a more scalable approach, allowing user registration to be outsourced to the gateway, but XSEDE SPs still require the ability to track per-person resource usage for accounting and security reasons.
Gateways can combine both approaches, allowing registered XSEDE users to access resources with their individual credentials and others to access resources via a community credential. For more about scenarios and strategies, see the Science Gateway Credential Management in the XSEDE Wiki.
Maintain An Audit Trail: Keep a Gateway Log
Keeping an audit trail enables traceback to a user engaged in abusive or suspicious activity or a serious security breach. The audit trail consists of a log of all user login and job activity. The audit trail is now being managed by attribute-based authentication, which sends expanded information, including unique user identifiers, in job submission records. These records will be recorded in the XSEDE Central Database (TGCDB) and available for individual management of security. See Attribute-based Authentication below.
Log Authentication and Authorization Activity
All login activity, including attempts, successful logins, and further authorization should include the following data:
- the requesting IP address
- date stamp (Universal Time Code)
Map User Activity to Jobs Run on XSEDE Resources
With the adoption of attribute-based authentication, it will no longer be necessary to maintain job information; however, it may be helpful in policing your gateway in case of an infraction in security.
- GRAM job handle (which usually looks like https://<execHost>:49xxx//xxxxx/xxxxxxxx) for each job submitted by a gateway user for Globus-based gateways. The mapping between this job handle and the scheduler local jobID on the XSEDE compute resource will be provided via an auditing Web Service
- RSL for each job for GRAM jobs.
Non-GRAM job submissions:
- the remote command(s) executed via SSH, scripts, Web Services, or some other method to initiate a job on a XSEDE compute resource.
- scheduler local jobID to gateway user mapping
For both GRAM and non-GRAM:
- XSEDE resource to which the job was submitted.
- Application(s) launched (especially for canned apps).
- Timestamp for each job launch and termination.
Safeguard and Validate Programs, Scripts, and Input
Developers will want to consider the security of codes running on XSEDE resources. To the extent possible, gateways must implement safeguards (to be determined jointly with the XSEDE gateways and Security Working Groups) to protect against intentional or unintentional abuse by programs, scripts, and input. For example, input entered through a web form must not cause buffer overflows when a code is run using the resulting input files. Gateways should not let anonymous users upload executable files.
Perform Risk and Vulnerability Assessment
The Gateways Risk and Vulnerability assessment are designed to provide the information needed by XSEDE for incident prevention and response and to evaluate security threats, to suggest possible mitigations, and to determine whether the unmitigated risk can be accepted as a cost of doing business. For more information please see the Vulnerability Assessment spreadsheet from the Wiki.
Protect Portal Passwords Locally and Over the Network
Gateways should implement a strong password enforcement mechanism for its users. They must also use SSL or some other encryption mechanism to protect gateway user passwords from being transmitted in the clear.
Collect Accounting (and Other) Statistics
XSEDE must report several important metrics about the gateways it supports:
Gateways must record each gateway user's CPU usage on a quarterly basis. XSEDE is required to report these figures to the NSF, but the information will also be useful for your project reports, requests for future allocations, and for the Security or Accounting Working Groups should they need to investigate abuse or overuse. Once your gateway implements attribute-based authentication, this will be done automatically by XSEDE. Science successes may be more challenging for a gateway to collect because of the disparate nature of the user community. Nevertheless, science success due to the use of the gateway are important, both for XSEDE resource requests and likely for gateway funding requests as well. Published papers, citations and science successes can provide useful supporting information.
The gateway_submit_attributes package provides a mechanism for collecting science gateway-supplied usernames used to run applications under community accounts on XSEDE resources. In this scenario, the gateway authenticates the external user, sets the user name, and provides indirect access to the community account.
The gateway (via GSI-SSH) or the job management middleware (Globus GRAM or UNICORE) invoke the script, gateway_submit_attributes, that writes the gateway-supplied username, the local job ID (obtained from the local resource manager), the submission time (also obtained from the local resource manager), and the submission host (configured by the local service provider) to special, restricted tables in the XSEDE Central Database (XDCDB).
Science Gateway User Count: Perl Client
The gateway_submit_attributes package provides a PERL client for integration into science gateways. The client is available on XSEDE resources under the module name "gateway-usage-reporting". After ssh'ing into the XSEDE resource, to access the client, simply run
module load gateway-usage-reporting
The gateway_submit_attributes script takes as input 3 command-line parameters in the format:
> gateway_submit_attributes -gateway_user <email@example.com> \
-submit_time <submission_time> -jobid <jobid>
The submission_time should be in the standard ISO format of
"YYYY-MM-DD HH:MM:SS TZ" like
"1999-01-08 04:05:06 -8:00". For example, after submitting a job on an XSEDE resource, extract the job id, and run the gateway_submit_attributes script as follows:
> sbatch mpi.job
Submitted batch job 4937919
> gateway_submit_attributes -gateway_user firstname.lastname@example.org \
-submit_time "`date '+%F %T %:z'`" -jobid 4939827
Please note that for the Gordon resource, you need to use the full string returned from PBS (including the hostname) e.g.,
> qsub test.sub2
> gateway_submit_attributes -gateway_user email@example.com \
-submit_time "`date '+%F %T %:z'`" -jobid 2149587.gordon-fe2.local
This command will submit the information to the XDCDB to a staging table where it will be later matched with AMIE accounting records coming from the site. To verify that the information is matched correctly, you can run the xdusage command with "-ja" option as shown in the example below:
$ xdusage -j -ja -s 2015-03-04 -e 2015-03-05 -p TG-STA110011S
job id= 4939827 resource=stampede.tacc.xsede
end=2015-03-04@17:57:37 nodecount=2 processors=32 queue=normal charge=24.89
job-attr id= 4939827 firstname.lastname@example.org email@example.com
It may take up to a day for the AMIE packets to be sent by the site and for the data to be matched as above.
In case of submission failures due to database errors, the attributes are saved in a log file in the $HOME directory:
If the file already exists, the script exits without overwriting the file. Attributes can later be resubmitted through gateway_submit_attributes log file option:
gateway_submit_attributes -f <attributes_filename>
A gateway_bulk_submit script is also provided for the convenience of gateway operators to submit/resubmit bulk of attributes or attribute log files. To resubmit all previously failed entries, simply use:
Upon successful resubmission, the corresponding log file will be renamed as
$HOME/gateway_attributes_log/gateway_attributes_entry.<job_id>.<submit_time>.delete and can be deleted.
All submission histories are logged for future references in files named
$HOME/gateway_attributes_log/history/gateway_submit_attributes-<date>.log and users are encouraged to keep such logs at least for 90 to 120 days for auditing purposes.
How Gateways Can Manage Jobs as Specific Users
Gateways typically submit jobs on behalf of their users using a common community account. However, it is sometimes desirable for gateways to execute remote commands on behalf of the user as that user. This can be done using XSEDEÂ¹s Oauth for MyProxy service. The following is a short summary of how the service works:
Migrating from JGlobus
For many years, the JGlobus software provided a Java client API for Globus Toolkit services, covering GRAM, GridFTP, MyProxy, and Grid Security Infrastructure (GSI) functionality. JavaÂ-based science gateways used JGlobus to interface with XSEDE (and previously TeraGrid). However, JGlobus is now deprecated and not supported by the Globus project or by XSEDE. Science Gateways still using JGlobus 1.x, which relies on outdated PureTLS libraries for security, must promptly migrate to alternative options for support of current security algorithms including SHAÂ2 certificates. Please read the Migrating From JGlobus guide to migrate your science gateways.