Gateways for Developers

An XSEDE Science Gateway is a web or application portal sponsored by a principal investigator (PI) who has an allocation to use data storage and compute resources provided by XSEDE. The gateway provides access to tools customized to meet the needs of a specific community of researchers and is connected to XSEDE resources. For an overview of the steps a PI must take to start an XSEDE Science Gateway, see the Gateways for PIs section.

Gateway Portal Developer Role

A portal developer or development team may be brought into an XSEDE Science Gateway project during the initial planning by the PIâ€"before any decisions have been made for implementationâ€"or may join the project after many requirements have been defined and an allocation has already been obtained. If the developers are involved early in the planning process, they can make valuable contributions to decisions that will affect the community in the future.

An XSEDE Science Gateway developer will want to incorporate recommended best practices described below, as well as meeting specific requirements to fulfill XSEDE standards:

Building the Gateway

Types of Gateways

The first steps in creating a gateway include deciding on the interface type and about the services to which a gateway will connect on the backend. You may choose to build a web-based user interface or a desktop application that is installed directly on the end-users' workstations. On the back end, the gateway may connect to only XSEDE services, or it may serve as a bridge to both XSEDE services and other community grids.

Best Practices for Planning and Design

The practices below apply to the design of any web application, and they are worth mentioning here, so that the ease of use of your gateway is considered along with its scientific objectives.

  • Create a precise list of requirements your gateway must meet
  • Choose technologies based on resources and time
  • Select a development team with user interface (UI) experience
  • Plan for the long term, i.e., for the lifetime of the gateway
  • Use formal design principles to avoid confusing presentation:
    • Structured layouts
    • Focused and uncluttered user interface
    • Easy identification of information categories and relationships
  • Develop in stages
  • Involve end-users in the design process
  • Use mockups to perform usability testing
Desirable Gateway Characteristics

Gateway characteristics are the intrinsic qualities of the portal technologies that will lead to a robust and maintainable system.

  • Universal, secure access
  • Airtight security
  • Based on Open Standards (JSR 168/236, OGSA, etc.)
  • Modular, reusable design (use portlets)
  • Technologies with a rich API/Abstraction Layer
  • Platform independence (web, Java, XML, etc.)
  • Ease of integration into existing infrastructure
  • Use of commodity software
  • Extensibility
  • Maintainability
  • Scalability
Software and Sample Codes

Gateway developers have contributed their recommendations for software that they have found helpful in for developing gateways and connecting them to XSEDE resources.

Connecting to XSEDE

Community Accounts

To address scalability issues, many gateways provide access to XSEDE resources through a community account rather setting up unique XSEDE accounts for each gateway user.

A community account has the following characteristics:

  • Only a single community user account (i.e., a XSEDE username/password) is created.
  • The Science Gateway uses the single XSEDE community user account to launch jobs on XSEDE.
  • The gateway user running under the community account typically has privileges to run only a limited set of applications.

The chief difference between an individual and a community account is that a community account is essentially a single username on XSEDE shared by many (human) users. While this eliminates the need for individual gateway users to request their own XSEDE accounts, it places additional accounting and security burdens on the gateway developers. To distinguish one gateway user from another, the gateway developer has to institute a user registry and gateway authentication mechanism.

The gateway developer may create individual logins to the gateway itself. However, after logging in, the user will be unaware that they are running applications on XSEDE through a shared, community account. Because the gateway maintains control of the XSEDE allocation, it is the gateway PI who is responsible for ensuring that the NSF computational resources are used in a manner consistent with policies and that reasonable attempts and tools have been installed to ensure appropriate usage, including monitoring for all usage of the gateway by the community. Developers will want to develop usage tracking mechanisms that allow them to attribute XSEDE resource usage to individual gateway users in the case of a security incident.

A gateway may wish to distinguish between identification mechanisms and capabilities for different types of users. For example, lightweight identification mechanisms may be appropriate for K-12 users making small demonstration runs. More substantial identification and justification for resources may be required for senior researchers using large fractions of a gateway allocation. XSEDE Service Provider sites may choose to restrict community accounts in a variety of ways, for example chroot jails or non-shell, role-based accounts which allow the execution of only selected commands or commands located in specific directories.

If a gateway provides services for use by individuals with their own allocations, users may be able to upload their own credentials and make use of gateway tools, but charge the runs on XSEDE resources to their own individual allocation.

Security and accounting requirements are below.

To request a community account, the PI can log on to the XSEDE User Portal and select "Community Accounts." from the My XSEDE tab.

Connecting to HPC Resources
Data Resources and File Spaces

Data storage on XSEDE is categorized by its purpose and its location.

  • Allocated storage space for archiving on disk or on tape
  • Temporary or long-term storage associated with a compute allocation
  • File space for sharing libraries and codes, either through the PIs home directory, a community account home directory, or a community software area that is available by special request
OAuth MyProxy Services

The XSEDE OAuth for MyProxy service (oa4mp.xsede.org) provides an OAuth 1.0a compliance interface for XSEDE science gateways to obtain user certificates on behalf of XSEDE users on active XSEDE allocations. The user certificate allows science gateways to act on the users' behalf, i.e., performing GridFTP transfers and GRAM job submissions to the XSEDE user's account. More information about this service is provided in a TeraGrid 2011 paper: http://dx.doi.org/10.1145/2016741.2016776

Java Client

The MyProxy project provides an OAuth for MyProxy Java client for integration into science gateways. Documentation for this client, including download instructions, is provided at: http://grid.ncsa.illinois.edu/myproxy/oauth/client/

For the XSEDE OAuth for MyProxy service instance, the client configuration should include:

  <serviceUri>https://oa4mp.xsede.org/oauth</serviceUri>

Be sure to follow the instructions for the OAuth 1.0a version used by XSEDE, rather than the (newer) OAuth 2.0 version.

Protocol

In case science gateway developers want to use their own OAuth 1.0a implementation with the OAuth for MyProxy service, rather than the Java client provided above, the OAuth 1.0a compliant protocol for the OAuth for MyProxy service is specified at: http://goo.gl/d37tQv

This specification document lists the OAuth endpoints, signature method, and message specification for interacting with the XSEDE OAuth for MyProxy server.

Support

The discuss@sciencegatewaysecurity.org mailing list provides a discussion forum for OAuth for MyProxy. For details, see: http://www.sciencegatewaysecurity.org/discussion

Operations and Maintenance Practices

Once your gateway is operational, good operations and maintenance practices ensure continued, optimum integration with XSEDE resources.

  • Implement new technologies as needed to keep gateway up to date
  • Monitor filesystem usage
  • Monitor job load
  • Keep content current and relevant
  • Keep security and accounting functionalities current with XSEDE requirements
  • Back up your gateway routinely
  • Make sure your OS and applications are properly patched
  • Put a contingency plan in place for complete server loss or security incident
  • Keep logs and contact the XSEDE Help Desk for troubleshooting

All gateway developers will want to build their gateways using best practices for portal and web application development. In addition, accurate accounting practices will provide statistics to help justify requests by the gateway PI subsequent proposals.

Security and Accounting for XSEDE gateways

XSEDE has specific security and accounting requirements and recommendations for connecting to its resources to optimize your gateway for prevention and triage of security incidents or inadvertent misuse. In addition, accurate accounting practices will provide statistics to help justify requests by the gateway PI subsequent proposals.

Security and Accounting Requirements and Recommendations
  • Required: Notify the XSEDE Help Desk immediately if you suspect the gateway or its community account may be compromised or call the helpdesk at 1-866-907-2383
  • XSEDE reserves the right to disable a community account in the event of a security incident.
  • Required: Keep Science Gateway contact info up to date on the gateway list in case XSEDE staff should need to contact you
  • Required: Institute a user registry
  • Devise a credential management strategy
  • Required: Use gateway_submit_attributes tool to submit gateway username with job
  • Collect Accounting Statistics
  • Maintain an audit trail (keep a gateway log)
  • Provide the ability to restrict job submissions on a per user basis
  • Safeguard and validate Programs, scripts, and input
  • Protect passwords locally and over the network
  • Use proper precautions for passwordless ssh keys (not recommended, if they are stolen anyone can use them)
  • Perform Risk and Vulnerability Assessment
  • Backup your gateway routinely
  • Develop an an incident response plan for your gateway; review and update it regularly
  • Put a contingency plan in place to prepare for a disaster or security event that could cause the total loss or lock down of the server
  • Monitor changes to critical system files such as SSH with tripwire or samhain (open source)
  • Make sure your OS and applications are properly patched - Run a vulnerability scanner against them such as nessus
  • Make use of community accounts rather than individual accounts
What to Do in a Security Incident

Whether a threat is confirmed or suspected, quick action and immediate communication with XSEDE Security Working Group is essential. Please contact the XSEDE Help Desk immediately at 1-866-907-2383

Make Use of Community Accounts

Community Accounts are described at length on the main page of the developers section; they are the most common account strategy for XSEDE Science Gateways. Community Accounts present specific security and accounting challenges for the gateway developer, because many end users share access to XSEDE resources through the shared account. Consequently, the developer will typically want to restrict privileges so that it can run a limited set of applications.

To distinguish one gateway user from another, the gateway developer has to institute a user registry and gateway authentication mechanism. A gateway may wish to distinguish between identification mechanisms and capabilities for different types of users. For example, lightweight identification mechanisms may be appropriate for K-12 users making small demonstration runs. More substantial identification and justification for resources may be required for senior researchers using large fractions of a gateway allocation. XSEDE Service Provider sites may choose to restrict community accounts in a variety of ways, for example chroot jails or non-shell, role-based accounts which allow the execution of only selected commands or commands located in specific directories. Some of the techniques for managing community accounts are described below.

Institute a User Registry

Science gateways must implement a user registry that contains contact information for all users accessing XSEDE resources through the gateway. Gateways may provide access to their XSEDE allocated resources for demonstration or class accounts. In cases such as these, capabilities would be very limited.

Collect resource usage information for each registered user.

Provide the ability to restrict job submissions on a per user basis. This is optional, but can protect the gateway from the shut down of an entire community account in the event of a security incident.  Furthermore, Identification of researchers who have benefited from the services offered by the Gateway will be a fundamental part of future requests requesting XSEDE resources and may also be useful to the Gateway's own funding efforts.

Devise a Credential Strategy

Gateways require X.509 credentials for accessing XSEDE resources securely. Developers need to plan a strategy to fit their credential management scenarios. Users access XSEDE resources via science gateways using either individual credentials (i.e., issued to a single user who is known to the XSEDE Central Database) or community credentials (i.e., issued to the gateway which is responsible for per-user tracking). Individual credentials allow XSEDE service providers (SPs) to track per-person resource usage using standard account-based techniques, based on the XSEDE allocations process. Community credentials provide a more scalable approach, allowing user registration to be outsourced to the gateway, but XSEDE SPs still require the ability to track per-person resource usage for accounting and security reasons.

Gateways can combine both approaches, allowing registered XSEDE users to access resources with their individual credentials and others to access resources via a community credential. For more about scenarios and strategies, see the Science Gateway Credential Management in the XSEDE Wiki.

Maintain An Audit Trail: Keep a Gateway Log

Keeping an audit trail enables traceback to a user engaged in abusive or suspicious activity or a serious security breach. The audit trail consists of a log of all user login and job activity. The audit trail is now being managed by attribute-based authentication, which sends expanded information, including unique user identifiers, in job submission records. These records will be recorded in the XSEDE Central Database (TGCDB) and available for individual management of security. See Attribute-based Authentication below.

Log Authentication and Authorization Activity

All login activity, including attempts, successful logins, and further authorization should include the following data:

  • the requesting IP address
  • date stamp (Universal Time Code)
  • username

Map User Activity to Jobs Run on XSEDE Resources

With the adoption of attribute-based authentication, it will no longer be necessary to maintain job information; however, it may be helpful in policing your gateway in case of an infraction in security.

GRAM jobs:

  • GRAM job handle (which usually looks like https://<execHost>:49xxx//xxxxx/xxxxxxxx) for each job submitted by a gateway user for Globus-based gateways. The mapping between this job handle and the scheduler local jobID on the XSEDE compute resource will be provided via an auditing Web Service
  • RSL for each job for GRAM jobs.

Non-GRAM job submissions:

  • the remote command(s) executed via SSH, scripts, Web Services, or some other method to initiate a job on a XSEDE compute resource.
  • scheduler local jobID to gateway user mapping

For both GRAM and non-GRAM:

  • XSEDE resource to which the job was submitted.
  • Application(s) launched (especially for canned apps).
  • Timestamp for each job launch and termination.

Safeguard and Validate Programs, Scripts, and Input

Developers will want to consider the security of codes running on XSEDE resources. To the extent possible, gateways must implement safeguards (to be determined jointly with the XSEDE gateways and Security Working Groups) to protect against intentional or unintentional abuse by programs, scripts, and input. For example, input entered through a web form must not cause buffer overflows when a code is run using the resulting input files. Gateways should not let anonymous users upload executable files.

Perform Risk and Vulnerability Assessment

The Gateways Risk and Vulnerability assessment are designed to provide the information needed by XSEDE for incident prevention and response and to evaluate security threats, to suggest possible mitigations, and to determine whether the unmitigated risk can be accepted as a cost of doing business. For more information please see the Vulnerability Assessment spreadsheet from the Wiki.

Protect Portal Passwords Locally and Over the Network

Gateways should implement a strong password enforcement mechanism for its users. They must also use SSL or some other encryption mechanism to protect gateway user passwords from being transmitted in the clear.

Collect Accounting (and Other) Statistics

XSEDE must report several important metrics about the gateways it supports:

Gateways must record each gateway user's CPU usage on a quarterly basis. XSEDE is required to report these figures to the NSF, but the information will also be useful for your project reports, requests for future allocations, and for the Security or Accounting Working Groups should they need to investigate abuse or overuse. Once your gateway implements attribute-based authentication, this will be done automatically by XSEDE. Science successes may be more challenging for a gateway to collect because of the disparate nature of the user community. Nevertheless, science success due to the use of the gateway are important, both for XSEDE resource requests and likely for gateway funding requests as well. Published papers, citations and science successes can provide useful supporting information.

Science Gateway User Count

The gateway_submit_attributes package provides a mechanism for collecting science gateway-supplied usernames used to run applications under community accounts on XSEDE resources. In this scenario, the gateway authenticates the external user, sets the user name, and provides indirect access to the community account. 

The gateway (via GSI-SSH) or the job management middleware (Globus GRAM or UNICORE) invoke the script, gateway_submit_attributes, that writes the gateway-supplied username, the local job ID (obtained from the local resource manager), the submission time (also obtained from the local resource manager), and the submission host (configured by the local service provider) to special, restricted tables in the XSEDE Central Database (XDCDB).

Science Gateway User Count: Perl Client

The gateway_submit_attributes package provides a PERL client for integration into science gateways. The client is available on XSEDE resources under the module name "gateway-usage-reporting". After ssh'ing into the XSEDE resource, to access the client, simply run  

module load gateway-usage-reporting

The gateway_submit_attributes script takes as input 3 command-line parameters in the format:

> gateway_submit_attributes -gateway_user <username@mygateway.org> \
-submit_time <submission_time> -jobid <jobid>

The submission_time should be in the standard ISO format of "YYYY-MM-DD HH:MM:SS TZ" like "1999-01-08 04:05:06 -8:00". For example, after submitting a job on an XSEDE resource, extract the job id, and run the gateway_submit_attributes script as follows:

> sbatch mpi.job
...
Submitted batch job 4937919
> gateway_submit_attributes -gateway_user marlon@iu.edu \
-submit_time "`date '+%F %T %:z'`" -jobid 4939827

Please note that for the Gordon resource, you need to use the full string returned from PBS (including the hostname) e.g.,

> qsub test.sub2
2149587.gordon-fe2.local
> gateway_submit_attributes -gateway_user marlon@iu.edu \
-submit_time "`date '+%F %T %:z'`" -jobid 2149587.gordon-fe2.local

This command will submit the information to the XDCDB to a staging table where it will be later matched with AMIE accounting records coming from the site. To verify that the information is matched correctly, you can run the xdusage command with "-ja" option as shown in the example below:

$ xdusage -j -ja -s 2015-03-04 -e 2015-03-05 -p TG-STA110011S
...
job id= 4939827 resource=stampede.tacc.xsede
submit=2015-03-04@17:47:28 start=2015-03-04@17:47:28
end=2015-03-04@17:57:37 nodecount=2 processors=32 queue=normal charge=24.89
job-attr id= 4939827 name=marlon@iu.edu value=marlon@iu.edu
…


It may take up to a day for the AMIE packets to be sent by the site and for the data to be matched as above.

In case of submission failures due to database errors, the attributes are saved in a log file in the $HOME directory:

$HOME/gateway_attributes_log/gateway_attributes_entry.<job_id>.<submit_time>

If the file already exists, the script exits without overwriting the file.  Attributes can later be resubmitted through gateway_submit_attributes log file option:

gateway_submit_attributes -f <attributes_filename>

A gateway_bulk_submit script is also provided for the convenience of gateway operators to submit/resubmit bulk of attributes or attribute log files. To resubmit all previously failed entries, simply use:

gateway_bulk_submit -resubmit

Upon successful resubmission, the corresponding log file will be renamed as $HOME/gateway_attributes_log/gateway_attributes_entry.<job_id>.<submit_time>.delete and can be deleted. 

All submission histories are logged for future references in files named $HOME/gateway_attributes_log/history/gateway_submit_attributes-<date>.log and users are encouraged to keep such logs at least for 90 to 120 days for auditing purposes.

For any questions or issues with the gateway_submit_attributes package, please contact the XSEDE Help Desk or email help@xsede.org and follow the standard XSEDE helpdesk procedure. 

How Gateways Can Manage Jobs as Specific Users

Gateways typically submit jobs on behalf of their users using a common community account. However, it is sometimes desirable for gateways to execute remote commands on behalf of the user as that user. This can be done using XSEDE¹s Oauth for MyProxy service. The following is a short summary of how the service works:

The general approach promoted by XSEDE is that the gateway should be
using the XSEDE OAuth service (https://portal.xsede.org/oauth/) to
authenticate you. It does this using a standard OAuth 1 flow similar to
what you do when you authenticate to login to a website using your
Google, Twitter, Facebook, or Github ID.
 
 1. You click to login to the gateway.
 2. The gateway redirects you to the XSEDE OAuth service
(https://portal.xsede.org/oauth/).
 3. You login to the XSEDE OAuth service login page using your XSEDE
User Portal username and password.
 4. The XSEDE OAuth service validates your username and password and
asks you if you would like to grant the gateway permission to obtain a
delegated credential on your behalf.
 5. You click ok and are redirected back to the gateway.
 6. The gateway uses a special token received from the XSEDE OAuth
server in the redirect to call a proprietary service that in turn calls
the XSEDE MyProxy Server (myproxy.xsede.org) to receive a proxy
certificate on your behalf.
 7. The gateway keeps the proxy certificate until it expires at which
point you need to repeat the process (save the approval step).

 

Migrating from JGlobus

For many years, the JGlobus software provided a Java client API for Globus Toolkit services, covering GRAM, GridFTP, MyProxy, and Grid Security Infrastructure (GSI) functionality. Java­-based science gateways used JGlobus to interface with XSEDE (and previously TeraGrid). However, JGlobus is now deprecated and not supported by the Globus project or by XSEDE. Science Gateways still using JGlobus 1.x, which relies on outdated PureTLS libraries for security, must promptly migrate to alternative options for support of current security algorithms including SHA­2 certificates. Please read the Migrating From JGlobus guide to migrate your science gateways.