Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Please indicate whether this is for a personal namespace (e.g. primary storage for processing data on Grace), or for a group (shared storage). 
    • For a personal namespace, please indicate
      • your name
      • your e-mail
      • your departmental affiliation
      • what your login name (not your password) and domain you will use to access ROSS's administrative interface, e.g. johndoe@hpc.uams.edu (for personal namespaces we prefer that you use your HPC username,)
    • For a group namespace, please give
      • a name and brief description for the group
      • the primary e-mail contact for the group
      • the departmental affiliation of the group
      • who will be the namespace administrators - we need
        • their names
        • their e-mail address
        • their login name (not their password) and domain e.g. janedoe@ad.uams.edu (for group namespaces we generally prefer campus Active Directory usernames (e.g. the name the namespace administrator might use to login to Outlook Mail)
        • you may ask for more than one namespace administrator
        • if all the members of a particular campus AD group should be namespace administrators, you could also just give us the name and domain of the group instead of their individual names
  • Please estimate approximately how much storage you or your group intend to use in ROSS for the requested namepace, divided into local and replicated amounts.  The HPC Admins will use this information in setting the initial quotas and for capacity planning.  You will be allowed to request increases in quota if needed and space is available. 
  • We would also appreciate a brief description of what you will be using the storage for.  The "what it is used for" assists us in drumming up support (and possibly dollars) for expanding the system. 

If you wish to access an existing group namespace as an object user, please contact the namespace administrator for that namespace and ask to be added as an object user for that namespace.

Using ROSS

A particular username@domain can only be a namespace administrator for one namespace.  If a particular person needs to be the namespace administrator in more than one namespace, for example a personal namespace as well as a group namespace, they must use different login names and domains.  This is why we suggest using your HPC credentials for personal namespaces, and other credentials for group namespaces.

Don't forget that a namespace administrator is not the same as an object user.  See Access to the Research Object Store System (ROSS) for the difference between a namespace administrator and an object user.  An object user has a different API-specific set of credentials.

If you wish to access an existing group namespace as an object user, please contact the namespace administrator for that namespace and ask to be added as an object user for that namespace.

Using ROSS

In experiments we have notices that write times to ROSS are considerably slower than read times, and slower than many POSIX file systems.  However, read times are significantly faster.  In other words, it takes In experiments we have notices that write times to ROSS are considerably slower than read times, and slower than many POSIX file systems.  However, read times are significantly faster.  In other words, it takes longer to store new data into ROSS than to pull existing data out of ROSS.  We also notice (as is typically of most file systems) that transfers of large objects can go significantly faster than tiny objects.  Please keep these facts in mind when planning your use of ROSS - ROSS favors reads over writes and big things over little things.

In theory, a bucket can hold millions of files.  We have noticed, though, that with path prefix searching the more objects that have that a particular prefix, typically related to the number of objects in a bucket, the longer the search takes.  While this is not dissimilar to POSIX filesystems, where the more files there are in a directory. the longer it takes to do a directory lookup, on an object store, being a flat namespace (no directory hierarchy), such searches can be much slowersuch searches can be much slower.  The total number of objects in a bucket also has a mild, though not dramatic, impact on the look up speed particular objects in a bucket.  

What APIs are supported by ROSS

ROSS supports multiple object protocol APIs.  The most common is S3, followed by Swift (from the Open Stack family).  Although ROSS can support the NFS protocol, our experiments show that the native NFS support is slower for many use cases than other options that simply emulate a POSIX file system using object protocols.  ROSS can also support HDFS (used in Hadoop clusters) and a couple of Dell/EMC proprietary protocols (ATMOS and CAS); however, we have not tested any of these other protocols, hence the HPC Admins could only offer limited support for these other options.  The primary protocols that the HPC Admins support is S3 and Swift.  We have more experience with S3, hence most of this article assumes S3.  Later we describe our two favorite tools for accessing ROSS, but you are not required to use our suggested tools..  Although ROSS can support the NFS protocol, our experiments show that the native NFS support is slower for many use cases than other options that simply emulate a POSIX file system using object protocols.  But NFS is available for those who would like to use it.  ROSS can also support HDFS (used in Hadoop clusters) and a couple of Dell/EMC proprietary protocols (ATMOS and CAS); however, we have not tested any of these other protocols, hence the HPC Admins could only offer limited support for these other options. 

The primary protocols that the HPC Admins support is S3 and Swift.  We have more experience with S3, hence most of this article primarily assumes S3.  Note that S3 and Swift can be used interchangeably and simultaneously on the same bucket.

The file-based protocols NFS and HDFS must be enabled on a bucket at creation time, if you intend to use them, since the underlying system adds additional metadata when file access is enabled.  When enabled, NFS and HDFS, too, can be used interchangeably with the S3 and Swift object protocols on the same bucket.  However, a bucket with file access enabled loses some of the object features, such as life cycle management.  So consider carefully before enabling them.  (You can still access a bucket using file system semantics with some external tools even when the bucket is not enabled for file access.)

How do I get access credentials?

In order to use either S3 or Swift, you need credentials.  The credentials are tied to a particular object user, who belongs in a particular namespace.  As discussed in Access to the Research Object Store System (ROSS), the namespace administrator can retrieve an object user's credentials via ROSS's administrative web interface.  Remember, the owner of a personal namespace is their own namespace administrator.

How do I create a bucket or container?

...