Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

As the name implies, ROSS is an object store.  An object store really does not have a directory tree structure like conventional POSIX file systems do.  (Grace's storage systems are POSIX file systems.)  Instead, objects have names and live in buckets, essentially a one-level directory hierarchy.  People often name objects using what looks like a POSIX-style file path.  For example "directory1/directory2/directory3/directory4/filename" might be an object name.  The RESTful APIs used to access ROSS can do a prefix search, listing just the objects that start with a particular prefix. For example "directory1/directory2/" would pull back all the objects that start with that prefix.  So in a way one can mimic a POSIX file system with object naming conventions that use the equivalent of POSIX path names.  Unfortunately, prefix search is quite inefficient, making the simulated directory lookup slower than a native POSIX file system.  Nevertheless, some tools exist that can mimic a POSIX-style file system using an object store behind the scenes.  With a proper naming convention, it is fairly easy to backup and restore POSIX directory trees to and from an object store fairly efficiently.  This is what makes ROSS an excellent backup backing storage system for Grace.  (A backing storage system is the larger, but slower storage system that back ends a smaller, but faster storage cache in a hierarchical storage model.  Grace's storage system should be considered cache.)

Remember, the HPC Admins do not back up Grace's shared storage system, since it is only intended to be a staging or scratch area (aka a cache) used to run HPC jobs.  In other words, the primary copy of data should be elsewhere, not on Grace.  ROSS is one option for keeping the primary copy of data.  In fact, the main reason UAMS purchased ROSS is as the primary location for storing research data.

Data in ROSS starts as triple replicated on 3 different storage nodes, but then transforms to erasure coding for giving resilience without major costs.  ROSS currently uses 12/16 error coding, meaning that for every 12 blocks of data stored it writes 16 blocks.  Up to four blocks could be damaged before ROSS is no longer able to recover data if a fifth block were damaged.  In contrast, most RAID systems only have a 1 or 2 drive redundancy.  ROSS scatters the 16 blocks across the storage nodes in a data center, improving the performance of data retrievals and further improving the resilience of the system (less chance that a node failure blocks access).  Data can be access accessed from any of the storage nodes.  Currently the two systems (UAMS and UARK) are isolated from each other, not allowing replication.  But the plans to join the two are in progress.  Eventually all of the content in ROSS, regardless of which campus it physically lives on, would be accessible from either campus.  Of course, if data living on one campus is accessed from the other campus, the access will be somewhat slower due to networking delays and bandwidth limitations going between campus.

...

For data that meets OURRstore requirements (relatively static, STEM related, not clinical nor regulated), archiving to OURRstore is a very cost effective option to get resilient, long-term, offsite copies for little extra money (just media and shipping costs).  Other options for backup include the UAMS campus Research NAS or cloud storage such as Azure, Google, Amazon, or IBM.  USB or bare drives are also an option for backup, but not recommended, as they are quite error prone if not stored and managed properly.

...

The server-side encryption can either be turned on at the namespace level, where all buckets in a namespace are required to be encrypted, or on the bucket level for namespaces that do not have 'encryption required' turned on.  If you want namespace level encryption, please inform the HPC admins when requesting a namespace.  The encryption choice must be made at namespace or bucket creation time, and cannot be changed afterwards.  (One can copy data from an unencrypted bucket to an encrypted one, then destroy the unencrypted bucket, or visa versa, should a change be needed after bucket creation.)

Requesting access to ROSS

...

If you wish to access an existing group namespace as an object user, please contact the namespace administrator for that namespace and ask to be added as an object user for that namespace.

Using ROSS

What APIs are supported by ROSS

ROSS supports multiple object protocolsprotocol APIs.  The most common is S3, followed by Swift (from the Open Stack family).  Although ROSS can support the NFS protocol, our experiments show that the native NFS support is slower for many use cases than other options that simply emulate a POSIX file system using object protocols.  ROSS can also support HDFS (used in Hadoop clusters) and a couple of Dell/EMC proprietary protocols (ATMOS and CAS); however, we have not tested any of these other protocols, hence the HPC Admins could only offer limited support for these other options.  The primary protocols that the HPC Admins support are is S3 and Swift.  We have more experience with S3.  Later we describe our two favorite tools for accessing ROSS, but you are not required to use our suggested tools.

How do I get access credentials?

blah

How do I create a bucket or container?

For your personal namespace, you can create buckets (the S3 term) or containers (the equivalent Swift term) from the administration API.

If it is a group namespace (e.g. lab, project, department), the namespace administrator can create buckets with certain characteristics for you.

In either cases, object users can also use APIs and tools to create buckets or containers.  Many of the ECS-specific features (e.g. replication, encryption, file access) are not available when using the bucket-creation options of tools and the APIs.  If you need any of those options, it is better to create the bucket or container with ROSS's web interface.  Using the APIs is beyond the scope of this document. 

What tools do you recommend for accessing ROSS

The bottom line - any tool that uses any of the supported protocols theoretically could work with ROSS.  There are a number of free and commercial tools that work nicely with object stores like ROSS.  Like most things in life, there are pros and cons to different tools.  Here is some guidance as to what we look for in tools.The bottom line - any tool that uses any of the above protocols theoretically could work with ROSS.  There are a number of free and commercial tools that work nicely with object stores like ROSS.

Being inherently parallel accessible (remember the 23+ storage nodes), the best performance is gain when operations proceed in parallel.  Keep that in mind if building your own scripts (e.g. using ECS's variant of s3curl) or when comparing tools.  More parallelism (i.e. multiple transfers happening simultaneously across multiple storage nodes) generally yields better performance, up to a limit.  Eventually the parallel transfers become limited by other factors such as memory or network bandwidth constraints.  Generally there is a 'sweet spot' in the number of parallel transfer threads that can run before stepping on each other's toes.  The S3 protocol includes support for multipart upload of objects, where a large object upload can be broken into multiple pieces and transferred in parallel.  Tools that support multipart upload likely will have better performance than tools that don't.  In addition, tools that support moving entire directory trees and that work in parallel will have better performance than tools that move files one object at a time.

...