...
The following table describes all the global options.
Option | Description |
---|---|
bufferSize | Sets the buffer size (in bytes) to use when streaming data from the source to the target (supported plugins only). Defaults to 128K |
dbConnectString | We do not use this parameter. Enables the MySQL database engine and specifies the JDBC connect string to connect to the database (i.e. "jdbc:mysql://localhost:3306/ecs_sync?user=foo&password=bar"). A database will make repeat runs and incrementals more efficient. With this database type, you can use the mysql client to interrogate the details of all objects in the sync. |
dbEncPassword | This parameter is only used with the MySQL option, which we do not use. Specifies the encrypted password for the MySQL database. |
dbEnhancedDetailsEnabled | May not work. Specifies whether the DB should included enhanced details, like source/target MD5 checksum, retention durations, etc. Note this will cause the DB to consume more storage and may add some latency to each copy operation. |
dbFile | We do recommend this option. Enables the Sqlite database engine and specifies the file to hold the status database. A database will make repeat runs and incrementals more efficient. With this database type, you can use the sqlite3 client to interrogate the details of all objects in the sync. |
dbTable | Specifies the DB table name to use. When using MySQL be sure to provide a unique table name or risk corrupting a previously used table. Default table is "objects". |
deleteSource | Supported source plugins will delete each source object once it is successfully synced (does not include directories). Use this option with care! Be sure log levels are appropriate to capture transferred (source deleted) objects. |
estimationEnabled | By default, the source plugin will query the source storage to crawl and estimate the total amount of data to be transferred. Use this option to disable estimation (i.e. for performance improvement). |
forceSync | Force the write of each object, regardless of its state in the target storage. |
ignoreInvalidAcls | If syncing ACL information when syncing objects, ignore any invalid entries (i.e. permissions or identities that don't exist in the target system). |
logLevel | Sets the verbosity of logging (silent|quiet|verbose|debug). Default is quiet. |
monitorPerformance | Enables performance monitoring for reads and writes on any plugin that supports it. |
perfReportSeconds | Not tested. Report upload and download rates for the source and target plugins every <x> seconds to INFO logging. Default is off (0). |
recursive | Hierarchical storage will sync recursively. |
rememberFailed | Tracks all failed objects and displays a summary of failures when finished |
retryAttempts | Specifies how many times each object should be retried after an error. Default is 2 retries (total of 3 attempts). |
sourceListFile | Path to a file that supplies the list of source objects to sync. This file must be in CSV format, with one object per line and the absolute identifier (full path or key) is the first value in each line. This entire line is available to each plugin as a raw string. |
sourceListFileRawValues | Whether to treat the lines in the sourceListFile as raw values (do not do any parsing to remove comments, escapes, or trim white space). Default is false. |
syncAcl | Sync ACL information when syncing objects (in supported plugins). |
syncData | Sync object data. |
syncMetadata | Sync metadata. |
syncRetentionExpiration | Sync retention/expiration information when syncing objects (in supported plugins). The target plugin will *attempt* to replicate retention/expiration for each object. Works only on plugins that support retention/expiration. If the target is an Atmos cloud, the target policy must enable retention/expiration immediately for this to work. |
threadCount | Specifies the number of objects to sync simultaneously. Default is 16. |
timingWindow | Sets the window for timing statistics. Every {timingWindow} objects that are synced, timing statistics are logged and reset. Default is 10,000 objects. |
timingsEnabled | Enables operation timings on all plug-ins that support it. |
verify | After a successful object transfer, the object will be read back from the target system and its MD5 checksum will be compared with that of the source object (generated during transfer). This only compares object data (metadata is not compared) and does not include directories. |
verifyOnly | Similar to verify except that the object transfer is skipped and only read operations are performed (no data is written). |
File System Options
This example shows the file system as a source. It could also be a target. The filesystem storage type plugin reads and writes to or from a directory. The example shows typical values, which generally need to change to fit your situation.
...
The following table describes the file system options.
Option | Description |
---|---|
deleteCheckScript | When the deleteSource global option is true, add this option to execute an external script to check whether a file should be deleted. If the process exits with return code zero, the file is safe to delete. |
deleteOlderThan | When the deleteSource global option is true, add this option to only delete files that have been modified more than <delete-age> milliseconds ago. |
excluded-paths | A list of regular expressions to search against the full file path. If the path matches, the file will be skipped. Since this is a regular expression, take care to escape special characters. For example, to exclude all .snapshot directories, the pattern would be .*/\.snapshot. Specify multiple entries by repeating the option or using multiple lines. |
followLinks | Instead of preserving symbolic links, follow them and sync the actual files. |
includeBaseDir | By default, the base directory is not included as part of the sync (only its children are). enable this to sync the base directory. |
modifiedSince | Only look at files that have been modified since the specifiec date/time. Date/time should be provided in ISO-8601 UTC format (i.e. 2015-01-01T04:30:00Z, which is <yyyy-MM-ddThh:mm:ssZ>). |
relativeLinkTargets | By default, any symbolic link targets that point to an absolute path within the primary source directory will be changed to a (more portable) relative path. Set this option false to keep the target path as-is. |
storeMetadata | When used as a target, stores source metadata in a json file, since filesystems have no concept of user metadata. When used as a source, uses the json file to restore the metadata in the target. |
useAbsolutePath | When true, uses the absolute path to the file when storing it instead of the relative path from the source dir. |
Ross (ECS S3) Options
ROSS uses the ECS S3 storage system type. This example shows ROSS as a destination. It could also be a target. The example shows typical values, some of which must be changed for the sync to work, such as accessKey, bucketName, keyPrefix, and secretKey.
...
The following table describes the options for the ECS S3 (ecs-s3:) plugin.
Option | Description |
---|---|
accessKey | The S3 access ID for ROSS's S3 API, which is the object user's username in the ECS world. |
secretKey | The S3 access secret for ROSS associated with the object user, which can be copied from the ECS portal by the namespace administrator. |
port | The port to use to access ROSS. Should 9020 if protocol is set to http, or 9021 if the protocol is set to https. |
protocol | The protocol to use to access ROSS, either http or https. |
bucket | The bucket to write or read objects from. |
create-bucket | By default, the target bucket must exist. If true, this option will create the bucket if it does not exists. |
enableVHosts | Specifies whether virtual hosted buckets will be used (i.e. bucket name in the hostname instead of in the path) (default is path-style buckets), |
keyPrefix | Specifies a string to be prepended to the name generated for the object. For example, if keyPrefix is set to "prefix/", the source is a file system, and ecs-sync is copying a file with a path "subdir/subdir/filename", then the object name will be "prefix/subdir/subdir/filename". |
apacheClientEnabled | Disabling this will use the native Java HTTP protocol handler, which can be faster in some situations, but is buggy |
geoPinningEnabled | Enables geo-pinning. This will use a standard algorithm to select a consistent VDC for each object key or bucket name, taking into account where the request is made, and where the VDCs that hold the data are. |
includeVersions | Enable to transfer all versions of every object. NOTE: this will overwrite all versions of each source key in the target system if any exist! |
mpuEnabled | Enables multi-part upload (MPU). Large files will be split into multiple streams and (if possible) sent in parallel. |
mpuPartSizeMb | Sets the part size to use when multipart upload is required (objects over 5GB). Default is 128MB, minimum is 4MB, |
mpuThreadCount | The number of threads to use for multipart upload (only applicable for file sources). |
mpuThresholdMb | Sets the size threshold (in MB) when an upload shall become a multipart upload. |
preserveDirectories | If enabled, directories are stored in S3 as empty objects to preserve empty dirs and metadata from the source. |
remoteCopy | If enabled, a remote-copy command is issued instead of streaming the data. Remote-copy can be much faster than the streaming alternative. Remote-copy can only be used when the source and target is the same system (e.g. both are on ROSS). |
resetInvalidContentType | When set to true (the default), any invalid content-type is reset to the default (application/octet-stream). Turn this off to fail these objects (ECS does not allow invalid content-types). |
smartClientEnabled | The smart client is enabled by default. Use this option to turn it off when using a load balancer (which presumably would perform a function similar to smart client) or a fixed set of nodes. |
socketConnectTimeoutMs | Sets the connection timeout in milliseconds (default is 15000ms). |
socketReadTimeoutMs | Sets the read timeout in milliseconds (default is 0ms). |
urlEncodeKeys | Enables URL-encoding of object keys in bucket listings. Use this if a source bucket has illegal XML characters in key names. |
vdcs | Sets which Virtual Data Center and which nodes in that virtual data center that ecs-sync sync will communicate with in carrying out the copying. The format is "vdc-name(host,..)[,vdc-name(host,..)][,..]", The smart client capability will load balance across the active nodes, skipping the inactive ones. We use this in favor of the hosts option. We use this option above, listing the current UAMS vdc and nodes. |
host | This is an alternative to the vdcs command, where hosts can be comma separated list of host network names, or can be the name of a load balancer. We do not provide an example of how to use the host option, as we prefer the vdcs option. |
rclone
Although not as fast nor efficient as ecs-sync, the rclone program (http://www.rclone.org) is one of the most versatile tools for accessing object stores such as ROSS. It includes features for browsing various types of object and cloud storage systems, as well as local files using Posix file system conventions, using commands such as ls (list objects/files in a path), lsd (list directories/containers/buckets in a path), copy, move, delete, etc. For some object/cloud storage systems, rclone can mount buckets as if they were a network file systems, though with reduced functionality and speed, so that users can use familiar commands to access bucket contents instead of the rclone-specific commands. But the most popular feature of rclone is the ability to sync a directory tree to a bucket on an object store using a familiar rsync-like syntax. The rclone site includes documentation on how to install, configure, and use rclone. The rclone program is licensed under the permissive, open-source MIT license, hence is free to use and distribute.