Integrating Starfish for Large-Scale File Management with Storj

Integration guide for connecting Storj to Starfish

Starfish Storage is a versatile service designed for file and object management across a wide range of scales, suitable for everything from small departmental file shares to the largest supercomputing file systems. It stands out for its ability to handle vast quantities of data with efficiency and ease, making it a valuable tool for organizations dealing with large and complex data environments.

Advantages of Starfish with Storj

Integrating Starfish with Storj offers a comprehensive solution for large-scale file management at a competitive cost. Starfish, specializing in metadata-driven file organization and management, integrates smoothly with Storj's S3 compatible API, ensuring efficient storage and retrieval of vast data volumes. This combination leverages Storj's robust security features, giving users confidence in the safety and integrity of their data.

Integration

The integration between Storj and Starfish is achieved through the S3 protocol, which allows Starfish to write and read backup data directly to and from the Storj network. Users can configure Starfish to use Storj as the storage target for their archive jobs.

To integrate Starfish with Storj, you will need:

An active Storj account
A bucket for Starfish in your Storj account
Storj S3 compatible credentials
Starfish account (see here)

For more details, see https://starfishstorage.com/solutions/

Set up Storj

Create an Account

To begin, you will need to create a Storj account.

Navigate to https://storj.io/signup?partner=starfish to sign up, or log in https://storj.io/login if you already have an account.

Create a Bucket

Once you have your Storj account you can create a bucket for your data to be stored in.

Navigate to Browse on the left side menu.
Click on the New Bucket button.
Assign the bucket an easily identifiable name, such as "my-bucket".
Optional: Enable Object Lock (required for immutability in many applications).
- If you enable Object Lock, you can also set a default retention period using either Governance or Compliance Mode
Optional: Enable Object Versioning (note that this will be enabled by default if Object Lock is enabled)
Click Create bucket

Generate S3 credentials

Storj has an Amazon S3 compatible API and you'll need generate S3 credentials to use it. S3 credentials consist of an access key, secret key, and endpoint.

Create S3 credentials in the Storj console:

Navigate to Access Keys on the left side menu.
Click the New Access Key button.
When the New Access dialog comes up, set specifications according to the following guidelines:
- Name: The name of the credentials (e.g. my-access)
- Type: S3 Credentials
Choose Full Access or Advanced
- In most cases, you DO NOT want to choose full access.
Provide Access encryption Information
If you have opted out of Storj-managed passphrases for the project you must unlock the bucket with your passphrase. In order to see the data uploaded to your bucket in the Storj console, you must unlock the bucket with the same encryption passphrase as the credentials.
- Use the current passphrase: this is default option
- Advanced: you may provide a different encryption phrase either your own or generate a new one.
  - Enter a new passphrase: use this option, if you would like to provide your own new encryption phrase
  - Generate 12-word passphrase: use this option, if you would like to generate a new encryption phrase
Select the permissions you want to allow:
- Read
- Write
- List
- Delete
Select the object lock permissions you want to allow
- PutObjectRetention
- GetObjectRetention
- BypassGovernanceRetention
- PutObjectLegalHold
- GetObjectLegalHold
- PutObjectLockConfiguration
- GetObjectLockConfiguration
Choose the buckets you want the access to include:
- All Buckets
- Select Buckets
Set an expiration
Click Create Access to finish creation of your S3 credentials
Your S3 credentials are created. Write them down and store them, or click the Download all button. You will need these credentials for the following steps.

Object Lock Permission Details

Permission Name	Description
PutObjectRetention	Allows you to set retention policies, protecting objects from deletion or modification until the retention period expires.
GetObjectRetention	Allows you to view the retention settings of objects, helping ensure compliance with retention policies.
BypassGovernanceRetention	Allows you to bypass governance-mode retention, enabling deletion of objects before the retention period ends.
PutObjectLegalHold	Allows you to place a legal hold on objects, preventing deletion or modification regardless of retention policies.
GetObjectLegalHold	Allows you to view the legal hold status of objects, which is useful for auditing and compliance purposes.
PutObjectLockConfiguration	Allows you to set retention policies on the specified bucket, automatically applying them to every new object added to that bucket.
GetObjectLockConfiguration	Allows you to view the default retention policies configured for the specified bucket.

Connecting Starfish with Storj

The workflow assumes a bucket called sf-test has previously been created within Storj. The following will add Storj as an target for archive jobs.

Select Targets button from the base UI to create a new target
Select Add target and S3
Fill out the target specific details:
- Type: s3
- Endpoint URL: https://gateway.storjshare.io
- Aws access key id: Enter the access key from the S3 credentials you generated in Storj.
- Aws secret access Key: Enter the secret key from the S3 credentials you generated in Storj.
- Bucket name: Enter the name of the bucket created previously
Select Close

Using the CLI, adjust the max_part_size and default_part_size to 64MiB

These options are not available via UI but can also be configured via the REST API.

Run the following command in a terminal:

$sf archive-target update Storj default_part_size=64MiB max_part_size=64MiB

Archive target updated
id: 44
name: Storj
type: s3
params:
aws_access_key_id=jusgujwkr4wkoeijgfhbtn465ara
aws_secret_access_key=***
bucket_name=sf-test
default_part_size=64MiB
dst_path=km-tst
endpoint_url=https://gateway.storjshare.io
max_part_size=64MiB
retries=8
single_read_on_s3_upload=False
timeout=60

$sf archive-target update Storj default_part_size=64MiB max_part_size=64MiB

Archive target updated
id: 44
name: Storj
type: s3
params:
aws_access_key_id=jusgujwkr4wkoeijgfhbtn465ara
aws_secret_access_key=***
bucket_name=sf-test
default_part_size=64MiB
dst_path=km-tst
endpoint_url=https://gateway.storjshare.io
max_part_size=64MiB
retries=8
single_read_on_s3_upload=False
timeout=60