Data from External S3 Bucket

User Guide

Zebclient supports adding S3 buckets in the filesystem. This can be done by using the zebclient "inlet" command.

Adding an Inlet

$ zebclient inlet add mysystem \
   --inletname=public-nasa \
   --inode=2 \
   --dirname=nasa \
   --uid=1000 \
   --gid=1000 \
   --uri=https://s3.us-west-2.amazonaws.com \
   --region=us-west-2 \
   --bucket=nasanex
mount request for inlet 'public-nasa' added
$

The inode number can be found with "stat".

$ stat /mnt/zebclient/mydir
  File: .
  Size: 4096            Blocks: 8          IO Block: 65536  directory
Device: 0,75    Inode: 2           Links: 2
Access: (0755/drwxr-xr-x)  Uid: ( 1000/   jonas)   Gid: ( 1000/      js)
Access: 2024-02-14 13:39:04.456198031 +0100
Modify: 2024-02-14 13:39:04.456198031 +0100
Change: 2024-02-14 13:39:04.456198031 +0100
 Birth: -

A new directory "nasa" has appeared.

$ ls -lh /mnt/zebclient/mydir/nasa
total 36K
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 AVHRR
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 CMIP5
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 Landsat
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 LOCA
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 MAIAC
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 MODIS
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 NAIP
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 NEX-DCP30
drwxr-xr-x 2 jonas js 4.0K Feb 14  2024 NEX-GDDP

This directory is read-only. It is visible from all agents in the cluster.

Command details

The command zebclient inlet add --help will list all command-line switches, such as the ability to set accesskey/secretkey for s3, changing dirmode/filemode or scoping the inlet to a bucket subpath.

$ zebclient inlet add --help
NAME:
   zebclient inlet add - Add an inlet, e.g. external S3 bucket

USAGE:
   zebclient inlet add [command options] <name>

DESCRIPTION:
   This commands gives access to external data from the filesystem tree on all zebclient mounts.
   A new directory will appear that reflects read-only access to the external bucket.

   Example:
   $ zebclient inlet add mysystem \
       --inletname=aws-mybucket \
       --inode=42 \
       --dirname=mypictures \
       --dirmode=0755 \
       --filemode=0644 \
       --uid=1000 \
       --gid=1000 \
       --protocol=s3 \
       --uri=https://s3.us-east-1.amazonaws.com \
       --region=us-east-1 \
       --bucket=mybucket \
       --accesskey=AFIOHTGJWERITYNWREMN \
       --secretkey=Xmdfgi283u4sadkmfgklasdm32498u5384sdkmfa


OPTIONS:
   --env-file value   Path to environment variable file
   --inletname value  The name of the inlet
   --dirname value    The name of the directory to create.
   --inode value      The inode parent to the new directory that the filesystem will create.
                      It is not valid to recursively mount an inlet within another inlet.
                      HINT: You can use for instance "stat" on a directory to see the inode number.
   --dirmode value    The mode that directories should have (default: "0755")
   --filemode value   The mode that files should have (default: "0644")
   --uid value        The user ID which the new files/directories should have
   --gid value        Same as uid, but group id
   --protocol value   What protocol to use (default: "s3")
   --uri value        The S3 service URI, e.g. 'https://s3.us-east-1.amazonaws.com'
   --region value     The S3 region, e.g. 'us-east-1'
   --bucket value     The name of the bucket
   --prefix value     Scope to this prefix within the bucket
   --accesskey value  Optional. The S3 access key for the account. If not set then the bucket need to be accessable without access and secret key
   --secretkey value  Optional. The S3 secret key for the account. If not set then the bucket need to be accessable without access and secret key
$

The command requires first the name of the zebclient configuration, in this case "mysystem". This corresponds to zebclient trying to find configuration variables that are named ZM_MYSYSTEM_XXX, such as:

ZM_MYSYSTEM_LICENSE_KEY=FEA234-3EDA12-IIJASD-DJIUSJ-9824DX-IJSD89
ZM_MYSYSTEM_ZEBEC_K=1
ZM_MYSYSTEM_ZEBEC_M=0
ZM_MYSYSTEM_META_URI=redis://dbhost:6379/1
ZM_MYSYSTEM_ZCFS_MOUNTPOINT=/mnt/zebclient
ZM_MYSYSTEM_BACKEND_000_XXX=<backend section> ....
ZM_MYSYSTEM_MEMORY_LIMIT="1GiB"

Listing Inlets

To list existing inlets:

$ zebclient inlet list mysystem
        INLET        | INODE | DIRNAME | UID  | GID  | DIRMODE | FILEMODE | PROTOCOL |                   URI                   |  REGION   |   BUCKET
---------------------+-------+---------+------+------+---------+----------+----------+-----------------------------------------+-----------+--------------
  lab-19-testbucket1 |     1 | bucket1 | 1000 | 1000 |    0755 |     0644 | s3       | http://test-lab-19.int.zebware.com:9000 | us-east-1 | testbucket
  lab-19-testbucket2 |     1 | bucket2 | 1000 | 1000 |    0755 |     0644 | s3       | http://test-lab-19.int.zebware.com:9000 | us-east-1 | testbucket2
  lab-19-testbucket3 |     1 | bucket3 | 1000 | 1000 |    0755 |     0644 | s3       | http://test-lab-19.int.zebware.com:9000 | us-east-1 | testbucket3
  lab-19-testbucket4 |     1 | bucket4 | 1000 | 1000 |    0755 |     0644 | s3       | http://test-lab-19.int.zebware.com:9000 | us-east-1 | testbucket4
  lab-19-testbucket5 |     1 | bucket5 | 1000 | 1000 |    0755 |     0644 | s3       | http://test-lab-19.int.zebware.com:9000 | us-east-1 | testbucket5
  minio-1            |     1 | minio-1 | 1000 | 1000 |    0755 |     0644 | s3       | http://localhost:9000                   | us-east-1 | mybucket
  public-nasa        |     1 | nasa    | 1000 | 1000 |    0755 |     0644 | s3       | https://s3.us-west-2.amazonaws.com      | us-west-2 | nasanex

Removing Inlets

To remove inlets:

$ zebclient inlet remove 00 --inletname=public-nasa
unmount request for inlet 'public-nasa' added
$

Last updated