e2fyi.utils.aws.s3_resource

Provides S3Resource to represent resources in S3 buckets.

Module Contents

Classes

S3Resource

S3Resource represents a resource in S3 currently or a local resource that will

e2fyi.utils.aws.s3_resource.T
e2fyi.utils.aws.s3_resource.StringOrBytes
class e2fyi.utils.aws.s3_resource.S3Resource(filename: str, content_type: str = '', bucketname: str = '', prefix: str = '', protocol: str = 's3a://', stream: S3Stream[StringOrBytes] = None, s3client: boto3.client = None, stats: dict = None, **kwargs)

S3Resource represents a resource in S3 currently or a local resource that will be uploaded to S3. S3Resource constructor will automatically attempts to convert any inputs into a S3Stream, but for more granular control S3Stream.from_any should be used instead to create the S3Stream.

S3Resource is a readable stream - i.e. it has read, seek, and close.

Example:

import boto3

from e2fyi.utils.aws import S3Resource, S3Stream

# create custom s3 client
s3client = boto3.client(
    's3',
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY
)

# creates a local copy of s3 resource with S3Stream from a local file
obj = S3Resource(
    # full path shld be "prefix/some_file.json"
    filename="some_file.json",
    prefix="prefix/",
    # bucket to download from or upload to
    bucketname="some_bucket",
    # or "s3n://" or "s3://"
    protocol="s3a://",
    # uses default client if not provided
    s3client=s3client,
    # attempts to convert to S3Stream if input is not a S3Stream
    stream=S3Stream.from_file("./some_path/some_file.json"),
    # addition kwarg to pass to `s3.upload_fileobj` or `s3.download_fileobj`
    # methods
    Metadata={"label": "foo"}
)
print(obj.key)  # prints "prefix/some_file.json"
print(obj.uri)  # prints "s3a://some_bucket/prefix/some_file.json"

# will attempt to fix prefix and filename if incorrect filename is provided
obj = S3Resource(
    filename="subfolder/some_file.json",
    prefix="prefix/"
)
print(obj.filename)     # prints "some_file.json"
print(obj.prefix)       # prints "prefix/subfolder/"

Saving to S3:

from e2fyi.utils.aws import S3Resource

# creates a local copy of s3 resource with some python object
obj = S3Resource(
    filename="some_file.txt",
    prefix="prefix/",
    bucketname="some_bucket",
    stream={"some": "dict"},
)

# upload obj to s3 bucket "some_bucket" with the key "prefix/some_file.json"
# with the json string content.
obj.save()

# upload to s3 bucket "another_bucket" instead with a metadata tag.
obj.save("another_bucket", MetaData={"label": "foo"})

Reading from S3:

from e2fyi.utils.aws import S3Resource
from pydantic import BaseModel

# do not provide a stream input to the S3Resource constructor
obj = S3Resource(
    filename="some_file.json",
    prefix="prefix/",
    bucketname="some_bucket",
    content_type="application/json"
)

# read the resource like a normal file object from S3
data = obj.read()
print(type(data))       # prints <class 'str'>

# read and load json string into a dict or list
# for content_type == "application/json" only
data_obj = obj.load()
print(type(data_obj))   # prints <class 'dict'> or <class 'list'>


# read and convert into a pydantic model
class Person(BaseModel):
    name: str
    age: int

# automatically unpack the dict
data_obj = obj.load(lambda name, age: Person(name=name, age=age))
# alternatively, do not unpack
data_obj = obj.load(lambda data: Person(**data), unpack=False)
print(type(data_obj))   # prints <class 'Person'>

Creates a new instance of S3Resource, which will use boto3.s3.transfer.S3Transfer under the hood to download/upload the s3 resource.

See https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer

Args:

filename (str): filename of the object. content_type (str, optional): mime type of the object. Defaults to “”. bucketname (str, optional): name of the bucket the obj is or should be.

Defaults to “”.

prefix (str, optional): prefix to be added to the filename to get the s3

object key. Defaults to “application/octet-stream”.

protocol (str, optional): s3 client protocol. Defaults to “s3a://”. stream (S3Stream[StringOrBytes], optional): data stream. Defaults to None. s3_client (boto3.client, optional): s3 client to use to retrieve

resource. Defaults to None.

Metadata (dict, optional): metadata for the object. Defaults to None. **kwargs: Any additional args to pass to boto3.s3.transfer.S3Transfer

function.

property content_type(self) → str

mime type of the resource

property key(self) → str

Key for the resource.

property uri(self) → str

URI to the resource.

property stream(self) → S3Stream[StringOrBytes]

data stream for the resource.

read(self, size=- 1)StringOrBytes

duck-typing for a readable stream.

seek(self, offset: int, whence: int = 0) → int

duck-typing for readable stream. See https://docs.python.org/3/library/io.html

Change the stream position to the given byte offset. offset is interpreted relative to the position indicated by whence. The default value for whence is SEEK_SET. Values for whence are:

SEEK_SET or 0 – start of the stream (the default); offset should be zero

or positive

SEEK_CUR or 1 – current stream position; offset may be negative

SEEK_END or 2 – end of the stream; offset is usually negative

Return the new absolute position.

close(self) → ’S3Resource’

Close the resource stream.

get_value(self)StringOrBytes

Retrieve the entire contents of the S3Resource.

load(self, constructor: Callable[, T] = None, unpack: bool = True) → Union[dict, list, T]

load the content of the stream into memory using json.loads. If a constructor is provided, it will be used to create a new object. Setting unpack to be true will unpack the content when creating the object with the constructor (i.e. * for list, ** for dict)

Args:
constructor (Callable[…, T], optional): A constructor function.

Defaults to None.

unpack (bool, optional): whether to unpack the content when passing

it to the constructor. Defaults to True.

Raises:

TypeError: [description]

Returns:

Union[dict, list, T]: [description]

save(self, bucketname: str = None, s3client: boto3.client = None, **kwargs) → ’S3Resource’

Saves the current S3Resource to the provided s3 bucket (in constructor or in arg). Extra args can be pass to boto3.s3.transfer.S3Transfer via keyword arguments of the same name.

See https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS

Args:
bucketname (str, optional): bucket to save the resource to. Overwrites

the bucket name provided in the constructor. Defaults to None.

s3client (boto3.client, optional): custom s3 client to use. Defaults to

None.

**kwargs: additional args to pass to boto3.s3.transfer.S3Transfer.

Raises:

ValueError: “S3 bucket name must be provided.”

Returns:

S3Resource: S3Resource object.