Skip to content

Latest commit

 

History

History
209 lines (123 loc) · 16.5 KB

File metadata and controls

209 lines (123 loc) · 16.5 KB
copyright
years
2024, 2026
lastupdated 2026-04-05
keywords
subcollection cloud-logs

{{site.data.keyword.attribute-definition-list}}

Configuring buckets for long term storage and search

{: #about-bucket}

{{site.data.keyword.logs_full_notm}} uses {{site.data.keyword.cos_full_notm}} buckets to store data and metrics for long term storage and search. {: shortdesc}

About buckets

{: #about-bucket-ov}

{{site.data.keyword.cos_full_notm}} is a highly available, durable, and secure platform for storing unstructured data. The files that are uploaded into {{site.data.keyword.cos_full_notm}} are called objects. Objects can be anywhere from a few bytes up to 10TB. They are organized into buckets that serve as containers for objects, and which can be configured independently from one another in terms of locations, resiliency, billing rates, security, and object lifecycle. For more information, see What is {{site.data.keyword.cos_full_notm}}?.

To manage buckets, your user must be granted permissions to work with buckets on the {{site.data.keyword.cos_full_notm}} instance. For more information about roles, see Identity and Access Management roles.

To create a bucket, you can choose 1 of the following options:

Action More info
Create a bucket through the {{site.data.keyword.cloud_notm}} UI Learn more
Create a bucket through the {{site.data.keyword.cloud_notm}} CLI Learn more
Create a bucket by using cURL Learn more
Create a bucket by using the REST API Learn more
Create a bucket with a different storage class by using the REST API Learn more
Create a bucket with Key Protect or Hyper Protect Crypto Services managed encryption keys (SSE-KP) by using the REST API Learn more
Create a bucket by using Terraform Learn more
{: caption="Create bucket requests" caption-side="top"}

For more information, see Getting started with {{site.data.keyword.cos_full_notm}}.

About buckets with {{site.data.keyword.logs_full_notm}}

{: #about-bucket-clbuckets}

For each {{site.data.keyword.logs_full_notm}} instance, you can configure 1 data bucket and 1 metrics bucket.

You can configure the data and metrics buckets in the same region or in a different region from your {{site.data.keyword.logs_full_notm}} instance. The buckets and the {{site.data.keyword.logs_full_notm}} instance can be in the same account or in different accounts. {: note}

You should create a bucket with Cross Region resiliency to store and access data across multiple geographical regions to ensure high availability, durability, and disaster recovery capabilities. See Creating and modifying {{site.data.keyword.cos_full_notm}} buckets.

You can configure the same bucket as your data bucket and your metrics bucket. However, consider the following recommendations:

Use different buckets for data and for metrics for production environments. {: tip}

Use separate buckets for logs and metrics if you have different data retention requirements on logs and metrics. {: tip}

You are responsible for the bucket and the data that is uploaded into the buckets. You decide for how long you want to keep the data in a bucket.

  • Compliance, corporate and industry requirements are key inputs to help define how long to keep the data for.

  • In {{site.data.keyword.cos_full_notm}}, you can configure object lifecycle policies, including tags, to automatically delete files from your buckets.

Deleted data will no longer be queryable. Ensure that you no longer require the deleted data for any queries or processes before removing it. {: attention}

To use different object lifecycle periods for metrics and logs data, you must use different buckets to handle your log data and your metrics data separately, and configure the lifecycle policies appropriately. {: tip}

To use different lifecycle periods for logs data ingested through different data pipelines, you must configure archive retention tags in {{site.data.keyword.logs_full_notm}} and lifecycle policies filtering by tag appropriately. {: tip}

While all data stored in {{site.data.keyword.cos_full_notm}} is automatically encrypted using randomly generated keys, some workloads require that the keys can be rotated, deleted, or otherwise controlled by a key management system (KMS) like {{site.data.keyword.keymanagementservicefull}}. Data at rest is encrypted with automatic provider-side Advanced Encryption Standard (AES) 256-bit encryption and the Secure Hash Algorithm (SHA)-256 hash. Data in motion is secured by using the built-in carrier grade Transport Layer Security/Secure Sockets Layer (TLS/SSL) or SNMPv3 with AES encryption. If you want more control over encryption, you can make use of {{site.data.keyword.keymanagementservicefull}} to manage generated or "bring your own" keying. For more information, see Encrypting a bucket with {{site.data.keyword.keymanagementservicefull}} and Key-protect COS Integration.

Notice that data that is stored in the data bucket includes data across all TCO data pipelines: data from {{site.data.keyword.frequent-search}}, {{site.data.keyword.monitoring}} and {{site.data.keyword.compliance}}. If the data must be protected by a customer-managed encryption only, then TCO policies need to be configured to exclusively process data through the {{site.data.keyword.monitoring}} or {{site.data.keyword.compliance}} data pipelines. For more information, see Configuring the TCO Optimizer. {: attention}

The {{site.data.keyword.cos_full_notm}} service is billed separately from {{site.data.keyword.logs_full_notm}}. {{site.data.keyword.cos_full_notm}} storage costs are determined by the pricing plan{: external} that you choose for the {{site.data.keyword.cos_full_notm}} instance.

{{site.data.keyword.logs_full_notm}} does not support {{site.data.keyword.cos_full_notm}} buckets configured with retention policies, object lock policies, or with public access enabled since {{site.data.keyword.logs_full_notm}} requires deletion permissions on the logs and metrics buckets. {: restriction}

IAM Service to service authorization between {{site.data.keyword.logs_full_notm}} and {{site.data.keyword.cos_full_notm}}

{: #about-bucket-s2s}

You must define a service to service (S2S) authorization between {{site.data.keyword.logs_full_notm}} and {{site.data.keyword.cos_full_notm}} to allow {{site.data.keyword.logs_full_notm}} to read and write data into the buckets.

For more information, see:

Data bucket

{: #about-bucket-data}

You can configure a data bucket for an {{site.data.keyword.logs_full_notm}} instance. For more information, see Configuring the data bucket.

  • The data bucket stores and retains logs for as long as you need them.

  • If you have regulatory and compliance requirements, check the location where you can create the bucket. Then, if performance is critical, consider creating the bucket in the same region where the {{site.data.keyword.logs_full_notm}} instance is provisioned.

  • You must configure the direct endpoint as the bucket endpoint.

    Direct endpoints are used for requests originating from resources within VPCs. Direct endpoints provide better performance over Public endpoints and do not incur charges for any outgoing or incoming bandwidth even if the traffic is cross regions or across data centers. For more information, see Endpoint Types.

  • You are responsible for the maintenance of the data bucket. In {{site.data.keyword.logs_full_notm}}, you can use {{site.data.keyword.cos_full_notm}} object tags to help you manage automatically the log data in a bucket. For more information, see Deleting files from the data bucket.

Files uploaded to the data bucket

{: #about-bucket-cl-data-bucket-objects}

Logs are stored as Parquet files with the following structure:

cx/parquet/v1/team_id=<TEAM>/dt=<DT>/hr=<HR>/UUID.parquet

{: codeblock}

Metadata is stored in manifest files with this structure:

cx/parquet/v1/_manifest/team_id=<TEAM>/dt=<DT>/hr=<HR>/_manifest/UUID.manifest

{: codeblock}

For example:

cx/parquet/v1/team_id=58/dt=2024-12-18/hr=14/_manifest/df7bda51-9a1a-4c67-9f4d-b17f93ec4fd1.manifest
cx/parquet/v1/team_id=58/dt=2024-12-18/hr=14/710bb5f8-0cfc-4706-8aec-27ec7d993af8.parquet

{: screen}

Deleting files from the data bucket

{: #about-bucket-cl-data-bucket-maintain}

In {{site.data.keyword.cos_full_notm}}, you can define expiration rules (lifecycle policies) on buckets. An expiration rule deletes objects after a defined period (from the object creation date). The expiration rules for each bucket are evaluated once every 24 hours. Any object that qualifies for expiration (based on the objects' expiration date) will be queued for deletion. The deletion of expired objects begins the following day and will typically take less than 24 hours.

  • You can configure expiration rules that can limit the scope of the rule by using one or more filters such as an object prefix, an object tags, or an object size.
  • You can use tags as a filter option that allows expiration rules to apply to objects that contain a matching tag. The tag filter is provided as a container that specifies a key string and value string. The key string must be less than 128 characters.
  • If no prefix, tag or object size is configured, the policy will apply to all objects in the bucket. For more information, see Deleting stale data with expiration rules.

In {{site.data.keyword.cos_full_notm}}, you can configure expiration rules (lifecycle policies) to manage automatically the deletion of object files based on number of days since the object creation date. However, if you want a more granular control on the data that is kept for search in the data bucket and delete files automatically by using different retention periods on the data, you must configure in {{site.data.keyword.cos_full_notm}} expiration rules that limit the scope by using the object tag ICL_ARCHIVE_RETENTION and use the tag values that you define in your {{site.data.keyword.logs_full_notm}} instance. {: note}

To use archive retention tags, you must complete the following steps:

  1. In {{site.data.keyword.logs_full_notm}}, configure {{site.data.keyword.cos_full_notm}} object tags to manage automatically how long log data is available for search in the data bucket.

    • You must configure and enable archive retention tags in your {{site.data.keyword.logs_full_notm}} instance. For more information, see Configuring archive retention tags to manage data retention.
    • You can define up to 3 custom object tags that you can use to define 3 different expiration periods on the log data.
    • You can use the default tag to define a default expiration period that you can apply to data that is not explicitly managed through a custom object tag.

    After you activate archive retention tags, every file in your data bucket is tagged with the custom tag ICL_ARCHIVE_RETENTION. The value of the tag is set to a custom tag value or to default. This action cannot be undone. Retention tags cannot be deactivated once enabled. {: attention}

  2. In your {{site.data.keyword.cos_full_notm}} data bucket lifecycle policies section, configure expiration rules for each tag, including default.

    Use the key ICL_ARCHIVE_RETENTION.

    The value string must be less than 256 characters. For example, you can use values like high, medium, and low.

    Make sure the tag names that you configure in {{site.data.keyword.logs_full_notm}} match the tag values you set in the expiration policies in your bucket. Tag values are case-sensitive. {: attention}

  3. In {{site.data.keyword.logs_full_notm}}, configure 1 or more TCO policies and define the object tag to use with the data selected in the policy. If no tag is configured, the default tag is used.

    Data that is sent to the log data bucket is uploaded into object files. Each file has 1 object tag ICL_ARCHIVE_RETENTION and value. For more information, see Retention tags.

Archive retention tags are attached to object files that are uploaded into the data bucket after they are defined and enabled in the {{site.data.keyword.logs_full_notm}} instance. {: note}

{{_include-segments/data-bucket-restrictions.md}}

Metrics bucket

{: #about-bucket-metrics}

You can configure a metrics bucket for an {{site.data.keyword.logs_full_notm}} instance. For more information, see Configuring the metrics bucket.

  • The metrics bucket stores and retains metrics from your events in a long-term index for as long as you need them.

    When you enable metrics, you can generate metrics from logs. These metrics are stored in the metrics bucket as Prometheus index blocks{: external}.

  • If you have regulatory and compliance requirements, check the location where you can create the bucket. Then, if performance is critical, consider creating the bucket in the same region where the {{site.data.keyword.logs_full_notm}} instance is provisioned.

  • You must configure the direct endpoint as the bucket endpoint.

    Direct endpoints are used for requests to a bucket that originate from resources within VPCs. Direct endpoints provide better performance over Public endpoints and do not incur charges for any outgoing or incoming bandwidth even if the traffic is cross regions or across data centers. For more information, see Endpoint Types.

  • You are responsible for the maintenance of the metrics bucket. In {{site.data.keyword.cos_full_notm}}, you can define an expiration rule to maintain data in the metrics bucket. For more information, see Deleting stale data with expiration rules.