Open telemetry collector

SUSE Observability

Open Telemetry Collector

The OpenTelemetry Collector offers a vendor-agnostic implementation to receive, process and export telemetry data. Applications instrumented with Open Telemetry SDKs can use the collector to send telemetry data to SUSE Observability (traces and metrics).

Your applications, when set up with OpenTelemetry SDKs, can use the collector to send telemetry data, like traces and metrics, to SUSE Observability or another collector (for further processing). The collector is set up to receive this data by default via OTLP, the native open telemetry protocol. It can also receive data in other formats provided by other instrumentation SDKs like Jaeger and Zipkin for traces, and Influx and Prometheus for metrics.

The collector is running close to your application, in the same Kubernetes cluster, on the same virtual machine, etc. This allows SDKs to quickly offload data to the collector, which can then do transformations, batching and filtering. It can be used by multiple applications and allows for easy changes to your data processing pipeline.

For installation guides use the different getting started guides. The getting started guides provide a basic collector configuration to get started, but over time you'll want to customize it to your needs and add additional receivers, processors, and exporters to customize your ingestion pipeline to your needs.

Configuration

The collector configuration defines pipelines for processing the different telemetry signals. The components in the processing pipeline can be divided in several categories, and each component has its own configuration. Here we'll give an overview of the different configuration sections and how to use them.

Receivers

Receivers accept telemetry data from instrumented applications, here via OTLP:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

There are many more receivers that accept data via other protocols, for example Zipkin traces, or that actively collect data from various sources, for example:

  • Host metrics

  • Kubernetes metrics

  • Prometheus metrics (OpenMetrics)

  • Databases

Some receivers support all 3 signals (traces, metrics, logs), others support only 1 or 2, for example the Prometheus receiver can only collect metrics. The opentelemetry-collector-contrib repository has all receivers with documentation on their configuration.

Processors

The data from the receivers can be transformed or filtered by processors.

processors:
  batch: {}

The batch processor batches all 3 signals, improving compression and reducing the number of outgoing connections. The opentelemetry-collector-contrib repository has all processors with documentation on their configuration.

Exporters

To send data to the SUSE Observability backend the collector has exporters. There are exporters for different protocols, push- or pull-based, and different backends. Using the OTLP protocols it is also possible to use another collector as a destination for additional processing.

exporters:
  # The gRPC otlp exporter
  otlp/suse-observability:
    auth:
      authenticator: bearertokenauth
    # Put in your own otlp endpoint
    endpoint: <otlp-suse-observability-endpoint>
    # Use snappy compression, if no compression specified the data will be uncompressed
    compression: snappy

The SUSE Observability exporter requires authentication using an api key, to configure that an authentication extension is used. The opentelemetry-collector-contrib repository has all exporters with documentation on their configuration.

If the gRPC exporter doesn't work for you (see also troubleshooting), you can switch to the, slightly less efficient, OTLP over HTTP protocol by using the otlphttp exporter instead. Replace all references to otlp/suse-observability with otlphttp/suse-observability (don't forget the references in the pipelines) and make sure to update the exporter config to:

exporters:
  # The gRPC otlp exporter
  otlphttp/suse-observability:
    auth:
      authenticator: bearertokenauth
    # Put in your own otlp HTTP endpoint
    endpoint: <otlp-http-suse-observability-endpoint>
    # Use snappy compression, if no compression specified the data will be uncompressed
    compression: snappy

Service pipeline

For each telemetry signal a separate pipeline is configured. The pipelines are configured in the service.pipeline section and define which receivers, processors and exporters should be used in which order. Before using a component in the pipeline it must first be defined in its configuration section. The batch processor, for example, doesn't have any configuration but still has to be declared in the processors section. Components that are configured but are not included in a pipeline will not be active at all.

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, resource, batch]
      exporters: [debug, spanmetrics, otlp/suse-observability]
    metrics:
      receivers: [otlp, spanmetrics, prometheus]
      processors: [memory_limiter, resource, batch]
      exporters: [debug, otlp/suse-observability]

Extensions

Extensions are not used directly in pipelines for processing data but extend the capabilities of the collector in other ways. For SUSE Observability it is used to configure the authentication using an api key. Extensions must be defined in a configuration section before they can be used. Similar to the pipeline components an extension is only active when it is enabled in the service.extensions section.

extensions:
  bearertokenauth:
    scheme: SUSEObservability
    token: "${env:API_KEY}"
service:
  extensions: [ bearertokenauth ]

The opentelemetry-collector-contrib repository has all extensions with documentation on their configuration.

Transforming telemetry

There are many processors in the opentelemetry-collector-contrib repository. Here we try to give an overview of commonly used processors and their capabilities. For more details and many more processors use the opentelemetry-collector-contrib repository.

Filtering

Some instrumentations or applications may generate a lot of telemetry data that is just noisy and unneeded for your use-case. The filter processor can be used to drop the data that you don't need in the collector, to avoid sending the data to SUSE Observability. For example to drop all the data of 1 specific service:

processors:
  filter/ignore-service1:
    error_mode: ignore
    traces:
      span:
        - resource.attributes["service.name"] == "service1"

The filter processor uses the Open Telemetry Transformation Lanuage (OTTL) to define the filters.

Adding, modifying or deleting attributes

The attributes processor can change attributes of spans, logs or metrics.

processors:
  attributes/accountid:
    actions:
      - key: account_id
        value: 2245
        action: insert

The resource processor can modify attributes of a resource. For example to add a Kubernetes cluster name to every resource:

  processors:
    resource/add-k8s-cluster:
      attributes:
      - key: k8s.cluster.name
        action: upsert
        value: my-k8s-cluster

For changing metric names and other metric specific information there is also the metrics transformer.

Transformations

The transform processor can be used to, for example, set a span status:

processors:
  transform:
    error_mode: ignore
    trace_statements:
      - set(span.status.code, STATUS_CODE_OK) where span.attributes["http.request.status_code"] == 400

It supports many more transformations, like modifying the span name, converting metric types or modifying log events. See it's readme for all the possibilities. It uses the Open Telemetry Transformation Lanuage (OTTL) to define the filters.

Scrub sensistive data

The collector is the ideal place to remove or obfuscate sensitive data, because it sits right between your applications and SUSE Observability and has processors to filter and transform your data. Next to the filtering and transformation capabilities already discussed there is also a redaction processor available that can mask attribute values that match a block list. It can also remove attributes that don't match a specified list of allowed attributes, however using this can quickly result in dropping most attributes resulting in very limited observability capabilities. Note that it does not process resource attributes.

An example that only masks specific attributes and/or values:

processors:
  redaction:
    allow_all_keys: true
    # attributes matching the regexes on the list are masked.
    blocked_key_patterns:
      - ".*token.*"
      - ".*api_key.*"
    blocked_values: # Regular expressions for blocking values of allowed span attributes
      - '4[0-9]{12}(?:[0-9]{3})?' # Visa credit card number
      - '(5[1-5][0-9]{14})' # MasterCard number
    summary: debug

Trying out the collector

The getting started guides show how to deploy the collector to Kubernetes or using Linux packages for a production ready setup. It is also possible to run it, for example for tests, directly as a docker container to try it out:

docker run \
  -p 127.0.0.1:4317:4317 \
  -p 127.0.0.1:4318:4318 \
  -v $(pwd)/config.yaml:/etc/otelcol-contrib/config.yaml \
  ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:latest

This uses the collector contrib image which includes all contributed components (receivers, processors, etc.). A smaller, more limited version of the image is also available, but it has only a very limited set of components available:

docker run \
  -p 127.0.0.1:4317:4317 \
  -p 127.0.0.1:4318:4318 \
  -v $(pwd)/config.yaml:/etc/otelcol/config.yaml \
  ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector:latest

Note that the Kubernetes installation defaults to the Kubernetes distribution of the collector image, ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s, which has more components than the basic image, but less than the contrib image. If you run into missing components with that image you can simply switch it to use the contrib image , ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib, instead.

Troubleshooting

HTTP Requests from the exporter are too big

In some cases HTTP requests for telemetry data can become very large and may be refused by SUSE Observability. SUSE Observability has a limit of 4MB for the gRPC protocol. If you run into HTTP requests limits you can lower the requests size by changing the compression algorithm and limiting the maximum batch size.

HTTP request compression

The getting started guides enable snappy compression on the collector, this is not the best compression but uses less CPU resources than gzip. If you removed the compression you can enable it again, or you can switch to a compression algorithm that offers a better compression ratio. The same compression types are available for gRPC and HTTP protocols.

Max batch size

To reduce the HTTP request size can be reduced by adding configuration to the batch processor limiting the batch size:

processor:
  batch: {}
    send_batch_size: 8192 # This is the default value
    send_batch_max_size: 10000 # The default is 0, meaning no max size at all

The batch size is defined in number of spans, metric data points, or log records (not in bytes), so you might need some experimentation to find the correct setting for your situation. For more details please refer to the batch processor documentation.

The Open Telemetry documentation provides much more details on the configuration and alternative installation options:

  • Open Telemetry Collector configuration: https://opentelemetry.io/docs/collector/configuration/

  • Kubernetes installation of the collector: https://opentelemetry.io/docs/kubernetes/helm/collector/

  • Using the Kubernetes operator instead of the collector Helm chart: https://opentelemetry.io/docs/kubernetes/operator/

  • Open Telemetry sampling: https://opentelemetry.io/blog/2022/tail-sampling/

Last updated