Notifications
Notifications in Conductor help proactively detect potential issues with a cluster's Dapr installation and Dapr-enabled apps by continuously evaluating signals of interest from the monitoring and logging system.
By default, Conductor provides a set of notification rules, which when triggered, will propagate events to the Conductor event stream via the three default channels for metrics, logs and advisories. In Conductor Enterprise, custom notifications can be configured and delivered by creating the following:
- Rules: Define the criteria and conditions for triggering events of specific severities.
- Channels: Act as subscribers connecting the events generated by a rule with an event consumption channel such as the Conductor UI, email, or webhook.
Notification rules
A notification rule will continuously monitor Dapr-enabled app and sidecar telemetry and Dapr log signals to evaluate whether to fire a notification with a specific severity.
Rules are composed of the following:
- Alert type: The category of data to act on (
metrics
,logs
oradvisor
) - Target type: The target resource to monitor (
cluster
,application
,component
) - Severity: The severity of the event triggered by the rule (
critical
orwarning
) - Name: A descriptive name that helps identify the rule
- Conditions: The conditions which will trigger the rule to fire an event based on the alert type selected
When you create a new organization in Conductor, you get a default set of notification rules and channels set up out of the box. These notification rules ensure that critical metrics issues, high advisories, and error logs are automatically surfaced.
Default rules
- Enterprise
- Free
Name | Type | Severity | Target | Conditions |
---|---|---|---|---|
Default App Crash | metrics | critical | all apps | App container in CrashLoopBackOff state for > 2 minutes |
Default Dapr Crash | metrics | critical | all apps | Dapr sidecar (Daprd ) in CrashLoopBackOff state for > 2 minutes |
Default App to Dapr HTTP Error Rate | metrics | critical | all apps | > 30% for > 2m |
Default App to Dapr gRPC Error Rate | metrics | critical | all apps | > 30% for > 2m |
Default Dapr to App HTTP Error Rate | metrics | critical | all apps | > 30% for > 2m |
Default Dapr to App gRPC Error Rate | metrics | critical | all apps | > 30% for > 2m |
Default Component Error Rate | metrics | critical | all components | > 30% for > 2m |
Default Dapr Operator Unavailable | metrics | critical | all clusters | > 30% for > 2m |
Default Dapr Sentry Unavailable | metrics | critical | all clusters | 0 sentry pods in Running status > 2m |
Default Dapr Placement Unavailable | metrics | critical | all clusters | 0 placement pods Running status > 2m |
Default Dapr Sidecar Injector Unavailable | metrics | critical | all clusters | 0 injector pods Running status > 2m |
Default Dapr Sentry Root Cert Expiry | metrics | critical | all clusters | Dapr Cert detected to expire in < 10d |
Default High Impact Advisories | advisor | critical | all clusters | Any advisories with severity high present for > 15m |
Default Logs | logs | warning | all clusters | Any Daprd logs with level fatal , error , or warning |
Name | Type | Severity | Target | Conditions |
---|---|---|---|---|
Default App Crash | metrics | critical | all apps | App container in CrashLoopBackOff state for > 2 minutes |
Default Dapr Crash | metrics | critical | all apps | Dapr sidecar (Daprd ) in CrashLoopBackOff state for > 2 minutes |
Default Dapr Operator Unavailable | metrics | critical | all clusters | > 30% for > 2m |
Default Dapr Sentry Unavailable | metrics | critical | all clusters | 0 sentry pods running > 2m |
Default Dapr Placement Unavailable | metrics | critical | all clusters | 0 placement pods running > 2m |
Default Dapr Sidecar Injector Unavailable | metrics | critical | all clusters | 0 injector pods running > 2m |
Default Dapr Sentry Root Cert Expiry | metrics | critical | all clusters | Dapr Cert detected to expire in < 10d |
Default App to Dapr HTTP Error Rate | metrics | warning | all apps | > 0% for > 2m |
Default App to Dapr gRPC Error Rate | metrics | warning | all apps | > 0% for > 2m |
Default Dapr to App HTTP Error Rate | metrics | warning | all apps | > 0% for > 2m |
Default Dapr to App gRPC Error Rate | metrics | warning | all apps | > 0% for > 2m |
Default Component Error Rate | metrics | warning | all components | > 0% for > 2m |
Default Logs | logs | warning | all clusters | Any Daprd logs with level fatal , error , or warning |
Custom Rules
- Enterprise
- Free
Conductor Enterprise supports creating custom rules using the metrics below:
Target | Name | Description |
---|---|---|
cluster | Dapr Control Plane CPU Usage | The CPU usage of the Dapr control plane services |
cluster | Dapr Control Plane Memory Usage | The memory usage of the Dapr control plane services |
cluster | Dapr Sentry Root Cert Expiry | The remaining time until the Dapr sentry root certificate expires |
application | App to Dapr HTTP Latency 95th Percentile | The 95th percentile latency of HTTP requests to the Dapr API |
application | App to Dapr HTTP RPS | The number of HTTP requests per second to the Dapr API |
application | App to Dapr HTTP Error Rate | The error rate of HTTP requests to the Dapr API |
application | Dapr to App HTTP Latency 95th Percentile | The 95th percentile latency of HTTP requests from Dapr |
application | Dapr to App HTTP RPS | The number of HTTP requests per second from Dapr |
application | Dapr to App HTTP Error Rate | The error rate of HTTP requests from Dapr |
application | App to Dapr gRPC Latency 95th Percentile | The 95th percentile latency of gRPC requests to the Dapr API |
application | App to Dapr gRPC RPS | The number of gRPC requests per second to the Dapr API |
application | App to Dapr gRPC Error Rate | The error rate of gRPC requests to the Dapr API |
application | Dapr to App gRPC Latency 95th Percentile | The 95th percentile latency of gRPC requests from Dapr |
application | Dapr to App gRPC RPS | The number of gRPC requests per second from Dapr |
application | Dapr to App gRPC Error Rate | The error rate of gRPC requests from Dapr |
application | App CPU Usage | The CPU usage of the app |
application | App Memory Usage | The memory usage of the app |
application | Dapr CPU Usage | The CPU usage of the Dapr sidecar |
application | Dapr Memory Usage | The memory usage of the Dapr sidecar |
application | App Container Restart Count | The restart count of the app container |
application | Dapr Sidecar Container Restart Count | The restart count of the Dapr sidecar container |
application | App Container Restart Count | The restart count of the app container |
application | Dapr Sidecar Container Crash Status | The crash status of the Dapr sidecar container |
application | App Container Crash Status | The crash status of the app container |
component | App Component RPS | The number of requests per second made to components by the app |
component | App Component Latency 95 Percentile | The 95th percentile latency of requests made to components by the app |
component | App Component Error Rate | The error rate of requests made to components by an app |
In addition, Conductor Enterprise supports creating custom rules based on the Advisor rules and alerting you when selected advisory is triggered:
Target | Name | Description |
---|---|---|
cluster | High Impact Advisory | One or more high impact advisories are triggered |
cluster | Medium Impact Advisory | One or more medium impact advisories are triggered |
cluster | Low Impact Advisory | One or more low impact advisories are triggered |
cluster | App Health Check | Dapr app health checks are not configured |
cluster | JSON App Logs | Dapr sidecars not set to JSON-formatted logs |
cluster | Authenticate App to Dapr | App to Dapr token authentication not configured |
cluster | Automount Service Account Token | Pod specification has automountServiceAccountToken set to true |
cluster | Renew mTLS Certificate | Dapr mTLS certificates close to or past expiration date |
cluster | Component Scopes | Component has no scopes defined |
cluster | Component Nonexistent Scopes | Component scoped to nonexistent app ID(s) |
cluster | Component Secrets | Component metadata contains sensitive information as plain text |
cluster | Binding Component | Bindings component has no direction defined |
cluster | Control Plane Debug Mode | Dapr control plane is in debug mode |
cluster | JSON Control Plane Logs | Dapr control plane not set to JSON-formatted logs |
cluster | Control Plane Log Level | Dapr control plane log level is debug |
cluster | Control Plane Old Version | Current Dapr version out of support policy |
cluster | Control Plane Resource Values | Dapr control plane resource requests and limits not set |
cluster | Control Plane NonRoot | Dapr control plane runAsNonRoot set to false |
cluster | Control plane logs | Dapr control plane deployed in HA mode with on-disk raft logs |
cluster | Authenticate Dapr to App | Dapr to app token authentication not configured |
cluster | Control Plane High Availability | Dapr control plane not deployed in highly available (HA) configuration |
cluster | High Cardinality GrpcClient Metrics | High cardinality Dapr gRPC client metrics detected |
cluster | High Cardinality GrpcServer Metrics | High cardinality Dapr gRPC server metrics detected |
cluster | High Cardinality HttpClient Metrics | High cardinality Dapr HTTP client metrics detected |
cluster | High Cardinality HttpServer Metrics | High cardinality Dapr HTTP server metrics detected |
cluster | High Cardinality Service Invocation Metrics | High cardinality Dapr service invocation metrics detected |
cluster | Control Plane mTLS | Mutual Authentication (mTLS) is disabled |
cluster | Resource Values Optimization | Application resource settings optimization advisories |
cluster | Sidecar API Logs | Sidecar API logging is enabled |
cluster | Sidecar Log Level | Sidecar log level is debug |
cluster | Sidecar Profiling | Dapr sidecar profiling server is enabled |
cluster | Sidecar Resource Values | Dapr sidecar resource requests and limits not set |
cluster | Operator Watchdog | Dapr operator injector watchdog is disabled |
cluster | Subscription Scopes | Subscription has no scopes defined |
cluster | Dapr New Release | New Dapr version available |
cluster | Pub/Sub Topic Scopes | PubSub component has no topic scopes defined |
cluster | HTTPEndpoint Scopes | HTTPEndpoint has no scopes defined |
cluster | Dapr Placement Service Force In Memory Log | Force in-memory log for Dapr Placement Service in HA deployments |
cluster | App Access Control Policy | Dapr configuration access control policy not configured |
cluster | Service Invocation Max HTTP Request Header Size | Service Invocation maximum HTTP request header size is not configured |
cluster | Deprecated version of component | Component version is deprecated |
Custom rules are not supported in Conductor Free.
Notification channels
Notification channels are the subscribers that receive notifications fired by the rules. All notifications with a matching target and severity will be forwarded to the destination specified in the channel.
Channels are composed of the following:
- Channel ID: An auto-generated, unique identifier for a specific channel
- Alert type: The category of notification events to deliver (
metrics
,logs
oradvisor
) - Severities: The severity level(s) which should be delivered upon firing
- Target clusters: The clusters in which this notification will be active
- Destination: The delivery target where the matching events will be sent
Default channels
By default, three notification channels are created in an organization:
- A channel to send
metrics
events of severitywarning
andcritical
to the Conductor events dashboard. - A channel to send
logs
events of severitywarning
andcritical
to the Conductor events dashboard. - A channel to send
advisor
events of severitywarning
andcritical
to the Conductor events dashboard.
Conductor Free only provides support for the default channels. To create custom notification channels, use Conductor Enterprise.
Notification events
Conductor provides a dashboard that displays all notifications sent to the Conductor channel across all connected clusters. The dashboard allows filtering on:
- Type - logs, metrics, or advisor event type.
- Log Level - filter based on log levels
error
,warning
andfatal
. Filter only applies to log event types. - Time Span - filter events that occur in the last X hours or X days.
- Metric Severity - filter based on severity values
critical
andwarning
. Filter only applies to metrics and advisor event types. - Cluster - events from a specific cluster.
- Namespace - events from a specific cluster namespace.
- Dapr Application Id - events from a specific application specified by its Dapr App ID.
- Message - performs a fuzzy text-based search of the
Message
field of events.
Clicking on a notification entry brings up the details panel for viewing the full notification details and navigating to the source cluster, application or component that generated the event.
Conductor automatically deduplicates repeat log entries and metric alerts for the same app so that they only show up hourly in the Notifications Events dashboard. Thus, if there is a log entry or metric alert that is constantly output to the Dapr sidecar, it will only show up once an hour in Conductor. For the advisor alerts, the deduplication period is 24 hours. Therefore, if there is a constant advisor alert, it will only show up once every 24 hours in Conductor.