Metrics

The Nomad agent collects various runtime metrics about the performance of different libraries and subsystems. These metrics are aggregated on a ten second interval and are retained for one minute.

This data can be accessed via an HTTP endpoint or via sending a signal to the Nomad process. This data is available via HTTP at /metrics. See Metrics for more information.

To view this data via sending a signal to the Nomad process: on Unix, this is USR1 while on Windows it is BREAK. Once Nomad receives the signal, it will dump the current telemetry information to the agent's stderr.

This telemetry information can be used for debugging or otherwise getting a better view of what Nomad is doing.

Telemetry information can be streamed to both statsite as well as statsd based on providing the appropriate configuration options.

To configure the telemetry output please see the agent configuration.

Below is sample output of a telemetry dump:

[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_blocked': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.plan.queue_depth': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.malloc_count': 7568.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.total_gc_runs': 8.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_ready': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.num_goroutines': 56.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.sys_bytes': 3999992.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.heap_objects': 4135.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.heartbeat.active': 1.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_unacked': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_waiting': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.alloc_bytes': 634056.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.free_count': 3433.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.total_gc_pause_ns': 6572135.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.memberlist.msg.alive': Count: 1 Sum: 1.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.serf.member.join': Count: 1 Sum: 1.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.raft.barrier': Count: 1 Sum: 1.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.raft.apply': Count: 1 Sum: 1.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.nomad.rpc.query': Count: 2 Sum: 2.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Query': Count: 6 Sum: 0.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.fsm.register_node': Count: 1 Sum: 1.296
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Intent': Count: 6 Sum: 0.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.runtime.gc_pause_ns': Count: 8 Min: 126492.000 Mean: 821516.875 Max: 3126670.000 Stddev: 1139250.294 Sum: 6572135.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.leader.dispatchLog': Count: 3 Min: 0.007 Mean: 0.018 Max: 0.039 Stddev: 0.018 Sum: 0.054
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.reconcileMember': Count: 1 Sum: 0.007
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.reconcile': Count: 1 Sum: 0.025
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.fsm.apply': Count: 1 Sum: 1.306
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.client.get_allocs': Count: 1 Sum: 0.110
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.worker.dequeue_eval': Count: 29 Min: 0.003 Mean: 363.426 Max: 503.377 Stddev: 228.126 Sum: 10539.354
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Event': Count: 6 Sum: 0.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.commitTime': Count: 3 Min: 0.013 Mean: 0.037 Max: 0.079 Stddev: 0.037 Sum: 0.110
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.barrier': Count: 1 Sum: 0.071
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.client.register': Count: 1 Sum: 1.626
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.eval.dequeue': Count: 21 Min: 500.610 Mean: 501.753 Max: 503.361 Stddev: 1.030 Sum: 10536.813
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.memberlist.gossip': Count: 12 Min: 0.009 Mean: 0.017 Max: 0.025 Stddev: 0.005 Sum: 0.204

Metric Types

Type Description Quantiles
Gauge Gauge types report an absolute number at the end of the aggregation interval false
Counter Counts are incremented and flushed at the end of the aggregation interval and then are reset to zero true
Timer Timers measure the time to complete a task and will include quantiles, means, standard deviation, etc per interval. true

Tagged Metrics

Nomad emits metrics in a tagged format. Each metric can support more than one tag, meaning that it is possible to do a match over metrics for datapoints such as a particular datacenter, and return all metrics with this tag. Nomad supports labels for namespaces as well.

Key Metrics

The metrics in the table below are the most important metrics for monitoring the overall health of a Nomad cluster.

When telemetry is being streamed to statsite or statsd, interval in the table below is defined to be their flush interval. Otherwise, the interval can be assumed to be 10 seconds when retrieving metrics using the above described signals.

Metrics Description Unit Type
nomad.runtime.alloc_bytes Memory utilization # of bytes Gauge
nomad.runtime.heap_objects Number of objects on the heap. General memory pressure indicator # of heap objects Gauge
nomad.runtime.num_goroutines Number of goroutines and general load pressure indicator # of goroutines Gauge
nomad.nomad.broker.total_blocked Evaluations that are blocked until an existing evaluation for the same job completes # of evaluations Gauge
nomad.nomad.broker.total_ready Number of evaluations ready to be processed # of evaluations Gauge
nomad.nomad.broker.total_unacked Evaluations dispatched for processing but incomplete # of evaluations Gauge
nomad.nomad.heartbeat.active Number of active heartbeat timers. Each timer represents a Nomad Client connection # of heartbeat timers Gauge
nomad.nomad.heartbeat.invalidate The length of time it takes to invalidate a Nomad Client due to failed heartbeats ms / Heartbeat Invalidation Timer
nomad.nomad.plan.evaluate Time to validate a scheduler Plan. Higher values cause lower scheduling throughput. Similar to nomad.plan.submit but does not include RPC time or time in the Plan Queue ms / Plan Evaluation Timer
nomad.nomad.plan.queue_depth Number of scheduler Plans waiting to be evaluated # of plans Gauge
nomad.nomad.plan.submit Time to submit a scheduler Plan. Higher values cause lower scheduling throughput ms / Plan Submit Timer
nomad.nomad.rpc.query Number of RPC queries RPC Queries / interval Counter
nomad.nomad.rpc.request_error Number of RPC requests being handled that result in an error RPC Errors / interval Counter
nomad.nomad.rpc.request Number of RPC requests being handled RPC Requests / interval Counter
nomad.nomad.worker.invoke_scheduler.<type> Time to run the scheduler of the given type ms / Scheduler Run Timer
nomad.nomad.worker.wait_for_index Time waiting for Raft log replication from leader. High delays result in lower scheduling throughput ms / Raft Index Wait Timer
nomad.raft.apply Number of Raft transactions Raft transactions / interval Counter
nomad.raft.leader.lastContact Time since last contact to leader. General indicator of Raft latency ms / Leader Contact Timer
nomad.raft.replication.appendEntries Raft transaction commit time ms / Raft Log Append Timer

Client Metrics

The Nomad client emits metrics related to the resource usage of the allocations and tasks running on it and the node itself. Operators have to explicitly turn on publishing host and allocation metrics. Publishing allocation and host metrics can be turned on by setting the value of publish_allocation_metrics publish_node_metrics to true.

By default the collection interval is 1 second but it can be changed by the changing the value of the collection_interval key in the telemetry configuration block.

Please see the agent configuration page for more details.

As of Nomad 0.9, Nomad will emit additional labels for parameterized and periodic jobs. Nomad emits the parent job id as a new label parent_id. Also, the labels dispatch_id and periodic_id are emitted, containing the ID of the specific invocation of the parameterized or periodic job respectively. For example, a dispatch job with the id myjob/dispatch-1312323423423, will have the following labels.

Label Value
job myjob/dispatch-1312323423423
parent_id myjob
dispatch_id 1312323423423

Host Metrics

Nomad will emit tagged metrics, in the below format:

Metric Description Unit Type Labels
nomad.client.allocated.cpu Total amount of CPU shares the scheduler has allocated to tasks Mhz Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocated.memory Total amount of memory the scheduler has allocated to tasks Megabytes Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocated_disk Total amount of disk space the scheduler has allocated to tasks Megabytes Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocations.blocked Number of allocations blocked Integer Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocations.migrating Number of allocations migrating Integer Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocations.pending Number of allocations pending Integer Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocations.running Number of allocations running Integer Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocations.start Number of allocations starting Integer Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocations.terminal Number of allocations terminal Integer Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.allocs.oom_killed Number of allocations OOM killed Integer Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.cpu.idle CPU utilization in idle state Percentage Gauge cpu, datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.cpu.system CPU utilization in system space Percentage Gauge cpu, datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.cpu.total Total CPU utilization Percentage Gauge cpu, datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.cpu.user CPU utilization in user space Percentage Gauge cpu, datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.disk.available Amount of space which is available Bytes Gauge datacenter, disk, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.disk.inodes_percent Disk space consumed by the inodes Percentage Gauge datacenter, disk, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.disk.size Total size of the device Bytes Gauge datacenter, disk, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.disk.used_percent Percentage of disk space used Percentage Gauge datacenter, disk, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.disk.used Amount of space which has been used Bytes Gauge datacenter, disk, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.memory.available Total amount of memory available to processes which includes free and cached memory Bytes Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.memory.free Amount of memory which is free Bytes Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.memory.total Total amount of physical memory on the node Bytes Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.host.memory.used Amount of memory used by processes Bytes Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.unallocated.cpu Total amount of CPU shares free for the scheduler to allocate to tasks Mhz Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.unallocated.disk Total amount of disk space free for the scheduler to allocate to tasks Megabytes Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.unallocated_memory Total amount of memory free for the scheduler to allocate to tasks Bytes Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status
nomad.client.uptime Uptime of the host running the Nomad client Seconds Gauge datacenter, host, node_class, node_id, node_scheduling_eligibility, node_status

Allocation Metrics

The following metrics are emitted for each allocation if allocation metrics are enabled. Note that allocation metrics available may be dependent on the task driver; not all task drivers can provide all metrics.

Metric Description Unit Type Labels
nomad.client.allocs.cpu.allocated Total CPU resources allocated by the task across all cores MHz Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.cpu.system Total CPU resources consumed by the task in system space Percentage Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.cpu.throttled_periods Total number of CPU periods that the task was throttled Nanoseconds Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.cpu.throttled_time Total time that the task was throttled Nanoseconds Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.cpu.total_percent Total CPU resources consumed by the task across all cores Percentage Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.cpu.total_ticks CPU ticks consumed by the process in the last collection interval Integer Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.cpu.user Total CPU resources consumed by the task in the user space Percentage Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.memory.allocated Amount of memory allocated by the task Bytes Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.memory.cache Amount of memory cached by the task Bytes Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.memory.kernel_max_usage Maximum amount of memory ever used by the kernel for this task Bytes Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.memory.kernel_usage Amount of memory used by the kernel for this task Bytes Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.memory.max_usage Maximum amount of memory ever used by the task Bytes Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.memory.rss Amount of RSS memory consumed by the task Bytes Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.memory.swap Amount of memory swapped by the task Bytes Gauge alloc_id, host, job, namespace, task, task_group
nomad.client.allocs.memory.usage Total amount of memory used by the task Bytes Gauge alloc_id, host, job, namespace, task, task_group

Job Summary Metrics

Job summary metrics are emitted by the Nomad leader server.

Metric Description Unit Type Labels
nomad.nomad.job_summary.complete Number of complete allocations for a job Integer Gauge host, job, namespace, task_group
nomad.nomad.job_summary.failed Number of failed allocations for a job Integer Gauge host, job, namespace, task_group
nomad.nomad.job_summary.lost Number of lost allocations for a job Integer Gauge host, job, namespace, task_group
nomad.nomad.job_summary.queued Number of queued allocations for a job Integer Gauge host, job, namespace, task_group
nomad.nomad.job_summary.running Number of running allocations for a job Integer Gauge host, job, namespace, task_group
nomad.nomad.job_summary.starting Number of starting allocations for a job Integer Gauge host, job, namespace, task_group

Job Status Metrics

Job status metrics are emitted by the Nomad leader server.

Metric Description Unit Type Labels
nomad.nomad.job_status.dead Number of dead jobs Integer Gauge host
nomad.nomad.job_status.pending Number of pending jobs Integer Gauge host
nomad.nomad.job_status.running Number of running jobs Integer Gauge host

Server Metrics

The following table includes metrics for overall cluster health in addition to those listed in Key Metrics above.

Metric Description Unit Type Labels
nomad.memberlist.gossip Time elapsed to broadcast gossip messages Nanoseconds Summary host
nomad.nomad.acl.bootstrap Time elapsed for ACL.Bootstrap RPC call Nanoseconds Summary host
nomad.nomad.acl.delete_policies Time elapsed for ACL.DeletePolicies RPC call Nanoseconds Summary host
nomad.nomad.acl.delete_tokens Time elapsed for ACL.DeleteTokens RPC call Nanoseconds Summary host
nomad.nomad.acl.get_policies Time elapsed for ACL.GetPolicies RPC call Nanoseconds Summary host
nomad.nomad.acl.get_policy Time elapsed for ACL.GetPolicy RPC call Nanoseconds Summary host
nomad.nomad.acl.get_token Time elapsed for ACL.GetToken RPC call Nanoseconds Summary host
nomad.nomad.acl.get_tokens Time elapsed for ACL.GetTokens RPC call Nanoseconds Summary host
nomad.nomad.acl.list_policies Time elapsed for ACL.ListPolicies RPC call Nanoseconds Summary host
nomad.nomad.acl.list_tokens Time elapsed for ACL.ListTokens RPC call Nanoseconds Summary host
nomad.nomad.acl.resolve_token Time elapsed for ACL.ResolveToken RPC call Nanoseconds Summary host
nomad.nomad.acl.upsert_policies Time elapsed for ACL.UpsertPolicies RPC call Nanoseconds Summary host
nomad.nomad.acl.upsert_tokens Time elapsed for ACL.UpsertTokens RPC call Nanoseconds Summary host
nomad.nomad.alloc.exec Time elapsed to establish alloc exec Nanoseconds Summary host
nomad.nomad.alloc.get_alloc Time elapsed for Alloc.GetAlloc RPC call Nanoseconds Summary host
nomad.nomad.alloc.get_allocs Time elapsed for Alloc.GetAllocs RPC call Nanoseconds Summary host
nomad.nomad.alloc.list Time elapsed for Alloc.List RPC call Nanoseconds Summary host
nomad.nomad.alloc.stop Time elapsed for Alloc.Stop RPC call Nanoseconds Summary host
nomad.nomad.alloc.update_desired_transition Time elapsed for Alloc.UpdateDesiredTransition RPC call Nanoseconds Summary host
nomad.nomad.blocked_evals.cpu Amount of CPU shares requested by blocked evals Integer Gauge datacenter, host, node_class
nomad.nomad.blocked_evals.memory Amount of memory requested by blocked evals Integer Gauge datacenter, host, node_class
nomad.nomad.blocked_evals.job.cpu Amount of CPU shares requested by blocked evals of a job Integer Gauge host, job, namespace
nomad.nomad.blocked_evals.job.memory Amount of memory requested by blocked evals of a job Integer Gauge host, job, namespace
nomad.nomad.blocked_evals.total_blocked Count of evals in the blocked state Integer Gauge host
nomad.nomad.blocked_evals.total_escaped Count of evals that have escaped computed node classes Integer Gauge host
nomad.nomad.blocked_evals.total_quota_limit Count of blocked evals due to quota limits Integer Gauge host
nomad.nomad.broker.batch_ready Count of batch evals ready to be scheduled Integer Gauge host
nomad.nomad.broker.batch_unacked Count of unacknowledged batch evals Integer Gauge host
nomad.nomad.broker.service_ready Count of service evals ready to be scheduled Integer Gauge host
nomad.nomad.broker.service_unacked Count of unacknowledged service evals Integer Gauge host
nomad.nomad.broker.system_ready Count of system evals ready to be scheduled Integer Gauge host
nomad.nomad.broker.system_unacked Count of unacknowledged system evals Integer Gauge host
nomad.nomad.broker.total_ready Count of evals in the ready state Integer Gauge host
nomad.nomad.broker.total_waiting Count of evals in the waiting state Integer Gauge host
nomad.nomad.client.batch_deregister Time elapsed for Node.BatchDeregister RPC call Nanoseconds Summary host
nomad.nomad.client.deregister Time elapsed for Node.Deregister RPC call Nanoseconds Summary host
nomad.nomad.client.derive_si_token Time elapsed for Node.DeriveSIToken RPC call Nanoseconds Summary host
nomad.nomad.client.derive_vault_token Time elapsed for Node.DeriveVaultToken RPC call Nanoseconds Summary host
nomad.nomad.client.emit_events Time elapsed for Node.EmitEvents RPC call Nanoseconds Summary host
nomad.nomad.client.evaluate Time elapsed for Node.Evaluate RPC call Nanoseconds Summary host
nomad.nomad.client.get_allocs Time elapsed for Node.GetAllocs RPC call Nanoseconds Summary host
nomad.nomad.client.get_client_allocs Time elapsed for Node.GetClientAllocs RPC call Nanoseconds Summary host
nomad.nomad.client.get_node Time elapsed for Node.GetNode RPC call Nanoseconds Summary host
nomad.nomad.client.list Time elapsed for Node.List RPC call Nanoseconds Summary host
nomad.nomad.client.register Time elapsed for Node.Register RPC call Nanoseconds Summary host
nomad.nomad.client.stats Time elapsed for Client.Stats RPC call Nanoseconds Summary host
nomad.nomad.client.update_alloc Time elapsed for Node.UpdateAlloc RPC call Nanoseconds Summary host
nomad.nomad.client.update_drain Time elapsed for Node.UpdateDrain RPC call Nanoseconds Summary host
nomad.nomad.client.update_eligibility Time elapsed for Node.UpdateEligibility RPC call Nanoseconds Summary host
nomad.nomad.client.update_status Time elapsed for Node.UpdateStatus RPC call Nanoseconds Summary host
nomad.nomad.client_allocations.garbage_collect_all Time elapsed for ClientAllocations.GarbageCollectAll RPC call Nanoseconds Summary host
nomad.nomad.client_allocations.garbage_collect Time elapsed for ClientAllocations.GarbageCollect RPC call Nanoseconds Summary host
nomad.nomad.client_allocations.restart Time elapsed for ClientAllocations.Restart RPC call Nanoseconds Summary host
nomad.nomad.client_allocations.signal Time elapsed for ClientAllocations.Signal RPC call Nanoseconds Summary host
nomad.nomad.client_allocations.stats Time elapsed for ClientAllocations.Stats RPC call Nanoseconds Summary host
nomad.nomad.client_csi_controller.attach_volume Time elapsed for Controller.AttachVolume RPC call Nanoseconds Summary host
nomad.nomad.client_csi_controller.detach_volume Time elapsed for Controller.DetachVolume RPC call Nanoseconds Summary host
nomad.nomad.client_csi_controller.validate_volume Time elapsed for Controller.ValidateVolume RPC call Nanoseconds Summary host
nomad.nomad.client_csi_node.detach_volume Time elapsed for Node.DetachVolume RPC call Nanoseconds Summary host
nomad.nomad.deployment.allocations Time elapsed for Deployment.Allocations RPC call Nanoseconds Summary host
nomad.nomad.deployment.cancel Time elapsed for Deployment.Cancel RPC call Nanoseconds Summary host
nomad.nomad.deployment.fail Time elapsed for Deployment.Fail RPC call Nanoseconds Summary host
nomad.nomad.deployment.get_deployment Time elapsed for Deployment.GetDeployment RPC call Nanoseconds Summary host
nomad.nomad.deployment.list Time elapsed for Deployment.List RPC call Nanoseconds Summary host
nomad.nomad.deployment.pause Time elapsed for Deployment.Pause RPC call Nanoseconds Summary host
nomad.nomad.deployment.promote Time elapsed for Deployment.Promote RPC call Nanoseconds Summary host
nomad.nomad.deployment.reap Time elapsed for Deployment.Reap RPC call Nanoseconds Summary host
nomad.nomad.deployment.run Time elapsed for Deployment.Run RPC call Nanoseconds Summary host
nomad.nomad.deployment.set_alloc_health Time elapsed for Deployment.SetAllocHealth RPC call Nanoseconds Summary host
nomad.nomad.deployment.unblock Time elapsed for Deployment.Unblock RPC call Nanoseconds Summary host
nomad.nomad.eval.ack Time elapsed for Eval.Ack RPC call Nanoseconds Summary host
nomad.nomad.eval.allocations Time elapsed for Eval.Allocations RPC call Nanoseconds Summary host
nomad.nomad.eval.create Time elapsed for Eval.Create RPC call Nanoseconds Summary host
nomad.nomad.eval.dequeue Time elapsed for Eval.Dequeue RPC call Nanoseconds Summary host
nomad.nomad.eval.get_eval Time elapsed for Eval.GetEval RPC call Nanoseconds Summary host
nomad.nomad.eval.list Time elapsed for Eval.List RPC call Nanoseconds Summary host
nomad.nomad.eval.nack Time elapsed for Eval.Nack RPC call Nanoseconds Summary host
nomad.nomad.eval.reap Time elapsed for Eval.Reap RPC call Nanoseconds Summary host
nomad.nomad.eval.reblock Time elapsed for Eval.Reblock RPC call Nanoseconds Summary host
nomad.nomad.eval.update Time elapsed for Eval.Update RPC call Nanoseconds Summary host
nomad.nomad.file_system.list Time elapsed for FileSystem.List RPC call Nanoseconds Summary host
nomad.nomad.file_system.logs Time elapsed to establish FileSystem.Logs RPC Nanoseconds Summary host
nomad.nomad.file_system.stat Time elapsed for FileSystem.Stat RPC call Nanoseconds Summary host
nomad.nomad.file_system.stream Time elapsed to establish FileSystem.Stream RPC Nanoseconds Summary host
nomad.nomad.fsm.alloc_client_update Time elapsed to apply AllocClientUpdate raft entry Nanoseconds Summary host
nomad.nomad.fsm.alloc_update_desired_transition Time elapsed to apply AllocUpdateDesiredTransition raft entry Nanoseconds Summary host
nomad.nomad.fsm.alloc_update Time elapsed to apply AllocUpdate raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_acl_policy_delete Time elapsed to apply ApplyACLPolicyDelete raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_acl_policy_upsert Time elapsed to apply ApplyACLPolicyUpsert raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_acl_token_bootstrap Time elapsed to apply ApplyACLTokenBootstrap raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_acl_token_delete Time elapsed to apply ApplyACLTokenDelete raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_acl_token_upsert Time elapsed to apply ApplyACLTokenUpsert raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_csi_plugin_delete Time elapsed to apply ApplyCSIPluginDelete raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_csi_volume_batch_claim Time elapsed to apply ApplyCSIVolumeBatchClaim raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_csi_volume_claim Time elapsed to apply ApplyCSIVolumeClaim raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_csi_volume_deregister Time elapsed to apply ApplyCSIVolumeDeregister raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_csi_volume_register Time elapsed to apply ApplyCSIVolumeRegister raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_deployment_alloc_health Time elapsed to apply ApplyDeploymentAllocHealth raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_deployment_delete Time elapsed to apply ApplyDeploymentDelete raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_deployment_promotion Time elapsed to apply ApplyDeploymentPromotion raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_deployment_status_update Time elapsed to apply ApplyDeploymentStatusUpdate raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_job_stability Time elapsed to apply ApplyJobStability raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_namespace_delete Time elapsed to apply ApplyNamespaceDelete raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_namespace_upsert Time elapsed to apply ApplyNamespaceUpsert raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_plan_results Time elapsed to apply ApplyPlanResults raft entry Nanoseconds Summary host
nomad.nomad.fsm.apply_scheduler_config Time elapsed to apply ApplySchedulerConfig raft entry Nanoseconds Summary host
nomad.nomad.fsm.autopilot Time elapsed to apply Autopilot raft entry Nanoseconds Summary host
nomad.nomad.fsm.batch_deregister_job Time elapsed to apply BatchDeregisterJob raft entry Nanoseconds Summary host
nomad.nomad.fsm.batch_deregister_node Time elapsed to apply BatchDeregisterNode raft entry Nanoseconds Summary host
nomad.nomad.fsm.batch_node_drain_update Time elapsed to apply BatchNodeDrainUpdate raft entry Nanoseconds Summary host
nomad.nomad.fsm.cluster_meta Time elapsed to apply ClusterMeta raft entry Nanoseconds Summary host
nomad.nomad.fsm.delete_eval Time elapsed to apply DeleteEval raft entry Nanoseconds Summary host
nomad.nomad.fsm.deregister_job Time elapsed to apply DeregisterJob raft entry Nanoseconds Summary host
nomad.nomad.fsm.deregister_node Time elapsed to apply DeregisterNode raft entry Nanoseconds Summary host
nomad.nomad.fsm.deregister_si_accessor Time elapsed to apply DeregisterSITokenAccessor raft entry Nanoseconds Summary host
nomad.nomad.fsm.deregister_vault_accessor Time elapsed to apply DeregisterVaultAccessor raft entry Nanoseconds Summary host
nomad.nomad.fsm.node_drain_update Time elapsed to apply NodeDrainUpdate raft entry Nanoseconds Summary host
nomad.nomad.fsm.node_eligibility_update Time elapsed to apply NodeEligibilityUpdate raft entry Nanoseconds Summary host
nomad.nomad.fsm.node_status_update Time elapsed to apply NodeStatusUpdate raft entry Nanoseconds Summary host
nomad.nomad.fsm.persist Time elapsed to apply Persist raft entry Nanoseconds Summary host
nomad.nomad.fsm.register_job Time elapsed to apply RegisterJob raft entry Nanoseconds Summary host
nomad.nomad.fsm.register_node Time elapsed to apply RegisterNode raft entry Nanoseconds Summary host
nomad.nomad.fsm.update_eval Time elapsed to apply UpdateEval raft entry Nanoseconds Summary host
nomad.nomad.fsm.upsert_node_events Time elapsed to apply UpsertNodeEvents raft entry Nanoseconds Summary host
nomad.nomad.fsm.upsert_scaling_event Time elapsed to apply UpsertScalingEvent raft entry Nanoseconds Summary host
nomad.nomad.fsm.upsert_si_accessor Time elapsed to apply UpsertSITokenAccessors raft entry Nanoseconds Summary host
nomad.nomad.fsm.upsert_vault_accessor Time elapsed to apply UpsertVaultAccessor raft entry Nanoseconds Summary host
nomad.nomad.job.allocations Time elapsed for Job.Allocations RPC call Nanoseconds Summary host
nomad.nomad.job.batch_deregister Time elapsed for Job.BatchDeregister RPC call Nanoseconds Summary host
nomad.nomad.job.deployments Time elapsed for Job.Deployments RPC call Nanoseconds Summary host
nomad.nomad.job.deregister Time elapsed for Job.Deregister RPC call Nanoseconds Summary host
nomad.nomad.job.dispatch Time elapsed for Job.Dispatch RPC call Nanoseconds Summary host
nomad.nomad.job.evaluate Time elapsed for Job.Evaluate RPC call Nanoseconds Summary host
nomad.nomad.job.evaluations Time elapsed for Job.Evaluations RPC call Nanoseconds Summary host
nomad.nomad.job.get_job_versions Time elapsed for Job.GetJobVersions RPC call Nanoseconds Summary host
nomad.nomad.job.get_job Time elapsed for Job.GetJob RPC call Nanoseconds Summary host
nomad.nomad.job.latest_deployment Time elapsed for Job.LatestDeployment RPC call Nanoseconds Summary host
nomad.nomad.job.list Time elapsed for Job.List RPC call Nanoseconds Summary host
nomad.nomad.job.plan Time elapsed for Job.Plan RPC call Nanoseconds Summary host
nomad.nomad.job.register Time elapsed for Job.Register RPC call Nanoseconds Summary host
nomad.nomad.job.revert Time elapsed for Job.Revert RPC call Nanoseconds Summary host
nomad.nomad.job.scale_status Time elapsed for Job.ScaleStatus RPC call Nanoseconds Summary host
nomad.nomad.job.scale Time elapsed for Job.Scale RPC call Nanoseconds Summary host
nomad.nomad.job.stable Time elapsed for Job.Stable RPC call Nanoseconds Summary host
nomad.nomad.job.validate Time elapsed for Job.Validate RPC call Nanoseconds Summary host
nomad.nomad.job_summary.get_job_summary Time elapsed for Job.Summary RPC call Nanoseconds Summary host
nomad.nomad.leader.barrier Time elapsed to establish a raft barrier during leader transition Nanoseconds Summary host
nomad.nomad.leader.reconcileMember Time elapsed to reconcile a serf peer with state store Nanoseconds Summary host
nomad.nomad.leader.reconcile Time elapsed to reconcile all serf peers with state store Nanoseconds Summary host
nomad.nomad.namespace.delete_namespaces Time elapsed for Namespace.DeleteNamespaces Nanoseconds Summary host
nomad.nomad.namespace.get_namespace Time elapsed for Namespace.GetNamespace Nanoseconds Summary host
nomad.nomad.namespace.get_namespaces Time elapsed for Namespace.GetNamespaces Nanoseconds Summary host
nomad.nomad.namespace.list_namespace Time elapsed for Namespace.ListNamespaces Nanoseconds Summary host
nomad.nomad.namespace.upsert_namespaces Time elapsed for Namespace.UpsertNamespaces Nanoseconds Summary host
nomad.nomad.periodic.force Time elapsed for Periodic.Force RPC call Nanoseconds Summary host
nomad.nomad.plan.apply Time elapsed to apply a plan Nanoseconds Summary host
nomad.nomad.plan.evaluate Time elapsed to evaluate a plan Nanoseconds Summary host
nomad.nomad.plan.queue_depth Count of evals in the plan queue Integer Gauge host
nomad.nomad.plan.submit Time elapsed for Plan.Submit RPC call Nanoseconds Summary host
nomad.nomad.plan.wait_for_index Time elapsed that planner waits for the raft index of the plan to be processed Nanoseconds Summary host
nomad.nomad.plugin.delete Time elapsed for CSIPlugin.Delete RPC call Nanoseconds Summary host
nomad.nomad.plugin.get Time elapsed for CSIPlugin.Get RPC call Nanoseconds Summary host
nomad.nomad.plugin.list Time elapsed for CSIPlugin.List RPC call Nanoseconds Summary host
nomad.nomad.scaling.get_policy Time elapsed for Scaling.GetPolicy RPC call Nanoseconds Summary host
nomad.nomad.scaling.list_policies Time elapsed for Scaling.ListPolicies RPC call Nanoseconds Summary host
nomad.nomad.search.prefix_search Time elapsed for Search.PrefixSearch RPC call Nanoseconds Summary host
nomad.nomad.vault.create_token Time elapsed to create Vault token Nanoseconds Gauge host
nomad.nomad.vault.distributed_tokens_revoked Count of revoked tokens Integer Gauge host
nomad.nomad.vault.lookup_token Time elapsed to lookup Vault token Nanoseconds Gauge host
nomad.nomad.vault.renew_failed Count of failed attempts to renew Vault token Integer Gauge host
nomad.nomad.vault.renew Time elapsed to renew Vault token Nanoseconds Gauge host
nomad.nomad.vault.revoke_tokens Time elapsed to revoke Vault tokens Nanoseconds Gauge host
nomad.nomad.vault.token_ttl Time to live for Vault token Integer Gauge host
nomad.nomad.vault.undistributed_tokens_abandoned Count of abandoned tokens Integer Gauge host
nomad.nomad.volume.claim Time elapsed for CSIVolume.Claim RPC call Nanoseconds Summary host
nomad.nomad.volume.deregister Time elapsed for CSIVolume.Deregister RPC call Nanoseconds Summary host
nomad.nomad.volume.get Time elapsed for CSIVolume.Get RPC call Nanoseconds Summary host
nomad.nomad.volume.list Time elapsed for CSIVolume.List RPC call Nanoseconds Summary host
nomad.nomad.volume.register Time elapsed for CSIVolume.Register RPC call Nanoseconds Summary host
nomad.nomad.volume.unpublish Time elapsed for CSIVolume.Unpublish RPC call Nanoseconds Summary host
nomad.nomad.worker.create_eval Time elapsed for worker to create an eval Nanoseconds Summary host
nomad.nomad.worker.dequeue_eval Time elapsed for worker to dequeue an eval Nanoseconds Summary host
nomad.nomad.worker.invoke_scheduler_service Time elapsed for worker to invoke the scheduler Nanoseconds Summary host
nomad.nomad.worker.send_ack Time elapsed for worker to send acknowledgement Nanoseconds Summary host
nomad.nomad.worker.submit_plan Time elapsed for worker to submit plan Nanoseconds Summary host
nomad.nomad.worker.update_eval Time elapsed for worker to submit updated eval Nanoseconds Summary host
nomad.nomad.worker.wait_for_index Time elapsed that worker waits for the raft index of the eval to be processed Nanoseconds Summary host
nomad.raft.appliedIndex Current index applied to FSM Integer Gauge host
nomad.raft.barrier Count of blocking raft API calls Integer Counter host
nomad.raft.commitNumLogs Count of logs enqueued Integer Gauge host
nomad.raft.commitTime Time elapsed to commit writes Nanoseconds Summary host
nomad.raft.fsm.apply Time elapsed to apply write to FSM Nanoseconds Summary host
nomad.raft.fsm.enqueue Time elapsed to enqueue write to FSM Nanoseconds Summary host
nomad.raft.lastIndex Most recent index seen Integer Gauge host
nomad.raft.leader.dispatchLog Time elapsed to write log, mark in flight, and start replication Nanoseconds Summary host
nomad.raft.leader.dispatchNumLogs Count of logs dispatched Integer Gauge host
nomad.raft.replication.appendEntries Raft transaction commit time ms / Raft Log Append Timer
nomad.raft.state.candidate Count of entering candidate state Integer Gauge host
nomad.raft.state.follower Count of entering follower state Integer Gauge host
nomad.raft.state.leader Count of entering leader state Integer Gauge host
nomad.raft.transition.heartbeat_timeout Count of failing to heartbeat and starting election Integer Gauge host
nomad.raft.transition.leader_lease_timeout Count of stepping down as leader after losing quorum Integer Gauge host
nomad.runtime.free_count Count of objects freed from heap by go runtime GC Integer Gauge host
nomad.runtime.gc_pause_ns Go runtime GC pause times Nanoseconds Summary host
nomad.runtime.sys_bytes Go runtime GC metadata size # of bytes Gauge host
nomad.runtime.total_gc_pause_ns Total elapsed go runtime GC pause times Nanoseconds Gauge host
nomad.runtime.total_gc_runs Count of go runtime GC runs Integer Gauge host
nomad.serf.queue.Event Count of memberlist events received Integer Summary host
nomad.serf.queue.Intent Count of memberlist changes Integer Summary host
nomad.serf.queue.Query Count of memberlist queries Integer Summary host
nomad.state.snapshotIndex Current snapshot index Integer Gauge host