Refactor Proxmox and Process Monitor configurations for improved Joanna dispatch logic and update README with new automation references

pull/1719/head
Carlo Costanzo 1 month ago
parent ee5238ce72
commit 11d3050f23

@ -48,13 +48,13 @@ Live collection of plug-and-play Home Assistant packages. Each YAML file in this
| [mariadb_monitoring.yaml](mariadb_monitoring.yaml) | MariaDB health sensors and Lovelace dashboard snippet for recorder stats. | `sensor.mariadb_status`, `sensor.database_size` | | [mariadb_monitoring.yaml](mariadb_monitoring.yaml) | MariaDB health sensors and Lovelace dashboard snippet for recorder stats. | `sensor.mariadb_status`, `sensor.database_size` |
| [docker_infrastructure.yaml](docker_infrastructure.yaml) | Docker host patching telemetry + container/stack Repairs automation, 20-minute Joanna escalation for persistent container outages using stable configured monitor membership, and weekly scheduled prune actions across docker_10/14/17/69. | `sensor.docker_*_apt_status`, `binary_sensor.*_stack_status`, `sensor.docker_stacks_down_count`, `repairs.create`, `script.joanna_dispatch` | | [docker_infrastructure.yaml](docker_infrastructure.yaml) | Docker host patching telemetry + container/stack Repairs automation, 20-minute Joanna escalation for persistent container outages using stable configured monitor membership, and weekly scheduled prune actions across docker_10/14/17/69. | `sensor.docker_*_apt_status`, `binary_sensor.*_stack_status`, `sensor.docker_stacks_down_count`, `repairs.create`, `script.joanna_dispatch` |
| [github_watched_repo_scout.yaml](github_watched_repo_scout.yaml) | Nightly Joanna dispatch that reviews unread notifications from watched GitHub repos, recommends HA-config ideas, refreshes strong-candidate issues, and marks processed watched-repo notifications read. | `automation.github_watched_repo_scout_nightly`, `script.joanna_dispatch`, `script.send_to_logbook` | | [github_watched_repo_scout.yaml](github_watched_repo_scout.yaml) | Nightly Joanna dispatch that reviews unread notifications from watched GitHub repos, recommends HA-config ideas, refreshes strong-candidate issues, and marks processed watched-repo notifications read. | `automation.github_watched_repo_scout_nightly`, `script.joanna_dispatch`, `script.send_to_logbook` |
| [proxmox.yaml](proxmox.yaml) | Proxmox runtime and disk pressure monitoring with Repairs for node degradations plus nightly Frigate reboot. | `binary_sensor.proxmox*_runtime_healthy`, `sensor.proxmox*_disk_used_percentage`, `repairs.create`, `button.qemu_docker2_101_reboot` | | [proxmox.yaml](proxmox.yaml) | Proxmox runtime and disk pressure monitoring with Repairs + Joanna dispatch for sustained node degradations, plus nightly Frigate reboot. | `binary_sensor.proxmox*_runtime_healthy`, `sensor.proxmox*_disk_used_percentage`, `repairs.create`, `script.joanna_dispatch`, `button.qemu_docker2_101_reboot` |
| [synology_dsm.yaml](synology_dsm.yaml) | Synology DSM integration health normalization for Carlo-NAS01 and Carlo-NVR, with Repairs + Joanna dispatch on sustained integration, security, or storage problems. | `binary_sensor.carlo_*_synology_problem`, `sensor.carlo_*_synology_problem_summary`, `repairs.create`, `script.joanna_dispatch` | | [synology_dsm.yaml](synology_dsm.yaml) | Synology DSM integration health normalization for Carlo-NAS01 and Carlo-NVR, with Repairs + Joanna dispatch on sustained integration, security, or storage problems. | `binary_sensor.carlo_*_synology_problem`, `sensor.carlo_*_synology_problem_summary`, `repairs.create`, `script.joanna_dispatch` |
| [infrastructure_observability.yaml](infrastructure_observability.yaml) | Normalized WAN/DNS/backup/domain/cert health + website uptime/latency SLO signals for Infrastructure dashboards. | `binary_sensor.infra_website_uptime_slo_breach`, `binary_sensor.infra_website_latency_degraded`, `binary_sensor.infra_*` | | [infrastructure_observability.yaml](infrastructure_observability.yaml) | Normalized WAN/DNS/backup/domain/cert health + website uptime/latency SLO signals for Infrastructure dashboards. | `binary_sensor.infra_website_uptime_slo_breach`, `binary_sensor.infra_website_latency_degraded`, `binary_sensor.infra_*` |
| [onenote_indexer.yaml](onenote_indexer.yaml) | OneNote indexer health/status monitoring for Joanna, failure-repair automation, and a daily duplicate-delete maintenance request. | `sensor.onenote_indexer_last_job_status`, `binary_sensor.onenote_indexer_last_job_successful` | | [onenote_indexer.yaml](onenote_indexer.yaml) | OneNote indexer health/status monitoring for Joanna, failure-repair automation, and a daily duplicate-delete maintenance request. | `sensor.onenote_indexer_last_job_status`, `binary_sensor.onenote_indexer_last_job_successful` |
| [mqtt_status.yaml](mqtt_status.yaml) | Command-line MQTT broker reachability probe with Spook Repairs escalation and Joanna troubleshooting dispatch on outage. | `binary_sensor.mqtt_status_raw`, `binary_sensor.mqtt_broker_problem`, `repairs.create`, `rest_command.bearclaw_command` | | [mqtt_status.yaml](mqtt_status.yaml) | Command-line MQTT broker reachability probe with Spook Repairs escalation and Joanna troubleshooting dispatch on outage. | `binary_sensor.mqtt_status_raw`, `binary_sensor.mqtt_broker_problem`, `repairs.create`, `rest_command.bearclaw_command` |
| [mariadb.yaml](mariadb.yaml) | MariaDB recorder health and capacity SQL sensors. | `sensor.mariadb_status`, `sensor.database_size` | | [mariadb.yaml](mariadb.yaml) | MariaDB recorder health and capacity SQL sensors. | `sensor.mariadb_status`, `sensor.database_size` |
| [processmonitor.yaml](processmonitor.yaml) | Root filesystem disk-pressure monitoring with early Joanna review at 80% and Repairs + urgent dispatch at 90%. | `sensor.disk_use_percent`, `repairs.create`, `script.joanna_dispatch`, `tts.clear_cache` | | [processmonitor.yaml](processmonitor.yaml) | Root filesystem disk-pressure monitoring with immediate digest/logbook notes at 80%, Joanna review after 10 minutes above 80%, and delayed phone alerts only if the issue stays unresolved after dispatch. | `sensor.disk_use_percent`, `repairs.create`, `script.joanna_dispatch`, `tts.clear_cache` |
| [tugtainer_updates.yaml](tugtainer_updates.yaml) | Tugtainer container update notifications via webhook + persistent alerts, plus event-based Joanna dispatch when reports include `### Available:` (24h cooldown via `mode: single` + delay, no new helpers). | `persistent_notification.create`, `event: tugtainer_available_detected`, `script.joanna_dispatch`, `input_datetime.tugtainer_last_update` | | [tugtainer_updates.yaml](tugtainer_updates.yaml) | Tugtainer container update notifications via webhook + persistent alerts, plus event-based Joanna dispatch when reports include `### Available:` (24h cooldown via `mode: single` + delay, no new helpers). | `persistent_notification.create`, `event: tugtainer_available_detected`, `script.joanna_dispatch`, `input_datetime.tugtainer_last_update` |
| [bearclaw.yaml](bearclaw.yaml) | Joanna/BearClaw bridge automations that forward Telegram commands to codex_appliance, include LLM-first routing context for freeform text, relay replies back, ingest `/api/bearclaw/status` telemetry, and expose dispatch plus QMD/memory-index sensors for Infrastructure dashboards. | `rest_command.bearclaw_*`, `sensor.bearclaw_status_telemetry`, `sensor.joanna_*`, `binary_sensor.joanna_*`, `automation.bearclaw_*`, `script.send_to_logbook` | | [bearclaw.yaml](bearclaw.yaml) | Joanna/BearClaw bridge automations that forward Telegram commands to codex_appliance, include LLM-first routing context for freeform text, relay replies back, ingest `/api/bearclaw/status` telemetry, and expose dispatch plus QMD/memory-index sensors for Infrastructure dashboards. | `rest_command.bearclaw_*`, `sensor.bearclaw_status_telemetry`, `sensor.joanna_*`, `binary_sensor.joanna_*`, `automation.bearclaw_*`, `script.send_to_logbook` |
| [telegram_bot.yaml](telegram_bot.yaml) | Legacy Telegram transport marker for BearClaw; the shared `joanna_send_telegram` helper now forwards through the codex_appliance direct Telegram API. | `rest_command.bearclaw_telegram_send`, `script.joanna_send_telegram` | | [telegram_bot.yaml](telegram_bot.yaml) | Legacy Telegram transport marker for BearClaw; the shared `joanna_send_telegram` helper now forwards through the codex_appliance direct Telegram API. | `rest_command.bearclaw_telegram_send`, `script.joanna_send_telegram` |

@ -8,14 +8,15 @@
# ------------------------------------------------------------------- # -------------------------------------------------------------------
# - Blog: https://www.vcloudinfo.com/2026/04/joanna-agent-engineer-home-assistant-infrastructure-dispatch.html # - Blog: https://www.vcloudinfo.com/2026/04/joanna-agent-engineer-home-assistant-infrastructure-dispatch.html
# Notes: Uses `sensor.disk_use_percent` for the root (`/`) filesystem. # Notes: Uses `sensor.disk_use_percent` for the root (`/`) filesystem.
# Notes: 80% usage triggers cleanup-oriented notification + Joanna review. # Notes: 80% usage writes an immediate activity note; Joanna reviews only after 10 minutes above threshold.
# Notes: Phone alerts happen only after Joanna dispatch and a short unresolved grace period.
# Notes: 90% usage opens a Repairs issue and dispatches Joanna for urgent triage. # Notes: 90% usage opens a Repairs issue and dispatches Joanna for urgent triage.
###################################################################### ######################################################################
automation: automation:
- alias: "Self Heal Disk Use Alarm" - alias: "Self Heal Disk Use Alarm"
id: b16f2155-4688-4c0f-9cf8-b382e294a029 id: b16f2155-4688-4c0f-9cf8-b382e294a029
description: "Warn on elevated root disk usage and request Joanna review before it becomes critical." description: "Log elevated root disk usage immediately so transient pressure shows up in the digest."
mode: single mode: single
trigger: trigger:
- platform: numeric_state - platform: numeric_state
@ -24,36 +25,65 @@ automation:
variables: variables:
mount_path: "/" mount_path: "/"
disk_use: "{{ states('sensor.disk_use_percent') | float(0) | round(1) }}" disk_use: "{{ states('sensor.disk_use_percent') | float(0) | round(1) }}"
trigger_context: "HA automation b16f2155-4688-4c0f-9cf8-b382e294a029 (Self Heal Disk Use Alarm)"
action: action:
- service: script.notify_engine
data:
value1: "Hard Drive Monitor:"
value2: "Your harddrive is running out of Space! {{ mount_path }}:{{ disk_use }}%!"
value3: "Attempting to clean"
who: "carlo"
- service: script.send_to_logbook - service: script.send_to_logbook
data: data:
topic: "SYSTEM" topic: "SYSTEM"
message: "Disk usage exceeded 80% ({{ mount_path }}: {{ disk_use }}%). Attempting to clean." message: "Disk usage exceeded 80% ({{ mount_path }}: {{ disk_use }}%). Monitoring for sustained pressure."
- service: tts.clear_cache - service: tts.clear_cache
- condition: template
value_template: "{{ disk_use | float(0) < 90 }}" - alias: "Self Heal Disk Use Joanna Review"
id: processmonitor_disk_use_joanna_review
description: "Dispatch Joanna when elevated root disk usage remains above 80% for 10 minutes."
mode: single
trigger:
- platform: numeric_state
entity_id: sensor.disk_use_percent
above: 80
for:
minutes: 10
variables:
mount_path: "/"
disk_use: "{{ states('sensor.disk_use_percent') | float(0) | round(1) }}"
trigger_context: "HA automation processmonitor_disk_use_joanna_review (Self Heal Disk Use Joanna Review)"
condition:
- condition: numeric_state
entity_id: sensor.disk_use_percent
below: 90
action:
- service: script.joanna_dispatch - service: script.joanna_dispatch
data: data:
trigger_context: "{{ trigger_context }}" trigger_context: "{{ trigger_context }}"
source: "home_assistant_automation.self_heal_disk_use_alarm" source: "home_assistant_automation.processmonitor_disk_use_joanna_review"
summary: "Home Assistant root disk usage exceeded 80%" summary: "Home Assistant root disk usage remained above 80% for 10 minutes"
entity_ids: entity_ids:
- "sensor.disk_use_percent" - "sensor.disk_use_percent"
diagnostics: >- diagnostics: >-
mount_path={{ mount_path }}, mount_path={{ mount_path }},
disk_use={{ disk_use }}, disk_use={{ disk_use }},
threshold=80 threshold=80,
sustained_for=10m
request: >- request: >-
Review Home Assistant disk growth and recommend safe cleanup actions. Review Home Assistant disk growth and recommend safe cleanup actions.
Check recorder/database size, logs, cache, backups, and temporary files. Check recorder/database size, logs, cache, backups, and temporary files.
Do not restart Home Assistant or remove data unless explicitly requested. Do not restart Home Assistant or remove data unless explicitly requested.
- service: script.send_to_logbook
data:
topic: "SYSTEM"
message: >-
Disk usage remained above 80% for 10 minutes ({{ mount_path }}: {{ disk_use }}%).
Joanna review requested.
- delay: "00:05:00"
- condition: numeric_state
entity_id: sensor.disk_use_percent
above: 80
below: 90
- service: script.notify_engine
data:
value1: "Hard Drive Monitor:"
value2: "Joanna is reviewing sustained Home Assistant disk usage at {{ mount_path }}:{{ states('sensor.disk_use_percent') | float(0) | round(1) }}%."
value3: "No phone alert was sent until the issue stayed unresolved."
who: "carlo"
- alias: "Disk Use Alarm" - alias: "Disk Use Alarm"
id: 1ce3cb43-0e27-4c53-acdd-d672396f3559 id: 1ce3cb43-0e27-4c53-acdd-d672396f3559
@ -69,17 +99,6 @@ automation:
disk_use: "{{ states('sensor.disk_use_percent') | float(0) | round(1) }}" disk_use: "{{ states('sensor.disk_use_percent') | float(0) | round(1) }}"
trigger_context: "HA automation 1ce3cb43-0e27-4c53-acdd-d672396f3559 (Disk Use Alarm)" trigger_context: "HA automation 1ce3cb43-0e27-4c53-acdd-d672396f3559 (Disk Use Alarm)"
action: action:
- service: script.notify_engine
data:
value1: "Hard Drive Monitor:"
value2: "Your harddrive is running out of Space! {{ mount_path }}:{{ disk_use }}%!"
who: "carlo"
- service: script.send_to_logbook
data:
topic: "SYSTEM"
message: >-
Disk usage exceeded 90% ({{ mount_path }}: {{ disk_use }}%).
Repair {{ issue_id }} opened and Joanna investigation requested.
- service: repairs.create - service: repairs.create
data: data:
issue_id: "{{ issue_id }}" issue_id: "{{ issue_id }}"
@ -107,6 +126,22 @@ automation:
Investigate critical Home Assistant disk usage and recommend or perform safe remediation if available. Investigate critical Home Assistant disk usage and recommend or perform safe remediation if available.
Check recorder/database size, logs, cache, backups, and temporary files first. Check recorder/database size, logs, cache, backups, and temporary files first.
Do not restart Home Assistant or prune/delete data unless explicitly requested. Do not restart Home Assistant or prune/delete data unless explicitly requested.
- service: script.send_to_logbook
data:
topic: "SYSTEM"
message: >-
Disk usage exceeded 90% ({{ mount_path }}: {{ disk_use }}%).
Repair {{ issue_id }} opened and Joanna investigation requested.
- delay: "00:05:00"
- condition: numeric_state
entity_id: sensor.disk_use_percent
above: 90
- service: script.notify_engine
data:
value1: "Hard Drive Monitor:"
value2: "Critical Home Assistant disk usage is still active at {{ mount_path }}:{{ states('sensor.disk_use_percent') | float(0) | round(1) }}%."
value3: "Joanna has already been dispatched to investigate."
who: "carlo"
- alias: "Disk Use Alarm Recovery" - alias: "Disk Use Alarm Recovery"
id: processmonitor_disk_use_alarm_recovery id: processmonitor_disk_use_alarm_recovery

@ -3,12 +3,13 @@
# For more info visit https://www.vcloudinfo.com/click-here # For more info visit https://www.vcloudinfo.com/click-here
# Original Repo : https://github.com/CCOSTAN/Home-AssistantConfig # Original Repo : https://github.com/CCOSTAN/Home-AssistantConfig
# ------------------------------------------------------------------- # -------------------------------------------------------------------
# Proxmox Host Automations - reboots and update alerts # Proxmox Host Automations - reboots, repairs, and Joanna dispatch
# Nightly Frigate host reboot plus update repair issues. # Nightly Frigate host reboot plus update/runtime/disk health automations.
# ------------------------------------------------------------------- # -------------------------------------------------------------------
# Related Issue: 1584 # Related Issue: 1584
# Notes: Creates HA repair issues when proxmox nodes report updates. # Notes: Creates HA repair issues when proxmox nodes report updates.
# Notes: Adds normalized runtime + disk health signals for dashboard/alerts. # Notes: Adds normalized runtime + disk health signals for dashboard/alerts.
# Notes: Joanna dispatch is reserved for sustained runtime and disk-pressure degradations.
###################################################################### ######################################################################
template: template:
- sensor: - sensor:
@ -148,6 +149,28 @@ automation:
{% else %} {% else %}
proxmox02_runtime_unhealthy proxmox02_runtime_unhealthy
{% endif %} {% endif %}
runtime_entity: >-
{% if 'proxmox1' in trigger.entity_id %}
binary_sensor.proxmox1_runtime_healthy
{% else %}
binary_sensor.proxmox02_runtime_healthy
{% endif %}
status_entity: >-
{% if 'proxmox1' in trigger.entity_id %}
{% if states('binary_sensor.node_proxmox1_status') not in ['unknown', 'unavailable', 'none', ''] %}
binary_sensor.node_proxmox1_status
{% else %}
sensor.node_proxmox1_status
{% endif %}
{% else %}
{% if states('binary_sensor.node_proxmox02_status') not in ['unknown', 'unavailable', 'none', ''] %}
binary_sensor.node_proxmox02_status
{% else %}
sensor.node_proxmox02_status
{% endif %}
{% endif %}
status_value: "{{ states(status_entity) }}"
trigger_context: "HA automation proxmox_runtime_repairs (Proxmox Runtime Repair Issues)"
action: action:
- choose: - choose:
- conditions: "{{ trigger.to_state.state == 'off' }}" - conditions: "{{ trigger.to_state.state == 'off' }}"
@ -164,10 +187,30 @@ automation:
description: > description: >
{{ node_name }} has remained offline for over 2 minutes. {{ node_name }} has remained offline for over 2 minutes.
Check node status in Proxmox and restore runtime. Check node status in Proxmox and restore runtime.
- service: script.joanna_dispatch
data:
trigger_context: "{{ trigger_context }}"
source: "home_assistant_automation.proxmox_runtime_repairs"
summary: "{{ node_name }} runtime has remained degraded for over 2 minutes"
entity_ids:
- "{{ runtime_entity }}"
- "{{ status_entity }}"
diagnostics: >-
issue_id={{ issue_id }},
node_name={{ node_name }},
runtime_entity={{ runtime_entity }},
status_entity={{ status_entity }},
status_value={{ status_value }},
unhealthy_for=2m
request: >-
Investigate {{ node_name }} runtime degradation and restore node availability if possible.
Check host status, cluster connectivity, storage reachability, and recent update activity first.
Do not reboot the host unless explicitly requested.
- service: script.send_to_logbook - service: script.send_to_logbook
data: data:
topic: "PROXMOX" topic: "PROXMOX"
message: "{{ node_name }} runtime is degraded." message: >-
{{ node_name }} runtime is degraded. Repair {{ issue_id }} opened and Joanna investigation requested.
default: default:
- service: repairs.remove - service: repairs.remove
continue_on_error: true continue_on_error: true
@ -188,11 +231,26 @@ automation:
- sensor.proxmox1_disk_used_percentage - sensor.proxmox1_disk_used_percentage
- sensor.proxmox02_disk_used_percentage - sensor.proxmox02_disk_used_percentage
above: 85 above: 85
below: 92
for: "00:15:00" for: "00:15:00"
id: warning
- platform: numeric_state
entity_id:
- sensor.proxmox1_disk_used_percentage
- sensor.proxmox02_disk_used_percentage
above: 92
id: critical
- platform: state - platform: state
entity_id: entity_id:
- sensor.proxmox1_disk_used_percentage - sensor.proxmox1_disk_used_percentage
- sensor.proxmox02_disk_used_percentage - sensor.proxmox02_disk_used_percentage
id: band_change
- platform: numeric_state
entity_id:
- sensor.proxmox1_disk_used_percentage
- sensor.proxmox02_disk_used_percentage
below: 85
id: recovered
variables: variables:
node_name: >- node_name: >-
{% if 'proxmox1' in trigger.entity_id %}Proxmox1{% else %}Proxmox02{% endif %} {% if 'proxmox1' in trigger.entity_id %}Proxmox1{% else %}Proxmox02{% endif %}
@ -202,10 +260,33 @@ automation:
{% else %} {% else %}
proxmox02_disk_pressure proxmox02_disk_pressure
{% endif %} {% endif %}
disk_pct: "{{ states(trigger.entity_id) | float(0) }}" disk_entity: "{{ trigger.entity_id }}"
raw_disk_entity: >-
{% if 'proxmox1' in trigger.entity_id %}
sensor.node_proxmox1_disk_used_percentage
{% else %}
sensor.node_proxmox02_disk_used_percentage
{% endif %}
disk_pct: "{{ states(disk_entity) | float(0) }}"
previous_disk_pct: >-
{% if trigger.from_state is not none and trigger.from_state.state not in ['unknown', 'unavailable', 'none', ''] %}
{{ trigger.from_state.state | float(0) }}
{% else %}
0
{% endif %}
previous_band: >-
{% if previous_disk_pct >= 92 %}
critical
{% elif previous_disk_pct >= 85 %}
warning
{% else %}
normal
{% endif %}
action: action:
- choose: - choose:
- conditions: "{{ disk_pct >= 92 }}" - conditions:
- condition: trigger
id: critical
sequence: sequence:
- service: repairs.create - service: repairs.create
data: data:
@ -216,11 +297,36 @@ automation:
description: > description: >
{{ node_name }} disk usage is critically high. {{ node_name }} disk usage is critically high.
Free disk space or expand storage allocation. Free disk space or expand storage allocation.
- service: script.joanna_dispatch
data:
trigger_context: "HA automation proxmox_disk_pressure_repairs (Proxmox Disk Pressure Repair Issues - Critical)"
source: "home_assistant_automation.proxmox_disk_pressure_repairs.critical"
summary: "{{ node_name }} disk pressure is critical at {{ disk_pct | round(1) }}%"
entity_ids:
- "{{ disk_entity }}"
- "{{ raw_disk_entity }}"
diagnostics: >-
issue_id={{ issue_id }},
node_name={{ node_name }},
disk_entity={{ disk_entity }},
raw_disk_entity={{ raw_disk_entity }},
disk_pct={{ disk_pct | round(1) }},
threshold=92
request: >-
Investigate critical disk pressure on {{ node_name }} and recommend safe remediation.
Check local storage usage, backups, logs, snapshots, and VM or container disk consumers first.
Do not delete VM disks or reboot the host unless explicitly requested.
- service: script.send_to_logbook - service: script.send_to_logbook
data: data:
topic: "PROXMOX" topic: "PROXMOX"
message: "{{ node_name }} disk usage is critical at {{ disk_pct | round(1) }}%." message: >-
- conditions: "{{ disk_pct >= 85 }}" {{ node_name }} disk usage is critical at {{ disk_pct | round(1) }}%.
Repair {{ issue_id }} opened and Joanna investigation requested.
- conditions:
- condition: trigger
id: warning
- condition: template
value_template: "{{ previous_band != 'critical' }}"
sequence: sequence:
- service: repairs.create - service: repairs.create
data: data:
@ -231,12 +337,52 @@ automation:
description: > description: >
{{ node_name }} disk usage has stayed above 85% for 15 minutes. {{ node_name }} disk usage has stayed above 85% for 15 minutes.
Plan cleanup before capacity reaches critical levels. Plan cleanup before capacity reaches critical levels.
- service: script.joanna_dispatch
data:
trigger_context: "HA automation proxmox_disk_pressure_repairs (Proxmox Disk Pressure Repair Issues - Warning)"
source: "home_assistant_automation.proxmox_disk_pressure_repairs.warning"
summary: "{{ node_name }} disk pressure warning at {{ disk_pct | round(1) }}%"
entity_ids:
- "{{ disk_entity }}"
- "{{ raw_disk_entity }}"
diagnostics: >-
issue_id={{ issue_id }},
node_name={{ node_name }},
disk_entity={{ disk_entity }},
raw_disk_entity={{ raw_disk_entity }},
disk_pct={{ disk_pct | round(1) }},
threshold=85,
sustained_for=15m
request: >-
Investigate elevated disk usage on {{ node_name }} and recommend safe cleanup actions before it becomes critical.
Check local storage usage, backups, logs, snapshots, and VM or container disk consumers first.
Do not delete VM disks or reboot the host unless explicitly requested.
- service: script.send_to_logbook - service: script.send_to_logbook
data: data:
topic: "PROXMOX" topic: "PROXMOX"
message: "{{ node_name }} disk usage warning at {{ disk_pct | round(1) }}%." message: >-
default: {{ node_name }} disk usage warning at {{ disk_pct | round(1) }}%.
- service: repairs.remove Repair {{ issue_id }} opened and Joanna investigation requested.
continue_on_error: true - conditions:
data: - condition: trigger
issue_id: "{{ issue_id }}" id: band_change
- condition: template
value_template: "{{ previous_band == 'critical' and disk_pct >= 85 and disk_pct < 92 }}"
sequence:
- service: repairs.create
data:
issue_id: "{{ issue_id }}"
severity: warning
persistent: true
title: "{{ node_name }} disk pressure warning ({{ disk_pct | round(1) }}%)"
description: >
{{ node_name }} disk usage is elevated but no longer critical.
Plan cleanup before capacity reaches critical levels again.
- conditions:
- condition: trigger
id: recovered
sequence:
- service: repairs.remove
continue_on_error: true
data:
issue_id: "{{ issue_id }}"

@ -29,8 +29,7 @@ template:
'binary_sensor.carlo_nas01_drive_3_below_min_remaining_life', 'binary_sensor.carlo_nas01_drive_3_below_min_remaining_life',
'binary_sensor.carlo_nas01_drive_1_exceeded_max_bad_sectors', 'binary_sensor.carlo_nas01_drive_1_exceeded_max_bad_sectors',
'binary_sensor.carlo_nas01_drive_2_exceeded_max_bad_sectors', 'binary_sensor.carlo_nas01_drive_2_exceeded_max_bad_sectors',
'binary_sensor.carlo_nas01_drive_3_exceeded_max_bad_sectors', 'binary_sensor.carlo_nas01_drive_3_exceeded_max_bad_sectors'
'update.carlo_nas01_dsm_update'
] %} ] %}
{% set ns = namespace(problem=false) %} {% set ns = namespace(problem=false) %}
{% for id in ids %} {% for id in ids %}
@ -86,8 +85,7 @@ template:
'binary_sensor.carlo_nvr_drive_1_below_min_remaining_life', 'binary_sensor.carlo_nvr_drive_1_below_min_remaining_life',
'binary_sensor.carlo_nvr_drive_2_below_min_remaining_life', 'binary_sensor.carlo_nvr_drive_2_below_min_remaining_life',
'binary_sensor.carlo_nvr_drive_1_exceeded_max_bad_sectors', 'binary_sensor.carlo_nvr_drive_1_exceeded_max_bad_sectors',
'binary_sensor.carlo_nvr_drive_2_exceeded_max_bad_sectors', 'binary_sensor.carlo_nvr_drive_2_exceeded_max_bad_sectors'
'update.carlo_nvr_dsm_update'
] %} ] %}
{% set ns = namespace(problem=false) %} {% set ns = namespace(problem=false) %}
{% for id in ids %} {% for id in ids %}
@ -422,13 +420,6 @@ automation:
dsm_update: {{ dsm_update_state }} dsm_update: {{ dsm_update_state }}
ssh_alias: {{ ssh_alias }} ssh_alias: {{ ssh_alias }}
dsm_url: {{ dsm_url }} dsm_url: {{ dsm_url }}
- service: script.send_to_logbook
data:
topic: "SYNOLOGY"
message: >-
{{ host_name }} reported a Synology DSM problem for 10 minutes.
Repair {{ issue_id }} opened and Joanna investigation requested.
Summary: {{ problem_summary }}.
- service: script.joanna_dispatch - service: script.joanna_dispatch
data: data:
trigger_context: "{{ trigger_context }}" trigger_context: "{{ trigger_context }}"
@ -450,6 +441,13 @@ automation:
Investigate {{ host_name }} using the Home Assistant Synology DSM entities first, then DSM or SSH if needed. Investigate {{ host_name }} using the Home Assistant Synology DSM entities first, then DSM or SSH if needed.
Review security status, drive health, volume health, and integration availability. Review security status, drive health, volume health, and integration availability.
Do not reboot or shut down the NAS unless explicitly requested. Do not reboot or shut down the NAS unless explicitly requested.
- service: script.send_to_logbook
data:
topic: "SYNOLOGY"
message: >-
{{ host_name }} reported a Synology DSM problem for 10 minutes.
Repair {{ issue_id }} opened and Joanna investigation requested.
Summary: {{ problem_summary }}.
- id: synology_dsm_clear_repair_on_recovery - id: synology_dsm_clear_repair_on_recovery
alias: "Synology DSM - Clear Repair On Recovery" alias: "Synology DSM - Clear Repair On Recovery"

@ -60,9 +60,13 @@ Current automations that kick off automated resolutions (via `script.joanna_disp
| `infra_backup_nightly_verification` | Infrastructure - Backup Nightly Verification | [../packages/infrastructure_observability.yaml](../packages/infrastructure_observability.yaml) | | `infra_backup_nightly_verification` | Infrastructure - Backup Nightly Verification | [../packages/infrastructure_observability.yaml](../packages/infrastructure_observability.yaml) |
| `docker_state_sync_repairs_dynamic` | Docker State Sync - Repairs (Dynamic) | [../packages/docker_infrastructure.yaml](../packages/docker_infrastructure.yaml) | | `docker_state_sync_repairs_dynamic` | Docker State Sync - Repairs (Dynamic) | [../packages/docker_infrastructure.yaml](../packages/docker_infrastructure.yaml) |
| `docker_group_reconcile_weekly_joanna_review` | Docker Group Reconcile - Weekly Joanna Review | [../packages/docker_infrastructure.yaml](../packages/docker_infrastructure.yaml) | | `docker_group_reconcile_weekly_joanna_review` | Docker Group Reconcile - Weekly Joanna Review | [../packages/docker_infrastructure.yaml](../packages/docker_infrastructure.yaml) |
| `tugtainer_dispatch_joanna_for_available_updates` | Tugtainer - Dispatch Joanna For Available Updates | [../packages/tugtainer_updates.yaml](../packages/tugtainer_updates.yaml) |
| `tugtainer_dispatch_joanna_for_home_assistant_core_digest` | Tugtainer - Dispatch Joanna For Home Assistant Core Digest | [../packages/tugtainer_updates.yaml](../packages/tugtainer_updates.yaml) |
| `unifi_ap_no_clients_repair_combined` | Unifi AP Create Repair Issue after 5m of 0 Clients | [../packages/wireless.yaml](../packages/wireless.yaml) | | `unifi_ap_no_clients_repair_combined` | Unifi AP Create Repair Issue after 5m of 0 Clients | [../packages/wireless.yaml](../packages/wireless.yaml) |
| `proxmox_runtime_repairs` | Proxmox Runtime Repair Issues | [../packages/proxmox.yaml](../packages/proxmox.yaml) |
| `proxmox_disk_pressure_repairs` | Proxmox Disk Pressure Repair Issues | [../packages/proxmox.yaml](../packages/proxmox.yaml) |
| `synology_dsm_open_repair_and_dispatch` | Synology DSM - Open Repair And Dispatch | [../packages/synology_dsm.yaml](../packages/synology_dsm.yaml) | | `synology_dsm_open_repair_and_dispatch` | Synology DSM - Open Repair And Dispatch | [../packages/synology_dsm.yaml](../packages/synology_dsm.yaml) |
| `b16f2155-4688-4c0f-9cf8-b382e294a029` | Self Heal Disk Use Alarm | [../packages/processmonitor.yaml](../packages/processmonitor.yaml) | | `processmonitor_disk_use_joanna_review` | Self Heal Disk Use Joanna Review | [../packages/processmonitor.yaml](../packages/processmonitor.yaml) |
| `1ce3cb43-0e27-4c53-acdd-d672396f3559` | Disk Use Alarm | [../packages/processmonitor.yaml](../packages/processmonitor.yaml) | | `1ce3cb43-0e27-4c53-acdd-d672396f3559` | Disk Use Alarm | [../packages/processmonitor.yaml](../packages/processmonitor.yaml) |
### Tips ### Tips

Loading…
Cancel
Save

Powered by TurnKey Linux.