Files
shturman/docs/specs/plans/08-v0.4-mcu-thermal.md
kk0t9 fb31a288c3 docs(v0.4): план реализации MCU/thermal (План 8)
TDD-разбивка вехи v0.4 (A12/B08/B09/B10) на P8.1–P8.9: ThermalPolicy (гистерезис) →
FSM-события (ThermalCleared/FailsafeCut) → протокол+кодек (CRC/replay/desync) →
Coprocessor (мок + B09-модель + клиент) → TempSource/Throttler/ThermalMonitor →
проводка сервиса (D-Bus ThermalState/ThermalChanged + spawn_loops + dev-mock SetTemp/
HangSoc) → integration → E2E-блок → verify+prod-gate+швы+finish. Полный код в шагах,
self-review пройден (тайминги тестов поправлены). Спека: docs/specs/v0.4-mcu-thermal.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Alexander <akotenev2003@gmail.com>
2026-06-25 15:18:11 +03:00

52 KiB
Raw Permalink Blame History

План 8 — v0.4 MCU/thermal fail-safe (тепловой триггер + MCU-протокол + fail-safe-таймер)

REQUIRED SUB-SKILL: executing-plans (или subagent-driven-development) + TDD. Спека: docs/specs/v0.4-mcu-thermal.md. Шаги — чекбоксы - [ ]. P8.7/P8.8/P8.9 — тяжёлые (Lima). Поверх живого FSM v0.3 (shturman-power).

Goal: замкнуть тепловой и MCU-швы домена B: thermal-trip → graceful (реюз v0.3); SoC↔MCU shutdown-протокол с защищённым кодеком; модель независимого fail-safe-таймера (MCU режет питание при зависшем SoC).

Architecture: новые модули в shturman-power (без нового крейта). Чистые юнит-тестируемые ядра (ThermalPolicy, codec, CoprocessorClient, ThermalMonitor) → абстракции (TempSource/Throttler/ Coprocessor traits) → dev-mock. FSM кормится через apply_event (как v0.3). Циклы (thermal/coprocessor) — фоновые tokio-таски на монотонике, спавнятся в main после регистрации на шине.

Tech Stack: Rust, zbus 4, tokio, CRC16-CCITT, Lima E2E (bash). Все таймеры — CLOCK_MONOTONIC.


File Structure

  • Create crates/core/shturman-power/src/thermal.rsThermalLevel, ThermalPolicy (чистая, гистерезис), TempSource/Throttler traits + MockTempSource/SysfsTempSource/NoopThrottler/CpufreqThrottler, ThermalMonitor.
  • Create crates/core/shturman-power/src/protocol.rsSocToMcu/McuToSoc + wire-типы.
  • Create crates/core/shturman-power/src/codec.rs — CRC16, encode_frame, FrameDecoder (resync + replay).
  • Create crates/core/shturman-power/src/coprocessor.rsCoprocessor trait, MockCoprocessor (+B09-модель), SerialCoprocessor (стаб), CoprocessorClient.
  • Modify crates/core/shturman-power/src/fsm.rsEvent::{ThermalCleared, FailsafeCut}, Action::Cut.
  • Modify crates/core/shturman-power/src/service.rsThermalState property, ThermalChanged signal, Action::Cut-хендлер, spawn_loops, dev-mock SetTemp/HangSoc/McuLinkLoss, хендлы fsm/thermal_state.
  • Modify crates/core/shturman-power/src/main.rs — собрать источники (mock/prod) + spawn_loops.
  • Modify crates/core/shturman-power/src/lib.rspub mod thermal; pub mod protocol; pub mod codec; pub mod coprocessor;.
  • Modify crates/shturman-ipc/src/proxy.rsPower1Proxy: property thermal_state + сигнал thermal_changed.
  • Modify crates/core/shturman-power/tests/integration.rs — thermal-trip / abort / failsafe-cut.
  • Modify tests/e2e/run.sh — блок v0.4 (после блока power-safe v0.3).
  • Modify (P8.9, швы §10) docs/domains/b-power-lifecycle.md, docs/contracts/hardware.md, docs/contracts/ipc.md, docs/capability-catalog.md, CLAUDE.md.

P8.1: ThermalPolicy — чистый тепловой автомат (A12/B10)

Files: Create crates/core/shturman-power/src/thermal.rs (часть 1 — политика); Modify lib.rs.

  • Шаг 1 — thermal.rs (политика + уровни + тесты):
//! Тепловая подсистема (A12/B10, спека v0.4 §5). `ThermalPolicy` — чистая (без I/O), с гистерезисом.
//! Источники/throttler/монитор — P8.5 (этот же файл).

/// Уровень теплового состояния. `Throttle(u8)` — банд (v0.4 использует уровень 1; мульти-банд — RK3588).
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum ThermalLevel {
    Normal,
    Warn,
    Throttle(u8),
    Critical,
}

impl ThermalLevel {
    pub fn as_str(&self) -> &'static str {
        match self {
            ThermalLevel::Normal => "normal",
            ThermalLevel::Warn => "warn",
            ThermalLevel::Throttle(_) => "throttle",
            ThermalLevel::Critical => "critical",
        }
    }
    fn rank(&self) -> u8 {
        match self {
            ThermalLevel::Normal => 0,
            ThermalLevel::Warn => 1,
            ThermalLevel::Throttle(_) => 2,
            ThermalLevel::Critical => 3,
        }
    }
}

// Пороги — placeholder-константы (°C). Тюнинг на RK3588 (hardware §1a; Tjmax ~100 °C).
pub const WARN_C: i32 = 75;
pub const THROTTLE_C: i32 = 85;
pub const CRITICAL_C: i32 = 95;
pub const HYST_C: i32 = 5;

/// Чистая политика: `(предыдущий уровень, температура) → уровень` с гистерезисом (Schmitt по бандам).
pub struct ThermalPolicy;

impl ThermalPolicy {
    fn band_by_entry(t: i32) -> ThermalLevel {
        if t >= CRITICAL_C {
            ThermalLevel::Critical
        } else if t >= THROTTLE_C {
            ThermalLevel::Throttle(1)
        } else if t >= WARN_C {
            ThermalLevel::Warn
        } else {
            ThermalLevel::Normal
        }
    }
    fn band_by_exit(t: i32) -> ThermalLevel {
        // нижние (гистерезисные) пороги = entry  HYST
        if t >= CRITICAL_C - HYST_C {
            ThermalLevel::Critical
        } else if t >= THROTTLE_C - HYST_C {
            ThermalLevel::Throttle(1)
        } else if t >= WARN_C - HYST_C {
            ThermalLevel::Warn
        } else {
            ThermalLevel::Normal
        }
    }

    /// Подъём — по entry-порогам; спуск — по exit-порогам (entry − HYST) → нет осцилляции на границе.
    pub fn next(prev: ThermalLevel, temp_c: i32) -> ThermalLevel {
        let up = Self::band_by_entry(temp_c);
        if up.rank() >= prev.rank() {
            up
        } else {
            Self::band_by_exit(temp_c)
        }
    }
}

#[cfg(test)]
mod policy_tests {
    use super::*;

    #[test]
    fn rises_by_entry_thresholds() {
        assert_eq!(ThermalPolicy::next(ThermalLevel::Normal, 70), ThermalLevel::Normal);
        assert_eq!(ThermalPolicy::next(ThermalLevel::Normal, 75), ThermalLevel::Warn);
        assert_eq!(ThermalPolicy::next(ThermalLevel::Warn, 85), ThermalLevel::Throttle(1));
        assert_eq!(ThermalPolicy::next(ThermalLevel::Throttle(1), 95), ThermalLevel::Critical);
        // прыжок вверх через банды
        assert_eq!(ThermalPolicy::next(ThermalLevel::Normal, 99), ThermalLevel::Critical);
    }

    #[test]
    fn hysteresis_holds_until_below_exit() {
        // critical держится до < 90 (955)
        assert_eq!(ThermalPolicy::next(ThermalLevel::Critical, 92), ThermalLevel::Critical);
        assert_eq!(ThermalPolicy::next(ThermalLevel::Critical, 89), ThermalLevel::Throttle(1));
        // warn держится до < 70
        assert_eq!(ThermalPolicy::next(ThermalLevel::Warn, 73), ThermalLevel::Warn);
        assert_eq!(ThermalPolicy::next(ThermalLevel::Warn, 69), ThermalLevel::Normal);
    }

    #[test]
    fn no_oscillation_at_boundary() {
        // на 84 (чуть ниже entry throttle=85): зависит от prev (Schmitt), не дёргается
        assert_eq!(ThermalPolicy::next(ThermalLevel::Throttle(1), 84), ThermalLevel::Throttle(1));
        assert_eq!(ThermalPolicy::next(ThermalLevel::Warn, 84), ThermalLevel::Warn);
    }
}
  • Шаг 2 — lib.rs: добавить pub mod thermal; (рядом с pub mod fsm;).

  • Шаг 3 — прогон. Run: cargo test -p shturman-power thermal::policy_tests. Expected: PASS (3 теста).

  • Шаг 4 — commit.

git add crates/core/shturman-power/src/thermal.rs crates/core/shturman-power/src/lib.rs
git commit -s -m "feat(v0.4): чистый ThermalPolicy (банды + гистерезис, A12/B10)"

P8.2: FSM-события ThermalCleared + FailsafeCut (B09/B10)

Files: Modify crates/core/shturman-power/src/fsm.rs.

  • Шаг 1 — расширить Event и Action. В fsm.rs добавить в enum Event (после GraceExpired):
    ThermalCleared, // тепло вернулось в норму до PONR → abort thermal-shutdown (гейт reason==Thermal)
    FailsafeCut,    // MCU-авторитетный cut (зависший SoC / истёк hold-up) → off, необратимо

В enum Action (после Commit):

    Cut, // MCU снял питание (fail-safe) — сервис логирует + переходит в off
  • Шаг 2 — добавить переходы в step. В match (self.state, ev) перед финальным _ => vec![] вставить:
            (
                ShuttingDown { phase: Abortable, reason: ShutdownReason::Thermal },
                E::ThermalCleared,
            ) => {
                self.state = Running;
                vec![Action::ShutdownAborted]
            }
            (Off, E::FailsafeCut) => vec![],
            (_, E::FailsafeCut) => {
                self.state = Off;
                vec![Action::Cut]
            }
  • Шаг 3 — тесты. В #[cfg(test)] mod tests (fsm.rs) добавить:
    #[test]
    fn thermal_cleared_aborts_only_thermal_abortable() {
        let mut f = PowerFsm::new(); // Running
        f.step(Event::ThermalTrip); // → ShuttingDown{Abortable, Thermal}
        assert_eq!(f.step(Event::ThermalCleared), vec![Action::ShutdownAborted]);
        assert_eq!(f.state(), State::Running);

        // из ACC-off-shutdown ThermalCleared — no-op
        let mut g = PowerFsm::new();
        g.step(Event::AccOff);
        assert_eq!(g.step(Event::ThermalCleared), vec![]);
        assert_eq!(g.power_state(), PowerState::ShuttingDown);
    }

    #[test]
    fn failsafe_cut_forces_off_from_any_nonoff() {
        let mut f = PowerFsm::new(); // Running
        assert_eq!(f.step(Event::FailsafeCut), vec![Action::Cut]);
        assert_eq!(f.state(), State::Off);
        // из off — no-op
        assert_eq!(f.step(Event::FailsafeCut), vec![]);
        // даже из committed (необратимый shutdown) cut уводит в off
        let mut g = PowerFsm::new();
        g.step(Event::AccOff);
        g.step(Event::GraceExpired); // committed
        assert_eq!(g.step(Event::FailsafeCut), vec![Action::Cut]);
        assert_eq!(g.state(), State::Off);
    }

(Использует PowerState/State/Action/Event, уже импортированные в тест-модуле/файле.)

  • Шаг 4 — прогон. Run: cargo test -p shturman-power fsm. Expected: PASS (v0.3 7 тестов + 2 новых = 9).

  • Шаг 5 — commit.

git add crates/core/shturman-power/src/fsm.rs
git commit -s -m "feat(v0.4): FSM ThermalCleared (abort thermal) + FailsafeCut (MCU cut)"

P8.3: протокол + кодек (B08 — защита линка)

Files: Create crates/core/shturman-power/src/protocol.rs, crates/core/shturman-power/src/codec.rs; Modify lib.rs.

  • Шаг 1 — protocol.rs:
//! Типы сообщений SoC↔MCU (B08, спека v0.4 §6.1). seq — поле кадра (codec), не сообщения.

pub mod wire {
    pub const HEARTBEAT: u8 = 0x01;
    pub const SHUTDOWN_IMMINENT: u8 = 0x02;
    pub const SAFE_TO_CUT: u8 = 0x03;
    pub const ACK: u8 = 0x81;
    pub const ACC: u8 = 0x82;
    pub const VOLTAGE: u8 = 0x83;
    pub const CUT_WARNING: u8 = 0x84;
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum SocToMcu {
    Heartbeat,
    ShutdownImminent { budget: u8 },
    SafeToCut,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum McuToSoc {
    Ack,
    Acc { on: bool },
    Voltage { mv: u16 },
    CutWarning,
}

impl SocToMcu {
    pub fn wire_type(&self) -> u8 {
        match self {
            SocToMcu::Heartbeat => wire::HEARTBEAT,
            SocToMcu::ShutdownImminent { .. } => wire::SHUTDOWN_IMMINENT,
            SocToMcu::SafeToCut => wire::SAFE_TO_CUT,
        }
    }
    pub fn payload(&self) -> Vec<u8> {
        match self {
            SocToMcu::ShutdownImminent { budget } => vec![*budget],
            _ => vec![],
        }
    }
    pub fn from_wire(t: u8, p: &[u8]) -> Option<Self> {
        match t {
            wire::HEARTBEAT => Some(SocToMcu::Heartbeat),
            wire::SHUTDOWN_IMMINENT => Some(SocToMcu::ShutdownImminent { budget: *p.first()? }),
            wire::SAFE_TO_CUT => Some(SocToMcu::SafeToCut),
            _ => None,
        }
    }
}

impl McuToSoc {
    pub fn wire_type(&self) -> u8 {
        match self {
            McuToSoc::Ack => wire::ACK,
            McuToSoc::Acc { .. } => wire::ACC,
            McuToSoc::Voltage { .. } => wire::VOLTAGE,
            McuToSoc::CutWarning => wire::CUT_WARNING,
        }
    }
    pub fn payload(&self) -> Vec<u8> {
        match self {
            McuToSoc::Acc { on } => vec![*on as u8],
            McuToSoc::Voltage { mv } => mv.to_be_bytes().to_vec(),
            _ => vec![],
        }
    }
    pub fn from_wire(t: u8, p: &[u8]) -> Option<Self> {
        match t {
            wire::ACK => Some(McuToSoc::Ack),
            wire::ACC => Some(McuToSoc::Acc { on: *p.first()? != 0 }),
            wire::VOLTAGE => Some(McuToSoc::Voltage {
                mv: u16::from_be_bytes([*p.first()?, *p.get(1)?]),
            }),
            wire::CUT_WARNING => Some(McuToSoc::CutWarning),
            _ => None,
        }
    }
}
  • Шаг 2 — codec.rs:
//! Кадр SoC↔MCU + защита линка (B08, спека v0.4 §6.2). Кадр: [SYNC][LEN][SEQ][TYPE][PAYLOAD..][CRC16].
//! CRC16-CCITT по LEN..=PAYLOAD. Декодер: resync по SYNC, drop при битом CRC, drop-replay (seq==last).

pub const SYNC: u8 = 0xA5;

pub fn crc16_ccitt(data: &[u8]) -> u16 {
    let mut crc: u16 = 0xFFFF;
    for &b in data {
        crc ^= (b as u16) << 8;
        for _ in 0..8 {
            crc = if crc & 0x8000 != 0 { (crc << 1) ^ 0x1021 } else { crc << 1 };
        }
    }
    crc
}

pub fn encode_frame(seq: u8, msg_type: u8, payload: &[u8]) -> Vec<u8> {
    let mut body = Vec::with_capacity(3 + payload.len());
    body.push(payload.len() as u8); // LEN
    body.push(seq); // SEQ
    body.push(msg_type); // TYPE
    body.extend_from_slice(payload);
    let crc = crc16_ccitt(&body);
    let mut frame = Vec::with_capacity(1 + body.len() + 2);
    frame.push(SYNC);
    frame.extend_from_slice(&body);
    frame.push((crc >> 8) as u8);
    frame.push((crc & 0xff) as u8);
    frame
}

#[derive(Debug, PartialEq, Eq)]
pub struct DecodedFrame {
    pub seq: u8,
    pub msg_type: u8,
    pub payload: Vec<u8>,
}

/// Потоковый декодер: накапливает байты, выдаёт валидные кадры. Resync/replay-guard внутри.
#[derive(Default)]
pub struct FrameDecoder {
    buf: Vec<u8>,
    last_seq: Option<u8>,
}

impl FrameDecoder {
    pub fn push(&mut self, bytes: &[u8]) -> Vec<DecodedFrame> {
        self.buf.extend_from_slice(bytes);
        let mut out = Vec::new();
        loop {
            // resync: отбросить мусор до SYNC
            while !self.buf.is_empty() && self.buf[0] != SYNC {
                self.buf.remove(0);
            }
            if self.buf.len() < 4 {
                break; // нужно минимум SYNC+LEN+SEQ+TYPE
            }
            let len = self.buf[1] as usize;
            let frame_len = 1 + 3 + len + 2;
            if self.buf.len() < frame_len {
                break; // кадр ещё не дочитан
            }
            let body = &self.buf[1..1 + 3 + len];
            let crc_rx = ((self.buf[1 + 3 + len] as u16) << 8) | self.buf[1 + 3 + len + 1] as u16;
            if crc_rx != crc16_ccitt(body) {
                self.buf.remove(0); // битый CRC → сдвиг, resync
                continue;
            }
            let seq = self.buf[2];
            let msg_type = self.buf[3];
            let payload = self.buf[4..4 + len].to_vec();
            self.buf.drain(0..frame_len);
            let replay = matches!(self.last_seq, Some(l) if l == seq);
            self.last_seq = Some(seq);
            if !replay {
                out.push(DecodedFrame { seq, msg_type, payload });
            }
        }
        out
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn roundtrip() {
        let f = encode_frame(7, 0x02, &[42]);
        let mut d = FrameDecoder::default();
        let out = d.push(&f);
        assert_eq!(out, vec![DecodedFrame { seq: 7, msg_type: 0x02, payload: vec![42] }]);
    }

    #[test]
    fn corruption_dropped_then_resyncs() {
        let mut d = FrameDecoder::default();
        let mut f = encode_frame(1, 0x01, &[]);
        f[4] ^= 0xFF; // флип в CRC/payload-зоне → битый CRC
        assert_eq!(d.push(&f), vec![]); // отброшен
        let g = encode_frame(2, 0x01, &[]);
        assert_eq!(d.push(&g), vec![DecodedFrame { seq: 2, msg_type: 0x01, payload: vec![] }]);
    }

    #[test]
    fn replay_dropped() {
        let mut d = FrameDecoder::default();
        let f = encode_frame(5, 0x01, &[]);
        assert_eq!(d.push(&f).len(), 1);
        assert_eq!(d.push(&f), vec![]); // тот же seq → replay drop
    }

    #[test]
    fn desync_garbage_before_sync() {
        let mut d = FrameDecoder::default();
        let mut stream = vec![0x00, 0xFF, 0x13]; // мусор
        stream.extend_from_slice(&encode_frame(9, 0x83, &[0x2B, 0x67]));
        let out = d.push(&stream);
        assert_eq!(out, vec![DecodedFrame { seq: 9, msg_type: 0x83, payload: vec![0x2B, 0x67] }]);
    }
}
  • Шаг 3 — lib.rs: добавить pub mod protocol; и pub mod codec;.

  • Шаг 4 — прогон. Run: cargo test -p shturman-power codec. Expected: PASS (4 теста).

  • Шаг 5 — commit.

git add crates/core/shturman-power/src/protocol.rs crates/core/shturman-power/src/codec.rs crates/core/shturman-power/src/lib.rs
git commit -s -m "feat(v0.4): SoC↔MCU протокол + кодек (CRC/replay/desync-guard, B08)"

P8.4: copprocessor — Coprocessor trait + мок + клиент + B09-модель

Files: Create crates/core/shturman-power/src/coprocessor.rs; Modify lib.rs.

  • Шаг 1 — coprocessor.rs:
//! SoC↔MCU копроцессор (B08/B09, спека v0.4 §6.3–§6.4). Транспорт байт-уровневый (codec исполняется реально).
//! `MockCoprocessor` моделирует MCU + независимый fail-safe-таймер (B09). Прод `SerialCoprocessor` — стаб (UART → HW).

use crate::codec::{encode_frame, FrameDecoder};
use crate::protocol::{McuToSoc, SocToMcu};
use std::collections::VecDeque;
use std::sync::{Arc, Mutex};

// Тайминги — placeholder (сек). Тюнинг hold-up — hardware §3 (RK3588).
pub const HEARTBEAT_SECS: u64 = 1;
pub const FAILSAFE_MISS: u64 = 3; // пропущено heartbeat-окон в running → SoC завис
pub const HOLDUP_BUDGET_SECS: u64 = 5; // shutdown без safe-to-cut дольше → cut
pub const BROWNOUT_MV: u16 = 11_000; // under-voltage backstop (placeholder, hardware §3)

/// Байт-уровневый линк к MCU. `failsafe_due`/`set_now` — модель B09 (прод: реальный MCU в железе → default).
pub trait Coprocessor: Send + Sync {
    fn tx(&self, bytes: &[u8]); // SoC → MCU
    fn rx(&self) -> Vec<u8>; // MCU → SoC (drain)
    fn failsafe_due(&self) -> bool {
        false
    }
    fn set_now(&self, _secs: u64) {}
}

/// SoC-сторона: heartbeat / shutdown-imminent / safe-to-cut + декод входящих MCU→SoC.
pub struct CoprocessorClient {
    link: Arc<dyn Coprocessor>,
    seq: u8,
    decoder: FrameDecoder,
}

impl CoprocessorClient {
    pub fn new(link: Arc<dyn Coprocessor>) -> Self {
        Self { link, seq: 0, decoder: FrameDecoder::default() }
    }
    fn send(&mut self, msg: SocToMcu) {
        self.seq = self.seq.wrapping_add(1);
        let f = encode_frame(self.seq, msg.wire_type(), &msg.payload());
        self.link.tx(&f);
    }
    pub fn heartbeat(&mut self) {
        self.send(SocToMcu::Heartbeat);
    }
    pub fn shutdown_imminent(&mut self, budget: u8) {
        self.send(SocToMcu::ShutdownImminent { budget });
    }
    pub fn safe_to_cut(&mut self) {
        self.send(SocToMcu::SafeToCut);
    }
    pub fn poll(&mut self) -> Vec<McuToSoc> {
        let bytes = self.link.rx();
        self.decoder
            .push(&bytes)
            .into_iter()
            .filter_map(|f| McuToSoc::from_wire(f.msg_type, &f.payload))
            .collect()
    }
}

#[derive(Default)]
struct MockState {
    soc_decoder: FrameDecoder,
    mcu_seq: u8,
    out: Vec<u8>,
    last_heartbeat: u64,
    shutdown_at: Option<u64>,
    safe_to_cut: bool,
    hung: bool,
    now: u64,
}

/// Мок MCU: декодит SoC-кадры (через реальный codec), моделирует B09-таймер, эмитит MCU→SoC.
#[derive(Clone, Default)]
pub struct MockCoprocessor {
    st: Arc<Mutex<MockState>>,
}

impl MockCoprocessor {
    pub fn new() -> Self {
        Self::default()
    }
    /// MCU → SoC (dev-mock кормит ACC/voltage отсюда).
    pub fn emit(&self, msg: McuToSoc) {
        let mut s = self.st.lock().unwrap();
        s.mcu_seq = s.mcu_seq.wrapping_add(1);
        let f = encode_frame(s.mcu_seq, msg.wire_type(), &msg.payload());
        s.out.extend_from_slice(&f);
    }
    /// `HangSoc()` — SoC «завис»: heartbeat больше не освежает таймер → сработает B09.
    pub fn hang(&self) {
        self.st.lock().unwrap().hung = true;
    }
}

impl Coprocessor for MockCoprocessor {
    fn tx(&self, bytes: &[u8]) {
        let mut s = self.st.lock().unwrap();
        let now = s.now;
        for f in s.soc_decoder.push(bytes) {
            if let Some(msg) = SocToMcu::from_wire(f.msg_type, &f.payload) {
                match msg {
                    SocToMcu::Heartbeat => {
                        if !s.hung {
                            s.last_heartbeat = now;
                        }
                    }
                    SocToMcu::ShutdownImminent { .. } => s.shutdown_at = Some(now),
                    SocToMcu::SafeToCut => s.safe_to_cut = true,
                }
            }
        }
    }
    fn rx(&self) -> Vec<u8> {
        std::mem::take(&mut self.st.lock().unwrap().out)
    }
    fn failsafe_due(&self) -> bool {
        let s = self.st.lock().unwrap();
        match s.shutdown_at {
            // running: тишина heartbeat дольше FAILSAFE_MISS окон → SoC завис → cut
            None => s.last_heartbeat != 0 && s.now.saturating_sub(s.last_heartbeat) > FAILSAFE_MISS * HEARTBEAT_SECS,
            // shutdown: бюджет истёк без safe-to-cut → cut
            Some(t0) => !s.safe_to_cut && s.now.saturating_sub(t0) > HOLDUP_BUDGET_SECS,
        }
    }
    fn set_now(&self, secs: u64) {
        self.st.lock().unwrap().now = secs;
    }
}

/// Прод-стаб: реальный UART/I2C — HW-bring-up-подфаза. В v0.4 не активен (прод-источника событий нет).
#[derive(Default)]
pub struct SerialCoprocessor;
impl SerialCoprocessor {
    pub fn new() -> Self {
        Self
    }
}
impl Coprocessor for SerialCoprocessor {
    fn tx(&self, _bytes: &[u8]) {}
    fn rx(&self) -> Vec<u8> {
        Vec::new()
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn client_heartbeat_decoded_by_mock() {
        let mock = Arc::new(MockCoprocessor::new());
        mock.set_now(1);
        let mut client = CoprocessorClient::new(mock.clone());
        client.heartbeat();
        // heartbeat освежил таймер → нет failsafe
        assert!(!mock.failsafe_due());
    }

    #[test]
    fn mcu_to_soc_roundtrip() {
        let mock = Arc::new(MockCoprocessor::new());
        let mut client = CoprocessorClient::new(mock.clone());
        mock.emit(McuToSoc::Acc { on: true });
        mock.emit(McuToSoc::Voltage { mv: 13_800 });
        assert_eq!(client.poll(), vec![McuToSoc::Acc { on: true }, McuToSoc::Voltage { mv: 13_800 }]);
    }

    #[test]
    fn failsafe_on_soc_hang() {
        let mock = Arc::new(MockCoprocessor::new());
        let mut client = CoprocessorClient::new(mock.clone());
        mock.set_now(1);
        client.heartbeat(); // last_heartbeat = 1
        mock.hang(); // SoC завис
        mock.set_now(2);
        client.heartbeat(); // hung → таймер НЕ освежается
        assert!(!mock.failsafe_due()); // 21=1, ≤ 3
        mock.set_now(5); // 51=4 > FAILSAFE_MISS*HEARTBEAT (3)
        assert!(mock.failsafe_due());
    }

    #[test]
    fn failsafe_on_holdup_budget() {
        let mock = Arc::new(MockCoprocessor::new());
        let mut client = CoprocessorClient::new(mock.clone());
        mock.set_now(10);
        client.shutdown_imminent(2); // shutdown_at = 10
        mock.set_now(14);
        assert!(!mock.failsafe_due()); // 1410=4 ≤ 5
        mock.set_now(16); // > HOLDUP_BUDGET (5)
        assert!(mock.failsafe_due());
        // safe-to-cut снимает failsafe
        mock.set_now(11);
        client.safe_to_cut();
        mock.set_now(20);
        assert!(!mock.failsafe_due());
    }
}
  • Шаг 2 — lib.rs: добавить pub mod coprocessor;.

  • Шаг 3 — прогон. Run: cargo test -p shturman-power coprocessor. Expected: PASS (4 теста).

  • Шаг 4 — commit.

git add crates/core/shturman-power/src/coprocessor.rs crates/core/shturman-power/src/lib.rs
git commit -s -m "feat(v0.4): Coprocessor trait + MockCoprocessor (B09-модель) + клиент (B08)"

P8.5: thermal — источники/throttler + ThermalMonitor

Files: Modify crates/core/shturman-power/src/thermal.rs (часть 2).

  • Шаг 1 — дописать thermal.rs (после политики; перед #[cfg(test)] mod policy_tests):
use std::sync::atomic::{AtomicI32, Ordering};
use std::sync::Arc;

/// Источник температуры (°C). real = sysfs; VM = mock.
pub trait TempSource: Send + Sync {
    fn read_celsius(&self) -> i32;
}

/// Mock-источник (dev): температуру задаёт `SetTemp` через dev-D-Bus.
#[derive(Clone)]
pub struct MockTempSource {
    temp: Arc<AtomicI32>,
}
impl MockTempSource {
    pub fn new(init_c: i32) -> Self {
        Self { temp: Arc::new(AtomicI32::new(init_c)) }
    }
    pub fn set(&self, c: i32) {
        self.temp.store(c, Ordering::Relaxed);
    }
}
impl TempSource for MockTempSource {
    fn read_celsius(&self) -> i32 {
        self.temp.load(Ordering::Relaxed)
    }
}

/// Прод: max по `/sys/class/thermal/thermal_zone*/temp` (миллиградусы). В Lima зоны статичны → числа на RK3588.
#[derive(Default)]
pub struct SysfsTempSource;
impl SysfsTempSource {
    pub fn new() -> Self {
        Self
    }
}
impl TempSource for SysfsTempSource {
    fn read_celsius(&self) -> i32 {
        let mut max = i32::MIN;
        if let Ok(rd) = std::fs::read_dir("/sys/class/thermal") {
            for e in rd.flatten() {
                let p = e.path().join("temp");
                if let Ok(s) = std::fs::read_to_string(&p) {
                    if let Ok(milli) = s.trim().parse::<i32>() {
                        max = max.max(milli / 1000);
                    }
                }
            }
        }
        if max == i32::MIN {
            0
        } else {
            max
        }
    }
}

/// Применение throttle. real = cpufreq (HW); VM = запись уровня (no-op-эффект).
pub trait Throttler: Send + Sync {
    fn apply(&self, level: ThermalLevel);
}

/// VM/прод-каркас: логирует + запоминает последний уровень (реальный cpufreq — HW).
#[derive(Default, Clone)]
pub struct NoopThrottler {
    last: Arc<std::sync::Mutex<Option<ThermalLevel>>>,
}
impl NoopThrottler {
    pub fn last(&self) -> Option<ThermalLevel> {
        *self.last.lock().unwrap()
    }
}
impl Throttler for NoopThrottler {
    fn apply(&self, level: ThermalLevel) {
        *self.last.lock().unwrap() = Some(level);
        tracing::info!("thermal: throttle уровень {} (эффект cpufreq — HW)", level.as_str());
    }
}

/// Результат шага монитора: уровень + рёбра входа/выхода Critical (для FSM ThermalTrip/ThermalCleared).
#[derive(Debug, PartialEq, Eq)]
pub struct ThermalObservation {
    pub level: ThermalLevel,
    pub changed: bool,
    pub entered_critical: bool,
    pub left_critical: bool,
}

/// Монитор: хранит предыдущий уровень, применяет политику, размечает рёбра Critical.
pub struct ThermalMonitor {
    prev: ThermalLevel,
}
impl Default for ThermalMonitor {
    fn default() -> Self {
        Self { prev: ThermalLevel::Normal }
    }
}
impl ThermalMonitor {
    pub fn new() -> Self {
        Self::default()
    }
    pub fn observe(&mut self, temp_c: i32) -> ThermalObservation {
        let level = ThermalPolicy::next(self.prev, temp_c);
        let changed = level != self.prev;
        let entered_critical = changed && level == ThermalLevel::Critical;
        let left_critical = changed && self.prev == ThermalLevel::Critical;
        self.prev = level;
        ThermalObservation { level, changed, entered_critical, left_critical }
    }
}
  • Шаг 2 — тесты монитора. В thermal.rs добавить отдельный тест-модуль:
#[cfg(test)]
mod monitor_tests {
    use super::*;

    #[test]
    fn marks_critical_edges() {
        let mut m = ThermalMonitor::new();
        let o = m.observe(96);
        assert!(o.entered_critical && o.changed && o.level == ThermalLevel::Critical);
        let o = m.observe(96); // держится — рёбер нет
        assert!(!o.changed && !o.entered_critical);
        let o = m.observe(80); // < 90 → выход из critical
        assert!(o.left_critical && o.level == ThermalLevel::Throttle(1));
    }

    #[test]
    fn mock_source_and_noop_throttler() {
        let src = MockTempSource::new(20);
        assert_eq!(src.read_celsius(), 20);
        src.set(88);
        assert_eq!(src.read_celsius(), 88);
        let th = NoopThrottler::default();
        th.apply(ThermalLevel::Throttle(1));
        assert_eq!(th.last(), Some(ThermalLevel::Throttle(1)));
    }
}
  • Шаг 3 — прогон. Run: cargo test -p shturman-power thermal. Expected: PASS (policy 3 + monitor 2).

  • Шаг 4 — commit.

git add crates/core/shturman-power/src/thermal.rs
git commit -s -m "feat(v0.4): TempSource/Throttler-абстракции + ThermalMonitor (A12/B10)"

P8.6: проводка сервиса — D-Bus + циклы + dev-mock

Files: Modify service.rs, main.rs, crates/shturman-ipc/src/proxy.rs.

  • Шаг 1 — service.rs: поля сервиса + хендлы. В PowerService добавить поле thermal_state и хендлы:
use crate::thermal::ThermalLevel;
// ... в use-блоке к существующим

Заменить структуру/Default/new/mock блок на:

pub struct PowerService {
    fsm: Arc<Mutex<PowerFsm>>,
    thermal_state: Arc<Mutex<ThermalLevel>>,
}

impl Default for PowerService {
    fn default() -> Self {
        Self {
            fsm: Arc::new(Mutex::new(PowerFsm::new())),
            thermal_state: Arc::new(Mutex::new(ThermalLevel::Normal)),
        }
    }
}

impl PowerService {
    pub fn new() -> Self {
        Self::default()
    }
    pub fn power_state(&self) -> shturman_ipc::types::PowerState {
        self.fsm.lock().unwrap().power_state()
    }
    pub fn ignition(&self) -> shturman_ipc::types::IgnitionState {
        self.fsm.lock().unwrap().ignition()
    }
    pub fn source(&self) -> shturman_ipc::types::PowerSource {
        self.fsm.lock().unwrap().source()
    }
    pub fn fsm_handle(&self) -> Arc<Mutex<PowerFsm>> {
        Arc::clone(&self.fsm)
    }
    pub fn thermal_state_handle(&self) -> Arc<Mutex<ThermalLevel>> {
        Arc::clone(&self.thermal_state)
    }

    #[cfg(feature = "dev-mocks")]
    pub fn mock(
        &self,
        temp: Arc<crate::thermal::MockTempSource>,
        copro: Arc<crate::coprocessor::MockCoprocessor>,
    ) -> PowerMock {
        PowerMock { fsm: Arc::clone(&self.fsm), temp, copro }
    }
}
  • Шаг 2 — service.rs: Action::Cut в apply_event. В match a { ... } добавить рукав:
            Action::Cut => {
                tracing::warn!("power: MCU fail-safe cut (SoC hang / hold-up budget) — forced off");
            }
  • Шаг 3 — service.rs: D-Bus property + сигнал. В #[interface(name = "ru.shturman.Power1")]-блоке добавить:
    #[zbus(property)]
    async fn thermal_state(&self) -> String {
        self.thermal_state.lock().unwrap().as_str().to_string()
    }

    #[zbus(signal)]
    async fn thermal_changed(ctx: &SignalContext<'_>, state: &str, celsius: i32) -> zbus::Result<()>;
  • Шаг 4 — service.rs: spawn_loops (свободная функция, после apply_event):
use crate::coprocessor::{Coprocessor, CoprocessorClient, BROWNOUT_MV};
use crate::fsm::{Phase, State};
use crate::protocol::McuToSoc;
use crate::thermal::{TempSource, ThermalMonitor, Throttler};

/// Фоновые циклы v0.4 — thermal-монитор + coprocessor (heartbeat/wait/safe-to-cut/B09). Монотоника.
#[allow(clippy::too_many_arguments)]
pub fn spawn_loops(
    fsm: Arc<Mutex<PowerFsm>>,
    thermal_state: Arc<Mutex<crate::thermal::ThermalLevel>>,
    temp: Arc<dyn TempSource>,
    throttler: Arc<dyn Throttler>,
    copro: Arc<dyn Coprocessor>,
    ctx: SignalContext<'static>,
) {
    // thermal-цикл
    {
        let (fsm, ctx) = (Arc::clone(&fsm), ctx.clone());
        tokio::spawn(async move {
            let mut mon = ThermalMonitor::new();
            loop {
                tokio::time::sleep(Duration::from_secs(crate::thermal::POLL_SECS)).await;
                let obs = mon.observe(temp.read_celsius());
                if !obs.changed {
                    continue;
                }
                throttler.apply(obs.level);
                *thermal_state.lock().unwrap() = obs.level;
                let _ = PowerService::thermal_changed(
                    &ctx,
                    obs.level.as_str(),
                    temp.read_celsius(),
                )
                .await;
                if obs.entered_critical {
                    let _ = apply_event(&fsm, Event::ThermalTrip, &ctx).await;
                }
                if obs.left_critical {
                    let in_thermal_abortable = matches!(
                        fsm.lock().unwrap().state(),
                        State::ShuttingDown { phase: Phase::Abortable, reason: shturman_ipc::types::ShutdownReason::Thermal }
                    );
                    if in_thermal_abortable {
                        let _ = apply_event(&fsm, Event::ThermalCleared, &ctx).await;
                    }
                }
            }
        });
    }
    // coprocessor-цикл (heartbeat / wait-for-completion / safe-to-cut / B09 failsafe)
    {
        tokio::spawn(async move {
            let mut client = CoprocessorClient::new(Arc::clone(&copro));
            let mut last_committed = false;
            let mut last_shutting = false;
            loop {
                tokio::time::sleep(Duration::from_secs(crate::coprocessor::HEARTBEAT_SECS)).await;
                copro.set_now(monotonic_secs());
                // входящие MCU→SoC → FSM
                for msg in client.poll() {
                    match msg {
                        McuToSoc::Acc { on } => {
                            let ev = if on { Event::AccOn } else { Event::AccOff };
                            let _ = apply_event(&fsm, ev, &ctx).await;
                        }
                        McuToSoc::Voltage { mv } if mv < BROWNOUT_MV => {
                            let _ = apply_event(&fsm, Event::UnderVoltage, &ctx).await;
                        }
                        _ => {}
                    }
                }
                // рёбра состояния FSM → протокол
                let st = fsm.lock().unwrap().state();
                let shutting = matches!(st, State::ShuttingDown { .. });
                let committed = matches!(st, State::ShuttingDown { phase: Phase::Committed, .. });
                if shutting && !last_shutting {
                    client.shutdown_imminent(crate::coprocessor::HOLDUP_BUDGET_SECS as u8);
                } else if committed && !last_committed {
                    client.safe_to_cut(); // PONR → MCU режет немедленно
                } else if !shutting {
                    client.heartbeat(); // running/accessory — keepalive
                }
                last_shutting = shutting;
                last_committed = committed;
                // B09: независимый fail-safe-таймер (зависший SoC / истёк бюджет)
                if copro.failsafe_due() {
                    let _ = apply_event(&fsm, Event::FailsafeCut, &ctx).await;
                }
            }
        });
    }
}

(Добавить use std::time::Duration; если ещё нет — в v0.3 уже есть. monotonic_secs уже импортирован.)

  • Шаг 5 — service.rs: расширить PowerMock. Заменить структуру PowerMock и добавить методы:
#[cfg(feature = "dev-mocks")]
pub struct PowerMock {
    fsm: Arc<Mutex<PowerFsm>>,
    temp: Arc<crate::thermal::MockTempSource>,
    copro: Arc<crate::coprocessor::MockCoprocessor>,
}

В #[interface(name = "ru.shturman.dev.PowerMock1")] impl PowerMock добавить (к существующим SetAcc/…):

    /// Задать температуру (°C) → thermal-монитор подхватит на следующем poll.
    async fn set_temp(&self, celsius: i32) {
        self.temp.set(celsius);
    }

    /// «Завис SoC»: heartbeat перестаёт освежать B09-таймер → MCU срежет питание.
    async fn hang_soc(&self) {
        self.copro.hang();
    }

    /// Тишина линка: SoC-сторона деградирует (лог, не self-cut — red-line). MCU-политика cut-vs-hold — B §12/HW.
    async fn mcu_link_loss(&self) {
        tracing::warn!("coprocessor: MCU link loss — SoC деградирует (cut-vs-hold политика — HW/§12)");
    }
  • Шаг 6 — thermal.rs: добавить POLL_SECS. Рядом с порогами:
pub const POLL_SECS: u64 = 1; // период опроса температуры (монотоника)
  • Шаг 7 — main.rs: собрать источники + spawn. Заменить тело main на:
#[tokio::main]
async fn main() -> anyhow::Result<()> {
    init_tracing("shturman-power");
    let conn = connect().await?;
    let svc = PowerService::new();
    let fsm = svc.fsm_handle();
    let thermal_state = svc.thermal_state_handle();

    #[cfg(feature = "dev-mocks")]
    let temp = std::sync::Arc::new(shturman_power::thermal::MockTempSource::new(20));
    #[cfg(not(feature = "dev-mocks"))]
    let temp = std::sync::Arc::new(shturman_power::thermal::SysfsTempSource::new());
    let throttler = std::sync::Arc::new(shturman_power::thermal::NoopThrottler::default());
    #[cfg(feature = "dev-mocks")]
    let copro = std::sync::Arc::new(shturman_power::coprocessor::MockCoprocessor::new());
    #[cfg(not(feature = "dev-mocks"))]
    let copro = std::sync::Arc::new(shturman_power::coprocessor::SerialCoprocessor::new());

    #[cfg(feature = "dev-mocks")]
    let mock = svc.mock(temp.clone(), copro.clone());

    conn.object_server().at(names::power::PATH, svc).await?;
    #[cfg(feature = "dev-mocks")]
    conn.object_server().at(names::power::PATH, mock).await?;
    conn.request_name(names::power::NAME).await?;

    let iface = conn
        .object_server()
        .interface::<_, PowerService>(names::power::PATH)
        .await?;
    let ctx = iface.signal_context().to_owned();
    shturman_power::spawn_loops(
        fsm,
        thermal_state,
        temp as std::sync::Arc<dyn shturman_power::thermal::TempSource>,
        throttler as std::sync::Arc<dyn shturman_power::thermal::Throttler>,
        copro as std::sync::Arc<dyn shturman_power::coprocessor::Coprocessor>,
        ctx,
    );

    tracing::info!("ru.shturman.Power1 на шине (FSM + thermal + coprocessor)");
    std::future::pending::<()>().await;
    Ok(())
}
  • Шаг 8 — proxy.rs: добавить в Power1Proxy. Рядом с существующими property/signal:
    #[zbus(property)]
    fn thermal_state(&self) -> zbus::Result<String>;

    #[zbus(signal)]
    fn thermal_changed(&self, state: String, celsius: i32) -> zbus::Result<()>;
  • Шаг 9 — прогон unit + lint. Run: cargo test -p shturman-power && cargo clippy -p shturman-power --all-targets -- -D warnings. Expected: PASS (все юниты v0.3+v0.4; clippy чисто).

  • Шаг 10 — commit.

git add crates/core/shturman-power/src/service.rs crates/core/shturman-power/src/main.rs crates/shturman-ipc/src/proxy.rs crates/core/shturman-power/src/thermal.rs
git commit -s -m "feat(v0.4): проводка thermal+coprocessor циклов + D-Bus ThermalState/ThermalChanged + dev-mock"

P8.7: integration-тесты (session-шина)

Files: Modify crates/core/shturman-power/tests/integration.rs.

  • Шаг 1 — добавить тесты (после существующих v0.3; используют тот же паттерн server/client + spawn_loops):
#[tokio::test]
#[ignore = "нужна session-шина: just test-integration"]
async fn thermal_trip_then_clear() {
    use futures_util::StreamExt;
    use shturman_power::{coprocessor::MockCoprocessor, thermal::{MockTempSource, NoopThrottler}};
    use std::sync::Arc;

    let svc = PowerService::new();
    let fsm = svc.fsm_handle();
    let thermal_state = svc.thermal_state_handle();
    let temp = Arc::new(MockTempSource::new(20));
    let copro = Arc::new(MockCoprocessor::new());

    let server = zbus::Connection::session().await.unwrap();
    server.object_server().at(names::power::PATH, svc).await.unwrap();
    server.request_name(names::power::NAME).await.unwrap();
    let iface = server.object_server().interface::<_, PowerService>(names::power::PATH).await.unwrap();
    let ctx = iface.signal_context().to_owned();
    shturman_power::spawn_loops(
        fsm,
        thermal_state,
        temp.clone() as Arc<dyn shturman_power::thermal::TempSource>,
        Arc::new(NoopThrottler::default()) as Arc<dyn shturman_power::thermal::Throttler>,
        copro as Arc<dyn shturman_power::coprocessor::Coprocessor>,
        ctx,
    );

    let client = zbus::Connection::session().await.unwrap();
    let power = PowerClient::new(&client).await.unwrap();
    let mut imminent = power.proxy().receive_shutdown_imminent().await.unwrap();
    let mut aborted = power.proxy().receive_shutdown_aborted().await.unwrap();

    // перегрев → ShutdownImminent(thermal)
    temp.set(99);
    let sig = imminent.next().await.unwrap();
    assert_eq!(sig.args().unwrap().reason(), "thermal");
    assert_eq!(power.power_state().await.unwrap(), PowerState::ShuttingDown);

    // остыло до PONR → ShutdownAborted + running
    temp.set(20);
    aborted.next().await.unwrap();
    assert_eq!(power.power_state().await.unwrap(), PowerState::Running);
}

#[tokio::test]
#[ignore = "нужна session-шина: just test-integration"]
async fn mcu_failsafe_cuts_on_hang() {
    use shturman_power::{coprocessor::MockCoprocessor, thermal::{MockTempSource, NoopThrottler}};
    use std::sync::Arc;

    let svc = PowerService::new();
    let fsm = svc.fsm_handle();
    let thermal_state = svc.thermal_state_handle();
    let temp = Arc::new(MockTempSource::new(20));
    let copro = Arc::new(MockCoprocessor::new());

    let server = zbus::Connection::session().await.unwrap();
    server.object_server().at(names::power::PATH, svc).await.unwrap();
    server.request_name(names::power::NAME).await.unwrap();
    let iface = server.object_server().interface::<_, PowerService>(names::power::PATH).await.unwrap();
    let ctx = iface.signal_context().to_owned();
    shturman_power::spawn_loops(
        fsm,
        thermal_state,
        temp as Arc<dyn shturman_power::thermal::TempSource>,
        Arc::new(NoopThrottler::default()) as Arc<dyn shturman_power::thermal::Throttler>,
        copro.clone() as Arc<dyn shturman_power::coprocessor::Coprocessor>,
        ctx,
    );

    let client = zbus::Connection::session().await.unwrap();
    let power = PowerClient::new(&client).await.unwrap();
    assert_eq!(power.power_state().await.unwrap(), PowerState::Running);

    // дать coproc-циклу послать ≥1 heartbeat (иначе last_heartbeat=0 и guard не даст cut)
    tokio::time::sleep(std::time::Duration::from_millis(1300)).await;
    copro.hang(); // SoC завис → heartbeat не освежает таймер
    // ждём, пока coproc-цикл (HEARTBEAT=1с) накопит > FAILSAFE_MISS окон и сделает FailsafeCut
    for _ in 0..10 {
        tokio::time::sleep(std::time::Duration::from_millis(700)).await;
        if power.power_state().await.unwrap() == PowerState::Off {
            break;
        }
    }
    assert_eq!(power.power_state().await.unwrap(), PowerState::Off);
}
  • Шаг 2 — прогон. Run: just test-integration. Expected: PASS (v0.3 2 + v0.4 2 = 4; --test-threads=1).

  • Шаг 3 — commit.

git add crates/core/shturman-power/tests/integration.rs
git commit -s -m "test(v0.4): integration — thermal-trip/abort + MCU fail-safe-cut (session-шина)"

P8.8: E2E-блок v0.4 (Lima, гибрид)

Files: Modify tests/e2e/run.sh.

  • Шаг 1 — блок v0.4 в run.sh. Вставить после блока power-safe v0.3 (после строки pass "watchdog-конфиг на месте"), до # ---- 8. base-бюджеты:
# ---- v0.4: thermal-trip + throttling + MCU fail-safe (мок-sensor/MCU) ----
# Каждый под-тест стартует с чистого running-power (рестарт сбрасывает MockTempSource→20, FSM→running,
# и счётчик start-limit). thermal-abort покрыт integration-тестом (P8.7) — в E2E не гоняем с grace-окном.
info "v0.4: thermal-trip → ShutdownImminent(thermal); throttling-банд; MCU fail-safe (hang → cut)"
P_RESTART() {  # чистый рестарт power → running
  sudo systemctl reset-failed shturman-power.service 2>/dev/null || true
  sudo systemctl restart shturman-power.service
  for _ in $(seq 1 10); do systemctl is-active --quiet shturman-power && break; sleep 1; done
  sleep 1.5  # дать циклам (poll/heartbeat ~1с) стартовать
}

# thermal-trip: SetTemp ≥ critical → ShutdownImminent(thermal) (монитор poll ~1с; ловим до grace-commit)
P_RESTART
mon=$(mktemp)
# shellcheck disable=SC2024
sudo busctl --system monitor "$P_NAME" >"$mon" 2>&1 & M=$!
sleep 0.4; P_CALL SetTemp i 99; sleep 1.6
sudo kill "$M" 2>/dev/null; wait "$M" 2>/dev/null
grep -q ShutdownImminent "$mon" || { cat "$mon"; rm -f "$mon"; fail "thermal: ShutdownImminent не наблюдаем"; }
grep -q thermal "$mon" || { cat "$mon"; rm -f "$mon"; fail "thermal: reason != thermal"; }
rm -f "$mon"
pass "thermal-trip: SetTemp≥critical → ShutdownImminent(thermal)"

# throttling-банд (85..95) → ThermalState=throttle, БЕЗ shutdown
P_RESTART
P_CALL SetTemp i 88; sleep 2
ts=$(busctl --system get-property "$P_NAME" "$P_PATH" "$P_IFACE" ThermalState 2>/dev/null)
echo "$ts" | grep -q throttle || { echo "ThermalState=$ts"; fail "thermal: ThermalState != throttle на 88°C"; }
busctl --system call "$P_NAME" "$P_PATH" "$P_IFACE" GetPowerState | grep -q running || fail "thermal: throttle не должен ронять"
pass "throttling-банд: ThermalState=throttle на 88°C, без shutdown"

# MCU fail-safe: HangSoc → heartbeat пропал → MCU режет (FSM → off) детерминированно (B09)
P_RESTART
P_CALL HangSoc
ok=0
for _ in $(seq 1 10); do
  sleep 1
  busctl --system call "$P_NAME" "$P_PATH" "$P_IFACE" GetPowerState 2>/dev/null | grep -q off && { ok=1; break; }
done
[ "$ok" = 1 ] || fail "MCU fail-safe: power не off после HangSoc (B09 не сработал)"
pass "MCU fail-safe: HangSoc → MCU cut (FSM off), детерминированно"
P_RESTART  # чистый running для последующих блоков

(P_CALL уже определён в блоке v0.3: busctl --system call "$P_NAME" "$P_PATH" "$P_MOCK" "$@". P_IFACE/P_NAME/ P_PATH — из преамбулы run.sh.)

  • Шаг 2 — shellcheck. Run: shellcheck -S warning tests/e2e/run.sh. Expected: чисто.

  • Шаг 3 — commit.

git add tests/e2e/run.sh
git commit -s -m "test(v0.4): E2E-блок thermal-trip + throttling + MCU fail-safe (мок-sensor/MCU)"

P8.9: verify в Lima + acceptance + prod-gate + швы

  • Шаг 1 — host-гейт. Run: just ci + just test-integration. Expected: exit 0; все Power-юниты + 4 integration-теста зелёные.

  • Шаг 2 — чистый E2E. Run: just vm-reset && just e2e. Expected: exit 0; блок v0.4 зелёный (thermal-trip→ShutdownImminent(thermal), recovery→abort, throttle-банд, HangSoc→off); регресс v0.1–v0.3 зелёный (power-safe N=3, abort, power-cut; machine-id стабилен); E2E OK ✅.

  • Шаг 3 — итерации по реальным ошибкам (тайминги монитора/heartbeat, start-limit — reset-failed уже в паттерне, busctl-типы i/SetTemp) — систематически, один симптом → одна правка, повтор шаг 2.

  • Шаг 4 — prod-build-gate. Run: cargo build -p shturman-power --no-default-features && ! strings target/debug/shturman-power | grep -qE 'PowerMock1|SetTemp|HangSoc'. Expected: сборка ок; PowerMock1/SetTemp/HangSoc отсутствуют.

  • Шаг 5 — синхронизация швов (спека §10): docs/domains/b-power-lifecycle.md (§4 thermal-путь реализован, §5 MCU-протокол+кодек+клиент — софт/модель, физический MCU/fail-safe → HW; §6 wait-for-completion; §1a пороги placeholder; пометить A12/B08/B09/B10), docs/contracts/hardware.md §3/§1a (B08/B09 🟡 → HW-bring-up), docs/contracts/ipc.md §3 (Power += ThermalState/ThermalChanged), docs/capability-catalog.md (A12/B10 , B08/B09 софт+модель/🟡 HW), CLAUDE.md (статус v0.4 готово → v0.5).

  • Шаг 6 — commit доков.

git add docs/ CLAUDE.md
git commit -s -m "docs(v0.4): синхронизация швов MCU/thermal + статус"
  • Шаг 7 — finishing-a-development-branch (merge/PR — спросить пользователя; в main без явного «ок» не мержить).

Acceptance (спека v0.4 §9.4)

  • thermal-trip → graceful (ShutdownImminent(thermal)→commit→/data консистентен); гистерезис — нет осцилляции.
  • MCU fail-safe-таймер: SoC-hang/бюджет → детерминированный cut (FSM off); /data консистентен.
  • Throttling-политика по бандам (запись/ThermalState; числа — RK3588).
  • Кодек: replay/desync/corruption отбиты (unit).
  • ThermalState/ThermalChanged на шине; таймеры монотонны.
  • Регресс v0.1–v0.3 зелёный; just ci зелёный; prod-build-gate (нет PowerMock1/SetTemp/HangSoc); красные линии целы (MCU/Power — только питание, нет CAN/actuator).