snippetMinor

How did the authors of Spice apply weakest precondition rule on their example car problem?

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

machine-learning programming-languages formal-methods cs stackoverflow hoare-logic

problemthecarweakestruledidapplyauthorsexamplehow

Problem

While reading the example given in [1]., I couldn't understand how the authors set up the logic to compute the weakest preconditions (wp) in their car example in section 4.2.

The dynamics of the problem are given by -

$x' = x + 0.1v$

$v' = v + 0.1a + \epsilon$

I don't see the authors working on the former equation, only the latter. In addition, while reading about wp from the book [2], I learned that one needs to come up with a suitable postcondition. I believe the postcondition here is $v <= 1$. Why did the authors start the wp computation using $v_1 <=1 \land v_2 <= 1$? Also, in the next step they do $v_1<=1 \land v_1 + 0.1a_1 + \epsilon_1 <= 1$. Why did they only expand the equation on the right of the conjunction?

Here is how I imagined it should have been done (based on my very limited knowledge of wp computations -

We have the equation $v' = v + 0.1a + \epsilon$ and we have a horizon of 2. Therefore, we write this equation 2 times (because that's how RL algorithms are unrolled) -

$v_1 = v_0 + 0.1a_0 + \epsilon_0$

$v_2 = v_1 + 0.1a_1 + \epsilon_1$

Now we set up a postcondition on $v_2$ such that $v_2<=1$ and then work backwards. My understanding is probably wrong, but I'd like the reader to see where I am getting confused.

Please let me know if any clarification from my side is needed.

Regarding the relation between $x'$ and $x$ and also $v_2$ and $v_1$: My understanding here is that $x'$ represents the subsequent state of the variable $x$. This is because of the following line in the paper - "A policy, in interaction with the environment, generates trajectories (or rollouts)
$x_0,u_0, x_1,u_1, \ldots, u_{n-1}, x_n$ where $x_0 \sim p_0$, each $u_i \sim \pi(x_i)$, and each $x_{i+1} \sim P(x_i, u_i)$." Here we can see that each state can be obtained from the subsequent state except the initial state.

[1]. Anderson, Greg, Swarat Chaudhuri, and Isil Dillig. "Guiding Safe Exploration with Weakest Preconditions." International Conference on Learning Repre

Solution

Let's address your questions/confusions in order:

1.

I don't see the authors working on the former equation, only the latter.

That's because $x$ doesn't matter:

it does not occur in the postcondition ($v \leq 1$) investigated here

it does not occur in the RHS of the equation for $v$

-

Why did the authors start the wp computation using $v_1 \leq 1 \wedge v_2 \leq 1$?

They state that their safety horizon is $2$, that is, they want to express that the safety condition holds both after $1$ step and after $2$ steps. In their notation, where $v_i$ refers to the value of $v$ after $i$ steps, that's exactly expressed by the condition $v_1 \leq 1 \wedge v_2 \leq 1$.

3.

Also, in the next step they do $v_1 \leq 1 \wedge v_1+0.1 a_1 + \epsilon_1 \leq 1$. Why did they only expand the equation on the right of the conjunction?

Because the 2nd iteration of the step function affects the variables subscripted with $2$. The LHS of the conjunction has only a $1$-subscripted variable, $v_1$. It is thus invariant under the substitution $[v_2 \mapsto _1+0.1 a_1 + \epsilon_1]$ performed to compute the weakest precondition.

(from your 1st reply comment)

Why didn't the authors include $v_0$?

(a) They state in the example that $v_0 = 0.9$, trivially satisfying $v_0 \leq 1$ and (b) $v_0$ is not affected by the dynamics.

(from your 2nd reply comment)

Why did the author then work on both sides in the next step.

Because both sides refer to $v_1$, which is assigned to in that step. Whence the substitution applies meaningfully to both.

Context

StackExchange Computer Science Q#165368, answer score: 2

Revisions (0)

No revisions yet.