State tables and state transition graphs

Wed Sep 9 09:43:15 CEST 2009

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lee Winter <lee.j.i.winter at gmail.com> writes:

>> The server periodically tries to run a so-called "checker" program
>> (by default "fping").
>
> From the existing documentation it is clear that you already know
> that this approach is weak.  [...]
> Do you have stronger alternatives in mind for the checker/heatbeat?

What we use here is IPsec (ESP, transport mode) between the machines,
whereby the "fping" method is made secure.  But, since we recognize
that IPsec is not everyone's cup of tea, we made the checker/heartbeat
system configurable to whatever the administrator would feel to be
more secure.  I don't know of any particular heartbeat system which
has similar security, so I haven't looked at any specific
alternatives.

> The doc has to explicitly state somshere that when the server
> disables a client that state change is permanent and that a manual
> override is required to re-enable the client.  I didn't see anything
> in the doc that addressed that issue.  (So I had to ask).

Hmm, what is stated is that a client is disabled in such-and-such a
case, and it doesn't say anything about a client ever being enabled,
so I guess you could say it's implied.  But sure, it'd be clearer if
it was explicitly stated; I'll add a small sentence about it to the
"CHECKING" section of the "mandos(8)" manual page.

(This will, of course, all change when we get around to releasing the
next version with support for interactively controlling the server,
which is why I'm not planning to do anything sophisticated about this
inconvenience now.)

>> But a failing checker does not trigger a state change.
>
> OK, but it might be worth logging the failure [...]

The server logs it with syslog(3) level LOG_INFO; how this is handled
by the system depends on the configuration of the syslog daemon.  (The
actual disabling of a client is also logged.)

>> But for now, you'll have to restart the Mandos server to enable a
>> disabled client.
>
> So every server reboot re-enables all clients?  That looks to me like
> a security hole.  In connection with the simplicity of the "checker"
> (in clusters this is called the hearbeat, which term you might want to
> consider) I am certain that there is a hole.  And multiple servers
> won't block it.  In fact they make it easier to exploit.

We recognize the problem with this, and we do plan to rectify this in
a future version by having the server save state between restarts.  In
fact, we note this problem, and its clumsy workaround, in the third
paragraph of the "SECURITY" section in the "mandos(8)" manual page.
(But see also below regarding sophisticated attacks.)

> I know that the current threat model is not one that includes a
> sophisticated seizure.

Correct.  This is because we don't know how to solve that problem.
Mandos implements a strategy we came up with to actually *have*
servers with encrypted disks, with what we considered a marked
improvement in security compared to having them non-encrypted, or some
other scheme with encrypted disks but with no actual security
advantage.  We feel that the Mandos system, while not being ideal, is
still an, albeit small, improvement on what was previously possible.

If you have a solution to the general problem, or even an improvement,
you are very welcome to:

1. Tell us about it, and we might implement it, and/or:
2. Implement it yourself, and become rich and famous. :-)

> But it needs to [solve the general problem of non-seizable servers].

We'd really like to do that, but we don't know how.

> So a year or so after mandos makes it into stable the threat model
> will be radically different.  After all this is an open-source
> project, so the adversaries will be able to plan exactly how to
> circumvent the system.

As we discuss in the README file, there are already two other methods
of completely circumventing the security of a Mandos client:

1. The so-called "Cold Boot" attack can be used to read the LUKS
   master key from memory.  There is nothing we can do about this, and
   I'm not sure anyone can ever do anything about this, short of
   getting sticky with epoxy or the like.

2. As long as the timeout is long enough to accomodate a reboot, it's
   almost certainly long enough for someone to extract the physical
   hard drive from the target machine, plug it into a laptop and use
   the OpenPGP key from the disk's initrd image to fake a Mandos
   request to the Mandos server.

Because of this, we feel that *sophisticated* attacks is something
which we can not really defend against, so we don't consider it an
out-and-out emergency that there is some other sophisticated attack
which might also be possible in some specific circumstance.

However, we certaintly *do* want to eliminate attack vectors where
possible.  Case in point: the checker mechanism is, as a whole,
strictly unnecessary to prevent the naive turn-off-and-seize-
everything attack, but we implement it as a slight improvement anyway.
In the case of checkers, we do note the problem of insecure checkers
in the README file, but we don't have any actual alternatives except
IPsec to offer.

> So I suggest that the server has to authenicate the clients during
> checking or the timeout will be ineffective.

This is true, but we don't know of any to recommend except for using
IPsec.

> Perhaps this could be a more advanced version with a heatbeat daemon
> on the client.

Did you have anything particular in mind?  Or did you mean that we
could write our own, to go with mandos-client which all the clients
should run at all times?  It seems to me that this should be an
already solved problem, so there ought be something else out there we
could plug in to.

> I believe this issue bears further discussion

If no other alternative to IPsec can be found, maybe we should write
some example shell scripts which does SSH into the clients and
verifies their identities that way.

>> Let's see if I can take a stab at enumerating the different states:
>>
>> 0. Server stopped.
>> 1. Server running, client enabled.
>> 2. Server running, client enabled, checker running.
>> 3. Server running, client disabled.
>>
>> Rules:
>>  i) In state 0, when changing to state 1, start a timer with a
>>     timeout.
>>  ii) In states 1 or 2, a timer timeout will cause a change to state 3.
>> iii) In state 1, wait for a bit and then change to state 2.
>>  iv) In state 2, when a checker completes successfully, reset the
>>     timer before changing to state 1.  If the checker is
>>     unsuccessful, just change to state 1 without touching the timer.

I forgot one rule:

  v) In states 1 and 2, after the client receives its password, reset
     the timer.

(I forgot it because it's a late addition to the code; It's only in
the trunk yet, not in the released version 1.0.11.)

> Rephrased in pseudo code
[...]

Well, not quite.  Let's see if I can do something similar...(I make no
guarantees about whether this corresponds to the actual code as
implemented.)

Counters
- --------
timeout -- Time length from the last known good check until disabling
interval -- How often to start a new checker

Server
- ------
logical thread #1:
        while ( is_enabled( client) ) {
                wait_for_completion( interval );
                if ( ! checker ) ){
                        checker = start_checker_asynchronously();
                }
                reset_timer(interval);
        }

logical thread #2:
        while ( is_enabled( client) ){
                wait_for ( checker ); /* wait until checker runs */
                r = result( checker ); /* blocks until checker returns */
                if ( r == SUCCESS ){
                        reset_timer( timeout );
                }
        }

logical thread #3:
        if ( is_enabled(client) ) {
                wait_for_completion( timeout ); /* might never finish */
                disable( client );
                if ( checker ) {
                        kill ( checker );
                }
        }

logical thread #4:
        while ( is_enabled( client) ) {
                rq = get_client_requst(); /* blocks until the client
                                             requests its password */
                if ( is_enabled( client) ) {
                        send_password_to( rq );
                        reset_timer( timeout ); /* Unreleased code */
                }
        }

Of course, this is only for *one* client; with the server handling
multiple clients, it gets more complicated - this is just pseudocode.
(And the server isn't really multithreaded, anyway.)

>> I hope this will be enlightening.
>
> Yes.  More importantly it excludes many alternative possibilities

Good; glad I could help.

>> P.S.  Do I have your permission to re-send your mails and mine to
>> the public mandos-dev mailing list?
>
> Of course.  I did not see where to subscribe.  Can you provide a
> sign-up link?

This one should work:

http://mail.fukt.bsnet.se/cgi-bin/mailman/listinfo/mandos-dev

(It was under the "Support/Contact" heading on the web page.  Perhaps
it's a bit obscure?)

/Teddy Hogeborn

- -- 
The Mandos Project
http://www.fukt.bsnet.se/mandos
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFKp1ykOWBmT5XqI90RAnwWAJ9hyGjI/DzgMESNztlJu3Rv2k8ulgCfVUzk
kVUzZ56fFlWipGCvx3ESNR4=
=6cQ1
-----END PGP SIGNATURE-----