Improvement for the interval parameter

Louis Charreau Louis.Charreau at vadesecure.com
Fri Sep 16 09:39:29 CEST 2022


Hello,

we use the Mandos solution on several hundred servers.

A patch had been implemented (version 1.8.10) to smooth the execution of the "checker" processes (via a random on the "interval" param) to avoid having at the same time as many child processes as hosts to check (because of the timer which is initialized at the same time for all hosts).

The negative effect of this patch is that some targets are checked much too often (almost in a loop for some) compared to others. The "interval" parameter serves more as a maximum threshold with a minimum of 1ms.

To solve this problem, I propose a patch which consists in randomizing the "interval" parameter at the initialization of the "checkers" to have a smoothed execution in time for the 1st check. At the time of the 2nd check, we replace the timer by using the "interval" parameter as a time interval between 2 checks. This way, the servers are not checked at the same time and at regular intervals (the same interval for all servers).

I put a lot of comments to explain the context, I don't think it's necessary to be so verbose !

I used the version 1.8.14 which is the one packaged for Debian.


--- mandos.1.8.14 2022-09-14 16:32:31.000000000 +0200
+++ mandos.new    2022-09-14 16:42:41.000000000 +0200
@@ -1058,17 +1058,24 @@
         # and every interval from then on.
         if self.checker_initiator_tag is not None:
             GLib.source_remove(self.checker_initiator_tag)
+        # At the initialization of the checkers, we smooth the execution in time,
+        # using a random of the interval parameter.
+        # At the time of the first execution, the timer is replaced by a new one
+        # based on the interval parameter to ensure that the executions are done
+        # at regular intervals according to the desired configuration.
         self.checker_initiator_tag = GLib.timeout_add(
             random.randrange(int(self.interval.total_seconds() * 1000
                                  + 1)),
-            self.start_checker)
+            self.start_checker, True)
         # Schedule a disable() when 'timeout' has passed
         if self.disable_initiator_tag is not None:
             GLib.source_remove(self.disable_initiator_tag)
         self.disable_initiator_tag = GLib.timeout_add(
             int(self.timeout.total_seconds() * 1000), self.disable)
-        # Also start a new checker *right now*.
-        self.start_checker()
+        # Do not launch a new checker at initialization to avoid forking the children's processes simultaneously.
+        # This is problematic when you have several hundred servers to check.
+        # # Also start a new checker *right now*.
+        # self.start_checker()

     def checker_callback(self, source, condition, connection,
                          command):
@@ -1119,7 +1126,7 @@
     def need_approval(self):
         self.last_approval_request = datetime.datetime.utcnow()

-    def start_checker(self):
+    def start_checker(self, init_timer=False):
         """Start a new checker subprocess if one is not running.

         If a checker already exists, leave it running and do
@@ -1178,6 +1185,14 @@
                 GLib.PRIORITY_DEFAULT, GLib.IO_IN,
                 self.checker_callback, pipe[0], command)
         # Re-run this periodically if run by GLib.timeout_add
+        if init_timer:
+            # Schedule a new checker to be started an 'interval' from now,
+            # and every interval from then on.
+            if self.checker_initiator_tag is not None:
+                GLib.source_remove(self.checker_initiator_tag)
+            self.checker_initiator_tag = GLib.timeout_add(
+                int(self.interval.total_seconds() * 1000),
+                self.start_checker)
         return True

     def stop_checker(self):



Thank you for your support and for the development of a solution that is very useful to us.

Louis
[img]<www.vadesecure.com>

Louis Charreau


louis.charreau at vadesecure.com



[img]<https://track.vadesecure.com/linkc/KzV1TXpKUT0-L1pzPQ>
[img]<https://track.vadesecure.com/linkc/KzV1TXpKUT0-L1pnPQ>
[img]<https://track.vadesecure.com/linkc/KzV1TXpKUT0-L1pRPQ>

[https://img.signitic.app/uploads/945f4b07f1db970a8091b3f7d859670a.png]<https://signitic.app/linkc/KzV1TXpKUT0-L3B3PQ-LzU2SnpRPT0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.recompile.se/pipermail/mandos-dev/attachments/20220916/a50b5a51/attachment-0001.htm>


More information about the Mandos-Dev mailing list