MEDIUM: threads: add thread_sync_release() to synchronize steps

This function provides an alternate way to leave a critical section run
under thread_isolate(). Currently, a thread may remain in thread_release()
without having the time to notice that the rdv mask was released and taken
again by another thread entering thread_isolate() (often the same that just
released it). This is because threads wait in harmless mode in the loop,
which is compatible with the conditions to enter thread_isolate(). It's
not possible to make them wait with the harmless bit off or we cannot know
when the job is finished for the next thread to start in thread_isolate(),
and if we don't clear the rdv bit when going there, we create another
race on the start point of thread_isolate().

This new synchronous variant of thread_release() makes use of an extra
mask to indicate the threads that want to be synchronously released. In
this case, they will be marked harmless before releasing their sync bit,
and will wait for others to release their bit as well, guaranteeing that
thread_isolate() cannot be started by any of them before they all left
thread_sync_release(). This allows to construct synchronized blocks like
this :

     thread_isolate()
     /* optionally do something alone here */
     thread_sync_release()
     /* do something together here */
     thread_isolate()
     /* optionally do something alone here */
     thread_sync_release()

And so on. This is particularly useful during initialization where several
steps have to be respected and no thread must start a step before the
previous one is completed by other threads.

This one must not be placed after any call to thread_release() or it would
risk to block an earlier call to thread_isolate() which the current thread
managed to leave without waiting for others to complete, and end up here
with the thread's harmless bit cleared, blocking others. This might be
improved in the future.
diff --git a/include/common/hathreads.h b/include/common/hathreads.h
index 79a6466..a7c8dc9 100644
--- a/include/common/hathreads.h
+++ b/include/common/hathreads.h
@@ -53,6 +53,7 @@
 enum { all_threads_mask = 1UL };
 enum { threads_harmless_mask = 0 };
 enum { threads_want_rdv_mask = 0 };
+enum { threads_sync_mask = 0 };
 enum { tid_bit = 1UL };
 enum { tid = 0 };
 
@@ -228,6 +229,10 @@
 {
 }
 
+static inline void thread_sync_release()
+{
+}
+
 static inline unsigned long thread_isolated()
 {
 	return 1;
@@ -417,6 +422,7 @@
 void thread_harmless_till_end();
 void thread_isolate();
 void thread_release();
+void thread_sync_release();
 void ha_tkill(unsigned int thr, int sig);
 void ha_tkillall(int sig);
 
@@ -439,12 +445,17 @@
 extern volatile unsigned long all_threads_mask;
 extern volatile unsigned long threads_want_rdv_mask;
 extern volatile unsigned long threads_harmless_mask;
+extern volatile unsigned long threads_sync_mask;
 
-/* explanation for threads_want_rdv_mask and threads_harmless_mask :
+/* explanation for threads_want_rdv_mask, threads_harmless_mask, and
+ * threads_sync_mask :
  * - threads_want_rdv_mask is a bit field indicating all threads that have
  *   requested a rendez-vous of other threads using thread_isolate().
  * - threads_harmless_mask is a bit field indicating all threads that are
  *   currently harmless in that they promise not to access a shared resource.
+ * - threads_sync_mask is a bit field indicating that a thread waiting for
+ *   others to finish wants to leave synchronized with others and as such
+ *   promises to do so as well using thread_sync_release().
  *
  * For a given thread, its bits in want_rdv and harmless can be translated like
  * this :
@@ -457,6 +468,9 @@
  *       1    |     1    | thread interested in RDV and waiting for its turn
  *       1    |     0    | thread currently working isolated from others
  *  ----------+----------+----------------------------------------------------
+ *
+ * thread_sync_mask only delays the leaving of threads_sync_release() to make
+ * sure that each thread's harmless bit is cleared before leaving the function.
  */
 
 #define ha_sigmask(how, set, oldset)  pthread_sigmask(how, set, oldset)