blob: e7741a668827ef212c79e65c26fc397f838f5ed3 [file] [log] [blame]
Willy Tarreauc89f6652019-09-06 18:50:32 +020012019-09-03
2
3u8 fd.state;
4u8 fd.ev;
5
6
7ev = one of :
8 #define FD_POLL_IN 0x01
9 #define FD_POLL_PRI 0x02
10 #define FD_POLL_OUT 0x04
11 #define FD_POLL_ERR 0x08
12 #define FD_POLL_HUP 0x10
13
14Could we instead have :
15
16 FD_WAIT_IN 0x01
17 FD_WAIT_OUT 0x02
18 FD_WAIT_PRI 0x04
19 FD_SEEN_HUP 0x08
20 FD_SEEN_HUP 0x10
21 FD_WAIT_CON 0x20 <<= shouldn't this be in the connection itself in fact ?
22
23=> not needed, covered by the state instead.
24
25What is missing though is :
26 - FD_DATA_PENDING -- overlaps with READY_R, OK if passed by pollers only
27 - FD_EOI_PENDING
28 - FD_ERR_PENDING
29 - FD_EOI
30 - FD_SHW
31 - FD_ERR
32
33fd_update_events() could do that :
34
35 if ((fd_data_pending|fd_eoi_pending|fd_err_pending) && !(fd_err|fd_eoi))
36 may_recv()
37
38 if (fd_send_ok && !(fd_err|fd_shw))
39 may_send()
40
41 if (fd_err)
42 wake()
43
44the poller could do that :
45 HUP+OUT => always indicates a failed connect(), it should not lack ERR. Is this err_pending ?
46
47 ERR HUP OUT IN
48 0 0 0 0 => nothing
49 0 0 0 1 => FD_DATA_PENDING
50 0 0 1 0 => FD_SEND_OK
51 0 0 1 1 => FD_DATA_PENDING|FD_SEND_OK
52 0 1 0 0 => FD_EOI (|FD_SHW)
53 0 1 0 1 => FD_DATA_PENDING|FD_EOI_PENDING (|FD_SHW)
54 0 1 1 0 => FD_EOI |FD_ERR (|FD_SHW)
55 0 1 1 1 => FD_EOI_PENDING (|FD_ERR_PENDING) |FD_DATA_PENDING (|FD_SHW)
56 1 X 0 0 => FD_ERR | FD_EOI (|FD_SHW)
57 1 X X 1 => FD_ERR_PENDING | FD_EOI_PENDING | FD_DATA_PENDING (|FD_SHW)
58 1 X 1 0 => FD_ERR | FD_EOI (|FD_SHW)
59
60 OUT+HUP,OUT+HUP+ERR => FD_ERR
61
62This reorders to:
63
64 IN ERR HUP OUT
65 0 0 0 0 => nothing
66 0 0 0 1 => FD_SEND_OK
67 0 0 1 0 => FD_EOI (|FD_SHW)
68
69 0 X 1 1 => FD_ERR | FD_EOI (|FD_SHW)
70 0 1 X 0 => FD_ERR | FD_EOI (|FD_SHW)
71 0 1 X 1 => FD_ERR | FD_EOI (|FD_SHW)
72
73 1 0 0 0 => FD_DATA_PENDING
74 1 0 0 1 => FD_DATA_PENDING|FD_SEND_OK
75 1 0 1 0 => FD_DATA_PENDING|FD_EOI_PENDING (|FD_SHW)
76 1 0 1 1 => FD_EOI_PENDING (|FD_ERR_PENDING) |FD_DATA_PENDING (|FD_SHW)
77 1 1 X X => FD_ERR_PENDING | FD_EOI_PENDING | FD_DATA_PENDING (|FD_SHW)
78
79Regarding "|SHW", it's normally useless since it will already have been done,
80except on connect() error where this indicates there's no need for SHW.
81
82FD_EOI and FD_SHW could be part of the state (FD_EV_SHUT_R, FD_EV_SHUT_W).
83Then all states having these bit and another one would be transient and need
84to resync. We could then have "fd_shut_recv" and "fd_shut_send" to turn these
85states.
86
87The FD's ev then only needs to update EOI_PENDING, ERR_PENDING, ERR, DATA_PENDING.
88With this said, these are not exactly polling states either, as err/eoi/shw are
89orthogonal to the other states and are required to update them so that the polling
90state really is DISABLED in the end. So we need more of an operational status for
91the FD containing EOI_PENDING, EOI, ERR_PENDING, ERR, SHW, CLO?. These could be
92classified in 3 categories: read:(OPEN, EOI_PENDING, EOI); write:(OPEN,SHW),
93ctrl:(OPEN,ERR_PENDING,ERR,CLO). That would be 2 bits for R, 1 for W, 2 for ctrl
94or total 5 vs 6 for individual ones, but would be harder to manipulate.
95
96Proposal:
97 - rename fdtab[].state to "polling_state"
98 - rename fdtab[].ev to "status"
99
100Note: POLLHUP is also reported is a listen() socket has gone in shutdown()
101TEMPORARILY! Thus we may not always consider this as a final error.
102
103
104Work hypothesis:
105
106SHUT RDY ACT
107 0 0 0 => disabled
108 0 0 1 => active
109 0 1 0 => stopped
110 0 1 1 => ready
111 1 0 0 => final shut
112 1 0 1 => shut pending without data
113 1 1 0 => shut pending, stopped
114 1 1 1 => shut pending
115
116PB: we can land into final shut if one thread disables the FD while another
117 one that was waiting on it reports it as shut. Theorically it should be
118 implicitly ready though, since reported. But if no data is reported, it
119 will be reportedly shut only. And no event will be reported then. This
120 might still make sense since it's not active, thus we don't want events.
121 But it will not be enabled later either in this case so the shut really
122 risks not to be properly reported. The issue is that there's no difference
123 between a shut coming from the bottom and a shut coming from the top, and
124 we need an event to report activity here. Or we may consider that a poller
125 never leaves a final shut by itself (100) and always reports it as
126 shut+stop (thus ready) if it was not active. Alternately, if active is
127 disabled, shut should possibly be ignored, then a poller cannot report
128 shut. But shut+stopped seems the most suitable as it corresponds to
129 disabled->stopped transition.
130
131Now let's add ERR. ERR necessarily implies SHUT as there doesn't seem to be a
132valid case of ERR pending without shut pending.
133
134ERR SHUT RDY ACT
135 0 0 0 0 => disabled
136 0 0 0 1 => active
137 0 0 1 0 => stopped
138 0 0 1 1 => ready
139
140 0 1 0 0 => final shut, no error
141 0 1 0 1 => shut pending without data
142 0 1 1 0 => shut pending, stopped
143 0 1 1 1 => shut pending
144
145 1 0 X X => invalid
146
147 1 1 0 0 => final shut, error encountered
148 1 1 0 1 => error pending without data
149 1 1 1 0 => error pending after data, stopped
150 1 1 1 1 => error pending
151
152So the algorithm for the poller is:
153 - if (shutdown_pending or error) reported and ACT==0,
154 report SHUT|RDY or SHUT|ERR|RDY
155
156For read handlers :
157 - if (!(flags & (RDY|ACT)))
158 return
159 - if (ready)
160 try_to_read
161 - if (err)
162 report error
163 - if (shut)
164 read0
165
166For write handlers:
167 - if (!(flags & (RDY|ACT)))
168 return
169 - if (err||shut)
170 report error
171 - if (ready)
172 try_to_write
173
174For listeners:
175 - if (!(flags & (RDY|ACT)))
176 return
177 - if (err||shut)
178 pause
179 - if (ready)
180 try_to_accept
181
182Kqueue reports events differently, it says EV_EOF() on READ or WRITE, that
183we currently map to FD_POLL_HUP and FD_POLL_ERR. Thus kqueue reports only
184POLLRDHUP and not POLLHUP, so for now a direct mapping of POLLHUP to
185FD_POLL_HUP does NOT imply write closed with kqueue while it does for others.
186
187Other approach, use the {RD,WR}_{ERR,SHUT,RDY} flags to build a composite
188status in each poller and pass this to fd_update_events(). We normally
189have enough to be precise, and this latter will rework the events.
190
191FIXME: Normally on KQUEUE we're supposed to look at kev[].fflags to get the error
192on EV_EOF() on read or write.