in the read after write scenario, why not use something like consistency tokens ? and redirect to primary if the secondary detects it has not caught up ?
I spoke about this exact thing at a conference (HPTS’19) a while back. This can work, but introduces modal behaviors into systems that make reasoning about availability very difficult and tends to cause meta stable behaviors and long outages.
The feedback loop is replicas slow -> traffic increases to primary -> primary slows -> replicas slow, etc. The only way out of this loop is to shed traffic.
Nice! I created tuningfork [1] a couple of months ago that proxies traffic through another node for the configured upstream. I wanted to understand networks, so rolled my own thing. And I wanted to bypass age verification laws in UK :)