Skip to content

Requests to Orchestrator API are timed out after 5 seconds #40

@o-fedorov

Description

@o-fedorov

The Problem

The following commit extracts the transport of a RAFT client, and reuses it as the transport of Orchestrator API reverse proxy: b0aa7b8

Previously, the proxy's transport was http.DefaultTransport with ResponseHeaderTimeout set to 30 seconds. After the mentioned change, ResponseHeaderTimeout is set to config.ActiveNodeExpireSeconds, which is equal to 5 seconds.

It looks Ok for a RAFT client to have a short timeout, though for a general API request it is too short.

A practical example: for the infrastructure managed by my team the call to graceful-master-takeover API usually takes 10-15 seconds. It means that we are never able to get a successful response from this API endpoint. (Fortunately, only reverse proxy transport is timed out, and the takeover itself keep running till the end).

Related Code

The transport for reverse proxy is set here:

proxy.Transport, err = orcraft.GetRaftHttpTransport()

Right now a single transport instance is defined and cached in GetRaftHttpTransport here:

func GetRaftHttpTransport() (*http.Transport, error) {
// Checks whether there is a cached httpTransport to return:
if httpTransport != nil {
return httpTransport, nil
}
httpTimeout := time.Duration(config.ActiveNodeExpireSeconds) * time.Second

Note that config.ActiveNodeExpireSeconds is hardcoded and can not be changed via a config file.
Also, note that the most of RAFT API do not use the reverse proxy:

orchestrator/go/http/api.go

Lines 3953 to 3964 in 1754ca9

this.registerAPIRequestNoProxy(m, "raft-yield/:node", this.RaftYield)
this.registerAPIRequestNoProxy(m, "raft-yield-hint/:hint", this.RaftYieldHint)
this.registerAPIRequestNoProxy(m, "raft-peers", this.RaftPeers)
this.registerAPIRequestNoProxy(m, "raft-state", this.RaftState)
this.registerAPIRequestNoProxy(m, "raft-leader", this.RaftLeader)
this.registerAPIRequestNoProxy(m, "raft-health", this.RaftHealth)
this.registerAPIRequestNoProxy(m, "raft-status", this.RaftStatus)
this.registerAPIRequestNoProxy(m, "raft-snapshot", this.RaftSnapshot)
this.registerAPIRequestNoProxy(m, "raft-follower-health-report/:authenticationToken/:raftBind/:raftAdvertise", this.RaftFollowerHealthReport)
this.registerAPIRequestNoProxy(m, "reload-configuration", this.ReloadConfiguration)
this.registerAPIRequestNoProxy(m, "hostname-resolve-cache", this.HostnameResolveCache)
this.registerAPIRequestNoProxy(m, "reset-hostname-resolve-cache", this.ResetHostnameResolveCache)

Proposed Solution

To deal with the issue, I would like to make the following changes:

  1. Define a separate transport for raftReverseProxy.
  2. Make the timeout for raftReverseProxy transport configurable, default to 30 seconds.

This way, RAFT clients will still time out after 5 seconds, and I, as a user, will be able to configure the reverse proxy timeout for regular Orchestrator API requests.

Please let me know if the proposed solution makes sense, and if it is Ok if I make a PR with related changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions