Quantcast
Channel: WE MOVED to github.com/microsoft/cpprestsdk. This site is not monitored!
Viewing all articles
Browse latest Browse all 4845

Commented Issue: Race condition with timeouts in http_linux.cpp [280]

$
0
0
_Reported by internal customer_

I think we’ve found the root cause of the issue I was asking about below. It looks like there’s a race condition in http_linux.cpp. Around line 415:
```
ctx->set_timer(static_cast<int>(client_config().timeout().count()));

if (ctx->m_connection->socket().is_open())
{
// If socket is already open (connection is reused), try to write the request directly.
write_request(ctx);
}
else
{
// If the connection is new (unresolved and unconnected socket), then start async
// call to resolve first, leading eventually to request write.
tcp::resolver::query query(host, utility::conversions::print_string(port));

m_resolver.async_resolve(query, boost::bind(&linux_client::handle_resolve, shared_from_this(), boost::asio::placeholders::error, boost::asio::placeholders::iterator, ctx));
}
```
It looks like the timer is being set at a time when the socket might not yet be open. The timer-firing code is handle_timeout_timer below. If the timer fires immediately and the socket is not yet open, m_connection->cancel() will fail, which raises the “Failed to cancel the socket” error. I believe this is what we’re seeing. (Due to an unrelated issue, we were passing in a negative value for the client_config().timeout(), which causes the timer to fire immediately.) I’m not sure how you would want to handle negative or 0 values for timeouts (some people might assume that a negative value means “no timeout”), but it looks like it could happen even with very small (albeit positive) timeouts.

I haven’t checked the uses of reset_timer() to know if there might be a similar issue there. I also don’t have a fix to recommend at the moment – you could check and see if the socket is open in the timeout handler, but in this case you’d probably actually want to fail the request. Anyway, for the time being I’ll just make sure that all timeouts in our testing are at least 30 seconds, which should work around this for our development purposes.

Comments: Fixed in the development branch, will be in 2.3.0 release.

Viewing all articles
Browse latest Browse all 4845

Trending Articles