Hi Ganesha,
To give some background information, the task returned from http_client::request(...) is completed once the HTTP headers have arrived. This does NOT mean that the entire HTTP response body has arrived, the body could still be in transit in chunks. The tasks returned from http_response::content_ready() or any of the http_response::extract_* functions will wait until the entire message body has arrived. It is possible the download is taking up some of the time waiting on the task from extract_json(). Parsing the JSON text could also be taking some of the time.
Internally all HTTP message bodies are stored in a stream, by default a stream backed by a producer_consumer_buffer. We read data from the socket in chunks at a time, by default the chunk size we use is 64k. You can try adjusting the chunk size to minimize the amount of read operations performed, if you know the size of your dataset. For example you could try setting it to 1MB to avoid repeated reads. You can set the chunk size to use with the http_client_config::set_chunksize(...) API.
The next improvement you could try is to avoid unnecessary copying performed writing and reading from the internal default producer consumer stream buffer we use. If you know the exact size and content of your incoming data you can perform further optimizations. For example if you just want to get the HTTP response body as a std::string and you know it is coming across as UTF-8 you can create a stream buffer backed by a std::string. Additionally you can perform all the memory heap allocations up front if you know the size. To quickly summarize this would involve steps like the following:
Steve
To give some background information, the task returned from http_client::request(...) is completed once the HTTP headers have arrived. This does NOT mean that the entire HTTP response body has arrived, the body could still be in transit in chunks. The tasks returned from http_response::content_ready() or any of the http_response::extract_* functions will wait until the entire message body has arrived. It is possible the download is taking up some of the time waiting on the task from extract_json(). Parsing the JSON text could also be taking some of the time.
Internally all HTTP message bodies are stored in a stream, by default a stream backed by a producer_consumer_buffer. We read data from the socket in chunks at a time, by default the chunk size we use is 64k. You can try adjusting the chunk size to minimize the amount of read operations performed, if you know the size of your dataset. For example you could try setting it to 1MB to avoid repeated reads. You can set the chunk size to use with the http_client_config::set_chunksize(...) API.
The next improvement you could try is to avoid unnecessary copying performed writing and reading from the internal default producer consumer stream buffer we use. If you know the exact size and content of your incoming data you can perform further optimizations. For example if you just want to get the HTTP response body as a std::string and you know it is coming across as UTF-8 you can create a stream buffer backed by a std::string. Additionally you can perform all the memory heap allocations up front if you know the size. To quickly summarize this would involve steps like the following:
- Create a container_buffer<std::string>
- Reserve the capacity to allocate all the memory up front using container_buffer::collection().reserve(size)
- Set the container buffer to be used to write the HTTP response body into before sending the HTTP request with http_request::set_response_stream(...)
- After sending the HTTP request, the body you can be signaled that the body has entirely arrived with the task returned from http_response::content_ready()
-
Access the underlying std::string from the container buffer by moving out of the buffer with something like std::string str = std::move(buffer.collection());
Steve