Requirements and Design #11

sbordet · 2017-02-24T17:47:35Z

Loud thoughts about the requirements and the design of this library.

sbordet · 2017-02-24T18:19:12Z

The important parameters for the client are:

t, the number of sending threads
c, the number of "channels" (connections for HTTP/1.1, streams for HTTP/2)
p, the request period (inverse of the request rate)

Bomb the server

t=1, c=max, p=0

Because p=0, we want the sender thread to burn 100% cpu, and this can only be achieved if there are enough channels so that there is always a channel available for a request to be sent.
If requests start queuing up, there is an artificial client-side latency (between queuing a request and actually send it) that would be an artifact of the load generator.

Using more than 1 thread could be interesting.
However, we should take into account that if 1 core is busy sending, threads from the executor will be used to handle the responses.
A response thread will be originating from the ManagedSelector via its ExecutionStrategy.
It will receive the response, and then try to send a queued request on the same channel, if any.
Therefore, they will typically run for a short task (receive response + possible send request).
With a large number of responses/s and a large number of connections, there will be a lot of tasks submitted to the executor queue in the unit of time to receive responses.
Would be great if each of these tasks has a thread available, but the limited number of cores makes it so that they will compete anyway for a free core. This will generate an artificial client-side latency.
TODO Perhaps it's worth to use a ProduceConsume ExecutionStrategy here ?

The take away here is that there is the need to monitor both the request queue (to try to avoid a request sitting in the queue for too long), and the executor (to try to avoid a response sitting in the queue for too long).
The former can be done via request queued and request begin events, the latter via - for example - https://github.com/cometd/cometd/blob/3.1.1/cometd-java/cometd-java-benchmark/cometd-java-benchmark-common/src/main/java/org/cometd/benchmark/MonitoringThreadPoolExecutor.java.

sbordet · 2017-02-24T22:54:16Z

HTTP/1.1 browser simulation

t=1, c=6, p=0
Here the goal is to reduce the number of channels to 6 to simulate the browser connection pool.
Therefore this parameter should be easily configurable in the library.

HTTP/2 browser simulation

t=1, c=100, p=0
Same as above, but there can be typically up to 100 streams for a single HTTP/2 connection.

Browser simulation

For browser simulation we need the concept of a Resource that is composable as a tree.
The interesting data to collect here is last byte of the whole tree (akin to the load event in browsers), and - as before - the latency and the response time for each node (request) of the tree.

Because we want a restricted number of channels for each "browser", we need to use a HttpClient instance to simulate each "browser".
However, that is orthogonal with the number of sender threads to use.
The executor for the response threads can be shared among all HttpClient instances, provided again that is monitored to avoid to run out of threads and introduce artificial latencies.

gregw · 2017-02-25T03:46:48Z

@sbordet Not sure I understand what you are saying about p=0? Are you saying that in such a state a thread should send as many requests as possible?

If so, that is entirely opposite to the design brief for this generator. I specifically requested a generator where the rate at which requests are generated is not a function of any aspect of how they are processed.

The generator should be able to be set at a given requests rate and must generate that rate of requests regardless of how many threads are needed or how much latency is taken to process each request. Perhaps there can be a monitor to detect long queues or empty thread pools to warn that the generator is above capacity, but it should never wait for a response before sending the next request.

I see only 2 primary configurations: number of channels; request rate (either per channel or in total). Thread pool size is an implementation detail.

gregw · 2017-02-25T03:52:59Z

Note also that channels is probably the wrong concept. Instead it is probably client/user, where a http/1.1 client/user can have 7 connections and a http/2.2 client/user can have one connection with multiple streams (limited to 100 by default??).

Given that, I'd probably like the request rate to be specified as the total request rate and the generator does the Maths to work out what the rate for each client/user is. Of course for some total request rates might not be able to be achieved with too few clients for a given server latency.

sbordet · 2017-02-25T10:43:32Z

@gregw, p=0 means indeed as many requests as possible. There is a distinction between how many requests you can generate per second, and how many you can actually send (over the network) per second.
Consider the case with just 1 channel and HTTP/1.1: network latency and server processing will dictate the max request rate - no matter what you specify on the client.

There is a relation between the request rate and how they are processed. The sender threads, the number of channels, and the pause between request generation (t, c and p) are the parameters that allow you to reduce at a minimum the relation; given the right numbers for those parameters the relation may be absent or very weak, which is what we want - and therefore we need to be able to tune those parameters.

Whether the library will be able to compute those numbers I'm not sure. One use case is where the load tester limits c (browser simulation), and with that fixed the load tester cannot choose an arbitrary p.
Computing an optimal t may also be challenging (I've done that in the past and it's difficult to handle spikes and smooth out oscillations that the system may produce).

What I am saying is that I want easy library methods to set t, c and p, because that is what I want to tune. Paired with feedback about the request queuing I will have a clear understanding of what the load generator is capable of and I will know if it's exceeding its capacity.

gregw · 2017-02-25T23:45:12Z

Let's have a chat about this next week. I'm very much opposed to the style of load generator you are describing a they are plenty of them out there already and we do not need our own. My brief for this load generator was for it to produce load at a configured specific rate entirely independent of server processing time. The server processing time and error rate are measured to determine if an acceptable qos is achieved. Of course there will be configurations that are impossible for the generator to achieve. P=0 tells us very little. It is measure of latency at a load that is achieved with a dynamically determined request rate that is a function of latency. Who knows what steady state of such a feedback loop means. It may even be a local maxima in the functions optimization On 25 Feb 2017 19:43, "Simone Bordet" <[email protected]> wrote: @gregw <https://github.com/gregw>, p=0 means indeed as many requests as possible. There is a distinction between how many requests you can *generate* per second, and how many you can actually *send* (over the network) per second. Consider the case with just 1 channel and HTTP/1.1: network latency and server processing will dictate the max request rate - no matter what you specify on the client. There *is* a relation between the request rate and how they are processed. The sender threads, the number of channels, and the pause between request generation (t, c and p) are the parameters that allow you to reduce at a minimum the relation; given the right numbers for those parameters the relation may be absent or very weak, which is what we want - and therefore we need to be able to tune those parameters. Whether the library will be able to *compute* those numbers I'm not sure. One use case is where the load tester limits c (browser simulation), and with that fixed the load tester cannot choose an arbitrary p. Computing an optimal t may also be challenging (I've done that in the past and it's difficult to handle spikes and smooth out oscillations that the system may produce). What I am saying is that I want easy library methods to set t, c and p, because that is what I want to tune. Paired with feedback about the request queuing I will have a clear understanding of what the load generator is capable of and I will know if it's exceeding its capacity. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#11 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEUrfrksiQtrZ3UsxfakH-SNsHA2GOzks5rgAXUgaJpZM4MLfM1> .

sbordet · 2017-02-26T15:39:59Z

@gregw p=0 is just a possible setting, like any other. A request rate of r=10,000 requests/s is not more meaningful, and it may be - in practice - equivalent to p=0.

We are on the same page wrt to produce a load that is independent from the server.

The problem is that the load tester cannot choose an arbitrary rate r (or equivalently an arbitrary p), and assume that the library is able to achieve that.
Not even 10 requests/s.

Of course there will be configurations that are impossible for the generator to achieve.

That is my point: we need a feedback from the load generator that tells whether it can or not do what the load tester asked given the parameters it was configured with.
The human load tester cannot guess anything, not even r=10.

If not, the load tester must be able to change the parameters, which are t, c and p, and try again.
For example keep the rate, but increase the number of channels. Or keep the number of channels and decrease the rate.

I don't see any issues with the goals of this library to have those 3 parameters.

Reasoning at p=0 is one extreme that we may never want to use (but I don't see why not, given enough connections), but that highlights (at least for me) exactly the issues we want to avoid.

olamy · 2017-02-28T02:29:17Z

Definitely agree on That is my point: we need a feedback from the load generator that tells whether it can or not do what the load tester asked given the parameters it was configured with.
I will had something as a monitoring display every each x minutes/seconds telling the exact request rate achieved.

sbordet · 2021-02-02T09:47:16Z

I think this has been solved. ReportListener reports both effective request rate and response rate and it's possible to tell whether the load generator is over capacity.

sbordet closed this as completed Feb 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirements and Design #11

Requirements and Design #11

sbordet commented Feb 24, 2017

sbordet commented Feb 24, 2017 •

edited

Loading

sbordet commented Feb 24, 2017

gregw commented Feb 25, 2017

gregw commented Feb 25, 2017

sbordet commented Feb 25, 2017

gregw commented Feb 25, 2017 via email

sbordet commented Feb 26, 2017

olamy commented Feb 28, 2017

sbordet commented Feb 2, 2021

Requirements and Design #11

Requirements and Design #11

Comments

sbordet commented Feb 24, 2017

sbordet commented Feb 24, 2017 • edited Loading

Bomb the server

sbordet commented Feb 24, 2017

HTTP/1.1 browser simulation

HTTP/2 browser simulation

Browser simulation

gregw commented Feb 25, 2017

gregw commented Feb 25, 2017

sbordet commented Feb 25, 2017

gregw commented Feb 25, 2017 via email

sbordet commented Feb 26, 2017

olamy commented Feb 28, 2017

sbordet commented Feb 2, 2021

sbordet commented Feb 24, 2017 •

edited

Loading