Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requirements and Design #11

Closed
sbordet opened this issue Feb 24, 2017 · 9 comments
Closed

Requirements and Design #11

sbordet opened this issue Feb 24, 2017 · 9 comments

Comments

@sbordet
Copy link
Member

sbordet commented Feb 24, 2017

Loud thoughts about the requirements and the design of this library.

@sbordet
Copy link
Member Author

sbordet commented Feb 24, 2017

The important parameters for the client are:

  • t, the number of sending threads
  • c, the number of "channels" (connections for HTTP/1.1, streams for HTTP/2)
  • p, the request period (inverse of the request rate)

Bomb the server

t=1, c=max, p=0

Because p=0, we want the sender thread to burn 100% cpu, and this can only be achieved if there are enough channels so that there is always a channel available for a request to be sent.
If requests start queuing up, there is an artificial client-side latency (between queuing a request and actually send it) that would be an artifact of the load generator.

Using more than 1 thread could be interesting.
However, we should take into account that if 1 core is busy sending, threads from the executor will be used to handle the responses.
A response thread will be originating from the ManagedSelector via its ExecutionStrategy.
It will receive the response, and then try to send a queued request on the same channel, if any.
Therefore, they will typically run for a short task (receive response + possible send request).
With a large number of responses/s and a large number of connections, there will be a lot of tasks submitted to the executor queue in the unit of time to receive responses.
Would be great if each of these tasks has a thread available, but the limited number of cores makes it so that they will compete anyway for a free core. This will generate an artificial client-side latency.
TODO Perhaps it's worth to use a ProduceConsume ExecutionStrategy here ?

The take away here is that there is the need to monitor both the request queue (to try to avoid a request sitting in the queue for too long), and the executor (to try to avoid a response sitting in the queue for too long).
The former can be done via request queued and request begin events, the latter via - for example - https://github.com/cometd/cometd/blob/3.1.1/cometd-java/cometd-java-benchmark/cometd-java-benchmark-common/src/main/java/org/cometd/benchmark/MonitoringThreadPoolExecutor.java.

@sbordet
Copy link
Member Author

sbordet commented Feb 24, 2017

HTTP/1.1 browser simulation

t=1, c=6, p=0
Here the goal is to reduce the number of channels to 6 to simulate the browser connection pool.
Therefore this parameter should be easily configurable in the library.

HTTP/2 browser simulation

t=1, c=100, p=0
Same as above, but there can be typically up to 100 streams for a single HTTP/2 connection.

Browser simulation

For browser simulation we need the concept of a Resource that is composable as a tree.
The interesting data to collect here is last byte of the whole tree (akin to the load event in browsers), and - as before - the latency and the response time for each node (request) of the tree.

Because we want a restricted number of channels for each "browser", we need to use a HttpClient instance to simulate each "browser".
However, that is orthogonal with the number of sender threads to use.
The executor for the response threads can be shared among all HttpClient instances, provided again that is monitored to avoid to run out of threads and introduce artificial latencies.

@gregw
Copy link
Member

gregw commented Feb 25, 2017

@sbordet Not sure I understand what you are saying about p=0? Are you saying that in such a state a thread should send as many requests as possible?

If so, that is entirely opposite to the design brief for this generator. I specifically requested a generator where the rate at which requests are generated is not a function of any aspect of how they are processed.

The generator should be able to be set at a given requests rate and must generate that rate of requests regardless of how many threads are needed or how much latency is taken to process each request. Perhaps there can be a monitor to detect long queues or empty thread pools to warn that the generator is above capacity, but it should never wait for a response before sending the next request.

I see only 2 primary configurations: number of channels; request rate (either per channel or in total). Thread pool size is an implementation detail.

@gregw
Copy link
Member

gregw commented Feb 25, 2017

Note also that channels is probably the wrong concept. Instead it is probably client/user, where a http/1.1 client/user can have 7 connections and a http/2.2 client/user can have one connection with multiple streams (limited to 100 by default??).

Given that, I'd probably like the request rate to be specified as the total request rate and the generator does the Maths to work out what the rate for each client/user is. Of course for some total request rates might not be able to be achieved with too few clients for a given server latency.

@sbordet
Copy link
Member Author

sbordet commented Feb 25, 2017

@gregw, p=0 means indeed as many requests as possible. There is a distinction between how many requests you can generate per second, and how many you can actually send (over the network) per second.
Consider the case with just 1 channel and HTTP/1.1: network latency and server processing will dictate the max request rate - no matter what you specify on the client.

There is a relation between the request rate and how they are processed. The sender threads, the number of channels, and the pause between request generation (t, c and p) are the parameters that allow you to reduce at a minimum the relation; given the right numbers for those parameters the relation may be absent or very weak, which is what we want - and therefore we need to be able to tune those parameters.

Whether the library will be able to compute those numbers I'm not sure. One use case is where the load tester limits c (browser simulation), and with that fixed the load tester cannot choose an arbitrary p.
Computing an optimal t may also be challenging (I've done that in the past and it's difficult to handle spikes and smooth out oscillations that the system may produce).

What I am saying is that I want easy library methods to set t, c and p, because that is what I want to tune. Paired with feedback about the request queuing I will have a clear understanding of what the load generator is capable of and I will know if it's exceeding its capacity.

@gregw
Copy link
Member

gregw commented Feb 25, 2017 via email

@sbordet
Copy link
Member Author

sbordet commented Feb 26, 2017

@gregw p=0 is just a possible setting, like any other. A request rate of r=10,000 requests/s is not more meaningful, and it may be - in practice - equivalent to p=0.

We are on the same page wrt to produce a load that is independent from the server.

The problem is that the load tester cannot choose an arbitrary rate r (or equivalently an arbitrary p), and assume that the library is able to achieve that.
Not even 10 requests/s.

Of course there will be configurations that are impossible for the generator to achieve.

That is my point: we need a feedback from the load generator that tells whether it can or not do what the load tester asked given the parameters it was configured with.
The human load tester cannot guess anything, not even r=10.

If not, the load tester must be able to change the parameters, which are t, c and p, and try again.
For example keep the rate, but increase the number of channels. Or keep the number of channels and decrease the rate.

I don't see any issues with the goals of this library to have those 3 parameters.

Reasoning at p=0 is one extreme that we may never want to use (but I don't see why not, given enough connections), but that highlights (at least for me) exactly the issues we want to avoid.

@olamy
Copy link
Member

olamy commented Feb 28, 2017

Definitely agree on That is my point: we need a feedback from the load generator that tells whether it can or not do what the load tester asked given the parameters it was configured with.
I will had something as a monitoring display every each x minutes/seconds telling the exact request rate achieved.

@sbordet
Copy link
Member Author

sbordet commented Feb 2, 2021

I think this has been solved. ReportListener reports both effective request rate and response rate and it's possible to tell whether the load generator is over capacity.

@sbordet sbordet closed this as completed Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants