Analysis of HTTP/1.0 Performance Problems

Simon E Spero (ses@tipper.oit.unc.edu)
Sat, 16 Jul 94 04:47:26 -0400


------- =_aaaaaaaaaa0
Content-Type: text/plain; charset="us-ascii"

Ok, pop quiz. There's a server on a bus. It's rigged
to blow if it drops below 50 transactions per second. What
do you do?
WWW-SPEED

This is the first in a series of reports on performance problems in
the web, and how they can be addressed. This paper discusses some
problems with HTTP/1.0. It includes a step by step walk through of
HTTP as seen through the eyes of a packet sniffer, together with
a look at how various changes to the protocol model could improve
performance.

I'll be following this up in a couple of days with a comparative
review of several proposed changes to HTTP.

This paper is a first draft; there are still some parts that need more
work, and I'll be adding some diagrams to the HTML version later. This
message contains two versions of the paper; one in plain text, the other
in HTML. I'll also be putting a copy of the html version up on the WWW-SPEED
server (currently http://elanor.oit.unc.edu/)

Y'all have fun now,

Simon

p.s.

Since this is of reasonably general interest, and since www-speed is a
new list, I'm Cc:ing www-talk. I'll try and avoid this in the future.

------- =_aaaaaaaaaa0
Content-Type: multipart/alternative; boundary="----- =_aaaaaaaaaa1"
Content-Description: HTTP performance problems

------- =_aaaaaaaaaa1
Content-Type: text/plain; charset="us-ascii"
Content-Description: HTTP performance problems (ascii)

Title: Analysis of HTTP Performance problems

Author: Simon E Spero

Abstract:

This paper is the first in a series on performance issues in the
World Wide Web.

HTTP is a transfer protocol used by the World Wide Web distributed
hypermedia system to retrieve distributed objects. HTTP uses TCP as
a transport layer. Certain design features of HTTP interact badly with
TCP, causing problems with performance and with server scalability.
Latency problems are caused by opening a single connection per request,
through connection setup and slow-start costs. Further avoidable
latency is incurred due to the protocol only returning a single object
per request. Scalability problems are caused by TCP requring a server
to maintain state for all recently closed connections.

------------------------------

Summary of HTTP/1.0
-------------------

HTTP [http] is a transfer protocol used by the World Wide Web to
retrieve information from distributed servers. The HTTP model is
extremely simple; the client establishes a connection to the remote
server, then issues a request. The server then processes the request,
returns a response, and closes the connection.

Requests.

The request format for HTTP is quite simple. The first line specifies
an object, together with the name of an object to apply the method
to. The most commonly used method is "GET", which ask the server to
send a copy of the object to the client.

The client can also send a series of optional headers; these headers are
in RFC-822 format. The most common headers are "Accept", which tells the server
which object types the client can handle, and "User-Agent", which gives the
implementation name of the client.

Request syntax.

<METHOD> <URI> "HTTP/1.0" <crlf>
{<Header>: <Value> <crlf>}*
<crlf>

Example.

GET /index.html HTTP/1.0
Accept: text/plain
Accept: text/html
Accept: */*
User-Agent: Yet Another User Agent

Responses.

The response format is also quite simple. Responses start with a
status line indicating which version of HTTP the server is running,
together with a result code and an optional message.

This is followed by a series of optional object headers; the most
important of these are "Content-Type", which describes the type of the
object being returned, and "Content-Length", which indicates the
length.The headers are teminated by an empty line.

The server now sends any requested data. After the data have been
sent, the server drops the connection.

Response syntax.

"HTTP/1.0" <result-code> [<message>] <crlf>
{<Header>: <Value> <crlf>}*
<crlf>

Example.

HTTP/1.0 200 OK
Server: MDMA/0.1
MIME-version: 1.0
Content-type: text/html
Last-Modified: Thu Jul 7 00:25:33 1994
Content-Length: 2003

<title>MDMA - Multithreaded Daemon for Multimedia Access</title>
<hr>
....
<hr>
<h2> MDMA - The speed daemon </h2>
<hr>

[Connection closed by foreign host]

Access Patterns
---------------

HTTP accesses usually exhibit a common pattern of behaviour. A client
requests a hypertext page, then issues a sequence of requests to
retrieve any icons referenced in the first document. Once the client
has retrieved the icons, the user will typically select a hypertext
link to follow. In the majority of cases, the referenced page will be
on the same server as the original page.

HTTP Illustrated
----------------

The problems with HTTP can best be understood by looking at the
network traffic generated by a typical HTTP transaction. This example
was generated by using Van Jacobsen's tcpdump program to monitor a
client at UNC fetching a copy of the NCSA Home page. This page is 1668
bytes long, including response headers. The client at UNC was a Sun
Sparcstation 20/512, running Solaris 2.3. The server at NCSA was a
Hewlett Packard 9000/735 running HP-UX.

The headers used in the request shown were captured from an xmosaic
request. The headers consisted of 42 lines totaling around 1130
bytes; of these lines, 41 were "Accept".

**Stage 1: Time = 0**

The trace begins with the client sending a connection request to the
http port on the server

.00000 unc.32840 > ncsa.80: S 2785173504:2785173504(0) win 8760 <mss 1460> (DF)

**Stage 2: Time = 0.077**

The server responds to the connect request with a connect
response. The client acknowledges the connect response, and send the
first 536 bytes of the request.

.07769 ncsa.80 > unc.32840: S 530752000:530752000(0) ack 2785173505 win 8192
.07784 unc.32840 > ncsa.80: . ack 1 win 9112 (DF)
.07989 unc.32840 > ncsa.80: P 1:537(536) ack 1 win 9112 (DF)

**Stage 3: Time = 0.35**

The server acknowledges the first part of the request. The client
then sends the second part, and without waiting for a response,
follows up with the third and final part of the request.

.35079 ncsa.80 > unc.32840: . ack 537 win 8192
.35092 unc.32840 > ncsa.80: . 537:1073(536) ack 1 win 9112 (DF)
.35104 unc.32840 > ncsa.80: P 1073:1147(74) ack 1 win 9112 (DF)

**Stage 4: Time = 0.45**

The server sends a packet acknowledging the second and third parts of
the request, and containing the first 512 bytes of the response. It
follows this with another packet containing the second 512 bytes.
The client then sends a message acknowledging the first two response
packets.

.45116 ncsa.80 > unc.32840: . 1:513(512) ack 1147 win 8190
.45473 ncsa.80 > unc.32840: . 513:1025(512) ack 1147 win 8190
.45492 unc.32840 > ncsa.80: . ack 1025 win 9112 (DF)

**Stage 5: Time = 0.53**

The server sends the third and fourth response packet. The fourth
packet also contains a flag indicating that the connection is being
closed. The client acknowledges the data, then sends a message
announcing that it too is closing the connection. From the point of
view of the client program, the transaction is now over.

.52521 ncsa.80 > unc.32840: . 1025:1537(512) ack 1147 win 8190
.52746 ncsa.80 > unc.32840: FP 1537:1853(316) ack 1147 win 8190
.52755 unc.32840 > ncsa.80: . ack 1854 win 9112 (DF)
.52876 unc.32840 > ncsa.80: F 1147:1147(0) ack 1854 win 9112 (DF)

**Stage 6: Time = 0.60**

The server acknowledges the close.

.59904 ncsa.80 > unc.32840: . ack 1148 win 8189

Very nice, but what does it all mean?
-------------------------------------

This trace is quite revealing. It seems that HTTP spends more time
waiting than it does transferring data. Before we take a look at what
causes these delays, we'll calculate a couple of metrics that will
help us work out what's going on.

One important metric is the Round Trip Time (RTT). This is the time
taken to sent a packet from one end of the connection to the other and
back. Look at the gap between stages one and two, and between stages
five and six, we can see that in our example, the RTT is around 70
milliseconds.

Another important metric is the bandwidth of the connection. This is
a measure of how many bits per second our connection can carry. We can
establish a lower bound on the available bandwidth by noting how long
it took to send one of the data packets.

The two data packets in stage four arrive with a gap of 3.57 ms;
since each packet carries 512 bytes of data, this puts a lower limit
on the available bandwidth of about 1.15 Mbps (143,750 MBps).

Connection Establishment.

TCP establishes connections via a three-way handshake. The client
sends a connection request, the server responds, and the client
acknowledges the response. The client can send data along with the
acknowledgement.

Since the client must wait for the server to send its connection
response, this procedure sets a lower bound on the transaction time of
two RTTs.

Data transfer: Segments.

When TCP transfers a stream data, it breaks up the stream into small
segments. The size of each segment can vary up to a maximum segement
size (MSS). Although the segment size can be negotiated (the UNC host
advertises an MSS of 1460 bytes in the first packet), this negotiation
is optional. For remote connections, the MSS defaults to 536 bytes.

Since the NCSA host did not advertise an MSS, the UNC host uses the
default value of 536 for this session.

Data transfer: Windows and Slow Start.

Instead of having to wait for each packet to be acknowledged, TCP
allows the sender to send out new segments even though it may not have
received acknowledgements for previous segments.

To prevent the sender from overflowing the receivers buffers, in each
segment the receiver tells the sender how much data it is prepared to
accept without acknowledgments. This value is known as the window
size.

Although the window size tells the sender the maximum amount of
unacknowledged data the receiver is prepared to let it have
outstanding, the receiver cannot know how much data the connecting
networks are prepared to carry. If the network is quite congested,
sending a full windows worth of data will cause even worse
congestion. The ideal transmission rate is one in which
acknowledgements and outgoing packets enter the network at the same
rate.

TCP determines the best rate to use through a process called Slow
Start. With Slow Start, the sender maintains calculates a second
window of unacknowledged segments known as the Congestion Window. When
a connection first starts up, each sender is only allowed to have a
single unacknowledged segement in transit. Every time a segment is
acknowledged without loss, the congestion window is opened; every time
a segement is lost and times out, the window is closed.

This approach is ideal for normal connections; these connections tend
to last a relatively long time, and the effects of slow start are
negligible. However, for short lived connections like those used in
HTTP, the effect of slow start is devestating.

How Slow Start affects HTTP.

HTTP is hurt by slow start on both the client and server sides.
Because the HTTP headers are longer than the MSS, the client TCP needs
to use two segments (Stage 2). Because the congestion window is
initialised to one, we need to wait for the first segment to be
acknowledged before we can send the second and third (Stage 3). This
adds an extra RTT onto the minimum time for the transaction.

When the server is ready to send the response, it starts with a
congestion window set to 2. This is because the acknowledgement it
sent in Stage 3 counts is counted as a successful transmission,
allowing the congestion window to open up a notch before it comes time
to send data.

Although the servers congestion window is already slightly open, it
still doesn't have enough a big enough gap to send the entire response
without pausing. In Stage 4, the server sends two segments carrying a
combined 1K of data, but then needs to wait to receive the
acknowledgement from the client before it can send the final two
segments in Stage 5.

Since most pages are larger than 1K in size, slow start in the server
will typically add at least one RTT to the total transaction
time. Longer documents may experience several more slow start induced
delays, until the congestion window is as big as the recievers window.

Readers should note that if the server were to support the negotition
of larger MSS, this particular transaction would not have incurred any
slow start delays. However, typical document sizes will cause at least
one slow start pause when sending the result.

Latency and Bandwidth
---------------------

Latency and Bandwidth are the two keys to protocol
performance. Latency, as measured by the RTT, is a measure of the
fixed costs of a transaction and does not change as documents get
bigger or smaller. Bandwidth, as we discussed earlier, is a measure
of how long it takes to send data.

Future Networks.

Improving available bandwidth is relatively easy. All you need to do
is buy a faster network. Reducing latency is somewhat harder- there
are only two known ways to achieve this; either change the speed of
light, or move. Since placing workstations in particle accelerators
has been known to cause data loss, and since the second approach is in
general contrary to the philosphy of distributed systems, latency is
increasingly becoming the dominant factor in protocol performance.

Effects of opening a new connection per transaction
---------------------------------------------------

The cost of opening a new connection for each transaction can be
clearly seen by estimating the transaction time as it would have been
if we had been reusing an existing connection.

In our example, the total transaction time was 530 ms

If were reusing an existing connection the transaction time could be
calculated as the time to send the request plus the round trip time,
plus any processing time on the server, plus the time to send the
response. The only figure in this list for which we have no estimate
is that for server processing time. This depends heavily upon machine
load, but for our purposes, we can take the gap between sending the
request and recieving the response, and subtract the response time
(Stages 3 and 4). In this case, the approximate value is about 30 ms.

Using the above method, the expected transaction time for an already
open server would be

(1,146 / 143,750) + 0.07 + 0.03 + (1854 / 143,750), or abour 120 ms.

Effects Of Requesting A Single Object Per Transaction
-----------------------------------------------------

Because HTTP has no way to ask for multiple objects with a single
request, each fetch requires a single transaction. With the current
protocol, this would require a new connection to be set up; however,
even if the connection were to be kept open, each request/response
pair would incur a separate round trip delay.

As an example, lets consider fetching ten documents, each the same
size as the one we've been considering. For HTTP/1.0, the total time
would be 5,300 ms. For the long-lived connection, the total
transaction time will be 1,200 ms.

Ignoring any possible savings in header size and processing time, if
we were to combine all the requests and responses into a single
message, the total transaction time would be 10 * (3000/143,750) +
0.07 + (0.03 * 10), or 580 ms. If we assume that the optional headers
are only sent once, then the total time per transaction is 520ms. In
this model, processing time is the dominant factor. If we further
assume that a proportion of processing time is independent of the
number of objects requested, then the total time can be as low as
250ms (150ms transfer, 70ms latency, 30ms processing time).

On a gigabit network, the transfer time becomes insignificant. On
such a network, the total time for the sequence of HTTP/1.0 requests
would be approximately 5,100ms. If multiple objects per request were
allowed over an already open channel, the total time would be 100ms.

Other Problems
--------------

One scalability problem caused by the single request per connection
paradigm occurs due to TCP's TIME_WAIT state. When a server closes a
TCP, connection, it is required to keep information about that
connection for a period of time, in case a delayed packet turns up and
sabotages a new incarnation of the connection.

The recommended time to keep this information is 240 seconds. Because
of this, a server will leave some resources allocated for every single
connection closed in the past four minutes. On a busy server, this can
add up to thousands of control blocks.

Summary
-------

HTTP/1.0 interacts badly with TCP. It incurs frequent round-trip
delays due to connection establishment, performs slow start in both
directions for short duration connections, and incurs heavy latency
penalties due to the mismatch of the typical access profiles with the
single request per transaction model. HTTP/1.0 also requires busy
servers to dedicate resources to mainting TIME_WAIT information for
large numbers of closed connections.

Bibliography
------------

http
href="http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTTP2.html"

HTTP protocol specification, Tim Berners-Lee, CERN

tcp
href="http://sunsite.unc.edu/pub/docs/rfc/rfc793.txt"

"Transmission Control Protocol," J. Postel, RFC-793, September
1981.

slow-start

"Congestion Avoidance and Control," V. Jacobson, ACM SIGCOMM-88,
August 1988.

------- =_aaaaaaaaaa1
Content-Type: text/html; charset="us-ascii"
Content-Description: HTTP performance problems (html)
Content-Transfer-Encoding: quoted-printable

<HEAD>
<title> Analysis of HTTP Performance Problems </title>
<hr>
<h1> Analysis of HTTP Performance problems
</h1>
<address> Simon E Spero
</address>
<H2>Abstract</H2>
<blockquote>
This paper is the first in a series on performance issues in the
World Wide Web.
=

<p>
HTTP is a transfer protocol used by the World Wide Web distributed
hypermedia system to retrieve distributed objects. HTTP uses TCP as
a transport layer. Certain design features of HTTP interact badly with
TCP, causing problems with performance and with server scalability.
Latency problems are caused by opening a single connection per request,
through connection setup and slow-start costs. Further avoidable =

latency is incurred due to the protocol only returning a single object
per request. Scalability problems are caused by TCP requring a server
to maintain state for all recently closed connections.
<p>
</blockquote>

</hr>
<h2>Summary of HTTP/1.0</h2>

HTTP =

<A href=3D"#http">[http]</a> is a transfer protocol used by the World
Wide Web to retrieve information from distributed servers. The HTTP
model is extremely simple; the client establishes a connection to the
remote server, then issues a request. The server then processes the
request, returns a response, and closes the connection.
<p>
<h3>Requests</h3>
The request format for HTTP is quite simple. The first line
specifies an object, together with the name of an object to apply the
method to. The most commonly used method is "GET", which ask the
server to send a copy of the object to the client.
<p>
The client can also send a series of optional headers; these headers are
in RFC-822 format. The most common headers are "Accept", which tells the s=
erver
which object types the client can handle, and "User-Agent", which gives th=
e
implementation name of the client.
<p>

<h3>Request syntax</h3>
<pre> =

&lt;METHOD&gt; &lt;URI&gt; "HTTP/1.0" &lt;crlf&gt;
{&lt;Header&gt;: &lt;Value&gt; &lt;crlf&gt;}* =

&lt;crlf&gt;
</pre>
<p>

<h3>Example</h3>
<pre>GET /index.html HTTP/1.0
Accept: text/plain
Accept: text/html
Accept: */*
User-Agent: Yet Another User Agent
</pre>
<p>

<h3>Responses</h3>
The response format is also quite simple. Responses start with a
status line indicating which version of HTTP the server is running,
together with a result code and an optional message.
<p>
This is followed by a series of optional object headers; the most importan=
t =

of these are "Content-Type", which describes the type of the object being
returned, and "Content-Length", which indicates the length.The headers are=
=

teminated by an empty line.
<p>
The server now sends any requested data. After the data have been
sent, the server drops the connection.
<p>

<h3>Response syntax</h3>
<pre> =

"HTTP/1.0" &lt;result-code&gt; [&lt;message&gt;] &lt;crlf&gt;
{&lt;Header&gt;: &lt;Value&gt; &lt;crlf&gt;}* =

&lt;crlf&gt;
</pre>
<p>

<h3>Example</h3>
<pre>HTTP/1.0 200 OK
Server: MDMA/0.1
MIME-version: 1.0
Content-type: text/html
Last-Modified: Thu Jul 7 00:25:33 1994
Content-Length: 2003
=

&lt;title&gt;MDMA - Multithreaded Daemon for Multimedia Access&lt;/title&g=
t;
&lt;hr&gt;
....
&lt;hr&gt;
&lt;h2&gt; MDMA - The speed daemon &lt;/h2&gt; =

&lt;hr&gt;

[Connection closed by foreign host]
</pre>

<p>

</hr>
<h2>Access Patterns</h2>
HTTP accesses usually exhibit a common pattern of behaviour. A client =

requests a hypertext page, then issues a sequence of requests to retrieve =

any icons referenced in the first document. Once the client has retrieved =

the icons, the user will typically select a hypertext link to follow. In t=
he
majority of cases, the referenced page will be on the same server as the =

original page.
<p>

</hr>
<h2>HTTP Illustrated</h2>
The problems with HTTP can best be understood by looking at the
network traffic generated by a typical HTTP transaction. This example
was generated by using Van Jacobsen's tcpdump program to monitor a
client at UNC fetching a copy of the NCSA Home page. This page is 1668
bytes long, including response headers. The client at UNC was a =

Sun Sparcstation 20/512, running Solaris 2.3. The server at NCSA was a =

Hewlett Packard 9000/735 running HP-UX.
<p>
The headers used in the request shown were captured from an xmosaic reques=
t.
The headers consisted of 42 lines totaling around 1130 bytes; of these
lines, 41 were "Accept". =

<p>
<h3>Stage 1: Time =3D 0</h3>
The trace begins with the client sending a connection request to the http
port on the server
<pre>.00000 unc.32840 &gt; ncsa.80: S 2785173504:2785173504(0) win 8760 &l=
t;mss 1460&gt; (DF)
</pre>

<p>

<h3>Stage 2: Time =3D 0.077</h3>
The server responds to the connect request with a connect =

response. The client acknowledges the connect response, and send the first=
=

536 bytes of the request.
<pre>.07769 ncsa.80 &gt; unc.32840: S 530752000:530752000(0) ack 278517350=
5 win 8192
.07784 unc.32840 &gt; ncsa.80: . ack 1 win 9112 (DF)
.07989 unc.32840 &gt; ncsa.80: P 1:537(536) ack 1 win 9112 (DF)
</pre>

<p>

<h3>Stage 3: Time =3D 0.35</h3>
The server acknowledges the first part of the request. The client then
sends the second part, and without waiting for a response, follows up with
the third and final part of the request.
<pre>.35079 ncsa.80 &gt; unc.32840: . ack 537 win 8192
.35092 unc.32840 &gt; ncsa.80: . 537:1073(536) ack 1 win 9112 (DF)
.35104 unc.32840 &gt; ncsa.80: P 1073:1147(74) ack 1 win 9112 (DF)
</pre>

<p>

<h3>Stage 4: Time =3D 0.45</h3>
The server sends a packet acknowledging the second and third parts of =

the request, and containing the first 512 bytes of the response. It =

follows this with another packet containing the second 512 bytes.
The client then sends a message acknowledging the first two response
packets.

<pre>.45116 ncsa.80 &gt; unc.32840: . 1:513(512) ack 1147 win 8190
.45473 ncsa.80 &gt; unc.32840: . 513:1025(512) ack 1147 win 8190
.45492 unc.32840 &gt; ncsa.80: . ack 1025 win 9112 (DF)
</pre>

<p>

<h3>Stage 5: Time =3D 0.53</h3>
The server sends the third and fourth response packet. The fourth packet =

also contains a flag indicating that the connection is being closed. The =

client acknowledges the data, then sends a message announcing that it too =
is =

closing the connection. From the point of view of the client program, the
transaction is now over.

<pre>.52521 ncsa.80 &gt; unc.32840: . 1025:1537(512) ack 1147 win 8190
.52746 ncsa.80 &gt; unc.32840: FP 1537:1853(316) ack 1147 win 8190
.52755 unc.32840 &gt; ncsa.80: . ack 1854 win 9112 (DF)
.52876 unc.32840 &gt; ncsa.80: F 1147:1147(0) ack 1854 win 9112 (DF)
</pre>
<p>

<h3>Stage 6: Time =3D 0.60</h3>
The server acknowledges the close.
<pre>.59904 ncsa.80 &gt; unc.32840: . ack 1148 win 8189
</pre>
<p>

</hr>
<h2>Very nice, but what does it all mean</h2>
This trace is quite revealing. It seems that HTTP spends more time
waiting than it does transferring data. Before we take a look at what
causes these delays, we'll calculate a couple of metrics that will help
us work out what's going on.
<p>
One important metric is the Round Trip Time (RTT). This is the
time taken to sent a packet from one end of the connection to the
other and back. Look at the gap between stages one and two, and
between stages five and six, we can see that in our example, the RTT
is around 70 milliseconds.
<p>
Another important metric is the bandwidth of the connection. This is =

a measure of how many bits per second our connection can carry. We can =

establish a lower bound on the available bandwidth by noting how long
it took to send one of the data packets. =

<p>
The two data packets in stage four arrive with a gap of 3.57 ms;
since each packet carries 512 bytes of data, this puts a lower limit on th=
e
available bandwidth of about 1.15 Mbps (143,750 MBps).

<p>
<h3>Connection Establishment</h3>
TCP establishes connections via a three-way handshake. The client sends
a connection request, the server responds, and the client acknowledges the=
=

response. The client can send data along with the acknowledgement.
<p>
Since the client must wait for the server to send its connection response=
,
this procedure sets a lower bound on the transaction time of two RTTs.
<p>

<h3>Data transfer: Segments</h3>
When TCP transfers a stream data, it breaks up the stream into small segme=
nts.
The size of each segment can vary up to a maximum segement size (MSS). Alt=
hough
the segment size can be negotiated (the UNC host advertises an MSS of =

1460 bytes in the first packet), this negotiation is optional. For remote =

connections, the MSS defaults to 536 bytes. =

<p>
Since the NCSA host did not advertise an MSS, the UNC host uses the
default value of 536 for this session.
<p>

<h3>Data transfer: Windows and Slow Start</h3>
Instead of having to wait for each packet to be acknowledged, TCP allows
the sender to send out new segments even though it may not have received
acknowledgements for previous segments. =

<p>
To prevent the sender from overflowing the receivers buffers, in =

each segment the receiver tells the sender how much data it is prepared to=
=

accept without acknowledgments. This value is known as the window size.
<p>
Although the window size tells the sender the maximum amount of =

unacknowledged data the receiver is prepared to let it have outstanding, t=
he
receiver cannot know how much data the connecting networks are prepared to
carry. If the network is quite congested, sending a full windows worth of =
data
will cause even worse congestion. The ideal transmission rate is one in
which acknowledgements and outgoing packets enter the network at the same =
rate.
<p>
TCP determines the best rate to use through a process called Slow Start. =

With Slow Start, the sender maintains calculates a second window of
unacknowledged segments known as the Congestion Window. When a connection =
first
starts up, each sender is only allowed to have a single unacknowledged =

segement in transit. Every time a segment is acknowledged without loss, t=
he =

congestion window is opened; every time a segement is lost and times out, =

the window is closed. =

<p>
This approach is ideal for normal connections; these connections tend
to last a relatively long time, and the effects of slow start are negligib=
le.
However, for short lived connections like those used in HTTP, the effect o=
f
slow start is devestating.
<p>

<h3>How Slow Start affects HTTP</h3>
HTTP is hurt by slow start on both the client and server
sides. Because the HTTP headers are longer than the MSS, the client
TCP needs to use two segments (Stage 2). Because the congestion window
is initialised to one, we need to wait for the first segment to be
acknowledged before we can send the second and third (Stage 3). This adds =
an
extra RTT onto the minimum time for the transaction.
<p>
When the server is ready to send the response, it starts with a congestio=
n
window set to 2. This is because the acknowledgement it sent in Stage 3 co=
unts
is counted as a successful transmission, allowing the congestion window t=
o
open up a notch before it comes time to send data. =

<p>
Although the servers congestion window is already slightly open, it still
doesn't have enough a big enough gap to send the entire response without =

pausing. In Stage 4, the server sends two segments carrying a combined =

1K of data, but then needs to wait to receive the acknowledgement from
the client before it can send the final two segments in Stage 5.
<p>
Since most pages are larger than 1K in size, slow start in the server =

will typically add at least one RTT to the total transaction time. Longer =

documents may experience several more slow start induced delays, until th=
e
congestion window is as big as the recievers window. =

<p>
Readers should note that if the server were to support the negotition of
larger MSS, this particular transaction would not have incurred any =

slow start delays. However, typical document sizes will cause at least
one slow start pause when sending the result.
<p>

</hr>
<h2>Latency and Bandwidth</h2>
Latency and Bandwidth are the two keys to protocol
performance. Latency, as measured by the RTT, is a measure of the
fixed costs of a transaction and does not change as documents get
bigger or smaller. Bandwidth, as we discussed earlier, is a measure
of how long it takes to send data.
<p>
<h3>Future Networks</h3>
Improving available bandwidth is relatively easy. All you need to do is =

buy a faster network. Reducing latency is somewhat harder- there are only =
two
known ways to achieve this; either change the speed of light, or move. Sin=
ce
placing workstations in particle accelerators has been known to cause data=
=

loss, and since the second approach is in general contrary to the philosph=
y
of distributed systems, latency is increasingly becoming the dominant =

factor in protocol performance.
<p>

</hr>
<h2>Effects of opening new connection per transaction</h2>
The cost of opening a new connection for each transaction can be clearly s=
een
by estimating the transaction time as it would have been if we had been =

reusing an existing connection.
<p>
In our example, the total transaction time was 530 ms
<p>
If were reusing an existing connection the transaction time could be =

calculated as the time to send the request plus the round trip time, plus =
any =

processing time on the server, plus the time to send the response. The onl=
y
figure in this list for which we have no estimate is that for server proce=
ssing
time. This depends heavily upon machine load, but for our purposes, we can=
take
the gap between sending the request and recieving the response, and subtra=
ct =

the response time (Stages 3 and 4). In this case, the approximate value is=
=

about 30 ms.
<p>
Using the above method, the expected transaction time for an already =

open server would be =

(1,146 / 143,750) + 0.07 + 0.03 + (1854 / 143,750), or abour 120 ms.
<p>

</hr>
<h2>Effects Of Requesting A Single Object Per Transaction</h2>
Because HTTP has no way to ask for multiple objects with a single request=
,
each fetch requires a single transaction. With the current protocol, this =

would require a new connection to be set up; however, even if the connecti=
on
were to be kept open, each request/response pair would incur a separate =

round trip delay. =

<p>
As an example, lets consider fetching ten documents, each the same size
as the one we've been considering. For HTTP/1.0, the total time would be =

5,300 ms. For the long-lived connection, the total transaction time will =
be
1,200 ms.
<p>
Ignoring any possible savings in header size and processing time,
if we were to combine all the requests and responses into a single
message, the total transaction time would be 10 * (3000/143,750) +
0.07 + (0.03 * 10), or 580 ms. If we assume that the optional headers
are only sent once, then the total time per transaction is 520ms. In
this model, processing time is the dominant factor. If we further
assume that a proportion of processing time is independent of the
number of objects requested, then the total time can be as low
as 250ms (150ms transfer, 70ms latency, 30ms processing time).
<p>
On a gigabit network, the transfer time becomes insignificant. On such =

a network, the total time for the sequence of HTTP/1.0 requests would be =

approximately 5,100ms. If multiple objects per request were allowed over
an already open channel, the total time would be 100ms.
<p>

</hr>
<h2>Other Problems</h2>
One scalability problem caused by the single request per connection
paradigm occurs due to TCP's TIME_WAIT state. When a server closes a TCP,
connection, it is required to keep information about that connection for =

a period of time, in case a delayed packet turns up and sabotages a new =

incarnation of the connection. =

<p>
The recommended time to keep this information is 240 seconds. Because of =

this, a server will leave some resources allocated for every single connec=
tion
closed in the past four minutes. On a busy server, this can add up to =

thousands of control blocks. =

<p>

</hr>
<h2>Summary</h2>
HTTP/1.0 interacts badly with TCP. It incurs frequent round-trip
delays due to connection establishment, performs slow start in both
directions for short duration connections, and incurs heavy latency
penalties due to the mismatch of the typical access profiles with the
single request per transaction model. HTTP/1.0 also requires busy
servers to dedicate resources to mainting TIME_WAIT information for
large numbers of closed connections.

<p>

<hr>
<H2>Bibliography</H2>
<DL>
<A name=3D"http" href=3D"http://info.cern.ch/hypertext/WWW/Protocols/HTTP/=
HTTP2.html">
<DT>http
<DD>
HTTP protocol specification, Tim Berners-Lee, CERN
</a>
<A name=3D"tcp" href=3D"http://sunsite.unc.edu/pub/docs/rfc/rfc793.txt">
<DT>tcp
<DD> "Transmission Control Protocol," J. Postel, RFC-793, September
1981.
</a>
<DT>slow-start
<DD> "Congestion Avoidance and Control," V. Jacobson, ACM SIGCOMM-88,
August 1988.

</DL>

------- =_aaaaaaaaaa1--

------- =_aaaaaaaaaa0--