What TCP is: A method a achieving the goals listed above. It describes bytes to be sent back and forth, and timers to be set, and header formats, and such.
What TCP is NOT: A set of system calls. A piece of software.
Both TCP and UDP use port numbers between 1 and 2^16. However, these port numbers do not conflict. If you open a UDP port 2001, that does not effect TCP port 2001.
A TCP connection is defined by four integers: (src IP, src port, dest IP, dest port). There can easily be more than one connection to the same port on the same machine. Telnet and web servers do this all the time. Each of these connection is treated independently.
When you do an 'accept()', you are creating a connection identified by the file descriptor. Even if you later close that file descriptor, the socket remains available for additional connections.
Each connection has an active and a passive end. The active end is the one at the client that does the connect(), the other is the server that does the accept().
The wrong way to achieve reliability: One can achieve reliability by sending a positive ack for every packet. If a packet does not get an ack before a timout period, the packet is resent. However, for systems with large bandwidth and high latency, this system wastes HUGE amounts of time.
Sliding Window: The idea of sliding window is that each side (sender and receiver) have an idea of the current bounds. The lower bound is the last piece of data ack'ed by the receiver. The upper bound is the lower bound plus the size of the receiver's buffer. The sender is allowed and encurraged to send any and all data within this window.
Correct Window Size: In an ideal world, the sender should be able to keep the network always busy sending data. If the one way latency is L, and the bandwidth is B, this requires a buffer (and a window) if 2 * L * B bytes. A little extra for slack would be nice. Too much extra can cause the sender to keep transmitting (and therefore waste resources) if the receiver goes down. later we'll learn an over large window can cause congestion problems. Although the TCP window size field is only 16 bits, an option allows that to scale by leftshiting up to 14 places, allowing for huge windows.
The sender side algorithm: Start by sending a windowful of data. Then start listening for acks. If any ack comes in, move the window forward. If this allows new data to be sent, send it. If this is the last bit of data, set the FIN bit in the header. If an ack for the last datum arrives, the data has been successfully sent. If a NAK is received for a segment, retransmit that segment.
The receiver's algorithm: For every packet received, send an ack. This ack should contain the new window size you want the sebder to send. If the applictation requests the data, give the data to the application. When the last packet arrives (the FIN bit will be set) you've received all the data. When the application has received the last of the data, release the buffer.
The receiver and fragments: If because of fragements a hole appears in the data sent (i.e. you have data looking like YYYYYYnnYYYYYYY) you might transmit a NAK for the missing data. If not, the sender must retransmit some data correctly received.
Lost Acks: Note that lost acks don't necessarily mean a retransmission. Instead, if a second ack arrives ack-ing more data, but in the original timeout period, then no retransmission need occur.
Why: TCP is built on top of IP, which can suffer long and varrying delays due to rerouting or changes in network load. But TCP needs to set a timeout value for each segment sent. If the timeout value is too largem latency will increase if segments are lost. If the timeout is too small, bandwidth will be wasted on unnecessary retransmissions during increases in packet delays.
HOW: Therefore, TCP must make estimates of both the current delay, and the variation in delay. The computation looks like this:DIFFERENCE = SAMPLE - OLD_RTTParameters:
NEW_RTT = OLD_RTT + delta * DIFFERENCE
DEVIATION = OLD_DEVIATION + alpha * (abs(DIFFERENCE) - OLD_DEVIATION)
TIMEOUT = NEW_RTT + n * DEVIATIONalpha -- How much each sample affects the current estimate of deviation. (Suggested value 1/4).Alpha and Beta should be between 0 (new data has no effect) and 1.0 (old data has no effect).
delta -- How much each sample affect the current estimate of the round trip time. (suggested value 1/8)
n -- How "loose" a timout to set. (Suggested value 2...4)
What about Retransmissions? Retransmissions make computing the delay between sending a packet and receiving an ack difficult to compute. It's impossible to tell if the ack coresponds to the original data or to the retransmission. Therefore, retransmissions are not used as samples in the above computations. Instead, the current RTT estimate is doubled for each timeout taken. See page 212 (Karn's Algorithm) for details.
One interesting varient on the standard sliding window used by TCP is that the ack for every packet contains both the data to be acked, and the current size of the sliding window. If the sender is sending to fast, the receiver can slow it down by reducing the size of the sliding window. Therefore, a slow receiver can stop a fast sender from overwelming it.
Why: Suppose the path between the two hosts becomes congested with traffic. This will delay traffic so much that the hosts will think data has been lost. They will respond with retransmissions, which increase the problem. Therefore, every connection has a 'congestion window' which also controls the data send. A sender cannot send data unless allowed by both the receiver's window (for flow control) and the congestion window (for congestion control).
How to detect congestion: There is no congestion detection built into IP. Therefore, TCP makes the conservative assumption that ALL lost packets are caused by congestion.
Setting the Congestion Window Size: Start the congestion window at 1 segment. If a packet is lost, reduce the congestion window by 1/2. For every successful transmission without timeout,
Note that the window goes down a by a factor, and it goes up by a constant. Having it go both up and down by a factor can result in an occilating system.
- if the congestion window is less than 1/2 it's maximum size increase the congestion window by 1. This is called slow start, and in fact is quite fast.
- else increase it by 1 only if all segments within a whole window arrive successfully.
A sender can mark some data as urgent. It does so be setting the urgent bit and urgent pointer in the header. A sender can send urgent data regardless of the state of the sliding window. Any host that receives urgent data should tell the application immedialey, and offer this data even if preceding data has not yet been received. One example of urgent data is the ^C in telnet.
TCP does not preserve data boundries. In other words, if you send four times a message of 512 bytes, you might receive 1 message of 2048 bytes, a 1024 and two 512, or any combination of bytes that adds to 2048.