diff --git a/en_US.ISO8859-1/books/developers-handbook/sockets/chapter.xml b/en_US.ISO8859-1/books/developers-handbook/sockets/chapter.xml index ff8f4f4eee..6173905a07 100644 --- a/en_US.ISO8859-1/books/developers-handbook/sockets/chapter.xml +++ b/en_US.ISO8859-1/books/developers-handbook/sockets/chapter.xml @@ -4,29 +4,37 @@ $FreeBSD$ --> -<chapter xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0" xml:id="sockets"> - <info><title>Sockets</title> +<chapter xmlns="http://docbook.org/ns/docbook" + xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0" + xml:id="sockets"> + <info> + <title>Sockets</title> + <authorgroup> - <author><personname><firstname>G. Adam</firstname><surname>Stanislav</surname></personname><contrib>Contributed by </contrib></author> + <author> + <personname> + <firstname>G. Adam</firstname> + <surname>Stanislav</surname> + </personname> + <contrib>Contributed by </contrib> + </author> </authorgroup> </info> - - <sect1 xml:id="sockets-synopsis"> <title>Synopsis</title> <para><acronym>BSD</acronym> sockets take interprocess - communications to a new level. It is no longer necessary for the - communicating processes to run on the same machine. They still - <emphasis>can</emphasis>, but they do not have to.</para> + communications to a new level. It is no longer necessary for + the communicating processes to run on the same machine. They + still <emphasis>can</emphasis>, but they do not have to.</para> <para>Not only do these processes not have to run on the same machine, they do not have to run under the same operating - system. Thanks to <acronym>BSD</acronym> sockets, your FreeBSD + system. Thanks to <acronym>BSD</acronym> sockets, your FreeBSD software can smoothly cooperate with a program running on a - &macintosh;, another one running on a &sun; workstation, yet another - one running under &windows; 2000, all connected with an + &macintosh;, another one running on a &sun; workstation, yet + another one running under &windows; 2000, all connected with an Ethernet-based local area network.</para> <para>But your software can equally well cooperate with processes @@ -44,40 +52,40 @@ <title>Networking and Diversity</title> <para>We have already hinted on the <emphasis>diversity</emphasis> - of networking. Many different systems have to talk to each - other. And they have to speak the same language. They also have - to <emphasis>understand</emphasis> the same language the same - way.</para> + of networking. Many different systems have to talk to each + other. And they have to speak the same language. They also + have to <emphasis>understand</emphasis> the same language the + same way.</para> <para>People often think that <emphasis>body language</emphasis> - is universal. But it is not. Back in my early teens, my father - took me to Bulgaria. We were sitting at a table in a park in + is universal. But it is not. Back in my early teens, my father + took me to Bulgaria. We were sitting at a table in a park in Sofia, when a vendor approached us trying to sell us some roasted almonds.</para> <para>I had not learned much Bulgarian by then, so, instead of saying no, I shook my head from side to side, the <quote>universal</quote> body language for - <emphasis>no</emphasis>. The vendor quickly started serving us + <emphasis>no</emphasis>. The vendor quickly started serving us some almonds.</para> <para>I then remembered I had been told that in Bulgaria shaking - your head sideways meant <emphasis>yes</emphasis>. Quickly, I - started nodding my head up and down. The vendor noticed, took - his almonds, and walked away. To an uninformed observer, I did + your head sideways meant <emphasis>yes</emphasis>. Quickly, I + started nodding my head up and down. The vendor noticed, took + his almonds, and walked away. To an uninformed observer, I did not change the body language: I continued using the language of - shaking and nodding my head. What changed was the - <emphasis>meaning</emphasis> of the body language. At first, the - vendor and I interpreted the same language as having completely - different meaning. I had to adjust my own interpretation of that - language so the vendor would understand.</para> + shaking and nodding my head. What changed was the + <emphasis>meaning</emphasis> of the body language. At first, + the vendor and I interpreted the same language as having + completely different meaning. I had to adjust my own + interpretation of that language so the vendor would + understand.</para> <para>It is the same with computers: The same symbols may have - different, even outright opposite meaning. Therefore, for - two computers to understand each other, they must not only - agree on the same <emphasis>language</emphasis>, but on the - same <emphasis>interpretation</emphasis> of the language. - </para> + different, even outright opposite meaning. Therefore, for two + computers to understand each other, they must not only agree on + the same <emphasis>language</emphasis>, but on the same + <emphasis>interpretation</emphasis> of the language.</para> </sect1> <sect1 xml:id="sockets-protocols"> @@ -86,7 +94,7 @@ <para>While various programming languages tend to have complex syntax and use a number of multi-letter reserved words (which makes them easy for the human programmer to understand), the - languages of data communications tend to be very terse. Instead + languages of data communications tend to be very terse. Instead of multi-byte words, they often use individual <emphasis>bits</emphasis>. There is a very convincing reason for it: While data travels <emphasis>inside</emphasis> your @@ -98,7 +106,7 @@ <emphasis>protocols</emphasis> rather than languages.</para> <para>As data travels from one computer to another, it always uses - more than one protocol. These protocols are + more than one protocol. These protocols are <emphasis>layered</emphasis>. The data can be compared to the inside of an onion: You have to peel off several layers of <quote>skin</quote> to get to the data. This is best @@ -106,11 +114,11 @@ <mediaobject> <imageobject> - <imagedata fileref="sockets/layers"/> + <imagedata fileref="sockets/layers"/> </imageobject> <textobject> - <literallayout class="monospaced">+----------------+ + <literallayout class="monospaced">+----------------+ | Ethernet | |+--------------+| || IP || @@ -131,7 +139,7 @@ </textobject> <textobject> - <phrase>Protocol Layers</phrase> + <phrase>Protocol Layers</phrase> </textobject> </mediaobject> @@ -153,7 +161,7 @@ <para>I think you get the picture...</para> <para>To inform our software how to handle the raw data, it is - encoded as a <acronym>PNG</acronym> file. It could be a + encoded as a <acronym>PNG</acronym> file. It could be a <acronym>GIF</acronym>, or a <acronym>JPEG</acronym>, but it is a <acronym>PNG</acronym>.</para> @@ -161,13 +169,13 @@ <para>At this point, I can hear some of you yelling, <emphasis><quote>No, it is not! It is a file - format!</quote></emphasis></para> + format!</quote></emphasis></para> - <para>Well, of course it is a file format. But from the + <para>Well, of course it is a file format. But from the perspective of data communications, a file format is a protocol: The file structure is a <emphasis>language</emphasis>, a terse one at that, communicating to our <emphasis>process</emphasis> - how the data is organized. Ergo, it is a + how the data is organized. Ergo, it is a <emphasis>protocol</emphasis>.</para> <para>Alas, if all we received was the <acronym>PNG</acronym> @@ -179,63 +187,63 @@ <acronym>JPEG</acronym>, or some other image format?</para> <para>To obtain that information, we are using another protocol: - <acronym>HTTP</acronym>. This protocol can tell us exactly that + <acronym>HTTP</acronym>. This protocol can tell us exactly that the data represents an image, and that it uses the - <acronym>PNG</acronym> protocol. It can also tell us some other - things, but let us stay focused on protocol layers here. - </para> + <acronym>PNG</acronym> protocol. It can also tell us some other + things, but let us stay focused on protocol layers here.</para> - <para>So, now we have some data wrapped in the <acronym>PNG</acronym> - protocol, wrapped in the <acronym>HTTP</acronym> protocol. - How did we get it from the server?</para> + <para>So, now we have some data wrapped in the + <acronym>PNG</acronym> protocol, wrapped in the + <acronym>HTTP</acronym> protocol. How did we get it from the + server?</para> <para>By using <acronym>TCP/IP</acronym> over Ethernet, that is - how. Indeed, that is three more protocols. Instead of + how. Indeed, that is three more protocols. Instead of continuing inside out, I am now going to talk about Ethernet, simply because it is easier to explain the rest that way.</para> <para>Ethernet is an interesting system of connecting computers in a <emphasis>local area network</emphasis> (<acronym>LAN</acronym>). Each computer has a <emphasis>network - interface card</emphasis> (<acronym>NIC</acronym>), which has a - unique 48-bit <acronym>ID</acronym> called its - <emphasis>address</emphasis>. No two Ethernet - <acronym>NIC</acronym>s in the world have the same address. - </para> + interface card</emphasis> (<acronym>NIC</acronym>), which has + a unique 48-bit <acronym>ID</acronym> called its + <emphasis>address</emphasis>. No two Ethernet + <acronym>NIC</acronym>s in the world have the same + address.</para> <para>These <acronym>NIC</acronym>s are all connected with each - other. Whenever one computer wants to communicate with another + other. Whenever one computer wants to communicate with another in the same Ethernet <acronym>LAN</acronym>, it sends a message - over the network. Every <acronym>NIC</acronym> sees the - message. But as part of the Ethernet + over the network. Every <acronym>NIC</acronym> sees the + message. But as part of the Ethernet <emphasis>protocol</emphasis>, the data contains the address of - the destination <acronym>NIC</acronym> (among other things). So, - only one of all the network interface cards will pay attention - to it, the rest will ignore it.</para> + the destination <acronym>NIC</acronym> (among other things). + So, only one of all the network interface cards will pay + attention to it, the rest will ignore it.</para> - <para>But not all computers are connected to the same - network. Just because we have received the data over our - Ethernet does not mean it originated in our own local area - network. It could have come to us from some other network (which - may not even be Ethernet based) connected with our own network - via the Internet.</para> + <para>But not all computers are connected to the same network. + Just because we have received the data over our Ethernet does + not mean it originated in our own local area network. It could + have come to us from some other network (which may not even be + Ethernet based) connected with our own network via the + Internet.</para> <para>All data is transferred over the Internet using <acronym>IP</acronym>, which stands for <emphasis>Internet - Protocol</emphasis>. Its basic role is to let us know where in - the world the data has arrived from, and where it is supposed to - go to. It does not <emphasis>guarantee</emphasis> we will + Protocol</emphasis>. Its basic role is to let us know where + in the world the data has arrived from, and where it is supposed + to go to. It does not <emphasis>guarantee</emphasis> we will receive the data, only that we will know where it came from <emphasis>if</emphasis> we do receive it.</para> <para>Even if we do receive the data, <acronym>IP</acronym> does not guarantee we will receive various chunks of data in the same - order the other computer has sent it to us. So, we can receive + order the other computer has sent it to us. So, we can receive the center of our image before we receive the upper left corner and after the lower right, for example.</para> <para>It is <acronym>TCP</acronym> (<emphasis>Transmission Control - Protocol</emphasis>) that asks the sender to resend any lost + Protocol</emphasis>) that asks the sender to resend any lost data and that places it all into the proper order.</para> <para>All in all, it took <emphasis>five</emphasis> different @@ -248,7 +256,7 @@ <acronym>Ethernet</acronym> protocol.</para> <para>Oh, and by the way, there probably were several other - protocols involved somewhere on the way. For example, if our + protocols involved somewhere on the way. For example, if our <acronym>LAN</acronym> was connected to the Internet through a dial-up call, it used the <acronym>PPP</acronym> protocol over the modem which used one (or several) of the various modem @@ -256,38 +264,38 @@ <para>As a developer you should be asking by now, <emphasis><quote>How am I supposed to handle it - all?</quote></emphasis></para> + all?</quote></emphasis></para> <para>Luckily for you, you are <emphasis>not</emphasis> supposed - to handle it all. You <emphasis>are</emphasis> supposed to - handle some of it, but not all of it. Specifically, you need not - worry about the physical connection (in our case Ethernet and - possibly <acronym>PPP</acronym>, etc). Nor do you need to handle - the Internet Protocol, or the Transmission Control + to handle it all. You <emphasis>are</emphasis> supposed to + handle some of it, but not all of it. Specifically, you need + not worry about the physical connection (in our case Ethernet + and possibly <acronym>PPP</acronym>, etc). Nor do you need to + handle the Internet Protocol, or the Transmission Control Protocol.</para> <para>In other words, you do not have to do anything to receive - the data from the other computer. Well, you do have to + the data from the other computer. Well, you do have to <emphasis>ask</emphasis> for it, but that is almost as simple as opening a file.</para> <para>Once you have received the data, it is up to you to figure - out what to do with it. In our case, you would need to + out what to do with it. In our case, you would need to understand the <acronym>HTTP</acronym> protocol and the <acronym>PNG</acronym> file structure.</para> <para>To use an analogy, all the internetworking protocols become a gray area: Not so much because we do not understand how it - works, but because we are no longer concerned about it. The + works, but because we are no longer concerned about it. The sockets interface takes care of this gray area for us:</para> <mediaobject> <imageobject> - <imagedata fileref="sockets/slayers"/> + <imagedata fileref="sockets/slayers"/> </imageobject> <textobject> - <literallayout class="monospaced">+----------------+ + <literallayout class="monospaced">+----------------+ |xxxxEthernetxxxx| |+--------------+| ||xxxxxxIPxxxxxx|| @@ -308,7 +316,7 @@ </textobject> <textobject> - <phrase>Sockets Covered Protocol Layers</phrase> + <phrase>Sockets Covered Protocol Layers</phrase> </textobject> </mediaobject> @@ -325,14 +333,13 @@ <para><acronym>BSD</acronym> sockets are built on the basic &unix; model: <emphasis>Everything is a file.</emphasis> In our example, then, sockets would let us receive an <emphasis>HTTP - file</emphasis>, so to speak. It would then be up to us to + file</emphasis>, so to speak. It would then be up to us to extract the <emphasis><acronym>PNG</acronym> file</emphasis> - from it. - </para> + from it.</para> <para>Because of the complexity of internetworking, we cannot just use the <function role="syscall">open</function> system call, or - the <function>open()</function> C function. Instead, we need to + the <function>open()</function> C function. Instead, we need to take several steps to <quote>opening</quote> a socket.</para> <para>Once we do, however, we can start treating the @@ -356,32 +363,30 @@ <title>The Client-Server Difference</title> <para>Typically, one of the ends of a socket-based data - communication is a <emphasis>server</emphasis>, the other is a - <emphasis>client</emphasis>.</para> + communication is a <emphasis>server</emphasis>, the other is a + <emphasis>client</emphasis>.</para> <sect3 xml:id="sockets-common-elements"> - <title>The Common Elements</title> + <title>The Common Elements</title> - <sect4 xml:id="sockets-socket"> + <sect4 xml:id="sockets-socket"> <title><function>socket</function></title> <para>The one function used by both, clients and servers, is - &man.socket.2;. It is declared this way:</para> + &man.socket.2;. It is declared this way:</para> -<programlisting> -int socket(int domain, int type, int protocol); -</programlisting> + <programlisting>int socket(int domain, int type, int protocol);</programlisting> - <para>The return value is of the same type as that of - <function>open</function>, an integer. FreeBSD allocates + <para>The return value is of the same type as that of + <function>open</function>, an integer. FreeBSD allocates its value from the same pool as that of file handles. That is what allows sockets to be treated the same way as files.</para> <para>The <varname>domain</varname> argument tells the system what <emphasis>protocol family</emphasis> you want - it to use. Many of them exist, some are vendor specific, - others are very common. They are declared in + it to use. Many of them exist, some are vendor specific, + others are very common. They are declared in <filename>sys/socket.h</filename>.</para> <para>Use <constant>PF_INET</constant> for @@ -422,25 +427,24 @@ int socket(int domain, int type, int protocol); <emphasis>unconnected</emphasis>.</para> <para>This is on purpose: To use a telephone analogy, we - have just attached a modem to the phone line. We have + have just attached a modem to the phone line. We have neither told the modem to make a call, nor to answer if the phone rings.</para> </note> </sect4> - <sect4 xml:id="sockets-sockaddr"> + <sect4 xml:id="sockets-sockaddr"> <title><varname>sockaddr</varname></title> <para>Various functions of the sockets family expect the - address of (or pointer to, to use C terminology) a small - area of the memory. The various C declarations in the - <filename>sys/socket.h</filename> refer to it as - <varname>struct sockaddr</varname>. This structure is - declared in the same file:</para> + address of (or pointer to, to use C terminology) a small + area of the memory. The various C declarations in the + <filename>sys/socket.h</filename> refer to it as + <varname>struct sockaddr</varname>. This structure is + declared in the same file:</para> -<programlisting> -/* + <programlisting>/* * Structure used by kernel to store most * addresses. */ @@ -449,17 +453,16 @@ struct sockaddr { sa_family_t sa_family; /* address family */ char sa_data[14]; /* actually longer; address value */ }; -#define SOCK_MAXADDRLEN 255 /* longest possible addresses */ -</programlisting> +#define SOCK_MAXADDRLEN 255 /* longest possible addresses */</programlisting> - <para>Please note the <emphasis>vagueness</emphasis> with + <para>Please note the <emphasis>vagueness</emphasis> with which the <varname>sa_data</varname> field is declared, just as an array of <constant>14</constant> bytes, with the comment hinting there can be more than <constant>14</constant> of them.</para> - <para>This vagueness is quite deliberate. Sockets is a very - powerful interface. While most people perhaps think of it + <para>This vagueness is quite deliberate. Sockets is a very + powerful interface. While most people perhaps think of it as nothing more than the Internet interface—and most applications probably use it for that nowadays—sockets can be used for just about @@ -473,8 +476,7 @@ struct sockaddr { right before the definition of <varname>sockaddr</varname>:</para> -<programlisting> -/* + <programlisting>/* * Address families. */ #define AF_UNSPEC 0 /* unspecified */ @@ -519,11 +521,9 @@ struct sockaddr { #define AF_SCLUSTER 34 /* Sitara cluster protocol */ #define AF_ARP 35 #define AF_BLUETOOTH 36 /* Bluetooth sockets */ -#define AF_MAX 37 +#define AF_MAX 37</programlisting> -</programlisting> - - <para>The one used for <acronym>IP</acronym> is + <para>The one used for <acronym>IP</acronym> is <symbol>AF_INET</symbol>. It is a symbol for the constant <constant>2</constant>.</para> @@ -534,13 +534,12 @@ struct sockaddr { used.</para> <para>Specifically, whenever the <emphasis>address - family</emphasis> is <symbol>AF_INET</symbol>, we can use - <varname>struct sockaddr_in</varname> found in + family</emphasis> is <symbol>AF_INET</symbol>, we can + use <varname>struct sockaddr_in</varname> found in <filename>netinet/in.h</filename>, wherever <varname>sockaddr</varname> is expected:</para> -<programlisting> -/* + <programlisting>/* * Socket address, internet style. */ struct sockaddr_in { @@ -549,18 +548,17 @@ struct sockaddr_in { in_port_t sin_port; struct in_addr sin_addr; char sin_zero[8]; -}; -</programlisting> +};</programlisting> - <para>We can visualize its organization this way:</para> + <para>We can visualize its organization this way:</para> - <mediaobject> - <imageobject> - <imagedata fileref="sockets/sain"/> - </imageobject> + <mediaobject> + <imageobject> + <imagedata fileref="sockets/sain"/> + </imageobject> - <textobject> - <literallayout class="monospaced"> 0 1 2 3 + <textobject> + <literallayout class="monospaced"> 0 1 2 3 +--------+--------+-----------------+ 0 | 0 | Family | Port | +--------+--------+-----------------+ @@ -570,39 +568,41 @@ struct sockaddr_in { +-----------------------------------+ 12 | 0 | +-----------------------------------+</literallayout> - </textobject> + </textobject> - <textobject> - <phrase>sockaddr_in</phrase> - </textobject> - </mediaobject> + <textobject> + <phrase>sockaddr_in</phrase> + </textobject> + </mediaobject> - <para>The three important fields are + <para>The three important fields are <varname>sin_family</varname>, which is byte 1 of the structure, <varname>sin_port</varname>, a 16-bit value found in bytes 2 and 3, and <varname>sin_addr</varname>, a 32-bit integer representation of the <acronym>IP</acronym> address, stored in bytes 4-7.</para> - <para>Now, let us try to fill it out. Let us assume we are + <para>Now, let us try to fill it out. Let us assume we are trying to write a client for the <emphasis>daytime</emphasis> protocol, which simply states that its server will write a text string representing the - current date and time to port 13. We want to use + current date and time to port 13. We want to use <acronym>TCP/IP</acronym>, so we need to specify - <constant>AF_INET</constant> in the address family - field. <constant>AF_INET</constant> is defined as - <constant>2</constant>. Let us use the - <acronym>IP</acronym> address of <systemitem class="ipaddress">192.43.244.18</systemitem>, which is the time - server of US federal government (<systemitem class="fqdomainname">time.nist.gov</systemitem>).</para> + <constant>AF_INET</constant> in the address family field. + <constant>AF_INET</constant> is defined as + <constant>2</constant>. Let us use the + <acronym>IP</acronym> address of <systemitem + class="ipaddress">192.43.244.18</systemitem>, which is + the time server of US federal government (<systemitem + class="fqdomainname">time.nist.gov</systemitem>).</para> - <mediaobject> - <imageobject> - <imagedata fileref="sockets/sainfill"/> - </imageobject> + <mediaobject> + <imageobject> + <imagedata fileref="sockets/sainfill"/> + </imageobject> - <textobject> - <literallayout class="monospaced"> 0 1 2 3 + <textobject> + <literallayout class="monospaced"> 0 1 2 3 +--------+--------+-----------------+ 0 | 0 | 2 | 13 | +-----------------+-----------------+ @@ -612,59 +612,58 @@ struct sockaddr_in { +-----------------------------------+ 12 | 0 | +-----------------------------------+</literallayout> - </textobject> + </textobject> - <textobject> - <phrase>Specific example of sockaddr_in</phrase> - </textobject> - </mediaobject> + <textobject> + <phrase>Specific example of sockaddr_in</phrase> + </textobject> + </mediaobject> - <para>By the way the <varname>sin_addr</varname> field is - declared as being of the <varname>struct in_addr</varname> - type, which is defined in - <filename>netinet/in.h</filename>:</para> + <para>By the way the <varname>sin_addr</varname> field is + declared as being of the <varname>struct in_addr</varname> + type, which is defined in + <filename>netinet/in.h</filename>:</para> -<programlisting> -/* + <programlisting>/* * Internet address (a structure for historical reasons) */ struct in_addr { in_addr_t s_addr; -}; -</programlisting> +};</programlisting> - <para>In addition, <varname>in_addr_t</varname> is a 32-bit - integer.</para> + <para>In addition, <varname>in_addr_t</varname> is a 32-bit + integer.</para> - <para>The <systemitem class="ipaddress">192.43.244.18</systemitem> is - just a convenient notation of expressing a 32-bit integer - by listing all of its 8-bit bytes, starting with the + <para>The <systemitem + class="ipaddress">192.43.244.18</systemitem> is just a + convenient notation of expressing a 32-bit integer by + listing all of its 8-bit bytes, starting with the <emphasis>most significant</emphasis> one.</para> - <para>So far, we have viewed <varname>sockaddr</varname> as + <para>So far, we have viewed <varname>sockaddr</varname> as an abstraction. Our computer does not store <varname>short</varname> integers as a single 16-bit - entity, but as a sequence of 2 bytes. Similarly, it stores - 32-bit integers as a sequence of 4 bytes.</para> + entity, but as a sequence of 2 bytes. Similarly, it + stores 32-bit integers as a sequence of 4 bytes.</para> - <para>Suppose we coded something like this:</para> + <para>Suppose we coded something like this:</para> <programlisting>sa.sin_family = AF_INET; sa.sin_port = 13; sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | 18;</programlisting> - <para>What would the result look like?</para> + <para>What would the result look like?</para> - <para>Well, that depends, of course. On a &pentium;, or other - x86, based computer, it would look like this:</para> + <para>Well, that depends, of course. On a &pentium;, or + other x86, based computer, it would look like this:</para> - <mediaobject> - <imageobject> - <imagedata fileref="sockets/sainlsb"/> - </imageobject> + <mediaobject> + <imageobject> + <imagedata fileref="sockets/sainlsb"/> + </imageobject> - <textobject> - <literallayout class="monospaced"> 0 1 2 3 + <textobject> + <literallayout class="monospaced"> 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 13 | 0 | +--------+--------+--------+--------+ @@ -674,23 +673,22 @@ sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | +-----------------------------------+ 12 | 0 | +-----------------------------------+</literallayout> - </textobject> + </textobject> - <textobject> - <phrase>sockaddr_in on an Intel system</phrase> - </textobject> - </mediaobject> + <textobject> + <phrase>sockaddr_in on an Intel system</phrase> + </textobject> + </mediaobject> - <para>On a different system, it might look like this: - </para> + <para>On a different system, it might look like this:</para> - <mediaobject> - <imageobject> - <imagedata fileref="sockets/sainmsb"/> - </imageobject> + <mediaobject> + <imageobject> + <imagedata fileref="sockets/sainmsb"/> + </imageobject> - <textobject> - <literallayout class="monospaced"> 0 1 2 3 + <textobject> + <literallayout class="monospaced"> 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 0 | 13 | +--------+--------+--------+--------+ @@ -700,21 +698,21 @@ sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | +-----------------------------------+ 12 | 0 | +-----------------------------------+</literallayout> - </textobject> + </textobject> - <textobject> - <phrase>sockaddr_in on an MSB system</phrase> - </textobject> - </mediaobject> + <textobject> + <phrase>sockaddr_in on an MSB system</phrase> + </textobject> + </mediaobject> - <para>And on a PDP it might look different yet. But the + <para>And on a PDP it might look different yet. But the above two are the most common ways in use today.</para> <para>Ordinarily, wanting to write portable code, - programmers pretend that these differences do not - exist. And they get away with it (except when they code in - assembly language). Alas, you cannot get away with it that - easily when coding for sockets.</para> + programmers pretend that these differences do not exist. + And they get away with it (except when they code in + assembly language). Alas, you cannot get away with it + that easily when coding for sockets.</para> <para>Why?</para> @@ -725,37 +723,38 @@ sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | (<acronym>LSB</acronym>) first.</para> <para>You might be wondering, <emphasis><quote>So, will - sockets not handle it for me?</quote></emphasis></para> + sockets not handle it for + me?</quote></emphasis></para> <para>It will not.</para> <para>While that answer may surprise you at first, remember - that the general sockets interface only understands the - <varname>sa_len</varname> and <varname>sa_family</varname> - fields of the <varname>sockaddr</varname> structure. You - do not have to worry about the byte order there (of - course, on FreeBSD <varname>sa_family</varname> is only 1 - byte anyway, but many other &unix; systems do not have - <varname>sa_len</varname> and use 2 bytes for - <varname>sa_family</varname>, and expect the data in - whatever order is native to the computer).</para> + that the general sockets interface only understands the + <varname>sa_len</varname> and <varname>sa_family</varname> + fields of the <varname>sockaddr</varname> structure. You + do not have to worry about the byte order there (of + course, on FreeBSD <varname>sa_family</varname> is only 1 + byte anyway, but many other &unix; systems do not have + <varname>sa_len</varname> and use 2 bytes for + <varname>sa_family</varname>, and expect the data in + whatever order is native to the computer).</para> <para>But the rest of the data is just - <varname>sa_data[14]</varname> as far as sockets - goes. Depending on the <emphasis>address - family</emphasis>, sockets just forwards that data to its - destination.</para> + <varname>sa_data[14]</varname> as far as sockets goes. + Depending on the <emphasis>address family</emphasis>, + sockets just forwards that data to its destination.</para> <para>Indeed, when we enter a port number, it is because we want the other computer to know what service we are asking - for. And, when we are the server, we read the port number + for. And, when we are the server, we read the port number so we know what service the other computer is expecting - from us. Either way, sockets only has to forward the port - number as data. It does not interpret it in any way.</para> + from us. Either way, sockets only has to forward the port + number as data. It does not interpret it in any + way.</para> <para>Similarly, we enter the <acronym>IP</acronym> address - to tell everyone on the way where to send our data - to. Sockets, again, only forwards it as data.</para> + to tell everyone on the way where to send our data to. + Sockets, again, only forwards it as data.</para> <para>That is why, we (the <emphasis>programmers</emphasis>, not the <emphasis>sockets</emphasis>) have to distinguish @@ -771,8 +770,8 @@ sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | over <acronym>IP</acronym> <emphasis><acronym>MSB</acronym> first</emphasis>. This, we will refer to as the <emphasis>network byte - order</emphasis>, or simply the <emphasis>network - order</emphasis>.</para> + order</emphasis>, or simply the <emphasis>network + order</emphasis>.</para> <para>Now, if we compiled the above code for an Intel based computer, our <emphasis>host byte order</emphasis> would @@ -780,11 +779,11 @@ sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | <mediaobject> <imageobject> - <imagedata fileref="sockets/sainlsb"/> - </imageobject> + <imagedata fileref="sockets/sainlsb"/> + </imageobject> - <textobject> - <literallayout class="monospaced"> 0 1 2 3 + <textobject> + <literallayout class="monospaced"> 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 13 | 0 | +--------+--------+--------+--------+ @@ -794,24 +793,24 @@ sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | +-----------------------------------+ 12 | 0 | +-----------------------------------+</literallayout> - </textobject> + </textobject> - <textobject> - <phrase>Host byte order on an Intel system</phrase> - </textobject> - </mediaobject> + <textobject> + <phrase>Host byte order on an Intel system</phrase> + </textobject> + </mediaobject> - <para>But the <emphasis>network byte order</emphasis> - requires that we store the data <acronym>MSB</acronym> - first:</para> + <para>But the <emphasis>network byte order</emphasis> + requires that we store the data <acronym>MSB</acronym> + first:</para> - <mediaobject> - <imageobject> - <imagedata fileref="sockets/sainmsb"/> - </imageobject> + <mediaobject> + <imageobject> + <imagedata fileref="sockets/sainmsb"/> + </imageobject> - <textobject> - <literallayout class="monospaced"> 0 1 2 3 + <textobject> + <literallayout class="monospaced"> 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 0 | 13 | +--------+--------+--------+--------+ @@ -821,135 +820,130 @@ sa.sin_addr.s_addr = (((((192 << 8) | 43) << 8) | 244) << 8) | +-----------------------------------+ 12 | 0 | +-----------------------------------+</literallayout> - </textobject> + </textobject> - <textobject> - <phrase>Network byte order</phrase> - </textobject> - </mediaobject> + <textobject> + <phrase>Network byte order</phrase> + </textobject> + </mediaobject> - <para>Unfortunately, our <emphasis>host order</emphasis> is + <para>Unfortunately, our <emphasis>host order</emphasis> is the exact opposite of the <emphasis>network - order</emphasis>.</para> + order</emphasis>.</para> - <para>We have several ways of dealing with it. One would be - to <emphasis>reverse</emphasis> the values in our code: - </para> + <para>We have several ways of dealing with it. One would be + to <emphasis>reverse</emphasis> the values in our + code:</para> <programlisting>sa.sin_family = AF_INET; sa.sin_port = 13 << 8; sa.sin_addr.s_addr = (((((18 << 8) | 244) << 8) | 43) << 8) | 192;</programlisting> - <para>This will <emphasis>trick</emphasis> our compiler - into storing the data in the <emphasis>network byte - order</emphasis>. In some cases, this is exactly the way - to do it (e.g., when programming in assembly - language). In most cases, however, it can cause a - problem.</para> + <para>This will <emphasis>trick</emphasis> our compiler into + storing the data in the <emphasis>network byte + order</emphasis>. In some cases, this is exactly the + way to do it (e.g., when programming in assembly + language). In most cases, however, it can cause a + problem.</para> - <para>Suppose, you wrote a sockets-based program in C. You - know it is going to run on a &pentium;, so you enter all - your constants in reverse and force them to the - <emphasis>network byte order</emphasis>. It works - well.</para> + <para>Suppose, you wrote a sockets-based program in C. You + know it is going to run on a &pentium;, so you enter all + your constants in reverse and force them to the + <emphasis>network byte order</emphasis>. It works + well.</para> - <para>Then, some day, your trusted old &pentium; becomes a - rusty old &pentium;. You replace it with a system whose - <emphasis>host order</emphasis> is the same as the - <emphasis>network order</emphasis>. You need to recompile - all your software. All of your software continues to - perform well, except the one program you wrote.</para> + <para>Then, some day, your trusted old &pentium; becomes a + rusty old &pentium;. You replace it with a system whose + <emphasis>host order</emphasis> is the same as the + <emphasis>network order</emphasis>. You need to recompile + all your software. All of your software continues to + perform well, except the one program you wrote.</para> - <para>You have since forgotten that you had forced all of - your constants to the opposite of the <emphasis>host - order</emphasis>. You spend some quality time tearing out - your hair, calling the names of all gods you ever heard - of (and some you made up), hitting your monitor with a - nerf bat, and performing all the other traditional - ceremonies of trying to figure out why something that has - worked so well is suddenly not working at all.</para> + <para>You have since forgotten that you had forced all of + your constants to the opposite of the <emphasis>host + order</emphasis>. You spend some quality time tearing + out your hair, calling the names of all gods you ever + heard of (and some you made up), hitting your monitor with + a nerf bat, and performing all the other traditional + ceremonies of trying to figure out why something that has + worked so well is suddenly not working at all.</para> - <para>Eventually, you figure it out, say a couple of swear - words, and start rewriting your code.</para> + <para>Eventually, you figure it out, say a couple of swear + words, and start rewriting your code.</para> - <para>Luckily, you are not the first one to face the - problem. Someone else has created the &man.htons.3; and - &man.htonl.3; C functions to convert a - <varname>short</varname> and <varname>long</varname> - respectively from the <emphasis>host byte - order</emphasis> to the <emphasis>network byte - order</emphasis>, and the &man.ntohs.3; and &man.ntohl.3; - C functions to go the other way.</para> + <para>Luckily, you are not the first one to face the + problem. Someone else has created the &man.htons.3; and + &man.htonl.3; C functions to convert a + <varname>short</varname> and <varname>long</varname> + respectively from the <emphasis>host byte order</emphasis> + to the <emphasis>network byte order</emphasis>, and the + &man.ntohs.3; and &man.ntohl.3; C functions to go the + other way.</para> - <para>On <emphasis><acronym>MSB</acronym>-first</emphasis> - systems these functions do nothing. On - <emphasis><acronym>LSB</acronym>-first</emphasis> systems - they convert values to the proper order.</para> - - <para>So, regardless of what system your software is - compiled on, your data will end up in the correct order - if you use these functions.</para> - - </sect4> + <para>On <emphasis><acronym>MSB</acronym>-first</emphasis> + systems these functions do nothing. On + <emphasis><acronym>LSB</acronym>-first</emphasis> systems + they convert values to the proper order.</para> + <para>So, regardless of what system your software is + compiled on, your data will end up in the correct order if + you use these functions.</para> + </sect4> </sect3> <sect3 xml:id="sockets-client-functions"> - <title>Client Functions</title> + <title>Client Functions</title> - <para>Typically, the client initiates the connection to the - server. The client knows which server it is about to call: + <para>Typically, the client initiates the connection to the + server. The client knows which server it is about to call: It knows its <acronym>IP</acronym> address, and it knows the - <emphasis>port</emphasis> the server resides at. It is akin + <emphasis>port</emphasis> the server resides at. It is akin to you picking up the phone and dialing the number (the <emphasis>address</emphasis>), then, after someone answers, asking for the person in charge of wingdings (the <emphasis>port</emphasis>).</para> - <sect4 xml:id="sockets-connect"> - <title><function>connect</function></title> + <sect4 xml:id="sockets-connect"> + <title><function>connect</function></title> - <para>Once a client has created a socket, it needs to - connect it to a specific port on a remote system. It uses + <para>Once a client has created a socket, it needs to + connect it to a specific port on a remote system. It uses &man.connect.2;:</para> -<programlisting> -int connect(int s, const struct sockaddr *name, socklen_t namelen); -</programlisting> +<programlisting>int connect(int s, const struct sockaddr *name, socklen_t namelen);</programlisting> - <para>The <varname>s</varname> argument is the socket, i.e., + <para>The <varname>s</varname> argument is the socket, i.e., the value returned by the <function>socket</function> - function. The <varname>name</varname> is a pointer to + function. The <varname>name</varname> is a pointer to <varname>sockaddr</varname>, the structure we have talked - about extensively. Finally, <varname>namelen</varname> + about extensively. Finally, <varname>namelen</varname> informs the system how many bytes are in our <varname>sockaddr</varname> structure.</para> - <para>If <function>connect</function> is successful, it - returns <constant>0</constant>. Otherwise it returns + <para>If <function>connect</function> is successful, it + returns <constant>0</constant>. Otherwise it returns <constant>-1</constant> and stores the error code in <varname>errno</varname>.</para> - <para>There are many reasons why + <para>There are many reasons why <function>connect</function> may fail. For example, with an attempt to an Internet connection, the <acronym>IP</acronym> address may not exist, or it may be down, or just too busy, or it may not have a server - listening at the specified port. Or it may outright + listening at the specified port. Or it may outright <emphasis>refuse</emphasis> any request for specific code.</para> - </sect4> - <sect4 xml:id="sockets-first-client"> - <title>Our First Client</title> + <sect4 xml:id="sockets-first-client"> + <title>Our First Client</title> <para>We now know enough to write a very simple client, one - that will get current time from <systemitem class="ipaddress">192.43.244.18</systemitem> and print it to - <filename>stdout</filename>.</para> + that will get current time from <systemitem + class="ipaddress">192.43.244.18</systemitem> and print + it to <filename>stdout</filename>.</para> -<programlisting> -/* + <programlisting>/* * daytime.c * * Programmed by G. Adam Stanislav @@ -987,85 +981,78 @@ int main() { close(s); return 0; -} -</programlisting> +}</programlisting> - <para>Go ahead, enter it in your editor, save it as + <para>Go ahead, enter it in your editor, save it as <filename>daytime.c</filename>, then compile and run it:</para> -<screen>&prompt.user; <userinput>cc -O3 -o daytime daytime.c</userinput> + <screen>&prompt.user; <userinput>cc -O3 -o daytime daytime.c</userinput> &prompt.user; <userinput>./daytime</userinput> 52079 01-06-19 02:29:25 50 0 1 543.9 UTC(NIST) * &prompt.user;</screen> - <para>In this case, the date was June 19, 2001, the time was - 02:29:25 <acronym>UTC</acronym>. Naturally, your results + <para>In this case, the date was June 19, 2001, the time was + 02:29:25 <acronym>UTC</acronym>. Naturally, your results will vary.</para> - </sect4> - </sect3> <sect3 xml:id="sockets-server-functions"> - <title>Server Functions</title> + <title>Server Functions</title> - <para>The typical server does not initiate the - connection. Instead, it waits for a client to call it and - request services. It does not know when the client will - call, nor how many clients will call. It may be just sitting - there, waiting patiently, one moment, The next moment, it - can find itself swamped with requests from a number of - clients, all calling in at the same time.</para> + <para>The typical server does not initiate the connection. + Instead, it waits for a client to call it and request + services. It does not know when the client will call, nor + how many clients will call. It may be just sitting there, + waiting patiently, one moment, The next moment, it can find + itself swamped with requests from a number of clients, all + calling in at the same time.</para> <para>The sockets interface offers three basic functions to handle this.</para> - <sect4 xml:id="sockets-bind"> - <title><function>bind</function></title> + <sect4 xml:id="sockets-bind"> + <title><function>bind</function></title> - <para>Ports are like extensions to a phone line: After you + <para>Ports are like extensions to a phone line: After you dial a number, you dial the extension to get to a specific person or department.</para> <para>There are 65535 <acronym>IP</acronym> ports, but a server usually processes requests that come in on only one - of them. It is like telling the phone room operator that + of them. It is like telling the phone room operator that we are now at work and available to answer the phone at a - specific extension. We use &man.bind.2; to tell sockets + specific extension. We use &man.bind.2; to tell sockets which port we want to serve.</para> -<programlisting> -int bind(int s, const struct sockaddr *addr, socklen_t addrlen); -</programlisting> + <programlisting>int bind(int s, const struct sockaddr *addr, socklen_t addrlen);</programlisting> - <para>Beside specifying the port in <varname>addr</varname>, - the server may include its <acronym>IP</acronym> - address. However, it can just use the symbolic constant + <para>Beside specifying the port in <varname>addr</varname>, + the server may include its <acronym>IP</acronym> address. + However, it can just use the symbolic constant <symbol>INADDR_ANY</symbol> to indicate it will serve all requests to the specified port regardless of what its - <acronym>IP</acronym> address is. This symbol, along with + <acronym>IP</acronym> address is. This symbol, along with several similar ones, is declared in <filename>netinet/in.h</filename></para> -<programlisting> -#define INADDR_ANY (u_int32_t)0x00000000 -</programlisting> + <programlisting>#define INADDR_ANY (u_int32_t)0x00000000</programlisting> - <para>Suppose we were writing a server for the + <para>Suppose we were writing a server for the <emphasis>daytime</emphasis> protocol over - <acronym>TCP</acronym>/<acronym>IP</acronym>. Recall that - it uses port 13. Our <varname>sockaddr_in</varname> + <acronym>TCP</acronym>/<acronym>IP</acronym>. Recall that + it uses port 13. Our <varname>sockaddr_in</varname> structure would look like this:</para> - <mediaobject> - <imageobject> - <imagedata fileref="sockets/sainserv"/> - </imageobject> + <mediaobject> + <imageobject> + <imagedata fileref="sockets/sainserv"/> + </imageobject> - <textobject> - <literallayout class="monospaced"> 0 1 2 3 + <textobject> + <literallayout class="monospaced"> 0 1 2 3 +--------+--------+--------+--------+ 0 | 0 | 2 | 0 | 13 | +--------+--------+--------+--------+ @@ -1075,78 +1062,73 @@ int bind(int s, const struct sockaddr *addr, socklen_t addrlen); +-----------------------------------+ 12 | 0 | +-----------------------------------+</literallayout> - </textobject> + </textobject> - <textobject> - <phrase>Example Server sockaddr_in</phrase> - </textobject> - </mediaobject> - </sect4> + <textobject> + <phrase>Example Server sockaddr_in</phrase> + </textobject> + </mediaobject> + </sect4> - <sect4 xml:id="sockets-listen"> - <title><function>listen</function></title> + <sect4 xml:id="sockets-listen"> + <title><function>listen</function></title> - <para>To continue our office phone analogy, after you have + <para>To continue our office phone analogy, after you have told the phone central operator what extension you will be at, you now walk into your office, and make sure your own - phone is plugged in and the ringer is turned on. Plus, you - make sure your call waiting is activated, so you can hear - the phone ring even while you are talking to someone.</para> + phone is plugged in and the ringer is turned on. Plus, + you make sure your call waiting is activated, so you can + hear the phone ring even while you are talking to + someone.</para> - <para>The server ensures all of that with the &man.listen.2; - function.</para> + <para>The server ensures all of that with the &man.listen.2; + function.</para> -<programlisting> -int listen(int s, int backlog); -</programlisting> + <programlisting>int listen(int s, int backlog);</programlisting> - <para>In here, the <varname>backlog</varname> variable tells + <para>In here, the <varname>backlog</varname> variable tells sockets how many incoming requests to accept while you are - busy processing the last request. In other words, it + busy processing the last request. In other words, it determines the maximum size of the queue of pending connections.</para> + </sect4> - </sect4> + <sect4 xml:id="sockets-accept"> + <title><function>accept</function></title> - <sect4 xml:id="sockets-accept"> - <title><function>accept</function></title> - - <para>After you hear the phone ringing, you accept the call - by answering the call. You have now established a - connection with your client. This connection remains + <para>After you hear the phone ringing, you accept the call + by answering the call. You have now established a + connection with your client. This connection remains active until either you or your client hang up.</para> <para>The server accepts the connection by using the - &man.accept.2; function.</para> + &man.accept.2; function.</para> -<programlisting> -int accept(int s, struct sockaddr *addr, socklen_t *addrlen); -</programlisting> + <programlisting>int accept(int s, struct sockaddr *addr, socklen_t *addrlen);</programlisting> - <para>Note that this time <varname>addrlen</varname> is a - pointer. This is necessary because in this case it is the - socket that fills out <varname>addr</varname>, the - <varname>sockaddr_in</varname> structure.</para> + <para>Note that this time <varname>addrlen</varname> is a + pointer. This is necessary because in this case it is the + socket that fills out <varname>addr</varname>, the + <varname>sockaddr_in</varname> structure.</para> - <para>The return value is an integer. Indeed, the + <para>The return value is an integer. Indeed, the <function>accept</function> returns a <emphasis>new - socket</emphasis>. You will use this new socket to + socket</emphasis>. You will use this new socket to communicate with the client.</para> - <para>What happens to the old socket? It continues to listen + <para>What happens to the old socket? It continues to listen for more requests (remember the <varname>backlog</varname> variable we passed to <function>listen</function>?) until we <function>close</function> it.</para> - <para>Now, the new socket is meant only for - communications. It is fully connected. We cannot pass it - to <function>listen</function> again, trying to accept + <para>Now, the new socket is meant only for communications. + It is fully connected. We cannot pass it to + <function>listen</function> again, trying to accept additional connections.</para> + </sect4> - </sect4> - - <sect4 xml:id="sockets-first-server"> - <title>Our First Server</title> + <sect4 xml:id="sockets-first-server"> + <title>Our First Server</title> <para>Our first server will be somewhat more complex than our first client was: Not only do we have more sockets @@ -1154,7 +1136,7 @@ int accept(int s, struct sockaddr *addr, socklen_t *addrlen); daemon.</para> <para>This is best achieved by creating a <emphasis>child - process</emphasis> after binding the port. The main + process</emphasis> after binding the port. The main process then exits and returns control to the <application>shell</application> (or whatever program invoked it).</para> @@ -1163,8 +1145,7 @@ int accept(int s, struct sockaddr *addr, socklen_t *addrlen); starts an endless loop, which accepts a connection, serves it, and eventually closes its socket.</para> -<programlisting> -/* + <programlisting>/* * daytimed - a port 13 server * * Programmed by G. Adam Stanislav @@ -1251,84 +1232,88 @@ int main() { fclose(client); } -} -</programlisting> +}</programlisting> - <para>We start by creating a socket. Then we fill out the + <para>We start by creating a socket. Then we fill out the <varname>sockaddr_in</varname> structure in <varname>sa</varname>. Note the conditional use of <symbol>INADDR_ANY</symbol>:</para> -<programlisting> - if (INADDR_ANY) - sa.sin_addr.s_addr = htonl(INADDR_ANY); -</programlisting> + <programlisting>if (INADDR_ANY) + sa.sin_addr.s_addr = htonl(INADDR_ANY);</programlisting> - <para>Its value is <constant>0</constant>. Since we have + <para>Its value is <constant>0</constant>. Since we have just used <function>bzero</function> on the entire structure, it would be redundant to set it to - <constant>0</constant> again. But if we port our code to + <constant>0</constant> again. But if we port our code to some other system where <symbol>INADDR_ANY</symbol> is perhaps not a zero, we need to assign it to - <varname>sa.sin_addr.s_addr</varname>. Most modern C + <varname>sa.sin_addr.s_addr</varname>. Most modern C compilers are clever enough to notice that - <symbol>INADDR_ANY</symbol> is a constant. As long as it + <symbol>INADDR_ANY</symbol> is a constant. As long as it is a zero, they will optimize the entire conditional statement out of the code.</para> <para>After we have called <function>bind</function> successfully, we are ready to become a <emphasis>daemon</emphasis>: We use - <function>fork</function> to create a child process. In + <function>fork</function> to create a child process. In both, the parent and the child, the <varname>s</varname> - variable is our socket. The parent process will not need + variable is our socket. The parent process will not need it, so it calls <function>close</function>, then it returns <constant>0</constant> to inform its own parent it had terminated successfully.</para> <para>Meanwhile, the child process continues working in the background. It calls <function>listen</function> and sets - its backlog to <constant>4</constant>. It does not need a + its backlog to <constant>4</constant>. It does not need a large value here because <emphasis>daytime</emphasis> is not a protocol many clients request all the time, and - because it can process each request instantly anyway.</para> + because it can process each request instantly + anyway.</para> <para>Finally, the daemon starts an endless loop, which performs the following steps:</para> <procedure> - <step><para> Call <function>accept</function>. It waits - here until a client contacts it. At that point, it - receives a new socket, <varname>c</varname>, which it - can use to communicate with this particular client. - </para></step> + <step> + <para>Call <function>accept</function>. It waits here + until a client contacts it. At that point, it + receives a new socket, <varname>c</varname>, which it + can use to communicate with this particular + client.</para> + </step> - <step><para>It uses the C function - <function>fdopen</function> to turn the socket from a - low-level <emphasis>file descriptor</emphasis> to a - C-style <varname>FILE</varname> pointer. This will allow - the use of <function>fprintf</function> later on. - </para></step> - - <step><para>It checks the time, and prints it in the - <emphasis><acronym>ISO</acronym> 8601</emphasis> format - to the <varname>client</varname> <quote>file</quote>. It - then uses <function>fclose</function> to close the - file. That will automatically close the socket as well. - </para></step> + <step> + <para>It uses the C function <function>fdopen</function> + to turn the socket from a low-level <emphasis>file + descriptor</emphasis> to a C-style + <varname>FILE</varname> pointer. This will allow the + use of <function>fprintf</function> later + on.</para> + </step> + <step> + <para>It checks the time, and prints it in the + <emphasis><acronym>ISO</acronym> 8601</emphasis> + format to the <varname>client</varname> + <quote>file</quote>. It then uses + <function>fclose</function> to close the file. That + will automatically close the socket as + well.</para> + </step> </procedure> <para>We can <emphasis>generalize</emphasis> this, and use it as a model for many other servers:</para> - <mediaobject> - <imageobject> - <imagedata fileref="sockets/serv"/> - </imageobject> + <mediaobject> + <imageobject> + <imagedata fileref="sockets/serv"/> + </imageobject> - <textobject> - <literallayout class="monospaced">+-----------------+ + <textobject> + <literallayout class="monospaced">+-----------------+ | Create Socket | +-----------------+ | @@ -1354,22 +1339,23 @@ int main() { | +--------+ | | Close | |<--------+</literallayout> - </textobject> + </textobject> - <textobject> - <phrase>Sequential Server</phrase> - </textobject> - </mediaobject> + <textobject> + <phrase>Sequential Server</phrase> + </textobject> + </mediaobject> - <para>This flowchart is good for <emphasis>sequential - servers</emphasis>, i.e., servers that can serve one + <para>This flowchart is good for <emphasis>sequential + servers</emphasis>, i.e., servers that can serve one client at a time, just as we were able to with our - <emphasis>daytime</emphasis> server. This is only possible - whenever there is no real <quote>conversation</quote> - going on between the client and the server: As soon as the - server detects a connection to the client, it sends out - some data and closes the connection. The entire operation - may take nanoseconds, and it is finished.</para> + <emphasis>daytime</emphasis> server. This is only + possible whenever there is no real + <quote>conversation</quote> going on between the client + and the server: As soon as the server detects a connection + to the client, it sends out some data and closes the + connection. The entire operation may take nanoseconds, + and it is finished.</para> <para>The advantage of this flowchart is that, except for the brief moment after the parent @@ -1379,40 +1365,40 @@ int main() { resources.</para> <para>Note that we have added <emphasis>initialize - daemon</emphasis> in our flowchart. We did not need to + daemon</emphasis> in our flowchart. We did not need to initialize our own daemon, but this is a good place in the flow of the program to set up any <function>signal</function> handlers, open any files we may need, etc.</para> <para>Just about everything in the flow chart can be used - literally on many different servers. The + literally on many different servers. The <emphasis>serve</emphasis> entry is the exception. We think of it as a <emphasis><quote>black - box</quote></emphasis>, i.e., something you design + box</quote></emphasis>, i.e., something you design specifically for your own server, and just <quote>plug it - into the rest.</quote></para> + into the rest.</quote></para> - <para>Not all protocols are that simple. Many receive a + <para>Not all protocols are that simple. Many receive a request from the client, reply to it, then receive another - request from the same client. Because of that, they do not - know in advance how long they will be serving the - client. Such servers usually start a new process for each - client. While the new process is serving its client, the + request from the same client. Because of that, they do + not know in advance how long they will be serving the + client. Such servers usually start a new process for each + client. While the new process is serving its client, the daemon can continue listening for more connections.</para> <para>Now, go ahead, save the above source code as <filename>daytimed.c</filename> (it is customary to end the names of daemons with the letter - <constant>d</constant>). After you have compiled it, try + <constant>d</constant>). After you have compiled it, try running it:</para> -<screen>&prompt.user; <userinput>./daytimed</userinput> + <screen>&prompt.user; <userinput>./daytimed</userinput> bind: Permission denied &prompt.user;</screen> - <para>What happened here? As you will recall, the - <emphasis>daytime</emphasis> protocol uses port 13. But + <para>What happened here? As you will recall, the + <emphasis>daytime</emphasis> protocol uses port 13. But all ports below 1024 are reserved to the superuser (otherwise, anyone could start a daemon pretending to serve a commonly used port, while causing a security @@ -1420,27 +1406,27 @@ bind: Permission denied <para>Try again, this time as the superuser:</para> -<screen>&prompt.root; <userinput>./daytimed</userinput> + <screen>&prompt.root; <userinput>./daytimed</userinput> &prompt.root;</screen> - <para>What... Nothing? Let us try again:</para> + <para>What... Nothing? Let us try again:</para> -<screen>&prompt.root; <userinput>./daytimed</userinput> + <screen>&prompt.root; <userinput>./daytimed</userinput> bind: Address already in use &prompt.root;</screen> - <para>Every port can only be bound by one program at a - time. Our first attempt was indeed successful: It started - the child daemon and returned quietly. It is still running + <para>Every port can only be bound by one program at a time. + Our first attempt was indeed successful: It started the + child daemon and returned quietly. It is still running and will continue to run until you either kill it, or any of its system calls fail, or you reboot the system.</para> - <para>Fine, we know it is running in the background. But is + <para>Fine, we know it is running in the background. But is it working? How do we know it is a proper <emphasis>daytime</emphasis> server? Simple:</para> -<screen>&prompt.user; <userinput>telnet localhost 13</userinput> + <screen>&prompt.user; <userinput>telnet localhost 13</userinput> Trying ::1... telnet: connect to address ::1: Connection refused @@ -1451,18 +1437,18 @@ Escape character is '^]'. Connection closed by foreign host. &prompt.user;</screen> - <para><application>telnet</application> tried the new - <acronym>IP</acronym>v6, and failed. It retried with + <para><application>telnet</application> tried the new + <acronym>IP</acronym>v6, and failed. It retried with <acronym>IP</acronym>v4 and succeeded. The daemon works.</para> - <para>If you have access to another &unix; system via + <para>If you have access to another &unix; system via <application>telnet</application>, you can use it to test - accessing the server remotely. My computer does not have a - static <acronym>IP</acronym> address, so this is what I + accessing the server remotely. My computer does not have + a static <acronym>IP</acronym> address, so this is what I did:</para> -<screen>&prompt.user; <userinput>who</userinput> + <screen>&prompt.user; <userinput>who</userinput> whizkid ttyp0 Jun 19 16:59 (216.127.220.143) xxx ttyp1 Jun 19 16:06 (xx.xx.xx.xx) @@ -1475,10 +1461,10 @@ Escape character is '^]'. Connection closed by foreign host. &prompt.user;</screen> - <para>Again, it worked. Will it work using the domain name? - </para> + <para>Again, it worked. Will it work using the domain + name?</para> -<screen>&prompt.user; <userinput>telnet r47.bfm.org 13</userinput> + <screen>&prompt.user; <userinput>telnet r47.bfm.org 13</userinput> Trying 216.127.220.143... Connected to r47.bfm.org. @@ -1487,19 +1473,15 @@ Escape character is '^]'. Connection closed by foreign host. &prompt.user;</screen> - <para>By the way, <application>telnet</application> prints + <para>By the way, <application>telnet</application> prints the <emphasis>Connection closed by foreign host</emphasis> - message after our daemon has closed the socket. This shows - us that, indeed, using + message after our daemon has closed the socket. This + shows us that, indeed, using <function>fclose(client);</function> in our code works as advertised.</para> - - </sect4> - + </sect4> </sect3> - </sect2> - </sect1> <sect1 xml:id="sockets-helper-functions"> @@ -1508,38 +1490,34 @@ Connection closed by foreign host. <para>FreeBSD C library contains many helper functions for sockets programming. For example, in our sample client we hard coded the <systemitem class="fqdomainname">time.nist.gov</systemitem> - <acronym>IP</acronym> address. But we do not always know the + <acronym>IP</acronym> address. But we do not always know the <acronym>IP</acronym> address. Even if we do, our software is more flexible if it allows the user to enter the - <acronym>IP</acronym> address, or even the domain name. - </para> + <acronym>IP</acronym> address, or even the domain name.</para> <sect2 xml:id="sockets-gethostbyname"> <title><function>gethostbyname</function></title> <para>While there is no way to pass the domain name directly to - any of the sockets functions, the FreeBSD C library comes with - the &man.gethostbyname.3; and &man.gethostbyname2.3; functions, - declared in <filename>netdb.h</filename>.</para> + any of the sockets functions, the FreeBSD C library comes with + the &man.gethostbyname.3; and &man.gethostbyname2.3; + functions, declared in <filename>netdb.h</filename>.</para> -<programlisting> -struct hostent * gethostbyname(const char *name); -struct hostent * gethostbyname2(const char *name, int af); -</programlisting> + <programlisting>struct hostent * gethostbyname(const char *name); +struct hostent * gethostbyname2(const char *name, int af);</programlisting> <para>Both return a pointer to the <varname>hostent</varname> - structure, with much information about the domain. For our - purposes, the <varname>h_addr_list[0]</varname> field of the - structure points at <varname>h_length</varname> bytes of the - correct address, already stored in the <emphasis>network byte - order</emphasis>.</para> + structure, with much information about the domain. For our + purposes, the <varname>h_addr_list[0]</varname> field of the + structure points at <varname>h_length</varname> bytes of the + correct address, already stored in the <emphasis>network byte + order</emphasis>.</para> <para>This allows us to create a much more flexible—and - much more useful—version of our - <application>daytime</application> program:</para> + much more useful—version of our + <application>daytime</application> program:</para> -<programlisting> -/* + <programlisting>/* * daytime.c * * Programmed by G. Adam Stanislav @@ -1589,26 +1567,28 @@ int main(int argc, char *argv[]) { close(s); return 0; -} -</programlisting> +}</programlisting> <para>We now can type a domain name (or an <acronym>IP</acronym> - address, it works both ways) on the command line, and the - program will try to connect to its - <emphasis>daytime</emphasis> server. Otherwise, it will still - default to <systemitem class="fqdomainname">time.nist.gov</systemitem>. However, even in - this case we will use <function>gethostbyname</function> - rather than hard coding <systemitem class="ipaddress">192.43.244.18</systemitem>. That way, even if its - <acronym>IP</acronym> address changes in the future, we will - still find it.</para> + address, it works both ways) on the command line, and the + program will try to connect to its + <emphasis>daytime</emphasis> server. Otherwise, it will still + default to <systemitem + class="fqdomainname">time.nist.gov</systemitem>. However, + even in this case we will use + <function>gethostbyname</function> rather than hard coding + <systemitem class="ipaddress">192.43.244.18</systemitem>. + That way, even if its <acronym>IP</acronym> address changes in + the future, we will still find it.</para> <para>Since it takes virtually no time to get the time from your - local server, you could run <application>daytime</application> - twice in a row: First to get the time from <systemitem class="fqdomainname">time.nist.gov</systemitem>, the second time from - your own system. You can then compare the results and see how - exact your system clock is:</para> + local server, you could run <application>daytime</application> + twice in a row: First to get the time from <systemitem + class="fqdomainname">time.nist.gov</systemitem>, the second + time from your own system. You can then compare the results + and see how exact your system clock is:</para> -<screen>&prompt.user; <userinput>daytime ; daytime localhost</userinput> + <screen>&prompt.user; <userinput>daytime ; daytime localhost</userinput> 52080 01-06-20 04:02:33 50 0 0 390.2 UTC(NIST) * @@ -1616,61 +1596,54 @@ int main(int argc, char *argv[]) { &prompt.user;</screen> <para>As you can see, my system was two seconds ahead of the - <acronym>NIST</acronym> time.</para> - + <acronym>NIST</acronym> time.</para> </sect2> <sect2 xml:id="sockets-getservbyname"> <title><function>getservbyname</function></title> <para>Sometimes you may not be sure what port a certain service - uses. The &man.getservbyname.3; function, also declared in - <filename>netdb.h</filename> comes in very handy in those - cases:</para> + uses. The &man.getservbyname.3; function, also declared in + <filename>netdb.h</filename> comes in very handy in those + cases:</para> -<programlisting> -struct servent * getservbyname(const char *name, const char *proto); -</programlisting> +<programlisting>struct servent * getservbyname(const char *name, const char *proto);</programlisting> <para>The <varname>servent</varname> structure contains the - <varname>s_port</varname>, which contains the proper port, - already in <emphasis>network byte order</emphasis>.</para> + <varname>s_port</varname>, which contains the proper port, + already in <emphasis>network byte order</emphasis>.</para> <para>Had we not known the correct port for the - <emphasis>daytime</emphasis> service, we could have found it - this way:</para> + <emphasis>daytime</emphasis> service, we could have found it + this way:</para> -<programlisting> - struct servent *se; + <programlisting>struct servent *se; ... if ((se = getservbyname("daytime", "tcp")) == NULL { fprintf(stderr, "Cannot determine which port to use.\n"); return 7; } - sa.sin_port = se->s_port; -</programlisting> - - <para>You usually do know the port. But if you are developing a - new protocol, you may be testing it on an unofficial - port. Some day, you will register the protocol and its port - (if nowhere else, at least in your - <filename>/etc/services</filename>, which is where - <function>getservbyname</function> looks). Instead of - returning an error in the above code, you just use the - temporary port number. Once you have listed the protocol in - <filename>/etc/services</filename>, your software will find - its port without you having to rewrite the code.</para> + sa.sin_port = se->s_port;</programlisting> + <para>You usually do know the port. But if you are developing a + new protocol, you may be testing it on an unofficial port. + Some day, you will register the protocol and its port (if + nowhere else, at least in your + <filename>/etc/services</filename>, which is where + <function>getservbyname</function> looks). Instead of + returning an error in the above code, you just use the + temporary port number. Once you have listed the protocol in + <filename>/etc/services</filename>, your software will find + its port without you having to rewrite the code.</para> </sect2> - </sect1> <sect1 xml:id="sockets-concurrent-servers"> <title>Concurrent Servers</title> <para>Unlike a sequential server, a <emphasis>concurrent - server</emphasis> has to be able to serve more than one client - at a time. For example, a <emphasis>chat server</emphasis> may + server</emphasis> has to be able to serve more than one client + at a time. For example, a <emphasis>chat server</emphasis> may be serving a specific client for hours—it cannot wait till it stops serving a client before it serves the next one.</para> @@ -1678,11 +1651,11 @@ struct servent * getservbyname(const char *name, const char *proto); <mediaobject> <imageobject> - <imagedata fileref="sockets/serv2"/> + <imagedata fileref="sockets/serv2"/> </imageobject> <textobject> - <literallayout class="monospaced">+-----------------+ + <literallayout class="monospaced">+-----------------+ | Create Socket | +-----------------+ | @@ -1718,23 +1691,23 @@ struct servent * getservbyname(const char *name, const char *proto); </textobject> <textobject> - <phrase>Concurrent Server</phrase> + <phrase>Concurrent Server</phrase> </textobject> </mediaobject> <para>We moved the <emphasis>serve</emphasis> from the <emphasis>daemon process</emphasis> to its own <emphasis>server - process</emphasis>. However, because each child process inherits - all open files (and a socket is treated just like a file), the - new process inherits not only the <emphasis><quote>accepted - handle,</quote></emphasis> i.e., the socket returned by the - <function>accept</function> call, but also the <emphasis>top - socket</emphasis>, i.e., the one opened by the top process right - at the beginning.</para> + process</emphasis>. However, because each child process + inherits all open files (and a socket is treated just like a + file), the new process inherits not only the + <emphasis><quote>accepted handle,</quote></emphasis> i.e., the + socket returned by the <function>accept</function> call, but + also the <emphasis>top socket</emphasis>, i.e., the one opened + by the top process right at the beginning.</para> <para>However, the <emphasis>server process</emphasis> does not need this socket and should <function>close</function> it - immediately. Similarly, the <emphasis>daemon process</emphasis> + immediately. Similarly, the <emphasis>daemon process</emphasis> no longer needs the <emphasis>accepted socket</emphasis>, and not only should, but <emphasis>must</emphasis> <function>close</function> it—otherwise, it will run out @@ -1743,36 +1716,33 @@ struct servent * getservbyname(const char *name, const char *proto); <para>After the <emphasis>server process</emphasis> is done serving, it should close the <emphasis>accepted - socket</emphasis>. Instead of returning to - <function>accept</function>, it now exits. - </para> + socket</emphasis>. Instead of returning to + <function>accept</function>, it now exits.</para> <para>Under &unix;, a process does not really - <emphasis>exit</emphasis>. Instead, it - <emphasis>returns</emphasis> to its parent. Typically, a parent + <emphasis>exit</emphasis>. Instead, it + <emphasis>returns</emphasis> to its parent. Typically, a parent process <function>wait</function>s for its child process, and - obtains a return value. However, our <emphasis>daemon - process</emphasis> cannot simply stop and wait. That would - defeat the whole purpose of creating additional processes. But + obtains a return value. However, our <emphasis>daemon + process</emphasis> cannot simply stop and wait. That would + defeat the whole purpose of creating additional processes. But if it never does <function>wait</function>, its children will become <emphasis>zombies</emphasis>—no longer functional but still roaming around.</para> <para>For that reason, the <emphasis>daemon process</emphasis> needs to set <emphasis>signal handlers</emphasis> in its - <emphasis>initialize daemon</emphasis> phase. At least a + <emphasis>initialize daemon</emphasis> phase. At least a <symbol>SIGCHLD</symbol> signal has to be processed, so the daemon can remove the zombie return values from the system and release the system resources they are taking up.</para> <para>That is why our flowchart now contains a <emphasis>process - signals</emphasis> box, which is not connected to any other box. - By the way, many servers also process <symbol>SIGHUP</symbol>, - and typically interpret as the signal from the superuser that - they should reread their configuration files. This allows us to - change settings without having to kill and restart these - servers.</para> - + signals</emphasis> box, which is not connected to any other + box. By the way, many servers also process + <symbol>SIGHUP</symbol>, and typically interpret as the signal + from the superuser that they should reread their configuration + files. This allows us to change settings without having to kill + and restart these servers.</para> </sect1> - </chapter>