Monday, February 15, 2010

Day 29

Programming a java HTTP Socket Server is proving to be quite interesting. I spent several hours today just reading and taking notes of the Hypertext Transfer Protocols and all the complexity of moving information around the internet. Then I actually got some help from the Craftsmanship Articles written by Uncle Bob, which I had read last year. It turns out that Alphonse and Jerry created their own Socket Server in java, which is quite similar to a part of what I need to do. I also just got done reading a rather interesting article about what it means to program an AI, and it turned the idea of human intelligence on its head, if i may.

There is a plethora of detail in HTTP, and the RFC documents are very dense and a tough read. I found the only way I can really absorb much from it is by taking notes and trying to spit back the knowledge I acquired from these documents.

HTTP has gone through a few versions now, starting with HTTP/0.9 which was extremely basic, and evolving into HTTP/1.1 which seems to be the standard these days. In its most basic form, all the HTTP is, is a standard way to ask for or send to a program some packets of data.

All HTTP interactions happen through a Request - Response form. First a client (a program connected to a server) sends a request to a server. Once the server receives this request, it processes it, and builds and sends the appropriate response. If the client wants any further information, it must package up and send another request. Often there are several intermediaries as well, such as Proxies or Gateways, which will receive a message from the client and then perhaps perform some manipulation before forwarding the message to the server.

Most programs using HTTP to interact will also have a cache. This is some local memory used to store different requests and responses, already packaged and ready to be sent. Using a cache can save a server or client a lot of processing time, so that the only real limitation they face will be the time it takes to get back a response.
Caches can also be particularly useful if there several proxies, gateways, or intermediate servers sitting between the Client and the Origin Server. If, for example, a client needs to send a request through proxy A and B before reaching the server, then both proxy A and proxy B can build up a cache to save time. Proxy A could save the Client's request, and the response coming back from the Origin Server. This way, the next time Proxy A gets that same request, it can just send the stored response right away without having to wait to hear back from the server.

When I say 'package' a request or response, I mean forming a proper HTTP message. The desired data in any message is called the entity of that message. The entity can't be sent on its own because it might not be in a familiar form for many different programs on the web, so it must be packaged into a formated HTTP message. These typically consist of a few header-fields which define certain basics about the entity, like its length or perhaps encryption type, and then the message-body which will contain the entity.

Typically if an HTTP formated message wants to send complicated data structures it will use MIME types. A MIME, Multi-purpose Internet Mail Extension, is just a standard way or form to send complicated data in HTTP messages. There are a variety of MIMEs, and I have much more to learn about them in order to understand how they actually vary.

A Socket Server is a type of server that uses Sockets to handle multiple requests. Say you have a server hosting your website which provides some service to clients. The most basic of servers will be able to talk to just one client at a time, and will focus all of its attention at that one client (spending far far too much time just waiting for the next request); but say you want your website talking to multiple clients at the same time. You might use a Socket Server to create a Server Socket for each port you wish to talk on, and then generate a Socket for each new client it wishes to talk to. So a Socket Server will make Server Sockets. A Server Socket is a place holder that is watching for any communication on a port, like port 80 (the standard port for communication on the internet). Once the Server Socket sees a client trying to talk to your server on its specified port, it will create a Socket for the client to talk to. The Socket will then handle any requests from that one client, and send responses back once it is able to get some processor time to use the program sitting on your website.

Its been pretty cool learning how to get programs talking over the web, and I am excited to make my own socket server and see it talk with my browser!

1 comment: