Wednesday, February 17, 2010

Day 31

I am just going to title this post because I know the connotation it will inevitably invoke.
How the SocketServer services the Client's Socket
I apologize if you find this unnecessary or inappropriate, but I can only imagine what is going through Kichu's head while he reads this.

Creating a multi-threaded system is always interesting. It is extremely difficult to be certain none of your threads will collide. It is virtually impossible to be certain if you have no tests. There are always a multitude of unexpected errors, race conditions, and concurrent modification issues (yes yes, I know this is a java error name) that can pop out and surprise you... or worse, lurk around until your not paying attention and then bite you in the a**. This is why it is important to be consistently coding in a thread safe manner, using locks and semaphores when you can, testing lots of possible variations of how the code can be run, and running your tests several times in a row each time you make a threading change.

For my Socket Server I had to spawn up a flurry of threads to run many simultaneous actions. So first, a brief over view of how the Socket Server worked, and then I will explain how I remained thread safe.

Some application would start up the Socket Server, which would create a Server Socket to watch a specific port. Then presumably this application would like to be able to accomplish others tasks while the Server Socket is simultaneously monitoring its specific port, thus the Server Socket needs to sit in its own thread called the ServerSocketThread.

The Server Socket is in a constant loop, checking to see if any Client Sockets have tried to talk on the specified port, for as long as the Server Socket is let to be open. If a Client Socket attempts to connect on this port, the Server Socket needs to accept the client's request. To accept, the Server Socket must create a server side socket to pair with the Client Socket, and then instruct this server side socket, called the SocketServicer, to serve the Client Socket. The SocketServicer has to serve the Client Socket while the Server Socket is still monitoring the port, thus a new thread, called a nobleServiceThread, must be created for every single Client Socket- SocketServicer pair.

Once the SocketServicer has finished serving the Client Socket, the nobleServiceThread will automatically close itself and end the connection as well as terminate itself.

If the Server wants to close one of it's Server Sockets, the Server Socket will have to make sure all of it's SocketServicers and nobleServiceThreads are closed and terminated before it can be closed. There must, therefore, be a list of all of the nobleServiceThreads. The Server Socket must loop through this nobleServiceThreads list and terminate each thread. To simply kill an active Socket, however, would mean you are suddenly cutting a client off in the middle of an interaction, and that is bad practice. Thus the nobleServiceThreads must first be given a chance for the Socket to finish and close up before they are cut off, and as such must be given a TimeOut Period before they are automatically cut off regardless of status. Once cut off, the deactive thread must be removed from the list.

Ok, now down to business.

There are a few smaller and somewhat trivial threading issues, some of which involving testing, but I am just going to go over the biggest and most interesting one I encountered.

So we have this list of nobleServiceThreads that each are running a Socket which can finish at any time. We also have a Server Socket which might want to close at any time, thus having to close all of the nobleServiceThreads. Thus, right away we can see the race condition. Can you guess what it is?
If, while looping through and closing the list of nobleServiceThreads, a Socket finishes it's interactions and decides to close, it could potentially remove itself from the list we are iterating through. This means we could be trying to remove a thread that is at the same time removing itself. Lets look at some code:

First, here is the Server Socket's thread/loop which watches a port and accepts incoming Client Sockets:
private Thread makeSocketThread()
{
return new Thread(new Runnable()
{

public void run()
{
while (serverSocketOpen)
{
runServerSocket();
}
}
});
}

private void runServerSocket()
{
try
{
Socket clientSocket = serverSocket.accept();

Thread servicerThread = new Thread(new ServiceRunner(clientSocket));
nobleServiceThreads.add(servicerThread);
servicerThread.start();

}
catch (IOException e)
{
}
}


We can see that as long as the Server Socket is open, it will continue to make new threads for incoming Clients.

Here is what the nobleServiceThread does:

private class ServiceRunner implements Runnable
{
Socket clientSocket;

public ServiceRunner(Socket clientSocket)
{
this.clientSocket = clientSocket;
}

public void run()
{
try
{

applicationServer.serve(clientSocket);
clientSocket.close();

}
catch (IOException e)
{
e.printStackTrace();
}
finally{
lastBreath();
}
}

private void lastBreath()
{
nobleServiceThreads.remove(Thread.currentThread());
}
}


I call them nobleServiceThreads because if they are about to be killed, their very last wish is to remove themselves from the list such that they don't burden the ServerSocket with dead weight. How kind.

The first step to preventing the thread from removing itself from the list while it is being removed elsewhere, is to make the nobleServiceThreads list a Synchronized list:

private List<Thread> nobleServiceThreads = Collections.synchronizedList(new ArrayList<Thread>());


Next, since we don't want a thread to be potentially modifying our list while we are iterating through it, we don't use an iterator to go through the list. Instead we just loop while the list isn't empty:

public void close() throws IOException, InterruptedException
{
if (serverSocketOpen)
{
serverSocketOpen = false;
while (nobleServiceThreads.size() > 0)
{
nobleServiceThreads.get(0).join();
nobleServiceThreads.remove(0);
}
serverSocket.close();
}
else
serverSocket.close();
}


Now this is pretty good, but we still have an issue. This one is a little tougher to guess. If, and this actually happened about 1/5 times when I ran the tests, a nobleServiceThread is active right as the while (nobleServiceThreads.size() > 0) is checked, and then right before the very next line, the nobleServiceThread removes itself from the list - then the nobleServiceThreads.get(0) could return null (if this was the very last thread).

This is a tough issue. For awhile I was almost tempted to put in 100 if statements checking to see if the thread was still there before I removed it, but of course that wouldn't change much. So instead, with Micah's help, here is the solution I've got:

public void close() throws IOException, InterruptedException
{
if (serverSocketOpen)
{
serverSocketOpen = false;
serverSocket.close();
serverSocketThread.join();
while (nobleServiceThreads.size() > 0)
{
Thread thread = null;
synchronized (mutex)
{
if (nobleServiceThreads.size() > 0)
{
thread = nobleServiceThreads.get(0);
}
}

if (thread != null)
{
thread.join(TIMEOUT_PERIOD);

synchronized (mutex)
{
if (nobleServiceThreads.contains(thread))
{
nobleServiceThreads.remove(0);
}
}
}
}
}
else
serverSocket.close();
}


The mutex is just a regular object that the synchronized block can hold onto, thus preventing anything else, trying to use the same mutex in a different synchronized block, from acting. Here is how the nobleServiceThread changed:

private void lastBreath()
{
synchronized (mutex)
{
nobleServiceThreads.remove(Thread.currentThread());
}
}


This prevents the nobleServiceThread from removing itself from the list if the ServerSocket is looking at this same nobleServiceThread to remove it.


Now you might ask " Since the nobleServiceThread's lastBreath is used in the finally{ ... } clause (which will execute in the end, no matter what, when you try to kill the thread), why not just call interrupt on the thread as soon as you've got it?"

This is because the applicationServer.serve() method might be talking to a client. If it is waiting on a response from the client, and thus is in a reading block, than the interrupt wont reach the thread until the reading block is finished. Thus, the interrupt could potentially wait forever while the thread waits for a response.

The interrupt signal actually only reaches the thread in 3 instances. When the thread is in a wait(), join(), or sleep() method. So a thread can only be interrupted when it isn't doing anything. (In case you didn't know, a thread is in wait() while it is waiting for its chance to use the processor, and in a join() when the application is telling it to finish its last task and then terminate). So if you send an interrupt signal to a thread while it is performing some task, a flag will be marked and next time the thread is waiting, the JVM will check the flag and terminate the thread.

I must give credit to the Software Craftsmanship Articles 8-11 written by Uncle Bob, since they walked through many of the same steps I listen here.

No comments:

Post a Comment