A distributed simulation game

A description of the design and implementation of a distributed simulation game, running on a variety of local area networks (Novell NetWare, Banyan VINES, NetBIOS). It uses an object-oriented design with separate and distributed publishers and subscribers.

Introduction

Imagine a computer science lab with some 50 students, intently peering at their monitors and engaged in a red-hot debate concerning investments, price settings, and wages, all the time pointing to charts that show how well (or how bad) their company fares in the simulated economy that comprises their companies and a somewhat lax government. This is the economic simulation game in full action.

The economic simulation game is a distributed application that my company developed during the past two years for a Dutch university. Its ultimate object is to teach students the causes and effects of both micro- and macro-economic decisions; its means is a Windows-hosted set of applications that are interconnected through a LAN (Novell NetWare, Banyan VINES, NetBIOS, and a shared-memory network simulation are supported) and that together form a simulated national economy, consisting of some 30-50 companies, a government, and a banking sector.

The game is played in rounds which last for anywhere between 1 and 30000 seconds (the former being primarily useful for quick simulation runs, the latter not being useful at all), although normally, a round lasts for 60-120 seconds. During each round, each company analyzes its position by means of its balance sheet, profit and loss statements, production and inventory data, and spreadsheets and charts that allow display and analysis of the corresponding historical data. Based on this information, its ‘board of directors’ decides on its prices, investments, and so on. At the end of each round, the then current individual prices, investment and labor demands are fed into an economic model developed at the same university, and actual sales, labor allocations and order stocks are distributed over the companies. The process then continues for the next round, normally for 2-3 hours in total (app. 120 rounds).

Overview of the article

In this article, we’ll discuss the (distributed, object-oriented - phew) design of the game software. After that, we’ll get into the way that the software abstracts the network communications in order to allow it to run on top of several different protocols. Finally, we’ll show a nice and clean way to dynamically load new C++ classes from a DLL.

Back to the top

Overall software design

The game software uses an object-oriented model (implemented in C++) of the game, in which companies, the national economy, and also the game controller itself are all objects. Each object is responsible for the part of the economic model that it represents, which implies that companies keep track of their own inventory, production, balance sheet, etc., while the national economy consolidates summary data from individual companies, and adds its own monetary policies and market distribution mechanisms. The game controller object is responsible for overall control of the game, such as pacing, but also for the login/logout process that participants must go through. To make this happen, a lot of communication is required.

The basic communication pattern between the national economy object on the one hand, and the individual companies on the other, is shown in Figure 1. In this figure, time runs from left to right. At the end of each round, the economy object uses up-to-date information from the company objects to perform the first series of calculations (which involve allocation of sales and labor, among others), and relays this information back to the companies. On receipt of this information, each company finalizes the then expired period, updates its balance sheet, inventory, and so forth, and prepares for the next round. Meanwhile, the economy waits until all companies have finalized the period, after which it proceeds to perform its finalization.

Communication between national economy and company

Figure 1: Communication between national economy and company objects

Back to the top

Publishers and subscribers

The pattern shown in Figure 1 is appropriate for objects that live on a single computer, but less so for a truly distributed system. In the latter case, network communications necessitate a more elaborate approach to allow communication between objects living on different machines (or in general, in different address spaces). In the game software, as in many other distributed applications, this is solved by using a form of proxies. Each company object becomes a remote publisher of information, while the economy object communicates with local subscribers to that information. This modified communication pattern is shown in Figure 2, where it should be understood that the economy object and the company subscribers live on one computer, while the company publishers live on several different other computers.

Communication between economy and companies in a distributed

Figure 2: Communication between economy and companies in a distributed environment

In the interest of efficiency, company subscribers cache the most important information from their publishers, which is updated as necessary by those publishers. In the simulation game, this means that information pertaining to the current round is frequently updated as players take decisions regarding their company’s policy, while at the end of a round, the information is frozen and added to the historical data that both publishers and subscribers maintain.

Communication channel between a publisher and its

Figure 3: Communication channel between a publisher and its subscribers

Publishers and subscribers are not restricted to a 1-to-1 relation; in general, a publisher can have any number of subscribers. The underlying model is shown in Figure 3. In this model, a publisher owns a publication channel to which subscribers may connect. The channel has a bus-like structure, which allows the publisher to send updates to its subscribers by means of a broadcast (or more precisely, a multicast). This is not the only communication pattern, however. Individual subscribers may also request specific information from their publisher (as in a classical client/server relationship), and conversely, a publisher may request information from one of its subscribers as well. In this situation, the subscriber under consideration acts as an ‘author’ of information. In the game software, this is primarily used when a previous game is being reopened from the central game repository: the company subscribers (acting as authors) that are located on the economy’s computer are used to initialize their publishers to the state where the game was previously left off. Other applications are feasible with this approach, for example a kind of ‘blackboard’ architecture where several parties contribute to a set of common knowledge that is maintained by the publisher.

As it turns out, the publisher/subscriber approach is not only useful for the distribution of company information. Each participating computer also displays a summary of the national economic data and, of course, the state of the game at large (current round, progress in the current round, etc.) Similarly to the companies’ information, the distribution of economic and game information is accomplished by the publisher/subscriber mechanism. The national economy object is turned into an economy publisher, all other computers are equipped with economy subscribers, et voilà! The same applies to the game controller, whose subscribers display the simulated game date, time, and news bulletin messages broadcast by the ‘government’. To keep the information from different sources separated, we have as many different publishing channels as there are publishers; in the case of the game software this amounts to n + 2, where n is the number of companies and the two additional channels are for the economy and the game controller.

Distribution of publishers and subscribers over participating

Figure 4: Distribution of publishers and subscribers over participating computers

All in all, what we have on the various participating computers is depicted in Figure 4. In this figure, we have shown a hypothetical situation with four computers involved. The topmost computer is the one operated by the instructor. It contains publishers for game and economic information, and subscribers for all companies. Below it are two computers of teams that participate in the game; each one has a subscriber for the game information and one for the economic data. In addition, they both house a publisher for their respective companies. Finally, at the bottom of the figure is a computer used by an outside observer, who subscribes to the game, the economic data, and all companies, but publishes nothing of itself.

Back to the top

Publisher and subscriber classes

To capture the commonality of the publisher/subscriber pattern, we have developed a small hierarchy of classes, shown in Figure 5. The top-level class is called Actor and captures the common aspects of both publishers and subscribers. Among other things, this is the ability to communicate across the network through the use of a Port. A Port object represents an endpoint in a network connection, similar to like-named entities in network protocols (sometimes called sockets).

Hierarchy of publisher and subscriber classes

Figure 5: Hierarchy of publisher and subscriber classes

Derived from the Actor base class are the actual Publisher and Subscriber classes. Their different purposes in life are reflected in their interfaces: a Subscriber object is aware of the network address of its Publisher, while the converse is not true.

A representative part of the C++ declarations of these classes is given in Listing 1. Starting with class cActor, you will find that it contains a cPort object that takes care of the actual network communications (discussed later on), a number of functions relating to this port and the associated communication channel, and some functions for datagram management. This last set of functions is in fact central to the purpose of the cActor hierarchy, which is to communicate across the network. At this level, communication is performed by means of a datagram abstraction (not shown here). The datagram abstraction contains fields to indicate the message type (e.g. information update, shutdown notification), its class (e.g. request/response, broadcast, acknowledgement), some routing information, and of course the message contents itself. Internally, datagrams are kept in a pre-allocated pool, which is why cActors and their derivations must explicitly acquire and release the cDatagram instances.

Dispatch of incoming datagrams in the cActor hierarchy is done through message maps similar to the message maps in Borland’s OWL or Microsoft’s MFC frameworks. The cActor class contains the dispatching mechanism proper, as well as a default message map that deals with a few predefined message types (such as diagnostics). Derived classes can add their own message handling or override existing mappings by including message maps in their own class declarations. Messages that do not appear in any map are handled by cActor::UnknownMessage(); by default, this function simply releases the datagram (in the debug version, it also produces a trace message to that effect). A similar function, cActor::IgnoreMessage() can be used to explicitly indicate that the message type is known, but needs no further handling. It too will release the datagram to its pool.

The derived class cPublisher and cSubscriber specialize the default behavior by adding support for the maintenance of a publication channel, and the subscription to it, respectively. In particular, functions for the orderly shutdown of a channel are implemented for both parties. Only the cSubscriber class needs a message handler to cover this possibility.

As a final remark, you may have noticed that the actual allocation of publication channels is not covered by one of the cActor-derived classes; instead, this responsibility is delegated to (a helper class of) the game controller which oversees channel management in general and is not part of the cActor hierarchy.

Back to the top

Combining actor classes with domain classes

The cActor hierarchy alone knows nothing about companies, the economy, or whatever domain-specific kind of information. To create actual company (and economy, and game) publishers and subscribers, we must combine the cActor functionality with domain-specific classes. We shall use the company classes as an example; the other domain classes are dealt with in much the same way.

During the design of the simulation game software, it appeared that is was useful to introduce two basic company classes, cCompany and cCompanyPlayer. See Figure 6. Class cCompany represents the basic structure of a company, including its historical data. Class cCompanyPlayer extends this functionality by adding detailed information about the machine inventory and the company’s policy, both of which are only available to the companies themselves, not to outside observers. Class cCompany is derived from a document-type base class provided by the application framework that we used to interface to the Windows environment; we used Borland’s OWL, but other frameworks with similar doc/view architectures would also be usable.

To this basic company functionality, the publisher and subscriber functionality was added by using the cPublisher and cSubscriber classes, respectively, as mix-in classes in a multiple inheritance setting. Since there was no chance that both of them would appear as a base class for a further derived class, their common ancestor cActor didn’t need to be virtual. The resulting classes are cCompanyPublisher (a player that is also active as a publisher), and cCompanySubscriber (an observer that tracks updates from its publisher). Each derived class has its own message map, which takes care of properly dispatching initialization and update broadcasts (in case of the subscriber), and information requests (in case of the publisher). In all, 14 different message types are currently defined for the communication across a company publication channel. A similar number applies to both the economy and the game control channels.

Company class hierarchy

Figure 6: Company class hierarchy

Back to the top

Network communications classes

So far, the actual network communications have only received passing mention. Let’s turn to this part of the system now.

As far as actors are concerned, a Port object is all they need to know about the network. Behind the screens, however, a lot goes on to make this abstraction work. For one thing, the simulation game has to run on top of several different LAN types (Novell NetWare, Banyan VINES, and generic NetBIOS were the targets, with AppleTalk also being used occasionally) and several different operating systems, although we will restrict ourselves to 16-bit Windows for the present. However, the design and most of the implementation of the network layer runs on most of the platforms, availability of the protocol permitting of course. Furthermore, the network protocol is freely selectable (given the presence of the corresponding network) and new protocols may even be added at runtime. How did we accomplish this?

Being of an object-oriented inclination, abstraction and encapsulation in a number of classes is once again the key. In effect, we designed our own transport layer that operates in terms of an abstract network protocol. For each of the network protocols that must be supported, we provided concrete implementations of the abstract protocol in terms of the API of the protocol under consideration. So, for Novell NetWare we used IPX, for Banyan VINES IPC, for NetBIOS datagrams, and for AppleTalk datagrams as well. We have not yet implemented a TCP/IP version, but that one would use UDP. Before we get entangled in the details, consider Figure 7 for an overview of the relationships among the classes. The corresponding class declarations may be found in Listing 2.

Class diagram for the network communications

Figure 7: Class diagram for the network communications

Class cTransportManager takes responsibility for the overall communication management. It exposes its services to the actors by means of the intermediate cPort objects that we first encountered in the cActor hierarchy. As shown in Listing 2, the cPort class offers to its clients the ability to send datagram messages in several ways. Conversely, when a datagram is received, the cPort object will call back its cActor object and let it dispatch the datagram as dictated by the actor’s message maps. Class cTransportManager does not create or destroy cPort objects, since they are normally assumed to be part of other objects, but does provide the means to connect and disconnect them to the network as appropriate. Furthermore, the cPort objects can use the cTransportManager::Send...() functions to forward the datagrams that are submitted by their own clients.

On the network side, cTransportManager uses the abstract interface of class cNetProtocol to get the datagrams from and to the port objects across the physical network. Class cNetProtocol is responsible for implementing some basic network services that are present in all network protocols considered. In the concrete derivations of the abstract cNetProtocol class, functions such as cNetProtocol::SendBroadcast() and cNetProtocol::SendMessage() map almost immediately to corresponding protocol API functions as indicated above. As an additional feature, we have also implemented a shared memory network simulation, which allows us to test the network classes on a single computer. Originally, we did this for testing only, but this pseudo-network protocol turned out to be quite useful for stand-alone demonstrations of the simulation game as well and is now a standard part of the software distribution.

The final two classes in Figure 7 are cDatagram and cDatagramPool. Class cDatagram represents the actual datagram, as mentioned earlier; class cDatagramPool assists cTransportManager in the maintenance of a pool of these objects. There are several reasons for this pool. To start with, datagram buffers must normally be present also during (network) interrupts, since several of the network protocols use some kind of event service routine on receiving or transmitting a datagram. In the Windows 16-bit environment, this implies that those buffers must be page-locked. Since we need them often and without delay, it makes sense to preallocate an ample number of them and let them be managed by a separate class. Instead of new and delete, we then use an acquire/release protocol to manipulate datagram buffers. (We could have overloaded operators new and delete for the purpose, but first of all, they were already overloaded to allocate page-locked memory chunks, and second, we didn’t want calls to constructors and destructors all the time.)

Back to the top

Implementation of network services

The trio cPort-cTransportManager-cNetProtocol together offers the following classes of datagram transmission:

  • Multicast to all ports connected to a given channel. This is used by publishers to announce updates etc. to all their subscribers at once.
  • Point-to-point request and reply with guaranteed delivery, used by subscribers and publishers in a client/server fashion (where in some cases the publisher itself assumes the role of a client to an authoring subscriber).
  • Point-to-point informational message without reply. This is used in particular during the shutdown of a node to announce its demise to the game control publisher, and for acknowledgement messages (see below).

The network protocol class cNetProtocol only needs to provide two services: broadcast (or multicast) and point-to-point, both of which may be unreliable. Class cTransportManager improves upon this basic quality of service by maintaining queues of pending requests (to retransmit the request if no reply is received) and of recent replies (to respond to re-requests whose replies were accidentally lost). A third queue holds pending transmissions in general, since some network protocols cannot handle more than a few (perhaps 10) datagram submissions at a time, whereas in a heavily loaded game there may be bursts of a few hundred transmissions within a few seconds. The pending transmission queue allows the cTransportManager class to adjust its outgoing pace to the capabilities of the underlying protocol.

The request/reply protocol used is a straightforward implementation of a ‘request/reply with acknowledgement’ algorithm with retransmission after a time-out. Textbooks such as [Coulouris et al. 1994] describe this and other algorithms in detail. The idea is to attach a unique identifier and an expiration field to each request, keep transmitted requests around until the corresponding reply is received, and retransmit the request if a time-out period expires without a reply being received. This may be repeated for several expiration periods, after which the other party is deemed unreachable and an error indication (instead of a reply) is returned to the original submitter of the request. If a reply is received, however, the request is satisfied. In that case, an acknowledgement is sent to the replying party, which allows it to release any resources it might hold to cater for retransmissions of the reply. In practice, the length of the time-out period, the maximum number of retries, and the expiration time of replies (in case acknowledgements are lost) are subject to the quality of the underlying network, the overall network load, the desired response times, and the chance one is willing to take to falsely declare a node as being unreachable. In our implementation, these are all parameters that may be preset and that to some extent will adapt dynamically to the network conditions.

Back to the top

Dynamically loading new classes

One final point still needs clarification. How can we load new network protocol classes at runtime, without them being linked into the code? The answer is simple if you take it logically: place the class code in a DLL. Doesn’t that require a lot of GetProcAddress() calls to obscure (i.e., mangled) member function names? And for that matter, doesn’t that require knowledge of the class name in the first place? Or do we need to make sure that those member functions are always exported at the same ordinal entry points?

Well, actually it’s much simpler. Remember virtual functions? If you do, you must be aware that they are called through a vtable, which is a glorified jump table. Suppose that we knew the address of that jump table, and knew that index #3 would point to one function, #4 to another, and so on... Is it dawning yet? Get this (read slowly): if we implement class cTransportManager in terms of the (virtual) interface of class cNetProtocol, put them both in the application’s executable, then at runtime somehow provide a pointer to a cNetProtocol-derived class in a DLL, we have (a) the pointer to the jump table (which is the vtable, whose address can be found at some offset from where the object’s ‘this’ pointer points to) and (b) the indices of the various functions in that table, since the C++ compiler courteously translates calls to virtual functions to look-up operations in that same jump table.

To good to be true? Nope. It works like a charm. No need to export anything from the protocol’s DLL. Remember, though, that you must make the member functions themselves exportable, and also that the class must be compiled as huge to get full-size vtable pointers and contents, even if you don’t actually export them in the DLL’s export table. They will still be called in a situation where DS!=SS and that sort of thing, even if you didn’t link to them or loaded their address in any obvious way. Small point: how do we get that pointer to the cNetProtocol-derived object in the first place? O.K., we confess, we do need a conventionally exported function for that. But only one. In our case, we called that function CreateProtocol(), demanded that it has no parameters, returns a pointer to cNetProtocol (but at runtime an object of a derived class), and that’s it. When we load a DLL for a network protocol, we GetProcAddress() its CreateProtocol() function, call it, and if it returns a nonzero value, we have our network protocol. By virtue of the protocol’s virtual destructor, we don’t even need any further assistance to get rid of it when we’re done. Finally, by placing a list of protocol descriptions, with the names of the corresponding DLLs, in the application’s .INI file, we can add and remove protocols at runtime.

Back to the top

Conclusion

This article has covered rather a lot of ground in a short space. Still, I hope that it has shed some light on yet another distributed computing design, and perhaps also shown some useful patterns and implementation techniques for immediate application.

Back to the top

References

[Coulouris et al. 1994] G. Coulouris, J. Dollimore, T. Kindberg: Distributed systems. Concepts and design, Second edition, Addison-Wesley 1994.

Back to the top

Listing 1: Declarations of Publisher/Subscriber classes

// Assume declarations of the following classes:
class cPort;       // Network port abstraction
class cNetAddress; // Generic network address
class cDatagram;   // Datagram message

// Abstract base class for Publisher & Subscriber
class cActor {
public:
    // Public virtual destructor; anyone can delete an Actor.
    virtual ~cActor();

    // Functions to obtain network information
    uint16 ChannelNo() const;
    void NetAddress(cNetAddress &adr) const;

    // Functions to interrogate & change connection state
    void DisconnectPort();
    bool IsConnected() const;
    virtual void Shutdown()=0;
    virtual bool IsPublisher() const=0;

protected:
    // Constructor for use by derived classes
    cActor();

    // Access to the port object for derived classes
    cPort & Port();

    // Functions relating to datagram management
    cDatagram * AcquireDatagram();
    void ReleaseDatagram(cDatagram *);
    virtual void IgnoreMessage(cDatagram *);
    virtual void UnknownMessage(cDatagram *);

    // Implementation of message dispatcher
    bool DispatchMessage(cDatagram *);

    // Default message table
    DECLARE_MESSAGE_TABLE(cActor);

private:
    // Actors own ports for their network communications.
    cPort mPort;
};

// Publisher class
class cPublisher: public cActor {
public:
    cPublisher();

    // Functions to manage the publication channel
    void BroadcastChannelDown();
    virtual void Shutdown();

    // Implementations of other cActor functions
    virtual bool IsPublisher() const { return true; }

    // Signature of the publisher
    uint16 Signature() const;
};

// Subscriber class
class cSubscriber: public cActor {
public:
    cSubscriber();

    // Functions that set the server address of the client.
    const cNetAddress &PublisherNode() const;
    void SetPublisherNode(const cNetAddress &);

    // Implementations of other cActor functions
    virtual bool IsPublisher() const { return false; }
    virtual void Shutdown();

protected:
    // Default message responders
    void OnChannelDown(cDatagram *);
    virtual void ChannelDownAction() {}

    // Subscriber message table
    DECLARE_MESSAGE_TABLE(cSubscriber);

private:
    // We keep the node address of our publisher
    cNetAddress mPublisherNode;
};

Back to the top

Listing 2: Declarations of network communications classes

// Network endpoint abstraction
class cPort {
public:
    cPort(cActor *);
    ~cPort();

    // Access to information regarding this port
    cTransportManager *Manager() const;
    void NetAddress(cNetAddress &) const;
    uint16 ChannelNo() const;

    // Function to check the connection state of the port
    bool IsConnected() const;
    void Disconnect();

    // Functions to send messages to our peers in other nodes.
    void SendBroadcast(cDatagram *);
    void SendInfo(cDatagram *, const cNetAddress &);
    void SendRequest(cDatagram *, const cNetAddress &);
    void SendReply(cDatagram *);

private:
    friend class cTransportManager;

    // cPort instances are managed by the transport manager,
    // organized by channel number.
    cTransportManager *mManager;
    uint16 mChannelNo;

    // Pointer to the actor to be called back by the port.
    cActor * mActor;

    // Function to handle incoming datagrams
    void ReceiveDatagram(cDatagram *);
};

// Transport manager
class cTransportManager {
public:
    cTransportManager();
    ~cTransportManager();

    // Interface to start the network connection.
    int StartGroupAdmin(const char *);
    int StartGroupMember(const cGroupInfo &);

    // Stopping the network occurs in two phases
    void StartShutdown();
    void FinishShutdown();

    // Function to enumerate the active groups.
    int EnumGroups(cGroupList &);
    int LookupGroup(cGroupInfo &);

    // Node-level information functions.
    bool IsActive() const;
    bool IsAdmin() const;
    const char * GroupName() const;

    // Low-level protocol information functions.
    cNetProtocol *Protocol();
    const char * ProtocolName() const;
    void NetAddress(cNetAddress &);
    const cNetAddress *GroupAdmin() const;

    // Interface to attach and detach ports.
    void ConnectPortAt(cPort *, uint16);
    void ConnectNextPort(cPort *);
    void DisconnectPort(cPort *);
    uint16 NextChannelNo();

    // Functions for datagram buffer maintenance.
    cDatagram * AcquireDatagram();
    void ReleaseDatagram(cDatagram *);

    // Functions to send datagrams.
    void SendBroadcast(cDatagram *);
    void SendRequest(cDatagram *, const cNetAddress &);
    void SendInfo(cDatagram *, const TLNetAddress &);
    void SendReply(cDatagram *);
    void SendAck(cDatagram *);

    // On reception of a datagram, ReceiveDatagram() is called.
    void ReceiveDatagram(cDatagram *);
    void DispatchDatagram(cDatagram *);

private:
    // Current network protocol
    cNetProtocol *mProtocol;

    // List of connected ports and next available channel
    cPtrArray<cPort> mPortList;
    uint16 mNextChannel;

    // A pool of datagrams is maintained by a subobject.
    cDatagramPool mPoolMgr;

    // Queues for reply and request transactions
    int16 mNodeID; // Unique node ID
    int16 mNextTid; // Transaction counter
    cDataQueue mRequestQ; // Request queue
    cDataQueue mReplyQ; // Reply queue
    cDataQueue mSendQ; // Pending send queue

    // Protocol maintenance
    int OpenProtocol();
    bool CloseProtocol();
    bool IsProtocolOpen() const;

    // Internal port maintenance
    void DisconnectAllPorts();

    // Function to maintain send and receive queues
    void PostSends();
    void CheckTransactions();
};

// Abstract base class for network protocols
class cNetProtocol {
public:
    // Virtual destructor to cater for derivation
    virtual ~cNetProtocol();

    // Functions to initialize and terminate the protocol
    virtual int InitProtocol()=0;
    virtual int TermProtocol()=0;
    virtual bool IsInited() const=0;
    virtual const char *ProtocolName() const=0;

    // Functions to open and close the network connection.
    virtual int OpenConnection(const cNetAddress *=0)=0;
    virtual int CloseConnection()=0;
    virtual bool IsConnected() const=0;
    virtual void NetAddress(cNetAddress &)=0;

    // A node must be able to advertise its address.
    virtual int StartAdvertising(const char *)=0;
    virtual int StopAdvertising(const char *)=0;

    // Function to enumerate the active groups.
    virtual int EnumGroup(cGroupList &)=0;
    virtual int LookupGroup(const char *, cNetAddress &)=0;

    // Functions to send datagram messages.
    virtual bool SendBroadcast(cDatagram *)=0;
    virtual bool SendMessage(cDatagram*, const cNetAddress&)=0;

protected:
    // Back pointer to transport manager
    cTransportManager *mManager;

    // Constructor for derived classes
    cNetProtocol(cTransportManager *);
};