Home / Resources / Redundancy On The Cheap: How to use two servers, FactoryPMI, FactorySQL, an

Redundancy On The Cheap: How to use two servers, FactoryPMI, FactorySQL, an

Introduction

FactoryPMI and FactorySQL have long supported clustering and redundancy, respectively. But their database-centric architecture always left the question: What about the database? A true clustered database solution, like Microsoft SQL Server Clustering or Oracle 11g RAC works great, but can take 4+ extra servers and tens of thousands of dollars worth of software. This whitepaper describes how to set up redundancy on the cheap, using 2 servers and MySQL.

First disclaimer: The solution provided here works well, but you do give up some of the reliability that you'd get with a true clustering solution. If you can afford it, go with a true clustered database. They are great and are built for exactly what you're doing. That said; we understand that budgets aren't infinite, and sometimes you have to do more with less. In that case, read on.

Second disclaimer: The solution here isn't rocket science, but it does demand that the user has a solid grasp on our products, databases, and networking. After all, rocket science probably isn't that difficult if you're a rocket scientist. If you start thinking this might be beyond your technical capability, skip to the bottom where we provide a much simpler solution that very well might be "good enough" for many users.

The Basics

This guide is going to be using some advanced features of FactorySQL, FactoryPMI, and MySQL that you may not be familiar with. Let's go over some of these features first.

FactorySQL Redundancy. FactorySQL Redundancy is a feature that lets two or more FactorySQL services running on different physical machines cooperate, so that if the primary instance goes down due to a machine crash or software fault, one of the secondary instances will take over. FactorySQL uses a common database connection to share project configuration data, as well as monitor who is the currently running master instance.

FactoryPMI Clustering. FactoryPMI Clustering is a feature that lets two or more FactoryPMI Gateway instances running on different physical machines share configuration data and split client load evenly across all running instances, with dynamic client failover. FactoryPMI Clustering uses UDP communication over the local subnet for all clustering communication.

Aggregate Connections / Failover Connections. In FactorySQL, you can create an aggregate connection to a database. This special connection type uses two other database connections to create a single one. If the primary connection goes down, the secondary connection will be used instead. In FactoryPMI, you can specify a failover connection for any datasource connection. In a similar fashion, if that datasource connection fails, the failover connection will be used instead. These two features are very important to the technique presented in this guide, because they let FactorySQL and FactoryPMI use two separate databases as one, which is how we simulate a clustered database.

MySQL Replication. MySQL Replication is a feature of MySQL that lets you specify one database as a master and one as a slave. All changes that are made to the master will be replicated over the network to the slave. Setting up MySQL replication is out of the scope of this guide, but was covered in an earlier article, available here /company/articles/?id=1.

The Architecture

The basic idea is to set up MySQL replication for realtime data (status and control), and FactorySQL datacaching for historical data. When both servers are up, history will be logged to the master database, and realtime status and control will go through the master database. In the event that the master server goes down, history will be cached by the secondary server's FactorySQL, and realtime status and control will be routed through the secondary database. When the master server is brought back up, the secondary FactorySQL will put all cached history into the master database. So, during the failure, history will be unavailable, but status and control will work.


Figure 1: Redundant Architecture Overview
Figure 1: Redundant Architecture Overview (click to zoom)

Figure 1 is a diagram of the architecture this guide is describing. You'll see a "master" and "slave" server. Each server has the full "stack" of software on it, that is: an OPC Server, FactorySQL, a database (MySQL), and FactoryPMI. Don't let the picture frighten you, it's not as bad as it looks. Notice that there is a lot of symmetry here: Both instances of FactorySQL actually have the exact same connections, and the same goes for both of the FactoryPMI Gateways. Let's go over the configuration for each piece of the software stack.

First, we have the OPC server. Both OPC servers will be connected to your devices (PLCs, RTUs, etc). They don't need to be specifically set up for "redundancy", but they do need to be configured in the same manner, so that they have access to the same tags, etc.

Before we go over FactorySQL, let's look at the database. Notice that there are two separate schemas: one for historical data, and one for realtime data. Separating the data into two schemas lets us use different connections for each schema. The master database has both schemas, but the slave server only has the realtime schema. You need to set up MySQL replication to replicate the realtime schema from the master to the slave. Again, see this article for detailed instructions on setting up MySQL replication /company/articles/?id=1.

The FactorySQL configuration will be identical on both the master and the slave. The crux of the configuration here is in the Data Connection Settings. You'll need the following connections:

  • HistoryConnection: Connection to the master's historical schema. Native or DSN connection. Ensure that Data Caching is enabled in the Service Settings.
  • MasterRealtime: Connection to the master's realtime schema. Native or DSN connection.
  • SlaveRealtime: Connection to the slave's realtime schema. Native or DSN connection.
  • RealtimeConnection: Aggregate connection with MasterRealtime as the primary connection, and SlaveRealtime as the secondary connection. Failover mode should be set to "Use secondary until primary is available"
Ensure that SQLTags is enabled and using the RealtimeConnection (unless you aren't using SQLTags). Enable redundancy, and set it to use the RealtimeConnection. Note that FactorySQL redundancy makes your project configuration shared implicitly across your master and slave, but the settings themselves need to be configured individually.

FactoryPMI configuration is fairly simple. You'll want to configure your master, and when it is all configured, you'll join your slave into the master's cluster. The configuration, again, mostly has to do with the datasource connections. You'll need the following connections:

  • HistoryConnection: A connection to the master's historical schema.
  • SlaveRealtime: A connection to the slave's realtime schema.
  • MasterRealtime: A connection to the master's realtime schema, with the SlaveRealtime set as the "Failover Datasource".
In your project, make sure to use the HistoryConnection for all historical queries (e.g. charts, tables, reports etc. Use the MasterRealtime for all realtime information (e.g. SQLTags). To cluster your two FactoryPMI Gateways together, first set the Cluster Name of the master server to something intelligible, such as "FPMICluster". Then set the slave's Cluster Name to the same name. Ensure that you don't have any software firewalls such as Windows Firewall enabled, and that multicast UDP traffic on a single port can pass between the two Gateways.

How It Works

Now that you have your master and slave configured, let's take a look at how it will work in the event of a failure. First, look at Figure 2, which shows the path data takes when both servers are running. Notice that realtime and historical data go through the master database, and the FactoryPMI Gateways share the client load. Data flow is shown in the thick green lines. All other connections are not being used.


Figure 2: Normal Data Flow
Figure 2: Normal Data Flow (click to zoom)

Now let's look at the data flow in the event that the master server has crashed in Figure 3. Again, the active data paths are shown in thick green. Notice that the realtime data is going through the slave's realtime schema, FactorySQL is caching the historical data, and the FactoryPMI clients have all switched to the FactoryPMI Gateway instance running on the slave.


Figure 3: Master Crashed Data Flow
Figure 3: Master Crashed Data Flow (click to zoom)

Conclusion and a Simpler Alternative

There you have it, with some clever partitioning of connections and some MySQL replication, you can create a mission critical HMI/SCADA system that logs seamless history, and never loses control of your application. Of course, there are details to work out, such as backup schemes for your history in case the master server crash cannot be recovered from, but such things can be put in place fairly easily. If you are going to put a system like this in place, we recommend two additional things: purchase the commercial version of MySQL (it is fairly inexpensive) so that you can receive support from them, and practice. Disaster recovery plans and redundant software can be quite complex. If you're serious about a highly available SCADA system, make sure that you regularly practice simulated outages to ensure that everything goes smoothly.

Last but not least, we have promised a simpler alternative, in case this all seems too complex. The alternative is: Have a cold-standby server ready and waiting. Purchase a USB licensing dongle from us (so that you can simply swap the FactoryPMI and FactorySQL license in a physical form), and in the event of a failure, turn on the secondary server and give it the IP address that your master server used to have. Yes, you'll probably be down for 5 to 10 minutes, and you will lose the history during that period, but you'll be able to explain how it works in one sentence. The only tricky part to this is to make sure that your secondary server's project files stay up to date. This can be achieved reliably through well-known procedural guidelines, or through automatic ghosting.

As always, if you'd like to speak to one of our engineers to discuss any of these architectures, or have some new ideas that we haven't thought of, please give us a call at 1-800-266-7798.

Inductive Automation
Inductive Automation pioneered the first full-featured web-launched HMI/SCADA system in the world. Its standards based, database-centric architecture receives accolades from plant managers, IT managers, and system integrators worldwide. With a commitment to software quality and technical support second to none, Inductive Automation is at the forefront of industrial software.

published: 04/30/14