Integrating SQL Databases and SCADA to Maximize Efficiency and Reliability
15 minute readUsing Structured Query Language (SQL) for automation is nothing new. SQL databases have been working in the background of many different systems for decades, but even today there are some who are hesitant to mix SQL with industrial automation software like supervisory control and data acquisition (SCADA). However, as more SCADA software users feel the pressure of keeping up in today’s connected, data-driven world, SQL has received more well-deserved attention.
A database can often be one of the most important components of any control system. Over many years of development, it has become increasingly clear that control systems integrated with databases that use open standards (such as SQL) have a major advantage over systems that use proprietary databases, which is why the combination of SQL and the Ignition industrial application platform is so valuable.
Ignition is a cross-platform, modular, server-based software that centralizes everything on your plant floor, including SCADA, manufacturing execution system (MES), Industrial Internet of Things (IIoT), human-machine interfaces (HMIs), alarming, reporting, and more into one platform. It also allows you to seamlessly connect to all your SQL databases and deploy unlimited clients on desktop, tablets, and mobile devices. By making open-standard and open-source technologies like web services, SQL, MQTT, OPC UA, and others easily accessible, Ignition offers greater scalability and connectivity with your enterprise data.
In this article, we will explore SQL databases, why SQL continues to be so popular and useful, and the limitless potential of integrating SQL databases with your control system, as well as time-series data and its importance.
The SQL Programming Language
According to the American National Standards Institute (ANSI), SQL is the standard programming language for managing data in a relational database management system (RDBMS) and the one of the most widely used languages for relational databases. It was first developed for IBM in the 1970s and has been used consistently for almost 50 years to store and manage data in SQL databases.
Its main function is to quickly and efficiently ask questions of small to very large amounts of data using SQL statements, or "queries," which allow you to store, update, retrieve, search, or remove specific data in a database.
Software engineers construct SQL queries using various components of the SQL language, such as identifiers, variables, and search conditions, to form the correct statement. They can also save SQL queries for later use as "stored procedures'' so you don't have to rewrite frequently used queries. These capabilities are part of what makes SQL so efficient, and even proprietary databases often use the same basic structure.
SQL Relational Databases
While all databases store data for future or continuous use, there are many types of databases, which are differentiated by their structure. The most widely used type are called relational databases. In a relational database, each data point is related to its surrounding data, and its function is to organize and retrieve relational data.
“SQL databases are one of the most important technologies that are used for data storage and access retrieval,” says Inductive Automation Chief Technology Architect Kevin McClusky.
You can think of a SQL database as having two main functions: access retrieval, which is performed using SQL; and storage, which is performed using the database engine. SQL databases organize data into files and specific storage areas on a physical hard disk or online servers in a cloud-based SQL database.
A SQL database is structured similarly to Excel tables with rows, columns, and cells, which makes organizing and retrieving any amount of data easy. This structure is especially useful for real-time and time-series (or historical) data. It helps to organize data that’s related to other data into tables by commonality, showing a picture of an event or data set.
One of the most powerful aspects of a SQL database is its ability to "query" or ask questions of data and establish relationships with it. SQL queries allow you to perform analysis, answer complex questions, and manipulate and aggregate data without many steps so you can retrieve specific data from multiple tables, associate records with common field types, leverage the relationships within your data, and control precisely where data goes and what data is excluded.
For example, a company could have two different tables for their employees — one for their personal data, such as name, address, and birth date, and another table that contains details about each employee’s salary. While the employees and salaries tables exist separately within the database, their data can be associated with one another, and potentially retrieved together. This association is accomplished by establishing a unique identifier (commonly known as a primary key) in one table that can be referenced in the other table. For example, the employee’s unique ID number could be a primary key in the employees table, and that ID could be used as a foreign identifier (commonly known as a foreign key) in the salaries table. This allows the data for an employee in the employees table to be associated with their salary data in the salaries table, all through the single value of their unique employee ID. The ability to create this association, or relation, is the reason these types of databases are referred to as relational databases.
By creating these relationships across tables, queries can then be written to summarize aggregate results from multiple tables such as employee salaries by department. If the SQL database is connected to your plant floor, you could find the average temperature of a room or several rooms, when or if the temperature control machines went down or how long they worked for, the standard deviation between datasets from multiple downtime events, and much more, in real time.
SQL queries can range from being simple with just one or two lines of code querying a single table, to extremely complex with hundreds of lines of code querying many tables together. Generally though, the shorter and succinct the query is, the better. “The queries are really only as complex as you need them to be,” says McClusky. The range and flexibility of SQL queries is part of what makes relational SQL databases one of the most powerful and versatile tools for data retrieval, storage, and management.
Why is SQL Still so Popular?
Many technology trends come and go every year. Yet after nearly five decades, SQL is still going strong. Here are four principal reasons behind SQL’s continued prominence: simplicity, scalability, accessibility, and its variety of options.
Simplicity
SQL is highly developed and very easy to learn. There are numerous free resources for learning SQL, and finding an IT person or engineer familiar with SQL is usually not a problem since so many use it. Besides, as a beginner, you only need to understand a few basic actions to retrieve, update, remove, or aggregate data.
Scalability
SQL databases are not overly expensive, unlike many proprietary databases, and can handle a large amount of data, making them easily scalable for any business. Some of the most popular SQL databases are even free to use, like MySQL and Postgres.
Accessibility
Open-standard technologies like SQL databases are available to everyone and easily accessible, unlike proprietary technologies, which are usually behind a high paywall, and often difficult to extract or migrate data from. You can also integrate SQL databases with many other technologies, and you don't have to stick to a single vendor and use a limited custom-built database.
Many Options
“One other reason why I think it's still going so strong is that there are a lot of different options. SQL is a standard query language that multiple different databases provide for you, and you don't have vendor lock-in with SQL databases. You're not forced to use one type of SQL database. You have many options,” says McClusky.
What About the SQL Alternatives?
At the beginning of the 21st Century, combining SQL databases with a historian was uncommon, which led to issues when dealing with extremely large amounts of data. To help with this scalability problem while continuing to use some of the basic structures of the SQL database, software engineers developed NoSQL in 1998 and then NewSQL in 2011. However, the scalability issues that NoSQL and NewSQL set out to solve are much rarer now that storage space is more affordable and because you can couple a SQL database with a historian to create a high-performance database. NoSQL has grown in popularity in recent years. MongoDB and Amazon Web Services’ DynamoDB are currently two of the most popular NoSQL databases.
Collecting and Storing Time-Series Data
Another major benefit of SQL databases is that they help you get more value out of your organization’s time-series data. Time-series data is a sequence of data points that are time-stamped at successive intervals. These data points can be stored, typically in a database, and then retrieved to display in tables or charts that show any type of time-progressive data, such as temperatures, weights, volumes, and more.
Some think that time-series data is only useful for determining the status of individual data points at a particular point in time, tracking specific trends, and not much else. However, time-series data is actually one of the most valuable products that comes from monitoring machines and processes connected to a SCADA system. Time-series data can potentially help you save millions of dollars in downtime, to make more detailed plans for improving energy efficiency, and a lot more.
However, many organizations don’t collect time-series data because they either don’t have the means to collect it, and/or an application to store and retrieve it. There are basically three options for storing time-series data: in PLCs, in proprietary process historians, or in SQL databases.
Storing data in PLCs is probably the worst thing you can do. The more data a PLC stores, the less efficient it becomes, inevitably slowing down the overall operation. It is also difficult to retrieve data stored in a PLC if there is no connection to a database because you have to physically plug in to retrieve it.
Process historians are better than PLCs, but they can be exorbitantly expensive, are not as versatile as SQL databases, and don't allow you to easily upgrade your database. “You might find a great solution that's out there, but that solution is just inside this locked-in ecosystem that is from a specific vendor, and that's not going to do you any favors in the future when you come to your next upgrade cycle,” says McClusky.
SQL databases are typically the best choice for storing time-series data. They are well-equipped to handle large amounts of time-series data for SCADA systems, and they enrich time-series data by putting it into a simple format that your entire organization can read.
Many large enterprise systems have used SQL databases for years. “Those databases are sometimes IT databases that are managed and have information from other sections of the company overall. Sometimes those are tied to enterprise resource planning (ERP) software,” says McClusky. Customer relationship management (CRM) and MES software also have a long history of using SQL databases. By connecting a SCADA system or application to a SQL database, you can share this information quickly and efficiently and make your data much more accessible to your other systems.
Once in an SQL database, time-series data can tell multi-layered stories about every aspect of your production, or reveal unseen trends that can save your organization a fortune. By overlaying this information, you can see how each system affects the other in real time and find new ways to improve efficiency and increase profitability.
For example, let's say you're a sheet metal manufacturer, and you notice that one of your production lines is running out of material faster than another, even though both lines are working fine. You check your SQL databases connected to the material inventories for each line and find that they receive the same amount of raw material. Then you check the SQL database connected to your SCADA system collecting time-series data from your machines' PLCs and sensors.
After comparing the two production lines, you notice that one produces sheet metal a few micrometers thicker than the other. It could be a measurement problem with the machine, the machine could be worn out and in need of replacement, or maybe it wasn't set up correctly. Even if the extra few micrometers are within an acceptable range, fixing the problem will save a large amount of inventory and capital for the future — all as a result of storing your time-series data in easily accessible SQL databases.
Connecting OT and IT
At this point, SQL may sound like a dream come true, but not everyone in the automation industry was originally on board with SQL databases. OT professionals, and SCADA users in particular, were skeptical of using SQL for time-series data, preferring proprietary databases specifically designed for the task. However, IT professionals typically view proprietary databases with their proprietary code and code translators as another hurdle to getting at valuable data and communicating effectively with other parts of the business.
It's never easy to get experts on opposite sides of a problem to see eye to eye, but fortunately, tools like Ignition and SQL can get both sides of the OT-IT divide working together again. Connecting your SCADA system to a SQL database is a big step toward a mutual understanding between IT and OT departments. “The SCADA system is going to acquire that data. It needs somewhere to put that data, and if you put it in a SQL database, you have all the advantages of the SQL database,” says McClusky.
By integrating these technologies, there would be no need to worry about connecting proprietary databases that could cause problems and hinder the rest of the system. In addition, most IT staff are familiar with SQL, so for those modernizing their SCADA systems with SQL databases and industrial automation platforms such as Ignition, it’s usually not necessary to teach staff how to use it.
Although some reluctance to pair SQL with SCADA remains, SQL and other open standard technologies are becoming much more popular.
“I'd say for folks who are modernizing their infrastructures and used to have an older setup with their manufacturing and they're moving over to Ignition, absolutely SQL is increasing and it should be increasing,” says McClusky. This is because SQL databases are compatible with any new technology or application that uses these open standards. Integrating SQL databases and SCADA offers many advantages as technology is changing very quickly, and SQL can keep up with the demand because of its interoperability.
Ignition’s SQL Modules
As mentioned earlier, Ignition is a modular software platform. Specific Ignition modules enable connection to SQL databases, such as the SQL Bridge and Tag Historian modules. With the ability to connect to your existing SQL databases and couple them with Ignition's native historian, you don't have to purchase an expensive proprietary database or perform time-consuming programming language conversions.
The SQL Bridge Module
The ability to integrate your PLCs and your SQL databases may seem like an obvious necessity, but it is by no means a given for most control systems. The SQL Bridge Module acts as a bridge between OPC data and SQL databases in the Ignition automation platform, which allows you to move data bidirectionally, log large amounts of data, scan barcodes, manage recipes and downtime, and sequence and track production.
Read about some of the most effective ways to use PLCs and SQL databases with Ignition’s SQL Bridge Module in this blog post.
The Tag Historian Module
The Tag Historian Module allows you to transform a SQL database into a high-performance time-series data historian at a fraction of the price of a proprietary historian. The module doesn't require any complex configurations or data modeling to work with a SQL database, and it allows you to create, move data to, and manage data in individual tables, charts, and graphs; compress, partition, insert, and combine data; use data with Ignition without writing SQL queries; and make data available to applications with database access.
SQL databases are great for most situations, but due to their simple structure, there can be drawbacks when storing large amounts of records in a single table, which is one of the most common criticisms of SQL databases. While this is not a problem for most SQL database users, it can be a problem for those dealing with extraordinarily large quantities of data.
“The indigenous historian takes care of that automatically because it sets up these partitions which basically spin up a new table for different time periods … so Ignition just automatically creates new tables there for the tag historian,” says McClusky. This reduces the amount of data per table and increases efficiency.
Saving Money and Future-Proofing with SQL and Ignition
By pairing the Ignition platform with SQL databases, your system will be able to handle technological changes and developments for much less than what proprietary databases cost, and avoid future issues related to data extraction or migration.
Instead of storing data in PLCs or spreadsheets and needing hours or days to cross-reference data, you’ll have a modern and affordable alternative. With the ability to track time-series data, you can put every success and failure into context and learn from it.
You won't have to worry about your disconnected historian slowing everything down or your system becoming outdated. Also, industrial automation platforms like Ignition allow you to constantly find new and better ways to use your existing SQL databases.
With the increased acceptance of open-standard technologies, the future continues to look bright for SQL databases and for SCADA systems that can leverage them effectively.
Want to stay up-to-date with us?
Sign up for our weekly News Feed.