Using Ignition with Machine Learning Libraries51 min video / 48 minute read
Chief Technology Architect & VP of Sales
Software Engineering Group Manager
Using Ignition and machine learning libraries can be a powerful combination. Inductive Automation's machine learning experts will lead conference attendees through practical applications for ML, along with typical ML setups that Ignition users could implement on their own systems.
Kevin McClusky: Hello everyone, and welcome to “Using Ignition With Machine Learning Libraries.” We're happy to be here together with you today, thank you for joining us on the second day of the conference here. This is the session right before the Build-a-Thon, which I'm also very excited about, that's gonna be right here after this. And machine learning is something that we care a lot about, and we've done a number of sessions on, we wanted to give you a little bit of a different session today, something that will still take you through... If you don't know machine learning, it'll still take you through some of the basics, but just from a conceptual standpoint and then some of the meat and potatoes here, how do you actually do this? And we have a little bit of a surprise for you. So my name is Kevin McClusky, if you joined yesterday at the keynote, you heard that I have recently been promoted to Chief Technology Architect, which I'm very... Thank you.
Kevin McClusky: And with me is Kathy Applebaum, who is also a manager, right? You wanna talk a little bit about your position?
Kathy Applebaum: Yeah, I'm now the Manager of the Software Engineering Department. I also have a Master's degree in machine learning, I use support vector machines to classify about 250,000 images in that project.
Kevin McClusky: So Kathy has a lot of... Yes, please.
Kevin McClusky: So Kathy has a lot of experience when it comes to all of this. I have a lot of hands-on practical with customers who have been doing different things in machine learning. Kathy has a lot of computer science, models, tuning parameters, how do you do different things, what algorithms are best, that type of experience here, so we're coming together here to talk about all of this. So why are we here? Why are you here? Well, we know why you're here, it's because outcomes, right? So you're not here necessarily because you wanna know what a k-means++ algorithm is, you're here because you wanna know how do I take my stuff that I have, whatever that stuff is, and how do I get something out of it? How does machine learning bring value to me? As we promised inside the description for this session, we're gonna cover practical applications and we're going to cover typical ML setups as well. So those are two different sections here and practical applications have to do with what you'd actually do with it, and then the typical ML setups have a little bit more technical behind them. It's gonna be talking about... We are gonna talk about algorithms, we're talking about outcomes, but you need the algorithms to get there. And so we have it broken up a little bit here, and this is an advanced session, so it's under about advanced track, it's part of that.
Kevin McClusky: If you have Python experience in Ignition, that is recommended if you're joining this session. If you don't there should still be enough that is valuable here that you'll be able to get out of it. But if you know Python, if you understand scripting inside Ignition, if you understand how to develop things inside Ignition, you're gonna get even more out of this session than you would otherwise. So I'll just bring this up for a second. We're actually going to go through these things that are in the outline, so don't need to cover it in a lot of detail right here. But gonna go through intro, goals and outcomes, steps, how you get there, building a model, the program, and then concepts for how all of that works. And then we're gonna jump over to the practical applications, typical ML setups, and then have some exciting things there to tell you about. And Kathy and I have one slide here and we... Oh, we do have the other piece there, I didn't even realize that.
Kathy Applebaum: Yeah, yeah.
Kevin McClusky: Okay, yeah. So we split these up, both of us could talk about most of this here, so this one's my slide as well. So, the anomaly detection is one of those practical applications, right? This is something that if you wanna take a look at a system, you wanna see if it's running the same way that it's been running, if you wanna see if it's... Let's say that you have patties of something, like beef patties or incredible amazing burger, whatever they call that thing. Or you're doing veggie burgers or whatever it is, you've got these patties that are going along in a production line, you might want to tell if they're different somehow. And you can have vision systems that measure things like the width, you can have weight sensors, you can have things that are measuring temperature. Vision systems could measure things like chroma, that's gonna give you a variation from maybe a specific point that you wanted to be color-wise. All of those things could have anomalies in them, and our computer is good at telling what those are. Well, before machine learning you'd probably need to program in limits and all of that, with machine learning it allows for that type of thing to happen more automatically.
Kevin McClusky: Preventative maintenance, predictive failure, this is the big one that folks are doing a lot, and they wanna be able to take a look at a motor and over time when that motor wears out your amp draw is gonna increase on that motor. And so you can program that in individually, but there are other things that maybe you can't. So vibration, which ones are good vibrations, which ones are bad vibrations, and yeah, some of you probably harken back to the “Good Vibration” song there. Those are things that computers are actually pretty good at detecting if you train them right, if you talk to them right, if you ask them the right questions they can be pretty good at giving you good answers to those. Reject bad product prediction, that kind of goes to anomaly detection like I was talking about earlier, if these things are varying quite a bit then you're gonna know that it might be a problem. Anomaly detection is just a general big-picture view, this reject and bad product prediction is a bit more specific, right? And so you can have some other algorithms that are feeding right into that and that are useful for that. Process tuning would be if you have some variables or some set points that are used, maybe an engineer is setting these on a regular basis and they're coming and they're saying, "I wanna set my speed of this process to 23, because that's what it initially was when it was set up, that's what the OEM recommended for this thing."
Kevin McClusky: And then they change it a little bit over time. If you sweep that a little bit and then you let the computer take a look at it, let a machine learning model take a look at it, process tuning could potentially tell you, "Well, if you ran a 24 you're gonna have big problems. If you ran a 25 things are gonna break. But if you ran a 23.2 instead of running at 23, you're actually gonna get 1% increase production out of the system." So things like that, and if you have several different parameters that are used for these, it can do things where it's finding basically intersections, there is regression algorithms and a number of different algorithms inside here, where it can suggest what it thinks the best process tuning parameters might be. Digital twin simulation and prediction is also another important one, lots of folks are doing digital twins inside Ignition. So they're using Ignition SFCs or they're using scripting in order to create simulated values, they're using memory tags, they're using UDTs and the data types inside Ignition to basically have a twin of something that's in the real world.
Kevin McClusky: But when you're doing that just through scripting and just through maybe some random values for noise that you're putting on the line or you're using Ignition simulator. You can only get so far with that, that's not doing a full simulation of the physics behind a real system, and of course, you could do a physics simulator but you can also sometimes use machine learning to help do simulation. Because it can take a look at past data from the real system and predict what real data might look like going into the future, you change a few parameters, it spits out the rest of them for you and you get a sense of that's your... That's the some of the process parameters from your digital twin, some of your status is coming back there.
Kathy Applebaum: Okay, so machine learning is really all about models, so what are models? Models are just something that will help us understand a complex system, but more importantly a model will let us make a prediction about that complex system. So thousands of years ago, we used a lot of physical models, maybe we made a model of the train of a battlefield and where the armies were so that we can try to figure out what type of strategy would be best for winning that war. Or maybe we made a model of a bridge to try and understand how much weight that bridge design could hold. But physical models only take us so far. As time went on and our mathematics got better, we started going more and more to mathematical models. So one example would be economic modeling. We might want to take a look at, for example, what our competitors are doing, what alternatives there are to our product, how price-sensitive customers are, maybe even the time of year in order to understand what a price change would do to our sales of that item. So, thing to understand here is, models make a prediction about the future, but that prediction is only going to be as accurate as our input among other factors.
Kathy Applebaum: So the input, the way we design the model, the things that we put into the model and the things we choose not to put in the model, all are going to affect our accuracy here. So how do we create models? Well, the very first way was by hand, pen and paper, but more importantly it was a human doing the analysis. A human would say, "I think that maybe this should be weighted to three times and this should be weighted 0.5 times, and maybe this is a square root, and I'll combine them all together. And I think I've got a decent model from what I understand." So a human is trying to figure out these correlations between our inputs.
Kevin McClusky: And one of the important things about doing it by hand is that that's where the name model came from, this is where it all started, right? We've got folks who are creating these, you mentioned economic models, right?
Kathy Applebaum: Yeah.
Kevin McClusky: So that's a great example, you got folks who are sitting at their desks and they're... The entities are asking them, “What's gonna happen if this happens? What's gonna happen if this happens?” and as they gather more information they figure out, "Well, we think if the price of oil goes up to here or if it goes down to here then that's going to affect this, that's gonna affect this, that's gonna affect this." The model could have a bunch of lines between different areas, it could have cause and effect, if/then along the lines, it could just be mathematical, as you just mentioned. So you can have polynomial equations that we're saying it's gonna affect these things and this is gonna be affected by 0.3, this is gonna be affected by 28.4. And all of these models, super useful, people have been using them for years and years and years before computers even existed.
Kathy Applebaum: Yeah. But then we get computers, and so it became a programmer writing a program, but really this is just an automated form of our by-hand model, it's just the computer is doing the calculation for us. But it's still a person trying to figure out the correlation between the various inputs and which inputs are important and which inputs aren't important. But the computer did a couple of important things, they let us try a lot of different weights a little faster, it became much easier to do that, and it became much easier to use large amounts of data to form our models. And then we get machine learning. Next slide. So how is machine learning different from just a program? They're both running on computers, they're really both programs. The difference is the human is not deciding the weights. The human is going to give it training data, the human is going to suggest a type of model to use and we'll talk about that in a second. But it's the machine itself that is deciding how important these weights are, how to take this input and get the right output. The human is no longer involved in that part.
Kathy Applebaum: So again, we determine the inputs, we determine the algorithm, the computer builds the model. This is something a lot of people don't understand, I hear like the press and lay people talking about machine learning, and they think the computer is in charge of everything. A lot of it is human and a lot of the errors in machine learning come from us humans unfortunately. So models are based on algorithms, algorithms are near and dear to my heart because I'm really a mathematician at heart. Mathematicians have come up with a lot of different ways to solve the same problem. So imagine that you have a drawer of socks and you've just thrown all your socks in there after you're doing the laundry, and now you're trying to get out the door to go to work and you need two matching socks. So one algorithm is you just grab two socks, if they don't match, you put them back, you grab two more random socks, not a very efficient algorithm but eventually you might get two matching socks. We can change this algorithm up to be a little bit more efficient, grab two socks, only put one back and try and find another sock that matches. Again, not super efficient, but a little better. Or we can keep that second non-matching sock, grab a third sock, see if it matches either of the first two, keep going until we get a matching sock.
Kathy Applebaum: Or we could sort all of the socks into piles, and now we know we have matching socks, plus tomorrow you've already solved your problem, you have matching socks. So these are all different algorithms to solve the “I need to have two matching socks at work” problem, and they're pretty decent algorithms for that. But these algorithms don't work for every problem. You notice that some of the algorithms were better for this problem than others, but none of these algorithms are going to, for example, predict the value of a sock, the price of a sock. So there's two large classes of problems that we have, one is categorization or classification, the other is predicting a value of something, whether that's a cost or a setting or whatever it is. Predicting a number.
Kevin McClusky: So, we'll jump over to the other side here for a second and just talking about how people are doing this today, right? So that's the foundation for what is a model, what's an algorithm, so that you'll have enough background to be able to understand the rest of what we're going through here. So inside Ignition, we talk to a lot of folks who are using Ignition, of course being Inductive Automation, and folks who are using Ignition are doing different types of machine learning models. And often these models are inside the algorithms, the models that are created from these algorithms often they're inside these frameworks. So they're inside, software companies have created these machine learning algorithms, they put them together inside these frameworks and then people are using these frameworks for all of that. So Ignition itself has some built-in machine learning libraries, we have about a dozen that are built into Ignition directly. If you've gone to a previous machine learning session, you're already familiar with that, if not you can take a look, we have some resources. And for years, we've had resources on the [Ignition] Exchange that show how to use some of these models, they're part of the Apache MathML 3 Library or Math 3 Library and the ML section of all of that.
Kevin McClusky: So it's just embedded into Ignition, it's available for anybody and you can access all of that through Python, inside Ignition, anywhere that you have scripting, you can access these algorithms and you can create models. There's Python 3 options. So we have a number of folks who are using TensorFlow, we have a number of folks who are using PyTorch, there's a number of other ones scikit-learn, and there's a half a dozen that are out there inside Python 3. Honestly, TensorFlow is probably the most popular and the most commonly used if folks are using Python 3, and if they are they'll often have Python installed right alongside Ignition. And they'll invoke their local Python execution environment, the Python 3 environment, whatever they have there, directly from inside Ignition with a system.util.execute or with a... There's some Python functions to do that as well that are from the system library there. We have a guide that's out there as well on Python, Jython, Python inside Ignition, and it has steps there for how you invoke that to the outside. So that's right on our knowledge base, it's linked from our forum and a number of other places. And so if you get a chance, you can read through that and that'll orient you on how to invoke Python 3 from inside Ignition if that's something that you're wanting to do or you're needing to do.
Kevin McClusky: And then there's cloud options out there, so there's SageMaker, if we're talking AWS on Azure, there's Azure Machine Learning, they have... That's kind of an umbrella term, they have their machine learning studio and they have a number of other products that are under this Azure ML that's out there. And most of these are accessed over web services, so you can set up web services for doing the things that you need to do any time that you're doing machine learning. So when you want to create a new model and you're going to pick an algorithm and you're going to train it, and you're going to go through the process and evaluate it and see if it's working right, all of those things. These have options and tools for that in there. And there are a few things that take advantage of machine learning inside these cloud platforms too, so for example, AWS has a image recognition library called Rekognition that is there, that basically is machine learning models that are wrapped up in a nice image-processing API. Microsoft has something similar inside Azure, and so these things that have machine learning built in, normally you'll also interact with those over web services. What I wanna focus on right now and share with you, and I'm particularly excited about is these built-in machine learning libraries.
Kevin McClusky: The reason that I'm excited about these isn't just that they're part of Ignition and they mean that you don't have to do anything else outside of Ignition, but that's nice, right? You don't have to use other pieces, but it's also that these are available today, where if you came to a previous session, you'd see a high-code type of environment. We're announcing that we have this Ignition Machine Learning Manager resource that's on the Exchange that we just published today and is available for everybody to use.
Kevin McClusky: You guys probably want to actually see this thing, right? So we're going to do a quick demo. We're going to pull it up. I'll not necessarily do a fresh install from the Exchange, but it's basically blank for the copy that I have on my laptop right here. And we'll walk you through what this looks like, creating machine learning models, going through picking algorithms. And then I want to give you a little bit of a real-world example here too, as part of what I'm using for the data set that is being fed in. So I'll pull that up right over here, I think we can... We don't need to switch screens because it's in the same system here. So let me just pop over and pop over to another tab in the browser. So if we start from scratch, so this is just the Inductive Automation website. ICC, you're here. You can come to the virtual conference too. It says “register for free.” Anyway, we're not talking about that. So the Ignition Exchange, we click over to here. That's where you find it. If you've ever... If you've never gone here, I think almost everybody has. We have at this point tens of thousands, maybe even hundreds of thousands of downloads. It's been something that's been instrumental for a lot of folks in terms of their projects.
Kevin McClusky: And if you scroll down, here's the Machine Learning Manager. Once again, this is an Exchange resource. So this is not something that when you download Ignition, that is just part of the Ignition install when you install it. But this is something that our team has created and that our Sales Engineering team specifically is going to be continuing to improve as we go along and help out and support. And if we come in here, this is the Exchange interface. There are a few other people who have discovered it since we uploaded it, which is great. So we've got three downloads. I'm sure that's going to pop up after this session. You click “Download,” and if you've never actually installed an Exchange resource before, it's super easy to do. You just go to your, this is my local host Ignition install, go over to Config, hit “Ignition Exchange” and “Import package file.” You just drag it in and there you go. So I imported it, as I said before this session. So I'm not going to import it again. But this is how you do that. And then once you have it in place, you're going to have a project if you've created one.
Kevin McClusky: That is, in my case, “machine-learning.” That's what I called the project right there. So I also have it loaded up in a couple of places. It uses a database connection, so that's an important piece. So you want to make sure that you have a database connection set up and connected. In my case, I'm using MySQL, it supports Microsoft SQL Server as well. And we'll probably add Postgres and Oracle over the next few days here, most likely. Sorry, Phil [Turmel]. But yeah that's, we were scrambling to get it done. We'd prefer to have all of them but we will shortly. And so then after you have that in place, you just load up the project, set up the default database connection, which I have done already but that's just under your project properties. Pop in here, give it the default database, whatever database you want to use. And we made it in a way that you can import it into an existing project if you want to as well. So it doesn't have to be a standalone project.
Kevin McClusky: You can have this as part of your bigger overall setup. But I imported it into its own project in this case. And I just set up a little page configuration so I'd be able to launch it. And that's really it. So I'm going to come over here, go to my tools, launch Perspective session. Pops up a session. And this is the home screen. Right now, you can see it's blank. There's not much there. But this is a machine learning manager, and it is intended to manage models that you're able to create. So you can see models right here. It lists that out. And if I hit “Create” down at the bottom right, which you probably expect to do this, it takes you right through. So it's got something that walks you through across the top. So create description, select type, set parameters, configure the training data. I'll come in and I'll take a look. So I created some example items inside here for product inspection. And we've got some values that are basically simulated out of all of this. And so I mentioned a few things earlier. You'll notice that these are similar to the example that I gave. So it's weight, radius, temperature, chroma for something that's being inspected as it moves down a line.
Kevin McClusky: So I'll just call this “Inspection” and I'll give this a description and I'll say “Categorization.” Categorization. I can never type when I'm in front of an audience, I apologize. “Categorization for patties.” Let's say meat patties or whatever type of patties they are. I'll hit “Next.” And then we get our options here. These are our first options for what we actually want to do. So they're the algorithms. Algorithms are broadly classified into types of algorithms. And so they're clustering algorithms, they're decision-tree algorithms, neural-network algorithms, and regression algorithms are the four major types that you have. And you'll notice, as I go through here, we've done a lot of work, so we have created a lot of configurations for these different algorithms that you might want to use. Special call out to our team who has been working heavily on this. Connor and Mitchell and Madiha have done a great job. If we can give them a quick round of applause.
Kevin McClusky: Thanks, folks. So I'll go into clustering. And clustering is intended for putting things into different groups, into different categories. So if you know that most of your things are going to go over here, but maybe you have another group over here and another group over here. Clustering can be really good at that. So in the patty example, let's say they're going down the line and if you graph it out later, you can see most of them based on those variables are sitting right about here. But then you have some that are off to the left and some that are off to the right. And you know that these are going to be bad because there are certain things that are moving out of range, that could be a good example for where it could be useful to do a clustering algorithm. So I'll use k-means in this case, k-means++. You don't have to know the etymology of these algorithms. They just have names that sometimes are kind of funny. There's always a reason, but doesn't really matter for when you're using them from a practical standpoint. And I'm going to hit “Next” here. I get to pick out the number of clusters.
Kevin McClusky: We tried to give you a little bit of intelligent information here in this info bubble. And in my case, I happen to know that my process that I'm simulating has three different basic types. So, good. And then you've got different types of rejects there. So I'll give it three clusters. Max iterations, this is if you expect it to take a really long time, you can basically limit how many times it'll go through. And if you don't know what these things are right here, distance measure, empty cluster strategy. What am I going to do for these things? You don't have to know what these are. You could just hit “Next.” Or if you want to inspect or go into it a little bit, we actually have this little link up in the right that is a view documentation link that takes you right over to documentation that's around this whole thing. So actually drop that on the desktop in case I didn't have access to it quickly. But this comes along with the resource. This is here, “Machine Learning Manager Documentation.” It talks about the database columns, it auto, I didn't mention it, but it autocreates the tables inside the database for you, which is why we don't have Postgres yet because to create table syntax is a little bit different, but we'll be adding it shortly.
Kevin McClusky: As I mentioned. You probably don't care about Oracle. Most people don't run Oracle inside their industrial applications, but a few do. Anyway, so all of these tables, this is autocreated, autogenerated for you as soon as you launch it up. This is describing the structure there. And then if we're talking about k-means, it gives you a link over to k-means, k-means clustering. And you can get as deep as you want with all of this. You can take a look at what the different parameters actually mean in the algorithm and learn to your heart's extent. So we try to make all that information available. You go through here, so I'm happy with those options. I'm going to hit “Next.” And then this is where you get your data or where you take your data to train the model. Every model needs to know what's happened in the past so it can predict the future or needs to have some sense of something to take a look at so it can basically do whatever it needs to do as a model. So in our case, so you can either upload a CSV or retrieve tag history. Those are the options at the bottom here in this resource. I'm going to do tag history. I think most people will end up doing tag history for these things since we have a great system that stores history and you normally need history for these types of things.
Kevin McClusky: I'm going to give it a start date, and that'll happen to be today, and I'll give it an end date of today. And we'll just pull in a few records here so that you can, we can actually take a look at them. And I can walk you through some of how that works too. So we'll go from 11:00 AM to noon, let's say. Maybe I'll even pull in a smaller amount. So I'd like to show you some of the data cleansing that we have along the way too. Let's do noon to maybe 12:30 here and then I can filter out bad data if I want to. No interpolation, load that data. And I can see all these values here.
Kevin McClusky: So I've got a handful of pages of data. I actually ended up with 15 pages of data. And all of this data is just sitting here for me to use. And maybe I'm happy with this. And I've got my chroma, I've got my radius, temperature, inspection weight right there. I'm just going to hit “Save.” Create a new model? Yes. One of the really fancy things this system does for you is behind the scenes, which is once a model is created, it has to be held onto in memory somewhere, or else you have to recreate it every time that you want to use it from that training data set, which is not a good way to do things if you're doing big training data sets for models. So what this resource does for you is every, it'll create that. It uses system.util.getGlobals to store it inside a globally accessible area, inside memory, inside the gateway. And then that model becomes accessible forever going into the future.
Kevin McClusky: If you restart your Ignition gateway, as long as you're using this Exchange resource and you've imported it, it has a project startup script inside there that will reinstantiate all the models that you've already created automatically as it gets going again. So you don't have to worry about any of that. So I have this, it's trained now, I'm going to hit “Inspect,” pop over here. And that looks interesting, right? This is, you might not really know what you're looking at, but this isn't what I would expect, right? So I intentionally did something wrong here. But this is, the idea is that you want to get some visuals from your model and take a look at it and understand, “Oh, is this right, is this wrong?” And then adjust as you need to. So in this case, I've got cluster populations here. And I can see my big thing right there is T_Stamp, timestamp. You don't want to use a timestamp when you're trying to cluster things. So let me, I'm going to hit “Update” right here. Come right back over here and we've got some data cleansing tools. So these things right here will allow us. I'm going to hit “Delete column.” So that'll just drop that right out right there.
Kevin McClusky: And then some of the other data cleansing tools will let us do things like delete rows, maybe I know that my zero values are bad for my data set. Those aren't actually valid values that I have here. So I could come through and I could say, alright, I'm going toss this one, I'm going toss that one. Maybe I wanna toss that row as well. And yeah, there are actually quite a few zeros here, which is interesting. In any case, so I've got those, I cleaned that up, I hit “Save.” This is going to look a lot different now. So there we go. That's what a good model looks like. And you can see right here, based on the values that I have coming through, these guys are right about midrange right here. These guys are the high ones, these guys are low ones, and you've got these different clusters. So cluster three is basically the good product cluster over here. Cluster two, cluster one I know if something drops into those, then that probably isn't a great thing and those might end up being rejects.
Kevin McClusky: And I can do things like take a product that's going down, kick it off the line early in order to get, to save some work. So beef patties isn't a great example for something where that's going to make a big difference, but electronics production is. So if you can take a look at, you're doing semiconductor and you've got a work in progress, that's step one of, some of these literally have 1,000 process steps. Some of them are three or four hundred. If you pull it off at step 27 instead of running it all the way through, you're literally saving thousands of dollars for each one of the wafers, for example, that you might be running through. And there's whole variations between those. So for these clusters right here, of course, the next step now that I have this model in place is that I want to be able to use this. So we have a section here and just to make the most of our time and have just a few minutes for Q&A at the end here too, I'll just go quickly through this. But you can individually, manually put this information in. So you can say, I'm going to type these in and see which cluster it goes into and feel free to play around with that.
Kevin McClusky: Or you can go in and what most folks are probably going to end up doing is trying to automatically know what cluster those go into. So I have these tags right here. Wouldn't it be nice if I had another tag that I could just alarm on that says if these are going to go into cluster one or two, I know I have a problem? I want to send an alarm on that. I want to affect the system, I want to have a tag change script on it that's going to feed out to another thing that has web services or maintenance management system or whatever it happens to be, right? Once you have that value in a tag, you can do anything with it inside Ignition. So I'll go ahead and just create a new tag. This tag is going to be an expression tag in this case. And once again, to make the most of my time, I was just going to walk you through doing it, but I'll copy paste it here. I've got it right here as well. And I'll walk you through the basics anyway. But the, inside this expression tag, I'll come in, drop the expression right here.
Kevin McClusky: And so I'm basically calling this process model by tag path script. I'm going to refresh it every second because I want that information back right away. And I happen to know this model runs quickly. You can easily overwhelm a system if you're trying to do models that run very slowly. So try not to do that too much. And of course, monitor your Ignition gateway and your CPU utilization and all of that as you're going through anything like this. And sometimes folks will put this processing on a separate Ignition gateway entirely so that you're not affecting a main production system if you're going to use this type of thing heavily. If you're just going to use it lightly, it's normally fine to throw it on your existing gateways. And so then this is sending out the model. I'm going to give it the model name, whatever I called this guy so I called this guy “Inspect.” So I'll call this “Inspect” right here. And then these are the tag paths. So simply copy paste tag paths. Just right click here, copy path, and paste it right in here. And I'm going to hit “Apply” and hit “Okay” right there.
Kevin McClusky: As long as I didn't do anything wrong, I would expect that to pop up. Of course, this is a live demo, and there's a possibility that I did get something wrong along the way. Let me just check my model name here. It's called “Inspection” right there. I know that I had another model in here called “Test,” so I can feed...
Kathy Applebaum: You called it “Inspect.”
Kevin McClusky: Oh, inspect.
Kathy Applebaum: So you need to change it to “Inspection.”
Kevin McClusky: Oh, I did. Yeah. Alright. Kathy, you're really using your machine learning expertise and putting it to good use there.
Kathy Applebaum: And my developer experience.
Kevin McClusky: And beautiful. Yeah. There we go. So the category is three, category is now one, you saw that change on the fly right there. And so this is all based on the values that I have right there. Chroma, radius, temperature, weight that's feeding in. And then I can take that category, I can set up an alarm, I can process anything that I want off of that, and that just feeds right back into there. This model, as I mentioned, it has to be trained, so every time Ignition restarts, it'll automatically retrain itself. It'll be available for this, and then that category is going to be using that now and going into the future. And most of the time, it's category three, it looks like, which is great. That's exactly what we want to see. But as soon as it drops into category one or category two, do something with it.
Kevin McClusky: Alright, let me jump to the rest of the presentation so we can leave just a couple of minutes for Q&A here at the end. One really nice thing about this resource is it is built to be extensible. You do need a Python programmer, somebody who at least knows what a class is in Python, knows what an interface is. But we created interfaces that were available, that are available that you could wrap other things with. So you could wrap things that were actually going into TensorFlow, into PyTorch. As long as you set that up, it'd take a little bit of work, but the model is extensible. And so you can get those models directly inside Ignition. As long as you add them from Ignition, you'd get them right there, you'd be able to interact with them and do whatever you want over whatever interface that you set up there. Same thing for the cloud. It should be possible to wrap RESTful interfaces and web services to do cloud models from this same framework that we have. And if we go back to those practical applications, the question really is going to be, “How do I get there from here?” So you've seen “How do I create a model?” But if you're just taking a look at your models and saying, “How do I know if I got to do something with this model as it's going down the line?” Sure, you might try out k-means++ because I just showed it to you. But what's the right option?
Kevin McClusky: And what if you wanna do process tuning and what if you wanna do parametrization for different things? So I demonstrated that anomaly detection. However, each one of these has different potential practical approaches.
Kathy Applebaum: Yeah, so k-means++ is actually my go-to for anomaly detection, because anomaly detection is a categorization type of thing. We're trying to categorize into normal and abnormal. But sometimes we just don't have enough information to do it well, or we may wanna try another approach. For example, k-means++ does take a while to train models if you have a lot of data. If you have millions or billions of data points, it's gonna be a slow process. So there's some other ones that are not included in our built-in libraries that you might look at. One is called k-nearest neighbor that does not generally require pretraining. Instead what it does is it takes your new data point and finds the nearest data points to it and says, your new one is probably like these others. Another one is gotta be a random forest. Some kind of decision tree. Which can work surprisingly well in finding anomalies. For preventative maintenance and predictive failure, here we're usually trying to estimate a value. Either how long it's going to be before this thing fails or needs maintenance, or we're trying to estimate in this given window, next month, what is the probability that it's going to fail? And so therefore, we want one of the regression algorithms. Polynomial regression works really well.
Kathy Applebaum: There's also a version of decision forest that predicts values. And again, that can often work very, very well on something like this. Rejecting bad products. Again, this is a categorization type thing, and for categorization, my go-tos are gonna be k-means, k-nearest neighbor, random forest. Another one that works surprisingly well is support vector machines. So support vector machines transform your data and try to put it on one side or another of a hyperplane. We can just think of a hyperplane as like a line. There's a dividing line between good data and bad data or category one and category two. So that's another one I would try, if you're not doing really well with that. Process tuning, we're trying to predict a value, a setting. If you've got fruit that you're drying, how long and what temperature do you wanna dry it? It's gonna depend on the size of your fruit, the sugar content, and today's temperature and today's humidity. So regression, polynomial regression is really great for that. And again, decision forest works really well for so many things. If you try nothing else, try that first for anything because it almost always works.
Kathy Applebaum: And the digital twin simulation. Boy, you could do any of these things with a digital twin. You could do anomaly detection, preventative maintenance, process tuning, so you're probably gonna be throwing a lot of different algorithms at your digital twin to kind of see how much you can glean from what you're getting back.
Kevin McClusky: So we wanted to take all that information that's in Kathy's head and get it out of Kathy's head. Make it available to you. So one thing that we are doing that we're working on finishing up right now and that we're going to have available probably released right when the virtual ICC is, is this companion guide for everything that we're talking about here. So how do you get from here to there? How do you approach certain problems? What algorithms do you look at? How do you walk through the process tuning? What kind of data sanitization do you do along the way? We didn't really talk about data cleansing much other than just showing you a couple of things, but that's a really important part in the process. Making sure you have good data, making sure that the data is consistent and synchronized. And I pulled in some values that were prescaled, but you wanna make sure that they're scaled appropriately. K-means or any of these clustering algorithms, if you have one thing that's completely off from the scale, if you have a timestamp that has a huge scale and everything else is small, it's not gonna work very well for you. So making sure that that's scaled in the right way and that you've cleaned up your data in a way that these algorithms can use is an important piece of it as well. So the...
Kathy Applebaum: Data cleansing is probably the most important step.
Kevin McClusky: Yeah. So the guide will include information about that and then a bit of a step by step, not necessarily click this thing and do this exact thing, but you're all engineers, you're all intelligent, you can follow a guide that doesn't tell you exactly what to click, where you go. But it'll have the general overall guidance, set up this type of model, train it with this type of data. Watch out for these things. Especially if you're just getting started. So that's it. We've got a couple of minutes for a question and answer. I know that there are mic runners, if there are questions. So if you have anything, go ahead and raise your hand and they'll bring around a mic to you.
Audience Member 1: Hello. Okay. I just wanted to ask, does your Machine Learning Manager support testing data sets if you wanna verify the accuracy and precision of your model that you end up with?
Kevin McClusky: Yeah, yeah, that's a very good question. I might be able to pull it back up real quick, so I had talked about testing individual values. It does do the same type of thing with data sets as well, which is really nice. So if I come into my inspection one and I hit “Inspect” on it and I go to process, this is where you do that.
Kevin McClusky: So test data, you can do the manual data entry right here, but you can also upload a CSV or you can pull in additional tag history. It's gonna give you all the predictions for all of that. And then of course, using Ignition's tools for graphing and charting or whatever else you can... If you don't wanna just read through line by line and you wanna understand more of the overall for all of that, you've got that information right there too. So it'll just come back into a data set that you can do whatever you want with.
Audience Member 1: Thank you.
Kevin McClusky: Yeah.
Audience Member 2: I think I'm up here. Yeah, so along those lines, this seems like a really great tool to implement if you kind of already know what you're looking at, but traditionally I would look... You spend way more time doing data cleansing as well as running models, rechecking and what I didn't see here was the ability to look at correlations between parameters or input parameters or probably seeding training data, verse test data, and running that iterative process for refining those models looking at, which ones are significant? How significant are they? And I guess the question is, would you recommend still probably doing a lot of that frontend stuff somewhere else like R, and then bringing in test data once you kind of know what your model looks like, or would you recommend trying to do more and more of that inside the ML manager here?
Kevin McClusky: Yeah, yeah, good question. So if you're already an expert with all of these, actually, both of those methods could work. So nothing wrong at all with doing all of that inside R or inside one of the other systems that you might be using, pulling back that whatever data set you find is gonna create a good model, pulling that into Ignition and creating that model in Ignition or we've even had a couple of folks integrate Ignition with our models before. You can technically do that, there are a few more hoops to run through there because it's not really easily exposed for web services, but there are some libraries that you can pull in, create an Ignition module. But that's kind of beside the point, so yes, you can pull those test data sets into this.
Kevin McClusky: This is also brand new, and one of the things that we would love is community involvement. So this is a resource, it's on the Exchange, you can download it and open it up inside the designer, see how it's built, see what the different script functions are that are behind the scenes, how the objects are built, all of that. And I would love to have more of the things that you just mentioned, because I've done that in a bunch of other frameworks too, right? You're gonna go through and there's a workflow, there's a process that you generally go through when you're going to build out a model, make sure that it works well, verify it, and if you want to contribute to this or if anybody here wants to contribute to this resource, absolutely hop in, create a couple of screens, create another step along the way and send it back over to our team, we're happy to incorporate that back into what we have here and keep making it better.
Kathy Applebaum: And for our non-R experts. You may have noticed that there was a number up above the final model. That's actually a score for how well the model performed. So you can actually try just deleting a row of data and seeing does the model... Does the score improve or does it get worse a lot? So you can try some of that leaving out data, adding extra data right here in the resource.
Kevin McClusky: And there are different visualizations for different models in here. I didn't have time to go through all the different models but our team spent a fair amount of time on being able to try to give you information in the time that they had available to put this together, but give you information that looks different for k-means than it does for a regression algorithm, for example. Alright, I think we've got time for one more question. I know we're over technically, but I think that there is...
Audience Member 3: Yeah, got one here Kevin.
Kevin McClusky: Yeah. Go ahead.
Audience Member 3: I noticed you trained it on tags and didn't go to the UDT tab, is it gonna work if you've got lots of exactly the same machines that you might wanna train it on the whole data set or did you select tags just to make it train faster in this case, on one device?
Kevin McClusky: It's just a little bit hard to hear with the noise coming from the sides, but I think I got what you said. So if you have 20 different machines that are the same type, is it better to train from a set of tags that are from one of those machines and then apply it to all of them, or is it better to pull that data together and train from all of that. Is that what the question was?
Audience Member 3: Yeah, sorry, can you use UDTs essentially, rather than the direct tags to train it?
Kevin McClusky: Right, right. Yes, good question. As of right now, the model, the resource that we have doesn't have anything that would automatically pull those from different instances of the same UDTs, but that's something that I've been thinking about as well, and it would be really nice to basically incorporate in the future. Kind of the same answer as the last one, if you wanna do something like that and contribute it back, we'd love to do that. It's something that our team might get to at some point in the future, regardless, and incorporate back into the overall resource. Yeah. Alright, I see one more hand over here. I think we'll do the last question over here and then we'll call it a session, but yeah, please go ahead.
Audience Member 4: Do any of the algorithms support once you generated the model, tuning the model by hand afterwards, like if you wanted in your patty inspection process, if you wanted to loosen it up while it's running to allow more bad patties through or less quality patties through, or would you have to generate new test data and retrain the model?
Kevin McClusky: Yeah, so retraining a model on the fly as opposed to... So there are a couple of different types. So there are some models that if you have these traditional models, I think all of the ones that we have here, basically you train, you can retrain with new training data set, or you can take the existing data set, you can add rows to that, to incorporate the new things, and then basically have that as a retraining data set. What I showed a few minutes ago was the section where it has the, basically the ability to retrain, you can click “Retrain,” you can modify that data set, you can add to it, and then train on the bigger data set based on the new stuff that you have. There are some models that have continuous training, right?
Kathy Applebaum: Right. I didn't notice any in here, I think these are all pretty much you throw away the old model and build a new one when you retrain.
Kevin McClusky: Right, right. But some of the examples would be in TensorFlow, right? And in scikit-learn and in some of the ML things that are available in the cloud, right?
Kathy Applebaum: Yeah, for sure.
Kevin McClusky: Are there a couple of examples of models like that that you can think of?
Kathy Applebaum: I'm blanking on them.
Kevin McClusky: Okay. That's fine. But yeah, the continuous training models aren't used as much partially because sometimes those will also cause drift, and so if you train with a good data set and then your motor start wearing out and your data set shifts, things still look like they're working. But it's shifting from your baseline, and if you retrain with that data or you have to continuous training and is using that data and calling it good data, then you'll never notice when it drifts because your good data point gets drifting as well, and if it drifts out of that and you start to get things breaking before it actually predicts that there's gonna be problems.
Kathy Applebaum: Yeah. Although I could say that, for example, the k-nearest neighbor that I talked about isn't an example of continuous training, but it's actually an example of a model that's never trained. You're basically... It's a lazy model, you're doing it out on the fly every time.
Kevin McClusky: Alright, so I think I'm gonna end the slide show and the session with...
Kevin McClusky: I'm gonna end it with the right slide here, so.
Kathy Applebaum: Thank you all.
Kevin McClusky: That's all. Thank you so much for joining, stick around for the Build-a-Thon if you're not flying out, we're super excited about that too, so thanks everybody.
Kathy Applebaum: Thank you.