Hello, everyone, and please welcome Donald Deuxer to the virtual stage. He is Korelen’s Director of Data Science, and his session today is titled “Case Study on Using Machine Learning in the Aluminum Industry.” Please greet Donald; you are invited to join us here with your camera and slides. Hello.
Hello, Shelly. Thank you for introducing me. Okay, I’ll switch off my camera now, and you can share your slides and leave.
That’s fantastic. Can you see my screen now? Yes, that looks good. So, everyone, good afternoon. Today, I’ll discuss a case study on machine learning in the aluminum business.
Before I begin the case study, let me tell you a little bit about my boss, Korelen. As a result, Novallis is a major manufacturer of flat rolled aluminum products. What you see on the right-hand side, top, is our finished product, which is delivered to our consumers. We are the world’s largest aluminum recycler. We sell rolled aluminum to four major markets: automotive, aerospace, and speciality applications. Our top clients are listed at the bottom. So, if you’ve ever used a beverage can or bought a car, you’ve probably come into contact with our aluminum. We are a multinational corporation with operations in nine countries and 33 sites employing over 15,000 employees.
So, it’s a worldwide manufacturing corporation, and the use of data science is new to this company. We began our digital journey roughly two years ago by establishing a Center of Excellence. The concept behind this Center of Excellence was to use digital technologies to tackle our difficult business problems.
To begin, we concentrated on operations for two reasons. For starters, it’s our bread and butter, therefore we can make a big difference by incorporating digital technologies into our operations. Second, we track a large amount of sophisticated data throughout our operations.
To give you an idea, each of our machine centers has close to a thousand sensors that track data at very brief intervals ranging from 10 milliseconds to one minute. So, in terms of big data, it is a true big data region in the proper sense of the term. We use advanced analytics approaches, primarily machine learning, linear and nonlinear optimization, and statistical modeling. We prefer to focus on the Bayesian side of statistical modeling, and because we are an open-source company, we often create our models and machine learning workflows using platforms such as Python and Spark.
We use cloud technologies in our facilities and have used several off-the-shelf applications to help plant operational engineers benefit from these technologies.
As a result, our Center of Excellence has a dual mandate. We want to add value to our operations, but in order to do so, we must grow capabilities within our factories. So my team is working on both of those things. We currently have seven data scientists working on data science use cases for our organization across the globe in various factories.
This is an example of a common workflow for a use case. As previously said, we have several machine centers in our factories that track numerous signals at very fine granularity and intervals. This data was previously housed in our plant databases, and one of the areas where we are attempting to correct it is by bringing that data to a common location in our data lake. But, before we do that, we normally start with our local data sources and create our models to answer business challenges using machine learning and various advanced analytics techniques.
We work with our data engineering team to bring relevant data into the cloud, into our data lake, and then we productionize our models into the cloud once we see a successful model being built. Finally, we communicate the predictions back to various interfaces used by plant operations.
I’m going to discuss one of the case studies in which we’ve had fantastic success implementing machine learning in our operations.
As previously said, we are an aluminum recycler, so we create aluminum coils by purchasing and recycling aluminum. We have big melters that melt discarded beverage cans or any other recycled aluminum material. So, forecasting the melting time accurately is critical for our operation to have a high throughput. Previously, we utilized a thermodynamics-based equation to make this prediction. However, for a variety of reasons, this forecast was incorrect, resulting in overcooking of the molten metal. This leads to two issues. One, it lengthens the process time to melt the aluminum, and two, it lengthens the weight time so that the molten aluminum may be accepted by the following process.
The operators did not believe the model forecasts because they were wrong, and they constantly opening the doors of our furnaces to gather samples and verify whether the aluminum had melted. This resulted in two major concerns. One, when you open the door, a lot of heat is lost, which reduces our fuel economy. The second is that opening the door is highly dangerous, thus there is a safety concern in this process. Finally, it kept our operators occupied with unneeded chores. They didn’t have to accomplish this job if the model predictions were correct.
So we chose to concentrate on this use case, with the goal of enhancing the throughput of our melting process by increasing the accuracy of the time to melt and, as a result, saving energy by minimizing heat loss due to door opening. This use case covered one facility with three furnaces. We had some constraints: we didn’t want to physically change the furnace construction, we didn’t want to spend any more money, and we didn’t want to make any big modifications to our operational processes. So we decided to design a model that will use data from numerous sources to accurately alert operators when the aluminum is molten.
This is the process via which the model evolved. As previously stated, we began with a weight-based thermodynamic equation and had a single equation for all of the alloys. That was one of the reasons our predictions were off. So, first and foremost, we updated the equation to include separate equations for different alloys. At the same time, we added the surface area of the molten aluminum to the calculation, which allowed us to reduce the total melting time by three percent.
The next stage was to create a fully machine learning-based model, and with that technique, we were able to reduce melting time by 5% on average compared to the baseline. However, the variance was extremely significant, which was unacceptable from a planning standpoint. Finally, as our final iteration, we took the updated thermodynamic equation as our base and created a machine learning model on top of it, which helped us get the best performance. This allowed us to shorten our melting time by 12%.
I’ll walk you through the model construction process. We employ a historian system called Pi, which records sensor data at one-minute intervals. Along with that, we have a database that keeps track of all the furnace additions. We combined the two data sources into a single database and developed additional functionalities. We had 118 candidate characteristics before we started the model. We had to change our aim since, as previously said, we used to overcook the metal, which meant that we went above and beyond the BTU requirement that the metal required to reach a lower temperature.
So we used extrapolation to adjust the BTU measurements and then trained the model on that. On the right, I’m displaying the feature importance, and you can see that the theoretical heat required absorbed about half of the feature importance. So, the beauty of this model is that the thermodynamics is the foundation, and we have additional features to absorb noise in the process when we open the doors, as well as other materials, hardeners, and various alloying components. Our prediction accuracy was quite good. We tested this on several batches and achieved an R-squared of 95 percent.
If you come from another industry, this may seem a little high, but keep in mind that for this particular model, the theoretical thermodynamic equation accounts for around half of the prediction.
This model was implemented as a REST API in our plant server. We had planned to install this model in our cloud resources, but when we discovered that the plant where we were going to use it had bandwidth challenges, we chose to keep the model near to the source where the decision would be made. This is the point at which the industry is transforming.
Cloud computing was once the standard, but it is gradually giving way to edge computing. We collect data from the two data sources I mentioned before, and it is sent to a single database known as the automation interface layer, where our feature engineering takes place. The model API is then called every three minutes. When we use the portal API, we send the features we created in the form of a JSON file. The model uses those attributes to generate BTU forecasts, which it then transmits back to the automation interface layer. We take those BTU predictions and convert them to the time required to melt the aluminum in the automation interface layer.
Finally, it is shown to the operators on an HMI. So we display two major elements on the screen. One, how much heat is needed to melt the aluminum, and two, how long it takes to melt the aluminum. A progress indicator depicts the time necessary to melt metal. This allows the operator to look at it and just check it every so often while also performing other jobs. It also prevents them from repeatedly opening the door because they now have an accurate projection of the whole melting time. This greatly aided us; our throughput grew significantly, as did our fuel efficiency, and this use case resulted in multi-million dollar savings for us.
The important takeaways from today’s session are that metal fabrication is a really intriguing business if you are interested in highly dimensional data and big data in every sense of the term. We have a large amount of data being tracked by hundreds of sensors, and you get to conduct some really fascinating analysis and model building on top of that. This business is transitioning from classical statistics and linear modeling to non-linear modeling, so explore metal manufacturing if you want to utilize advanced machine learning approaches. My team has learned from this experience that we cannot ignore basic sciences.
Indeed, our greatest models are those in which we used physics or thermodynamics as a foundation and then built machine learning on top of it. We are investing a lot on establishing capabilities within our plants since sophisticated analytics is becoming a critical value driver for us. So, within our plants, we offer a variety of curriculums and upskilling initiatives. Overall, it’s been a very interesting experience for us, and it’s proving to be a significant value driver for the company.
I’d like to conclude my presentation by thanking you for your time. If you have any questions, please contact me at the email address listed above, and I’ll be pleased to answer them. Thank you very much. Shelly, it’s your turn.
Okay, fantastic; thank you so much. That was a fantastic presentation. Could you please unshare your screen and camera for a moment? Thank you very much for coming, everyone. I hope you had a good time. Have a fantastic time at your next session or at one of our virtual booths.