Objective of the simulation 
The objective of this investigation is to test the capabilities of machine learning tools as applied to process operating data for the purpose of optimization. The technology that is utilized in this investigation is referred to as a deep learning neural network model developed using Python-based Keras Tensor Flow and NumPy applications. 
It is important to note that any optimization based upon historical operating data will only find optimum parameters which exist in that data. It is very unlikely to discover totally new operating regimes using this approach. However, what is possible is to better understand how different process parameters contribute to the overall optimization, and to more precisely define the operating window that leads to a consistently optimum result. 
A possible outcome of this type of investigation, in a real-life application, is that it becomes possible to quantify the economic benefit of tightening the control of a specific process variable. In this example, we seek to drive the process to a more consistent level of optimization by better identifying the importance of each process variable, and the opportunity window that could exist through more precise process control. 
The basis of the simulation 
This simulation assumes that a pharmaceutical product is being produced in a stirred fermentation reactor on a batch basis. Process parameters are collected during the fermentation step and the averages are stored in a table. The matching off-line laboratory analysis on the yield of product from the batch is added to the same table. 
Several variables were used in this simulation. Two variables, called pH and Temperature, exhibited highly non-linear response curves in keeping that micro-organisms usually grow best with optimum values, and quickly digress either above or below that value. Another hypothetical variable, called Conductivity, demonstrates a quadratic response on the product yield. Yet another variable, referred to as Redox potential, demonstrates a linear response on yield within the operating envelope of the investigation. 
Other variables, called Dissolved Oxygen and Agitator Power, have operating data but do not have any influence upon yield within this range of operating parameters. They are only present to see if the neural network can discern that they are not contributing. 
Finally, random noise is also introduced into the simulation. This noise is not an input that the Neural network can see, rather it just affects the final batch yield in a random manner. This was done to simulate that not all the process parameters which affect the yield outcome are being monitored and recorded. This is a more realistic situation. 
The mathematical simulation was carried out using VB.net to create a file with 200 batch records. The simulation was designed so that the range of yields was within a range of 70% to 96% of a theoretical maximum. 

copyright 2019 Powell Simulation LLC