The acquisition, processing, and utilization of data have become fundamental drivers behind the adoption of emerging technologies such as Cyber-Physical Systems (CPS), the Internet of Things (IoT), predictive maintenance, dynamic scheduling, big data analytics, cloud-based platforms, and digital twins. As a result, efficient data exchange and management are critical—if mishandled, they can become significant obstacles.
In the development of decision support tools, data preparation stands out as the most vital and time-intensive phase. Real-world data often arrives in inconsistent formats, may be incomplete or error-prone, lacks proper structure or traceability, and is frequently contaminated with noise. The core challenge typically lies not with the analytical tools themselves, but with the underlying data communication and integration infrastructure that governs data acquisition and handling.
To enable the effective deployment of dynamic decision support systems (DSSs), next-generation industrial architectures aligned with Industry 4.0 must ensure real-time data consistency, especially in time-sensitive applications. Addressing this need calls for a robust platform capable of automating data acquisition, preprocessing, and management—achieved through the development of an intelligent middleware layer.
This middleware functions as a bridge between field-level devices and higher-level DSSs. It facilitates seamless integration with technologies such as industrial communication protocols, fog computing architectures, and big data storage solutions. A compelling example of this platform's application can be found in pharmaceutical industry quality control labs, where a prototype system integrates an autonomous robotic module for material preparation with a dynamic scheduling decision support tool.
The integration of production operations with business planning and logistics relies heavily on the development of data-driven industrial architectures. A key framework for achieving this integration is the automation pyramid, as defined in the ISA-95 standard.
This model consists of five hierarchical layers that collectively enable seamless information flow across industrial environments:
-
Field Layer: This base layer includes sensors and actuators, providing real-time data directly from the physical environment. Technologies such as barcodes and Radio-Frequency Identification (RFID) enable resource tracking and monitoring, supplying critical input to intelligent systems. In sectors like pharmaceuticals, smart sensors also support Quality by Design (QbD) approaches through Process Analytical Technologies (PAT).
-
Control Layer: At this level, Programmable Logic Controllers (PLCs)—robust industrial computers with IP65/67 protection—manage machine-level control with exceptional precision. Their deterministic cycle times (typically <1ms) make them ideal for time-sensitive applications such as motion control.
-
Supervisory Layer: The Supervisory Control and Data Acquisition (SCADA) system acts as a bridge between field devices, controllers, and supervisory systems. It is the de facto standard for industrial data acquisition and ensures reliable communication across various automation layers.
-
Manufacturing Execution Layer: Manufacturing Execution Systems (MES) monitor, analyze, and report real-time production activities, tracing the full transformation process from raw materials to finished products. With the decentralization of decision-making systems, MES must increasingly integrate with distributed and intelligent systems.
-
Enterprise Layer: Enterprise Resource Planning (ERP) systems manage core business functions through modular and integrated applications. With advancements in data science, ERP systems are increasingly leveraging machine learning and data mining to extract actionable insights from vast datasets.
Each layer in the automation pyramid plays a distinct role in acquiring, processing, and utilizing data. Traditionally, information flowed hierarchically from bottom to top. However, the rise of mass customization and the need for real-time responsiveness require high-volume, bidirectional data exchange across all layers—continuously, in parallel, and often over internet-based infrastructure.
While the structure of the automation pyramid remains relevant, its static, hierarchical nature is evolving. Emerging technologies are driving a shift toward highly interconnected systems with greater horizontal integration. Direct communication between lower-level devices (e.g., field buses) and higher-level applications now demands enhanced interoperability and standardized communication protocols.
In this transformed landscape, cloud computing has become a central enabler. It offers a globally accessible platform for data storage and management, along with scalable services that enhance flexibility, security, and cost-efficiency. Cloud connectivity eliminates the need for local data storage, allowing data to be accessed and analyzed remotely, anytime and anywhere. Cloud-based Industrial IoT (IIoT) platforms like MindSphere (Siemens) and PI System (OSIsoft) are already revolutionizing process industries with plug-and-play analytics capabilities.
To complement the cloud, fog computing has emerged as a critical intermediary. It enables local data processing, filters sensor data, and ensures low-latency responses necessary for time-sensitive tasks involving actuators and robotics. Fog nodes act as intelligent gateways between industrial assets and cloud services, managing unstructured data while balancing speed and bandwidth requirements.
As industrial systems become more interconnected, the need for machines and applications to communicate seamlessly becomes paramount. This requires shared standards, common languages, and unified infrastructure that supports both centralized and decentralized intelligence—ushering in a new era of cyber-physical convergence in the smart factory environment.
USE CASE:
In the pharmaceutical sector, many labor-intensive operations offer significant potential for automation. One particularly complex area is the dynamic scheduling of tasks within Quality Control (QC) laboratories. These labs are essential for ensuring that every drug batch meets strict safety and quality criteria through comprehensive sampling and testing procedures.
Each laboratory test typically comprises six main stages: system preparation, system suitability, sample preparation, analytical run, data processing, and final review. These stages break down into numerous detailed and often customized subtasks, each requiring specific analysts, equipment, or both, depending on the nature of the test.
While production workflows are generally organized through pre-defined schedules, QC operations are often interrupted by the arrival of urgent, high-priority samples, which must be addressed immediately. This requires the system to dynamically reschedule tasks in real-time. The challenge lies in coordinating both equipment capabilities and operator skill sets, taking into account resource availability and operational constraints. This creates a dual-resource constraint problem, where both personnel and machines must be optimized simultaneously.
A critical and precision-sensitive component of this workflow is material preparation, which falls under the sample preparation stage. To improve efficiency and reduce manual handling, a prototype automation module has been developed to autonomously manage material preparation tasks. This module is designed to share resources and operate across multiple test sequences concurrently, enhancing throughput, flexibility, and overall laboratory efficiency.
Despite these advancements, several challenges remain. Among the most notable are the complexity of test recipes and the short shelf life of prepared materials, both of which impose additional constraints on automation and scheduling systems.
The prototype system is composed of several integrated components, including two industrial identification platforms—PC-based and PLC-based, an automation module, a SCADA system, the Middleware for Intelligent Automation (MIA), a dynamic Decision Support System (DSS) with scheduling capabilities, and a Data Warehouse (DW).
The PC-based platform utilizes a personal computer configured to manage data acquisition from two identification technologies: Barcode scanners and the RFID RFH620 system.
The PLC-based platform consists of a PLC, an Extension Terminal, distributed I/O system, an HMI Basic Panel, and a SCADA PC, all working together to handle data from Siemens' Barcode and RFID systems. This setup is responsible for real-time monitoring and control within the automated environment.
The MIA platform is hosted on a dedicated PC running SQL Server Enterprise 2017 and a Python environment, serving as the core of data integration and communication between systems. It acts as a bridge linking lower-level control devices with higher-level decision tools.
The Data Warehouse (DW) plays a crucial role by aggregating and managing data extracted from both the Enterprise Resource Planning (ERP) system and the Laboratory Information Management System (LIMS). In this setup, LIMS functions similarly to a Manufacturing Execution System (MES) but is tailored specifically to meet the needs of laboratory operations, including test tracking, compliance, and quality management.
Together, these components create a cohesive infrastructure that supports automated identification, real-time data acquisition, intelligent scheduling, and data-driven decision-making in the quality control processes of pharmaceutical laboratories.