Applications in the smart industry domain, such as interaction with collaborative robots using vocal commands or machine vision systems often requires the deployment of deep learning algorithms on heterogeneous low power computing platforms. The availability of software tools and frameworks to automatize different design steps can support the effective implementation of DL algorithms on embedded systems, reducing related effort and costs. One very important aspect for the acceptance of the framework, is its extensibility, i.e. the capability to accommodate different datasets and define customized preprocessing, without requiring advanced skills. The paper addresses a modular approach, integrated into the ALOHA tool flow, to support the data preprocessing and transformation pipeline. This is realized through customizable plugins and allows the easy extension of the tool flow to encompass new use cases. To demonstrate the effectiveness of the approach, we present some experimental results related to a keyword spotting use case and we outline possible extensions to different use cases.