Performance of industrial sensor data persistence in data vault

  • Florian Bachinger   ,
  • Jan Zenisek  ,
  • Lukas Kammerer  , 
  • Martin Stimpfl  
  • Gabriel Kronberger  
  • a, c, e Josef Ressel Centre for Symbolic Regression, School of Informatics, Communications and Media, University of
    Applied Sciences Upper Austria, Hagenberg
  • bHeuristic and Evolutionary Algorithms Laboratory, University of Applied Sciences Upper Austria, Hagenberg
  • Miba AG, Dr. Mitterbauer Str. 3, Postfach 3, A-4663 Laakirchen, Austria
Cite as
Bachinger F., Zenisek J., Kammerer L., Stimpfl M., Kronberger G. (2018). Performance of industrial sensor data persistence in data vault. Proceedings of the 30th European Modeling & Simulation Symposium (EMSS 2018), pp. 226-233. DOI: https://doi.org/10.46354/i3m.2018.emss.031

Abstract

Today manufacturing companies are facing important challenges from the market in terms of flexibility, ever
growing product mixes, small lot sizes, high competition, etc. To meet these market conditions, digitalization and the use of data are offering a viable toolset considering the advances in the field throughout the last couple of years. The increasing use of sensor technology and the need for interconnecting data from different departments in smart production leads to a surge of recorded data. Persistence and integration of heterogeneous data, generated in a variety of software systems, is a key factor to gain value from data and its analysis. High flexibility in regards to the model is required to accommodate the data. Hence, application of the data vault modelling approach is a fitting candidate to design a data warehouse model. In this paper we present a data vault model for factory sensor data. We analyze the performance of the data warehouse in regards to bulk load of data and common analytic queries.

References

  1. Chaudhuri, S., & Dayal, U. (1997). An overview of data warehousing and OLAP technology. ACM
    SIGMOD Record, 26(1), 65–74. https://doi.org/10.1145/248603.248616
  2. Codd, E. F. (1970). A Relational Model of Data for Large Shared Data Banks. Communications of the ACM,
    13(6), 377–387. https://doi.org/10.1145/362384.362685
  3. Collins, G., & Shibley, M. (2014). Data Vault and HQDM Principles. SAIS 2014 Proceedings.
  4. Hultgren, H. (2012). Modeling the agile data warehouse with data vault. New Hamilton.
  5. Inmon, W. H., & Linstedt, D. (2014). Data Architecture: A Primer for the Data Scientist: Big Data, Data
    Warehouse and Data Vault. Morgan Kaufmann.
  6. Inmon, W. H., Strauss, D., & Neushloss, G. (2010). DW 2.0: The Architecture for the Next Generation of
    Data Warehousing. Morgan Kaufmann.
  7. Kamal, A., & Gupta, S. C. (2015). Query based performance analysis of row and column storage
    data warehouse. 9th International Conference on Industrial and Information Systems, ICIIS 2014. https://doi.org/10.1109/ICIINFS.2014.7036537
  8. Linstedt, D. (2002). Data Vault Series 1 – Data Vault Overview. Retrieved July 10, 2018, from
    http://tdan.com/data-vault-series-1-data-vaultoverview/5054
  9. Linstedt, D. (2014). datavault 2.0 hashes versus natural keys. Retrieved May 7, 2018, from
    http://danlinstedt.com/allposts/datavaultcat/datavault-2-0-hashes-versus-natural-keys/
  10. Linstedt, D., & Olschimke, M. (2015). Building a Scalable Data Warehouse with Data Vault 2.0.
    Morgan Kaufmann. https://doi.org/10.1016/C2014-0-02486-0
  11. Microsoft. (2017). Reorganize and Rebuild Indexes. Retrieved July 10, 2018, from
    https://docs.microsoft.com/en-us/sql/relationaldatabases/indexes/reorganize-and-rebuildindexes?
    view=sql-server-2017
  12. Thomas Bauernhansl, Michael ten Hompel, B. V.-H. (2014). Industrie 4.0 in Produktion,
    Automatisierung und Logistik. (T. Bauernhansl, M. ten Hompel, & B. Vogel-Heuser, Eds.).Wiesbaden: Springer Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-658-04682-8
  13. van der Veen, J. S., van der Waaij, B., & Meijer, R. J. (2012). Sensor Data Storage Performance: SQL or
    NoSQL, Physical or Virtual. 2012 IEEE Fifth International Conference on Cloud Computing,431–438. https://doi.org/10.1109/CLOUD.2012.18
  14. West, M. (2011). Developing High Quality Data Models. Morgan Kaufmann.