• Greenfields Security Enclave, Table View, Cape Town, 7441

  • info@excite-data.tech

  • +27 61 435 6751

Data Engineering

Data Engineering

Data Engineering is the process of collecting data, preprocessing it for ingestion, ingesting, filtering and storing it in a ready-to-use format for data scientists and analysts. The most important function of data engineering is ensuring data availability at scale.

Thumb

Typically, business data is generated from different, unrelated sources. The initial task of the Data Engineers is to identify the sources and format of the data, and to design a pipeline for collection into a useful format. This typically involves the usage of a range of protocols including, but not limited to:

Industrial IoT (MQTT, OPC-UA, ModBus, etc.)

Relational and NoSQL databases (Microsoft SQL Server, MySQL, MongoDB etc.)

File formats like CSV, Excel* and text

Once data ingestion pipelines have been set up, historical data can be ingested. Real-time data can also be streamed into the data store for searching. The architecture of the data ingestion and storage pipelines is designed for cluster deployment, so that the data store can be scaled for big data.

Sample Pipeline