Understanding Feature Engineering (Part 1) — Continuous Numeric Data


Any intelligent system regardless of complexity needs to be powered by data. At the heart of any intelligent system, we have one or more algorithms based on machine learning, deep learning or statistical methods which consume this data to gather knowledge and provide intelligent insights over a period of time. Algorithms are pretty naive by themselves and cannot work out of the box on raw data. Hence the need for engineering meaningful features from raw data is of utmost importance which can be understood and consumed by these algorithms.

Any intelligent system basically consists of an end-to-end pipeline starting from ingesting raw data, leveraging data processing techniques to wrangle, process and engineer meaningful features and attributes from this data. Then we usually leverage techniques like statistical models or machine learning models to model on these features and then deploy this model if necessary for future usage based on the problem to be solved at hand. A typical standard machine learning pipeline based on the CRISP-DM industry standard process model is depicted below.

Read more at Towards Data Science