Feature engineering is the process of transforming existing features in a dataset, or creating new features to improve the performance of machine learning models. Here, the term “feature” refers to the individual input variables or attributes used to train a machine learning model.

Features are the characteristics or properties of the data that are fed into the model to make predictions or classifications.Here are some practical examples of features in different domains:

  1. In text classification a bag of words represents the presence or frequency of words in a document.
  2. In image classification, the historgram capturing the distribution of colors in an image.
  3. In a tabular data, numerical variables such as age, income, or temperature, or categorical values such as gender, or country.

Feature engineering involves several techniques ranging from simple classification, aggregation, data transformation to more complex mathematical operations such as normalization/scaling, dimensionality reduction, gradient boosting.