from Part One - Fundamentals
Published online by Cambridge University Press: 29 May 2020
This chapter discusses Feature Engineering techniques that look holistically at the feature set, therefore replacing or enhancing the features based on their relation to the whole set of instances and features. Techniques such as normalization, scaling, dealing with outliers and generating descriptive features are covered. Scaling and normalization are the most common, it involves finding the maximum and minimum and changing the values to ensure they will lie in a given interval (e.g., [0, 1] or [−1, 1]). Discretization and binning involve, for example, analyzing a feature that is an integer (any number from -1 trillion to +1 trillion) and realize that it only takes the values 0, 1 and 10 so it can be simplified into a symbolic feature with three values (value0, value1 and value10). Descriptive features is the gathering of information that talks about the shape of the data, the discussion centres around using tables of counts (histograms) and general descriptive features such as maximum, minimum and averages. Outlier detection and treatment refers to looking at the feature values across many instances and realizing some values might present themselves very far from the rest.
To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.