Linear Regression
The Linear regression tool creates a simple model to estimate values, or evaluate relationships between variables based on a linear relationship. To learn more about the Scikit-learn algorithm, Linear Regression visit Scikit-learn.
Before using the tool
Start with an existing workflow. You should first clean and prep your dataset. Once your dataset contains only the relevant data you need for your business use case, then start building a pipeline using the Machine Learning tools.
Add the tool
- Click the Classification tool or the Regression tool in the Machine Learning tool palette and drag it to the workflow canvas, connecting it to your dataset.
- In Algorithm, select the algorithm tool you want to configure.
- Configure the tool.
Configure the tool
Configure the parameters or use the default settings. Parameters are set to ayx-learn defaults to ensure accuracy and reproducibility. Each use case is different. The default settings do not represent a single global best combination for all use cases. Understand the parameters before changing them. For best practices, avoid making assumptions, and use a test dataset to assess the performance of your model whether your objective is prediction or not.
To reset to defaults, click the reset icon. To find out more about a parameter, click the parameter's tooltip.
This parameter determines whether or not the intercept is calculated for your model. Also known as the constant, the intercept is the expected mean value of y where x is equal to 0.
- Options:
- On: The intercept is calculated for your model. Use this parameter settings if you want to normalize your data.
- Off: No intercept is calculated for your model. Use this parameter setting if your data is expected to be already centered.
Use this parameter if you want the algorithm to normalize your targets.
Normalization adjusts your targets in such a way that you can compare them on a common scale with other data--helping you identify associations in your data
Options
- On: Normalization occurs.
- You must set Fit Intercept to On.
- False: No normalization occurs.
You should not normalize all data. Normalization refers to rescaling real valued numeric attributes into the range 0 and 1. It is useful to scale the input attributes for a model that relies on the magnitude of values, such as distance measures used in k-nearest neighbors and in the preparation of coefficients in regression.