Create Samples Tool
Use Create Samples to split the input rows into 2 or 3 random samples. In the tool, you specify the percentage of rows in each sample. If the total is less than 100%, the remaining rows output to the holdout, or H anchor.
Configure the Tool
Select the Row Allocation. The sum of Sample 1 and Sample 2 percentages must be less than or equal to 100%. If the total is less than 100%, the remaining percentage outputs to the H anchor:
Sample 1: Output to the E anchor. This is the percentage of the data to place in the estimation sample (between 1% and 99%).
Sample 2: Output to the V anchor. This is the percentage of the data to place in the validation sample (between 1% and 99%).
Enter a Random seed: An integer value between 1 and 1000 which provides the starting point in generating random numbers. Changing this value alters the sample that an individual row of the data is placed in. Unless there is a specific reason to change this value, the default value of 1 is recommended.
View the Output
There are 3 outputs from the Create Samples tool:
E anchor: The Estimation output stream contains a random sample of input rows. The count of rows in this stream is equal to the percent of total rows specified in Sample 1.
V anchor: The Validation stream contains a random sample of input rows. The count of rows in this stream is equal to the percent of total rows specified in Sample 2.
H anchor: The Holdout stream includes any leftover rows that weren't placed in either the Estimation or Validation samples.
If the number of rows is odd and Estimation and Validation are both set to 50%, the E anchor output stream has 1 more row than the V anchor stream.