The Sample tool limits the data stream to a specified number, percentage, or random set of the rows. In addition, the Sample tool applies the selected configuration to the columns selected to group by.
Configure the tool
- Select the type of sample. The options are:
- First N rows: Returns every row in the data from the beginning of the data through row N.
- Last N rows: Starting from the row that is N rows away from the end of the data, returns every row through to the end of the data.
- Skip 1st N rows: Returns all rows in the data starting after row N.
- 1 of every N rows: Returns the first row of every group of N rows.
- 1 in N chance to include each row: Randomly determines if each row is included in the sample, independent of the inclusion of any other rows. This method of selection results in N being an approximation.
- First N% of rows: Returns N percent of rows. Selecting this option requires the data to pass through the tool twice: once to calculate the count of rows and again to return the specified percent of row.