Sample Tool
The Sample tool limits the data stream to a specified number, percentage, or random set of the rows.
Configure the tool
- Select the type of sample. The options are:
- First N rows: Returns every row in the data from the beginning of the data through row N.
- Last N rows: Starting from the row that is N rows away from the end of the data, returns every row through to the end of the data.
- Skip 1st N rows: Returns all rows in the data starting after row N.
- 1 of every N rows: Returns the first row of every group of N rows.
- 1 in N chance to include each row: Randomly determines if each row is included in the sample, independent of the inclusion of any other rows. This method of selection results in N being an approximation.
- First N% of rows: Returns N percent of rows. Selecting this option requires the data to pass through the tool twice: once to calculate the count of rows and again to return the specified percent of row.
- Type a number in the N= box to specify the value for N.
- Group by column (optional): If a group or groups are specified, N rows are returned for each group.
If you have 1000 row, select a random sample, and specify N as 10, you would expect to get 100 rows returned; however, you may get anywhere between 75 and 150 records returned.