Sample Tool

The Sample tool limits the data stream to a specified number, percentage, or random set of the rows.

Configure the tool

  1. Select the type of sample. The options are:
    • First N rows: Returns every row in the data from the beginning of the data through row N.
    • Last N rows: Starting from the row that is N rows away from the end of the data, returns every row through to the end of the data.
    • Skip 1st N rows: Returns all rows in the data starting after row N.
    • 1 of every N rows: Returns the first row of every group of N rows.
    • 1 in N chance to include each row: Randomly determines if each row is included in the sample, independent of the inclusion of any other rows. This method of selection results in N being an approximation.
    • If you have 1000 row, select a random sample, and specify N as 10, you would expect to get 100 rows returned; however, you may get anywhere between 75 and 150 records returned.
    • First N% of rows: Returns N percent of rows. Selecting this option requires the data to pass through the tool twice: once to calculate the count of rows and again to return the specified percent of row.
  2. Type a number in the N= box to specify the value for N.
  3. Group by column (optional): If a group or groups are specified, N rows are returned for each group.