Skip to main content

EXAMPLE - Percentile Functions

This example illustrates you to apply percentile functions.

Functions:

Item

Description

MEDIAN Function

Computes the median from all row values in a column or group. Input column can be of Integer or Decimal.

PERCENTILE Function

Computes a specified percentile across all row values in a column or group. Input column can be of Integer or Decimal.

QUARTILE Function

Computes a specified quartile across all row values in a column or group. Input column can be of Integer or Decimal.

APPROXIMATEMEDIAN Function

Computes the approximate median from all row values in a column or group. Input column can be of Integer or Decimal.

APPROXIMATEPERCENTILE Function

Computes an approximation for a specified percentile across all row values in a column or group. Input column can be of Integer or Decimal.

APPROXIMATEQUARTILE Function

Computes an approximation for a specified quartile across all row values in a column or group. Input column can be of Integer or Decimal.

Source:

The following table lists each student's height in inches:

Student

Height

1

64

2

65

3

63

4

64

5

62

6

66

7

66

8

65

9

69

10

66

11

73

12

69

13

69

14

61

15

64

16

61

17

71

18

67

19

73

20

66

Transformation:

Use the following transformations to calculate the median height in inches, a specified percentile and the first quartile.

  • The first function uses a precise algorithm which can be slow to execute across large datasets.

  • The second function uses an appropriate approximation algorithm, which is much faster to execute across large datasets.

    • These approximate functions can use an error boundary parameter, which is set to 0.4 (0.4%) across all functions.

Median: This transformation calculates the median value, which corresponds to the 50th percentile.

Transformation Name

New formula

Parameter: Formula type

Single row formula

Parameter: Formula

median(heightIn)

Parameter: New column name

'medianIn'

Transformation Name

New formula

Parameter: Formula type

Single row formula

Parameter: Formula

approximatemedian(heightIn, 0.4)

Parameter: New column name

'approxMedianIn'

Percentile: This transformation calculates the 68th percentile.

Transformation Name

New formula

Parameter: Formula type

Single row formula

Parameter: Formula

percentile(heightIn, 68, linear)

Parameter: New column name

'percentile68In'

Transformation Name

New formula

Parameter: Formula type

Single row formula

Parameter: Formula

approximatepercentile(heightIn, 68, 0.4)

Parameter: New column name

'approxPercentile68In'

Quartile: This transformation calculates the first quartile, which corresponds to the 25th percentile.

Transformation Name

New formula

Parameter: Formula type

Single row formula

Parameter: Formula

quartile(heightIn, 1, linear)

Parameter: New column name

'percentile25In'

Transformation Name

New formula

Parameter: Formula type

Single row formula

Parameter: Formula

approximatequartile(heightIn, 1, 0.4)

Parameter: New column name

'approxPercentile25In'

Results:

studentId

heightIn

approxPercentile25In

percentile25In

approxPercentile68In

percentile68In

approxMedianIn

medianIn

1

64

64

64

67.1

66.92

66

66

2

65

64

64

67.1

66.92

66

66

3

63

64

64

67.1

66.92

66

66

4

64

64

64

67.1

66.92

66

66

5

62

64

64

67.1

66.92

66

66

6

66

64

64

67.1

66.92

66

66

7

66

64

64

67.1

66.92

66

66

8

65

64

64

67.1

66.92

66

66

9

69

64

64

67.1

66.92

66

66

10

66

64

64

67.1

66.92

66

66

11

73

64

64

67.1

66.92

66

66

12

69

64

64

67.1

66.92

66

66

13

69

64

64

67.1

66.92

66

66

14

61

64

64

67.1

66.92

66

66

15

64

64

64

67.1

66.92

66

66

16

61

64

64

67.1

66.92

66

66

17

71

64

64

67.1

66.92

66

66

18

67

64

64

67.1

66.92

66

66

19

73

64

64

67.1

66.92

66

66

20

66

64

64

67.1

66.92

66

66