Skip to main content

Structure of a URL

This section provides information about the structure of a URL.

A valid value for the URL data type can be composed of the following parts.

Example URL:

http://www.app.example.co.uk/support

Nota

IP addresses that include the protocol identifier (http://1.2.3.4) do not contain domain identifiers and need to be processed using a different set of methods. It might be easier to remove the protocol identifiers and change the data type to IP Address.

The hierarchy of domain names extends from right to left.

Element Name

Examples

Wrangle Function

Notes

Top-level domain

  • co.uk

  • com, net, org

SUFFIX Function

Every valid URL must have at least one top-level domain.

Nota

When the DOMAIN function parses a multi-tiered top-level domain such as co.uk, the output is the first part of the domain value (e.g. co).

Second-level domain

example

app.example

DOMAIN Function

This value can be extracted from a valid URL using the DOMAIN function. See DOMAIN Function.

Third-level domain

www

SUBDOMAIN Function

This value can be extracted from a valid URL using the SUBDOMAIN function. See SUBDOMAIN Function.

path

/support

protocol identifier

http://

https://

You can use pattern matching to locate these protocol identifiers. In your Wrangle transforms, use the following Wrangle :

`http%?://`

For an example, see IPTOINT Function.

host

www.app.example.com

HOST Function

Protocol identifier (e.g. http://) is not included