Ferramenta Rede Neural
Fluxo de trabalho de exemplo
Esta ferramenta tem um fluxo de trabalho de exemplo. Visite Exemplos de fluxos de trabalho para saber como acessar esse e muitos outros exemplos diretamente do Alteryx Designer.
A ferramenta Rede Neural cria um modelo de rede neural feedforward do tipo perceptron com uma única camada oculta. Os neurônios na camada oculta usam uma função de ativação sigmoide, e a função de ativação de saída depende da natureza do campo-alvo. Specifically, for binary classification problems (e.g., the probability a customer buys or does not buy), the output activation function used is logistic, for multinomial classification problems (e.g., the probability a customer chooses option A, B, or C) the output activation function used is softmax, for regression problems (where the target is a continuous, numeric field) a linear activation function is used for the output.
Neural networks represent the first machine learning algorithm (as opposed to traditional statistical approaches) for predictive modeling. The motivation behind the method is mimicking the structure of neurons in the brain (hence the method's name). The basic structure of a neural network involves a set of inputs (predictor fields) that feed into one or more "hidden" layers, with each hidden layer having one or more "nodes" (also known as "neurons").
In the first hidden layer, the inputs are linearly combined (with a weight assigned to each input in each node), and an "activation function" is applied to the weighted linear combination of the predictors. In the second and subsequent hidden layers, output from the nodes of the prior hidden layer are linearly combined in each node of the hidden layer (again with weights assigned to each node from the prior hidden layer), and an activation function is applied to the weighted linear combination. Finally, the results from the nodes of the final hidden layer are combined in a final output layer that uses an activation function that is consistent with the target type.
Estimation (or "learning" in the vocabulary of the neural network literature) involves finding the set of weights for each input or prior layer node values that minimize the model's objective function. In the case of a continuous numeric field this means minimizing the sum of the squared errors of the final model's prediction compared to the actual values, while classification networks attempt to minimize an entropy measure for both binary and multinomial classification problems. As indicated above, the Neural Network tool (which relies on the R nnet package), only allows for a single hidden layer (which can have an arbitrary number of nodes), and always uses a logistic transfer function in the hidden layer nodes. Despite these limitations, our research indicates that the nnet package is the most robust neural network package available in R at this time.
While more modern statistical learning methods (such as models produced by the Boosted, Forest, and Spline Model tools) typically provide greater predictive efficacy relative to neural network models, in some specific applications (which cannot be determined before the fact), neural network models outperform other methods for both classification and regression models. Moreover, in some areas, such as in financial risk assessment, neural network models are considered a "standard" method that is widely accepted. Essa ferramenta utiliza a ferramenta R. Vá para OpçõesBaixar ferramentas preditivas e faça login no portal de Downloads e Licenças da Alteryx para instalar o R e os pacotes usados pela ferramenta R. Visite Baixar e utilizar ferramentas preditivas.
Configurar a ferramenta
Nome do modelo: cada modelo precisa de um nome para que possa ser identificado mais tarde. Os nomes de modelo devem começar com uma letra e podem conter letras, números e os caracteres especiais ponto (".") e sublinhado ("_"). Nenhum outro caractere especial é permitido, e a ferramenta R diferencia maiúsculas de minúsculas.
Selecione a variável-alvo: selecione o campo do fluxo de dados que você deseja prever. Esse alvo deve ser uma cadeia de caracteres.
Selecione as variáveis preditoras: escolha os campos do fluxo de dados que você pressupõe causem alterações no valor da variável-alvo. Colunas que contêm identificadores exclusivos, como chaves primárias alternativas e chaves primárias naturais, não devem ser usadas em análises estatísticas. Elas não têm nenhum valor preditivo e podem causar exceções de tempo de execução.
Usar pesos de amostragem no treinamento do modelo: marque essa caixa de seleção e selecione o campo de peso no fluxo de dados para treinar o modelo.
The number of nodes in the hidden layer: The number of nodes (neurons) in the model's single hidden layer. The default is ten.
Incluir diagramas de efeito marginal?: uma opção para incluir no relatório diagramas que mostram a relação entre a variável preditora e o alvo, calculando a média sobre o efeito de outros campos preditores. The number of plots to produce is controlled by "The minimal level of importance of a field to be included in the plots," which indicates the percentage of the total predictive power of the model a particular field must contribute to the model in order to have a marginal effect plot produced for that field. The higher the value for this selection reduces the number of marginal effects plots produced.
Custom scaling/normalization...: The numeric methods underlying the optimization of the model's weights can be problematic if the inputs (predictor fields) are on different scales (e.g., income which ranges from seven thousand to one million combined with the number of members present in the household that ranges from one to seven).
Nenhum (Padrão)
Z-score: All predictor fields are scaled so that they have a mean of zero and a standard deviation of one.
Unit interval: All predictor fields are scaled so that they have a minimum value of zero and a maximum value of one, with all other values being between zero and one.
Zero centered: All predictor fields are scaled so that they have a minimum value of negative one and a maximum value of one, with all other values being between negative and positive one).
The weight decay: The decay weight limits the movement in the new weight values at each iteration (also called "epoch") of the estimation process. The value of the decay weight should be between zero and one, larger values place a greater restriction of the possible movements of the weights. In general, a weight decay value of between 0.01 and 0.2 often works well.
The +/- range of the initial (random) weights around zero: The weights given to the input variables in each hidden node are initialized using random numbers. This option allows the user to set the range of the random numbers used. Generally, the values should be near 0.5. However, smaller values can be better if all the input variables are large in size. A value of 0 is actually a special value that causes the tool to find a good comprise value given the input data.
The maximum number of weights allowed in the model: This option becomes relevant when there are a large number of predictor fields and nodes in the hidden layer. Reducing the number of weights speeds up model estimation, and also reduces the chance that the algorithm finds a local optimum (as opposed to a global optimum) for the weights. Weights excluded from the model are implicitly set to zero.
The maximum number of iterations for model estimation: This value controls the number of attempts the algorithm can make in attempting to find improvements in the set of model weights relative to the previous set of weights. If no improvements are found in the weights prior to the maximum number of iterations, the algorithm will terminate and return the best set of weights. This option defaults to 100 iterations. In general, given the behavior of the algorithm, it is likely to make sense to increase this value if needed, at the cost of lengthening the runtime for model creation.
Tamanho do gráfico: selecione polegadas ou centímetros para o tamanho do gráfico.
Resolução do gráfico: selecione a resolução do gráfico em pontos por polegada: 1x (96 dpi), 2x (192 dpi) ou 3x (288 dpi).
Resoluções mais baixas geram um arquivo menor, melhor para visualização em um monitor.
Resoluções mais altas geram um arquivo maior e com melhor qualidade de impressão.
Tamanho da fonte base (pontos): selecione o tamanho da fonte para o gráfico.
Visualizar a saída
Âncora O: Objeto. Consiste em uma tabela do modelo serializado com o nome correspondente.
Âncora R: Relatório. Consiste em fragmentos de relatório gerados pela ferramenta Classificador de Naive Bayes: um resumo básico do modelo e os principais gráficos de efeito para cada classe da variável-alvo.