Graded with Expert Solutions
Save
Terms in this set (25)
Recurrent Neural Network (RNN) A RNN models sequential interactions through a
hidden state, or memory. It can take up to N inputs
and produce up to N outputs. For example, an
input sequence may be a sentence with the
outputs being the part-of-speech tag for each
word (N-to-N). An input could be a sentence, and
the output a sentiment classification of the
sentence (N-to-1). An input could be a single
image, and the output could be a sequence of
words corresponding to the description of an
image (1-to-N). At each time step, an RNN
calculates a new hidden state ("memory") based on
the current input and the previous hidden state. The
"recurrent" stems from the facts that at each step
the same parameters are used and the network
performs the same calculations based on different
inputs
, LSTM (Long Short-Term Memory) the network was invented to prevent the vanishing
gradient problem in Recurrent Neural Networks by
using a memory gating mechanism. Using LSTM
units to calculate the hidden state in an RNN we
help to the network to efficiently propagate
gradients and learn long-range dependencies
how do RNN and LSTM update rules LSTM networks update rule is cell state is updated
differ? in an additive way by adding something to its
previous value C_t-1, this differs from the
multiplicative update Rule of RNN.
Gradients for RNNs The gradient computation involves recurrent
multiplication of WW. This multiplying by WW to
each cell has a bad effect. Think like this: If you a
scalar (number) and you multiply gradients by it
over and over again for say 100 times, if that
number > 1, it'll explode the gradient and if < 1, it'll
vanish towards 0.
What does t-SNE stand for? t-Distributed Stochastic Neighbor Embedding
what is t-SNE? is an unsupervised non-linear technique primarily
used for data exploration and visualization of high
dimensional data. In short, it gives you a feel or
intuition of how data is in high dimensional space.