diff --git a/_posts/2015-07-12-basic-python-network.markdown b/_posts/2015-07-12-basic-python-network.markdown index a902885..2a809b4 100755 --- a/_posts/2015-07-12-basic-python-network.markdown +++ b/_posts/2015-07-12-basic-python-network.markdown @@ -155,7 +155,7 @@ Output After Training: l1 - Second Layer of the Network, otherwise known as the hidden layer + Second Layer of the Network, which is our hypothesis, and should approximate the correct answer as we train syn0 @@ -277,7 +277,7 @@ We are now ready to update our network! Let's take a look at a single training e In this training example, we're all setup to update our weights. Let's update the far left weight (9.5).

weight_update = input_value * l1_delta

-For the far left weight, this would multiply 1.0 * the l1_delta. Presumably, this would increment 9.5 ever so slightly. Why only a small ammount? Well, the prediction was already very confident, and the prediction was largely correct. A small error and a small slope means a VERY small update. Consider all the weights. It would ever so slightly increase all three. +For the far left weight, this would multiply 1.0 * the l1_delta. Presumably, this would increment 9.5 ever so slightly. Why only a small amount? Well, the prediction was already very confident, and the prediction was largely correct. A small error and a small slope means a VERY small update. Consider all the weights. It would ever so slightly increase all three.

@@ -327,7 +327,7 @@ So, now that we've looked at how the network updates, let's look back at our tra

-Thus, in our four training examples below, the weight from the first input to the output would consistently increment or remain unchanged, whereas the other two weights would find themselves both increasing and decreasing across training examples (cancelling out progress). This phenomenon is what causes our network to learn based on correlations between the input and output. +Thus, in our four training examples above, the weight from the first input to the output would consistently increment or remain unchanged, whereas the other two weights would find themselves both increasing and decreasing across training examples (cancelling out progress). This phenomenon is what causes our network to learn based on correlations between the input and output.

Part 2: A Slightly Harder Problem

@@ -373,7 +373,7 @@ Thus, in our four training examples below, the weight from the first input to th -

Consider trying to predict the output column given the two input columns. A key takeway should be that neither columns have any correlation to the output. Each column has a 50% chance of predicting a 1 and a 50% chance of predicting a 0.

+

Consider trying to predict the output column given the three input columns. A key takeway should be that neither columns have any correlation to the output. Each column has a 50% chance of predicting a 1 and a 50% chance of predicting a 0.

So, what's the pattern? It appears to be completely unrelated to column three, which is always 1. However, columns 1 and 2 give more clarity. If either column 1 or 2 are a 1 (but not both!) then the output is a 1. This is our pattern. @@ -387,7 +387,7 @@ This is considered a "nonlinear" pattern because there isn't a direct one-to-one



-

Believe it or not, image recognition is a similar problem. If one had 100 identically sized images of pipes and bicycles, no individual pixel position would directly correlate with the presence of a bicycle or pipe. The pixels might as well be random from a purely statistical point of view. However, certain combinations of pixels are not random, namely the combination that forms the image of a bicycle or a person.

+

Believe it or not, image recognition is a similar problem. If one had 100 identically sized images of pipes and bicycles, no individual pixel position would directly correlate with the presence of a bicycle or pipe. The pixels might as well be random from a purely statistical point of view. However, certain combinations of pixels are not random, namely the combination that forms the image of a bicycle or a pipe.

Our Strategy

diff --git a/_posts/2017-03-17-safe-ai.markdown b/_posts/2017-03-17-safe-ai.markdown index 19309ea..a8a30da 100755 --- a/_posts/2017-03-17-safe-ai.markdown +++ b/_posts/2017-03-17-safe-ai.markdown @@ -183,22 +183,6 @@ def encrypt(x,S,m,n,w): def decrypt(c,S,w): return (S.dot(c) / w).astype('int') -def get_c_star(c,m,l): - c_star = np.zeros(l * m,dtype='int') - for i in range(m): - b = np.array(list(np.binary_repr(np.abs(c[i]))),dtype='int') - if(c[i] < 0): - b *= -1 - c_star[(i * l) + (l-len(b)): (i+1) * l] += b - return c_star - -def get_S_star(S,m,n,l): - S_star = list() - for i in range(l): - S_star.append(S*2**(l-i-1)) - S_star = np.array(S_star).transpose(1,2,0).reshape(m,n*l) - return S_star - x = np.array([0,1,2,5]) @@ -238,7 +222,7 @@ And when I run this code in an iPython notebook, I can perform the following ope
  • Given the two formulas above, if the secret key is the identity matrix, the message isn't encrypted.
  • Given the two formulas above, if the secret key is a random matrix, the generated message is encrypted.
  • We can make a matrix M that changes the secret key from one secret key to another.
  • -
  • When the matrix M converts from the identity to a random secret key, it is, by extension, encrypting the message in a one-way encryption.
  • +
  • When the matrix M converts from the identity to a random secret key, it is, by definition, encrypting the message in a one-way encryption.
  • Because M performs the role of a "one way encryption", we call it the "public key" and can distribute it like we would a public key since it cannot decrypt the code.
  • @@ -420,7 +404,7 @@ def elementwise_vector_mult(x,y,scaling_factor): -Now, there's one bit that I haven't told you about yet. To save time, I'm pre-computing several keys, , vectors, and matrices and storing them. This includes things like "the vector of all 1s" and one-hot encoding vectors of various lengths. This is useful for the masking operations above as well as some simple things we want to be able to do. For example, the derivive of sigmoid is sigmoid(x) * (1 - sigmoid(x)). Thus, precomputing these variables is handy. Here's the pre-computation step. +Now, there's one bit that I haven't told you about yet. To save time, I'm pre-computing several keys, vectors, and matrices and storing them. This includes things like "the vector of all 1s" and one-hot encoding vectors of various lengths. This is useful for the masking operations above as well as some simple things we want to be able to do. For example, the derivive of sigmoid is sigmoid(x) * (1 - sigmoid(x)). Thus, precomputing these variables is handy. Here's the pre-computation step.
     
    @@ -487,7 +471,7 @@ for row in H_sigmoid_txt:
     
     
    -If you're looking closely, you'll notice that the H_sigmoid matrix is the matrix we need for the polynomial evaluation of sigmoid. :) Finally, we want to train our neural network with the following. If the neural netowrk parts don't make sense, review A Neural Network in 11 Lines of Python. I've basically taken the XOR network from there and swapped out its operations with the proper utility functions for our encrypted weights. +If you're looking closely, you'll notice that the H_sigmoid matrix is the matrix we need for the polynomial evaluation of sigmoid. :) Finally, we want to train our neural network with the following. If the neural netowrk parts don't make sense, review A Neural Network in 11 Lines of Python. I've basically taken the XOR network from there and swapped out its operations with the proper utility functions for our encrypted weights.
     
    @@ -652,7 +636,7 @@ When I train this neural network, this is the output that I see. Tuning was a bi
     
     

    Part 9: Sentiment Classification

    -

    To make this a bit more real, here's the same network training on IMDB sentiment reviews based on a network from Udacity's Deep Learning Nanodegree. You can find the full code here +

    To make this a bit more real, here's the same network training on IMDB sentiment reviews based on a network from Udacity's Deep Learning Nanodegree. You can find the full code here