Dealing with non-differentiable loss functions in ML¶
Let the loss function $$f(y)=\mathbb1(h_\theta(x)\ne y)$$
Clearly, it is non differentiable with respect to the weights.
Hence, we resort to policy gradient methods, which is used in reinforcement learning.
Our objective is to minimize the loss f(y), i.e $$\min E_{y \sim P_{\theta}} \left[ f(y) \right]$$
$$\nabla_{\theta} E_{y \sim P_{\theta}}[ f(y)] = \nabla_{\theta} \int P_{\theta}(y) f(y) = \int P_{\theta}(y) f(y) \frac{\nabla_{\theta} P_{\theta}(y) }{P_{\theta}(y)} = E_{y \sim P_{\theta}} \left[ f(y) \frac{\nabla_{\theta} P_{\theta}(y)}{P_{\theta}(y)} \right]$$$$= E_{y \sim P_{\theta}} \left[ f(y) \ln \nabla_{\theta} P_{\theta}(y) \right]$$
Note that now, we are no longer directly differentiating the loss. This manipulations is sometimes called the log derivative trick
In [11]:
import numpy as np
from sklearn.metrics import accuracy_score
In [12]:
%%latex
Let us dive into the code. We will create a synthetic data set to test whether it works, where
$$y=\begin{cases}
1 & x> 0.5 \\
0 & x \leq 0.5
\end{cases}
$$
In [13]:
x=np.random.rand(200,1)
y=(x>0.5).astype('int32')
In [14]:
#Setting up predictor function
def sigmoid(z,weights):
temp=1+np.exp(-(weights[0]*z+weights[1]))
return 1.0/temp
$$\sigma(x)= 1/(1+exp(-x))$$
In [15]:
import random
In [16]:
alpha=0.1
weights=np.random.rand(2,1)*0.01
#print(weights)
num_iter=1000
minibatch_size=8
for h in range(num_iter):
change=0
minibatch=np.array(random.sample(list(np.hstack([x,y])),minibatch_size))
x_samp=minibatch[:,0]
y_samp=minibatch[:,1]
#print(x_samp)
#print(y_samp)
for m,l in enumerate(list(x_samp)):
prob_positive=sigmoid(l,weights) # computing probability of class 1
#print(logpred)
ypred=np.random.rand()<prob_positive # sampling from binomial disto
reward= 2*(ypred==y_samp[m])-1 #reward setting
if ypred==1:
grads=np.array([(1-prob_positive)*l,(1-prob_positive)])
if ypred==0:
grads=np.array([-prob_positive*l,-prob_positive])
change=change+alpha*grads*reward
#print(grads)
#print(change/minibatch_size)
weights=weights+change/minibatch_size
if h%200==0:
print(h)
#checking the accuracy
pred=[]
for idx,el in enumerate(x):
positive_prob=sigmoid(el,weights)
if positive_prob>=0.5:
t=1
else:
t=0
pred.append(t)
print('Accuracy is {}'.format(accuracy_score(pred,y)))
#print(weights)
In [ ]: