Sample weights by nikisix · Pull Request #120 · maciejkula/spotlight

nikisix · 2018-07-17T18:12:55Z

Commit corresponds to discussion at #118
Loss functions and implicit factorization done.
Implicit sequence coming.

… factorization models

maciejkula

This looks great, I left a couple of comments. I think once we've settled on a design for the implicit factorization models we can extend to all the other models as well.

Thanks for your help, I really appreciate it!

maciejkula · 2018-07-18T08:00:12Z

-        mask = mask.float()
-        loss = loss * mask
-        return loss.sum() / mask.sum()
+    if sample_weights is not None or mask is not None:


We don't need these checks here: we can always call the base loss function. I think that will make for less code repetition in all the loss functions.

maciejkula · 2018-07-18T08:00:17Z



-def pointwise_loss(positive_predictions, negative_predictions, mask=None):
+def base_loss(loss, sample_weights=None, mask=None):


Could we call this _weighted_loss? This would reflect the fact that:

This is an internal function.

Its essence lies in applying weights.

I'm not opposed to this name, but it'd be a misnomer in case they call us with a mask and no weights. Although, conceptually at least, a mask could be thought of as a weight of zero.

_base_loss
vs
_weighted_loss
vs
_modified_loss

¯_(ツ)_/¯ ?

If, as you suggest, we subsume masks under weights this will all be fine!

maciejkula · 2018-07-18T08:04:12Z

+                raise ValueError('Degenerate epoch loss: {}'
+                                 .format(epoch_loss))
+
+    def fit_weighted(self, interactions, verbose=False):


I think we should keep a single fit method. If weights are present, we use them; if not, we don't.

This could probably be accomplished by changing the minibatch function to just infinitely yield None for those arguments that are None rather than tensors. This way, it always yields None for batch_sample_weights which ties in nicely with how the loss functions are changed.

Definitely tried to keep it to a single fit function at first. But I want to avoid consuming ram for an unnecessary tensor of Nones.
Would this lead to a tensor of Nones (not None) being passed to the loss functions, rather than a single None? (unless the tensor is not created by yielding Nones?)
Would also have to augment torch_utils.shuffle and add a few if all(weights==None) checks in (may slow it down a bit).

What if we just checked if weights are specified at the beginning of fit, and if they are just have fit call _fit_weighted? That way we can maintain the api of having the user always call fit, but still keep the code relatively simple.

Sequence weight question for you: I was thinking about the sequence interactions yesterday, and my guess is to make a parallel sequence tensor full of sample weights, of the same dimensions as the sequence tensor of item-ids. Does this sound right?

No, we won't need a tensor of nones. I'll add a comment with a prototype of what I meant.

…ed_loss. rename fit_weighted to _fit_weighted and call it from fit to streamline the user-api. add cscope database files to gitignore.

nikisix · 2018-07-18T22:27:32Z

I notice that the sequence losses already make pretty heavy use of the mask argument via the PADDING_INDEX.

                loss = self._loss_func(positive_prediction,
                                       negative_prediction,
                                       mask=(sequence_var != PADDING_IDX))

Is it correct for sample weights to override them in the _weighted_loss, or is there something more sophisticated we could be doing?

maciejkula · 2018-07-19T11:02:13Z

This is a good point: we can definitely subsume masks under weights by just setting weights to zero where mask should be false.

maciejkula · 2018-07-19T11:06:18Z

+                raise ValueError('Degenerate epoch loss: {}'
+                                 .format(epoch_loss))
+
+    def _fit_weighted(self, interactions, verbose=False):


I'm still not convinced about having a separate _fit_weighted function, even if it is internal. I think it introduces too much code duplication that will have to be kept in sync.

What about modifying minibatch to look roughly like this:

def minibatch(*tensors, **kwargs): batch_size = kwargs.get('batch_size', 128) if len(tensors) == 1: tensor = tensors[0] for i in range(0, len(tensor), batch_size): yield tensor[i:i + batch_size] else: for i in range(0, len(tensors[0]), batch_size): yield tuple(x[i:i + batch_size] if x is not None else None for x in tensors)

This way, it emits tensor slices if an argument is a tensor (as before), but also emits None in the tuple if an argument is None.

maciejkula · 2018-07-19T11:07:40Z

+                raise ValueError('Degenerate epoch loss: {}'
+                                 .format(epoch_loss))
+
+    def fit_weighted(self, interactions, verbose=False):


No, we won't need a tensor of nones. I'll add a comment with a prototype of what I meant.

maciejkula

I like the idea of making masks simply zero weights.

But I still think we should try to keep only one fit method :)

…ights and removing _fit_weighted.

maciejkula · 2018-08-06T00:21:35Z

Closing in favour of #122

nikisix added 3 commits July 11, 2018 16:15

MAINT: adding spotlight results files to gitignore

6952630

feature: sample weight implementation for loss functions and implicit…

910fb9a

… factorization models

lint: linting changes

f8a8409

maciejkula requested changes Jul 18, 2018

View reviewed changes

nikisix added 3 commits July 18, 2018 12:16

MAINT: commit based on pr 120 discussion. rename base_loss to _weight…

191356b

…ed_loss. rename fit_weighted to _fit_weighted and call it from fit to streamline the user-api. add cscope database files to gitignore.

LINT: BUG:

0ba7164

LINT: BUG:

91a4d22

maciejkula reviewed Jul 19, 2018

View reviewed changes

maciejkula requested changes Jul 19, 2018

View reviewed changes

MAINT: augmenting implicit_factorizers fit method to handle sample_we…

28faf31

…ights and removing _fit_weighted.

maciejkula closed this Aug 6, 2018



		def pointwise_loss(positive_predictions, negative_predictions, mask=None):
		def base_loss(loss, sample_weights=None, mask=None):

Conversation

nikisix commented Jul 17, 2018

Uh oh!

maciejkula left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikisix Jul 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikisix commented Jul 18, 2018

Uh oh!

maciejkula commented Jul 19, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maciejkula left a comment

Choose a reason for hiding this comment

Uh oh!

maciejkula commented Aug 6, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nikisix Jul 18, 2018 •

edited

Loading