Step 2: Build Data and Structures for Training and Testing

Next, you will build the data and structures you will use for training the recommender model and testing it.

The API divides the training and testing and data into an 80-20 training/testing split.

The API call has a ‘contextual’ parameter that influences how the data is grouped.

If ‘contextual’ is false, then there will only be a differentiation for the event-types when processing the data. For example, all users that watched a particular item, or bought a particular item will be grouped together; it does not matter how much time they spent watching the item or what quantity of the item they bought.

Else if ‘contextual’ is true, then the same or similar event values and types will be grouped together to allow for better training on the minor differences in the user-item interactions. For example, users who spent a similar amount of time watching a specific item will be grouped together, or users who bought a similar quantity of an item will be grouped together.

E.g. On the MovieLens data with contextual set to false. The algorithm trains all the movies a given user rated as one monolithic group, irrespective of the rating.

If contextual is true, all movies a user rated an eight as an example will be grouped together. This forces the algorithm to learn minor distinctions in user behaviour and improves accuracy.

For more information, please refer to the API documentation.