Full datasets and ref here:…

Question Answered step-by-step Full datasets and ref here:… Full datasets and ref here: https://drive.google.com/drive/folders/1EYcXkqDk2cjLAT7rASC3w5K75tPJ6Vlv?usp=sharingImplement your own version of a collaborative filtering recommender system that will use the K-Nearest-Neighbor (KNN) learning algorithm to predict users’ ratings on items. Note that in contrast to the previous problem, KNN is not a model-based machine learning approach. In other words, there is no speparate training component to generate a model. Instead all the work is done at prediction time: when we want to generate a prediction for a test instance (in this case a user), at that point, we measure the similarity of that test instance, x, to every instance in the training data to identify the K most similar users to x. These are the K nearest neighbors. Your implementation should use thePearson Correlation Coefficient as the similarity function (to compute similarities between the test instance and the training instances). Be sure to review the lecture notes on K-Nearest-Neighbor Learning and on Classification & Prediction: Basic Concepts (and the associated videos) before starting on this problem.The basic tasks for this assignment can be described as follows:Given: a user-item ratings matrix with rows corresponding to user records (ratings given by a user to various items), and columns corresponding to all items in the database. This is the training data.Also given: a specific target user, u_t, and a target item, i_t. The item i_t is one of the existing items in our item database. The target user u_t may be a new user or an existing user and is represented as a vector of ratings over all items (a row in the ratings matrix).The task is to compute the predicted rating of user u_t on item i_t (assuming that u_t has not previously rated i_t). Note also that if u_t is a test user who has an actual rating on item i_t, the predicted rating for i_t can still be generated and compared to the actual rating to measure prediction error rate.As noted above, generating a predicted rating of user u_t on item i_t involves first identifying the K nearest neighbors of u_t among the users in the database. This is accomplished by computing the Pearson Correlation between u_t and all other users (rows in the ratings matrix) and ranking them in decreasing order of similarity [Note: when computing the correlation between two users, you must only consider the overalpping items, i.e., the items rated by both users). Once the nearest neighbors of u_t are identified, their ratings on item i_t (if they exist), are used to generate the prediction for user u_t. Specifically, the predicted rating of user u_t on item i_t will be the weighted average of the ratings of the K neighbors on item i_t (with the weight of the neighbor’s being the similarity of that neighbor to u_t). This prediction algorithm is further described in this example. If there are no neighbors for user u_t with rating for the target item, then the returned prediction should be the mean rating on the item across all training users. Note that K (the number of neighbors) is a parameter that can be varied to determine the optimal number of neighbors in different data sets.In addition to being able to generate a predicted rating for a given user u_t and and item i_t, your program should be able to generate a list of recommendations for a given user u_t. Given a user u_t and the number of desired recommendations N, your program should be able to generate the top N recommended items for u_t. your program will identify all items in the database not previously rated by u_t, and for each such item, the program will generate u_t’s predicted rating on that item using the KNN approach described above. It will then rank these items in decreasing order of predicted rating. The top N items in this list are returned as recommendation. Your program should allow you to specify a user in the data (e.g., a user’s row number in the ratings matrix) and the value of N. The output should provide understandable results including names of recommended items as well as the system’s predicted rating on those items. If the user’s unrated items are fewer than N, then the recommended set will include only the remaining unrated items.Finally, use a program or function to evaluate the accuracy of your prediction function using Mean Absolute Error (MAE) as the evaluation metric. You can compute MAE by generating predictions for items already rated by each test user. Given a test user u and a previously rated item t, you will use the remaining ratings of user u to generate prediction for the test item t being considered. For each of these items you can compute the absolute value of the difference between the predicted and the actual ratings (i.e. the prediction error). You must repeat this process for each rated item across all test users. You can then average these errors across all test cases to obtain the MAE. You can compute MAEs for different values of K to determine the optimal value for the number of neighbors.Testing your recommender:To test of your program, use this book-ratings dataset. This data represents the ratings of 40 users on 8 books. Ratings for users 0 through 29 comprise the training data. Ratings for users 30 through 39 represent the test data. Only the users in the training data are used to find the K nearest neighbors of a target user and generate predictions. Test instances will be used to evaluate prediction accuracy. For this problem, read this data into a user-item ratings matrix with rows corresponding to user profiles (ratings given by a user to various items), and columns corresponding to items in the database. Note that, in practice, it is not scalable to convert the data into full matrix representation (instead usually a sparse matrix representation or an inverted index structure is used handle the ratings), however, for simplicity we will use the standard matrix representation to facilitate the use of basic vector operations. Note that 0 entries in the matrix represent the absence of a rating. The mapping of book ids to names in provided in the file hw3-book-names.csv. Provide the following results in your submission.Show the predicted ratings in each of the following cases:User 35, item 0, K = 5.User 39, item 6, K = 10.Show the top 3 recommended items for user 35, using K = 20.Show the top 3 recommended items for user 36, using K = 10.Using your evaluation function, compute and display the MAE obtained on the test data using K = 5.Next, repeat the above evaluation procedure, but use a range of values of K from 1 to 30. Use a table and a plot of the values of K (x-axis) and the MAE values (y-axis) to determine the optimal value of K for this test dataset Computer Science Engineering & Technology Python Programming CSC 480 Share QuestionEmailCopy link Comments (0)

Write my essay for me?

LEGAL

CONTACT INFORMATION

WE ACCEPT

Bookmarks

You might also like

Write my essay for me?

LEGAL

CONTACT INFORMATION

WE ACCEPT

Bookmarks