1) Purpose The purpose of this project is to: Pre-processing -…

Question 1) Purpose The purpose of this project is to: Pre-processing -… 1) PurposeThe purpose of this project is to:Pre-processing  – Retrieve  & prepare the data:Load and explore the dataset referenced in section 4 in this document using techniques learnt during this course.Visualize the data and describe it thoroughly, identify correlations..etc.Clean, transform categorical data and model the dataset using the techniques learnt throughout the course in preparation for building a predictive model.Model building & fine tuningBuild a supervised predictive model based using a suitable classification algorithm in python and utilizing scikit-learn, pandas, numpy…etc. To provide predictions as specified in project specification, section 3 of this report.Validate / score and evaluate the models and choose the best model.Model deploymentBuild an API for the model using Python Flask framework.Build a simple front end to access the API and pass new feature values to the prediction model.2) Guidelines & InstructionsBe sure to read all theGeneral:.- Read the textbook, course lecture content, class examples, and additional references provided here. Each team are free to research and use more materials and tools to implement a good solution, just make sure to reference it in your solution and your report.- All code developed in python and any other language should be part of the submission.- The  submission should be accompanied with a report prepared as a Microsoft document or a pdf explaining the project and detailing all the assumptions, constraints applied. (Details in section 3).- Name the project submission: “Bycycle_theft_Group_Group#_section_section#COMP247Project” where Group# is the assigned group# and section# is the groups section number.- The submission should be a zipped file containing the code and the written report. 3) Project Specifications & deliverablesBoth the police department and the “general public” would make use of a software product that can give them an idea about the likelihood of bicycle theft. For the police department it would assist them in taking better measures of anti-theft around certain neighborhoods. For the public individuals, it would help them assess the need for additional precautions such as locks.Based on the dataset described in point four below, which is actual data collected over the period of five years by the Toronto police department. You need to build a predictive service that based on certain features would provide a classification of either the bike is likely to be returned or not. Please arrange to provide the following deliverables for your project.1. Data exploration: a complete review and analysis of the dataset including:Load and describe data elements (columns), provide descriptions & types, ranges and values of elements as aproppriate. – use pandas, numpy and any other python packages.Statistical assessments including means, averages, correlationsMissing data evaluations – use pandas, numpy and any other python packagesGraphs and visualizations – use pandas, matplotlib, seaborn, numpy and any other python packages, you also can use power BI desktop.2. Data modelling:Data transformations – includes handling missing data, categorical data management, data normalization and standardizations as needed.Feature selection – use pandas and sci-kit learn. (The group needs to justify each feature used and any data columns discarded)Train, Test data spliting – use numpy, sci-kit learn.Managing imbalanced classes if needed. Check here for info: https://elitedatascience.com/imbalanced-classesUse pipelines class to streamline all the pre-processing transformations.3. Predictive model buildingUse logistic regression, decision trees, SVM, Random forest and neural networks  as a minimum- use scikit learnFine tune the model using Grid search and randomized grid search.4. Model scoring and evaluationPresent results as accuracy , precision, recall, F1 scores, confusion matrices and plot the ROC curves of the models – use sci-kit learnSelect and recommend the best performing model5. Deploying the modelUsing flask framework arrange to turn your selected machine-learning model into an analytics  API.Using pickle module arrange for Serialization & Deserialization of your model.Build a client to test your model API service. Use the test data, which was not previously used to train the module. You can use simple Jinja HTML templates with or without Java script, REACT or any other technology but at minimum use POSTMAN Client API.6. Prepare  report explaining your project and detailing all the assumptions, constraints you applied should have the following sections:Executive summary (to be written once nearing the end of project work, should describe the problem/solution and key findings)Overview of your solution(to be written once nearing the end of project work)Data exploration and findings (dataset field descriptions, graphs, visualizations, tools and libraries used….etc.)Feature selection (tools and techniques used, results of different combinations…etc.)Data modeling (data cleaning strategy, results of data cleaning, data wrangling techniques, assumptions and constraints)Model building (train/ test data, sampling, algorithms tested, results: confusion matrixes …etc.)4) Data SetThis dataset contains actual Bicycle Thefts occurrences from 2014-2019 in the city of Toronto. (might change to 2020 depending on frequency of update)In accordance with the Municipal Freedom of Information and Protection of Privacy Act, the Toronto Police Service has taken the necessary measures to protect the privacy of individuals involved in the reported occurrences. No personal information related to any of the parties involved in the occurrence will be released as open data.The location of crime occurrences have been deliberately offset to the nearest road intersection node to protect the privacy of parties involved in the occurrence. All location data must be considered as an approximate location of the occurrence and users are advised not to interpret any of these locations as related to a specific address or individual.The reported crime dataset is intended to provide communities with information regarding public safety and awareness. The data supplied to the Toronto Police Service by the reporting parties is preliminary and may not have been fully verified.https://data.torontopolice.on.ca/datasets/bicycle-theftsUse the download tab and select spreadsheet to download the dataset as a csv file, also download the Metadata file.Image transcription textC a datatorontopolice on.ca/datasets/bicycle-thefts ill Appal 3 Amazonca = Online. @ The Comprehense..Imported From IE S Identifying Potentia. Weka Tutorial 10: F. Other bookmarks Sign In Toronto Police ServicePUBLIC SAFETY DATA PORTAL Home Catalogue Open Data Data Analytics Maps Crime @ a Gl… Show more… Show more  Computer Science Engineering & Technology Python Programming CSM ALTERNATIV Share QuestionEmailCopy link Comments (0)