In this project, I’ve use OpenCV and Viola-Jones Algorithm to build a simple face detection system that works in real-time. Problem I need to detect faces on images and from a webcam in real-time. Task Build a algorithm or machine learning model to detect faces on images. Solution I’ve used Python, OpenCV and the Viola-Jones Algorithm to build a solution to detect faces fast. Results The solution has a good accuracy and can detect faces really fast.
This project presents a code/kernel used in a Kaggle competition promoted by Data Science Academy in December of 2019. The aim of this competition is to build a predictive model that can predict the probability that a particular claim will be approved immediately by or not insurance company based on the resources available at the beginning of the process, helping the insurance company to accelerate the payment release process and thus provide better service to the client.
This project presents a code/kernel used in a Kaggle competition promoted by Data Science Academy in September of 2019. The goal of the competition was to create a Machine Learning model to help a robot to classify the floor surface on which it is using data collected by Inertial Measurement Units (IMU) sensors. About the project: The data used in this competition was collected by the Tampere University Signal Processing Department in Finland.
This project describes how a solve a client problem that needs to get the dentist’s data fast from findadentist.ada.org. Problem A client needed a lot of information about dentists available on findadentist.ada.org. The process of obtaining the data manually was very repetitive and impracticable given the number of dentists required. Task Create a bot to automate the collection (scraping) of website data. Solution I’ve used Python to create a bot that sends requests to the findadentist.
This project presents a code/kernel used in a Kaggle competition promoted by Data Science Academy in January of 2019. The goal of the competition was to create a Machine Learning model to predict the occurrence of diabetes. Data source: National Institute of Diabetes and Digestive and Kidney Diseases Competition page: https://www.kaggle.com/c/competicao-dsa-machine-learning-jan-2019/ Problem Predict the probability of the occurrence of diabetes from patient data. Task Create a Machine Learning model to estimate the probability of the occurrence of diabetes.
This project was built using Scrapy (Scraping and Web Crawling Framework). It contains a set of Spiders to gather product’s data from Etsy Website. Problem The client needs data (product_id, url, price, rating, number_of_reviews, product_options, count_of_images, images_urls, favorited_by, store_name and description) from thousands of products of etsy.com to perfom data analysis. Task Create an automated and fast solution to navigate the website, find the products by a search text, extract all the data, and save it in a user-friendly format (CSV and XLSX).
This project presents an exploratory data analysis of a database provided by Kaggle. The dataset contains over 370,000 used cars scraped from eBay Kleinanzeigen. The dataset can be downloaded from https://www.kaggle.com/orgesleka/used-cars-database The analysis was drive by several questions, that were answered through tables or graphs. Problem Answers questions about cars sold on eBay Kleinanzeigen. Questions: What is the distribution of vehicles by the year of registration? What is the Variation of the price range by type of vehicle?
This project contains a set of scripts used to scrape Ebay’s products data using Scrapy Web Crawling Framework. Problem The client needs data (name, status, price, stars, ratings) from thousands of products of ebay.com to perfom data analysis. Task Create an automated and fast solution to navigate the website, find the products by a search text, extract all the data, and save it in a user-friendly format (CSV or JSON).
In this project, I’ve built a Scraper/Bot to get equipment data from testmart.com using Scrapy Web Crawling Framework. Problem The client needs data from thousands of equipment of National Instruments corporation type from testmart.com to perfom data analysis. Task Create an automated and fast solution to navigate the website, extract all the data of the National Instruments corporation type, and save it in a user-friendly format (CSV, JSON and XML).