Each sector of this page takes to Kaggle. notebook where you can find the codes along with explanations and analysis.

Breast Cancer Detection using Logistic Regression

I used logistic regression to determine if the diagnosis is M or B (malignant or benign) by analysing other physical attributions of the person. Visit Kaggle.

Code and Resources Used Heat Map Results
Language: Python

Packages used: numpy, pandas, matplotlib, seaborn, and sklearn

Category: Logistic regression
image The acuracy of the model is 96%, estimated from ROC curve.


Analysis of Donald Trump Speech in Dallas using Text Mining and Text Analysis

I used text analysis to create a word cloud that illustrates the main topics he mentioned in his speech. Visit Kaggle.

Code and Resources Used Word Cloud Results
Language: R

Packages used: dplyr, tidytext, stringr, wordcloud, knitr, DT, tidyr

Category: Text mining and analysis
image I created a word cloud and datatable for the speech.


Happiness distribution analysis using Association Rule Technique

I used an apriori model predict find the assosciation among services in a place to predict the likelyhood of considered a place happy. The model is useful for residents to make a prediction for better housing environment. Visit Kaggle.

Code and Resources Used Plot of the Rules Results
Language: R

Packages used: arules, arulesViz

Category: Assosciation
image I built 2 models with a support of 0.05 and 0.04 and
a confidence level of 0.85 and 0.9 respectively.


Heart Disease detection using Random Forest Technique

I used random forest to determine whether the person has a high risk of cardiovascular disease or not. The model is useful for doctors to make a prediction and recommend the user on the best option for testing. Visit Kaggle.

Code and Resources Used Plot of the model Results
Language: R

Packages used: caTools, FSelector, randomForest

Category: Classification
image The model trained was 78% accurate.
The accuracy was calculated from the confusion matrix.


Diabetes Detection using Decision Tree Technique

I used decision tree to determine whether the person has diabetes or not. The model is useful for doctors to make a prediction and recommend the user on the best option for testing. Visit Kaggle.

Code and Resources Used Plot of the model Results
Language: R

Packages used: caTools, FSelector, party

Category: Classification
image The model trained was almost 78.5% accurate.
The accuracy was calculated from the confusion matrix.


Weather Forecast and Air Quality using Linear Regression Technique

I used linear regression to determine the temperature of a day given. The model is useful for predictions assist in weather forecast. Visit Kaggle.

Code and Resources Used Plot of errors of the model Results
Language: R

Packages used: Amelia, caTools, corrplot, Metrics

Category: Regression
image The model trained has an error of 5.6.
RMSE (Root Mean Squared Error) error was used for this model.