Each sector of this page takes to Kaggle. notebook where you can find the codes along with explanations and analysis.
Breast Cancer Detection using Logistic Regression
I used logistic regression to determine if the diagnosis is M or B (malignant or benign) by analysing other physical attributions of the person. Visit Kaggle.
Code and Resources Used | Heat Map | Results |
---|---|---|
Language: Python Packages used: numpy, pandas, matplotlib, seaborn, and sklearn Category: Logistic regression |
The acuracy of the model is 96% , estimated from ROC curve. |
Analysis of Donald Trump Speech in Dallas using Text Mining and Text Analysis
I used text analysis to create a word cloud that illustrates the main topics he mentioned in his speech. Visit Kaggle.
Code and Resources Used | Word Cloud | Results |
---|---|---|
Language: R Packages used: dplyr, tidytext, stringr, wordcloud, knitr, DT, tidyr Category: Text mining and analysis |
![]() |
I created a word cloud and datatable for the speech. |
Happiness distribution analysis using Association Rule Technique
I used an apriori model predict find the assosciation among services in a place to predict the likelyhood of considered a place happy. The model is useful for residents to make a prediction for better housing environment. Visit Kaggle.
Code and Resources Used | Plot of the Rules | Results |
---|---|---|
Language: R Packages used: arules, arulesViz Category: Assosciation |
![]() |
I built 2 models with a support of 0.05 and 0.04 and a confidence level of 0.85 and 0.9 respectively. |
Heart Disease detection using Random Forest Technique
I used random forest to determine whether the person has a high risk of cardiovascular disease or not. The model is useful for doctors to make a prediction and recommend the user on the best option for testing. Visit Kaggle.
Code and Resources Used | Plot of the model | Results |
---|---|---|
Language: R Packages used: caTools, FSelector, randomForest Category: Classification |
The model trained was 78% accurate. The accuracy was calculated from the confusion matrix. |
Diabetes Detection using Decision Tree Technique
I used decision tree to determine whether the person has diabetes or not. The model is useful for doctors to make a prediction and recommend the user on the best option for testing. Visit Kaggle.
Code and Resources Used | Plot of the model | Results |
---|---|---|
Language: R Packages used: caTools, FSelector, party Category: Classification |
![]() |
The model trained was almost 78.5% accurate. The accuracy was calculated from the confusion matrix. |
Weather Forecast and Air Quality using Linear Regression Technique
I used linear regression to determine the temperature of a day given. The model is useful for predictions assist in weather forecast. Visit Kaggle.
Code and Resources Used | Plot of errors of the model | Results |
---|---|---|
Language: R Packages used: Amelia, caTools, corrplot, Metrics Category: Regression |
![]() |
The model trained has an error of 5.6 . RMSE (Root Mean Squared Error) error was used for this model. |