Technology, software, data science, machine learning, entrepreneurship, investing, and various other topics. This post is part of a series covering the exercises from Andrew Ng's machine learning class on Coursera. The original code, exercise text, and data files for this post are available here. One of the pivotal moments in my professional development this year came when I discovered Coursera.
I'd heard of the "MOOC" phenomenon but had not had the time to dive in and take a class. I completed the whole thing from start to finish, including all of the programming exercises.
The experience opened my eyes to the power of this type of education platform, and I've been hooked ever since. This blog post will be the first in a series covering the programming exercises from Andrew's class. One aspect of the course that I didn't particularly care for was the use of Octave for assignments.
Since I'm trying to develop my Python skills, I decided to start working through the exercises from scratch in Python. The full source code is available at my IPython repo on Github. You'll also find the data used in these exercises and the original exercise PDFs in sub-folders off the root directory if you're interested. While I can explain some of the concepts involved in this exercise along the way, it's impossible for me to convey all the information you might need to fully comprehend it.
If you're really interested in machine learning but haven't been exposed to it yet, I encourage you to check out the class it's completely free and there's no commitment whatsoever. With that, let's get started! In the first part of exercise 1, we're tasked with implementing simple linear regression to predict profits for a food truck.
Suppose you are the CEO of a restaurant franchise and are considering different cities for opening a new outlet. The chain already has trucks in various cities and you have data for profits and populations from the cities. You'd like to figure out what the expected profit of a new food truck might be given only the population of the city that it would be placed in.
Let's start by examining the data which is in a file called "ex1data1.
First we need to import a few libraries. Now let's get things rolling.
We can use pandas to load the data into a data frame and display the first few rows using the "head" function. Another useful function that pandas provides out-of-the-box is the "describe" function, which calculates some basic statistics on a data set. This is helpful to get a "feel" for the data during the exploratory analysis stage of a project. Examining stats about your data can be helpful, but sometimes you need to find ways to visualize it too.
Fortunately this data set only has one dependent variable, so we can toss it in a scatter plot to get a better idea of what it looks like. We can use the "plot" function provided by pandas for this, which is really just a wrapper for matplotlib.Hi, I am clear up to how to calculate partial derivatives. But, I am having doubt after calculating delta values.
I have got delta-2 values in the dimension 10 X 25 and delta-1 with dimension 25X This is I have got for first row of input layer. So, for rows how these delta values will be calculated?
Just to keep two implementations separate for easy understanding of users. Your Program is not running. When I try to run this program sigmoidGradient. Now what is the error? Hi Sarthak, You should also post the error you are getting.
All above programs are working and tested by me multiple times. The program is giving the following error when runnning the nnCostFunction.
Please why is this happening? Hi, I am not Andrew. I am Akshay.
Coursera: Machine Learning (Week 2) [Assignment Solution] - Andrew NG
I think you are doing this assignment in Octave and that's why you are facing this issue. Chethan Bhandarkar has provided solution for it. You'll be getting this error because you are running your program sigmoidGradient. So run the ex4. In the link I have provided, Go and check the comment by "Chethan Bhandarkar" She has provided the solution for the similar to your problem. I have entered same code u mentioned for nncostfunction and submit it but backward propagation part has not submitted.
My first concern is "Why did you copied the code as it is?Continuing on with the series, we will move on the support vector machines for programming assignment 6. If you had notice, I did not have a write-up for assignment 5 as most of the tasks just require plotting and interpretation of the learning curves. T here is two part in this assignment. Next, we will use SVM on emails datasets to try and classify spam emails.
To load the dataset, loadmat from scipy. Plotting of the dataset. We start off with a simple dataset that has a clear linear boundary between the training examples. As recommended in the lecture, we try not to code SVM from scratch but instead, make use of highly optimized library such as sklearn for this assignment.
The official documentation can be found here. Since this is a linear classification problem, we will not be using any kernel for this task. The ravel function here returns an array with size m, which is required for SVC. Next, we will look at a dataset that could not be linearly separable. Here is where kernels come into play to provide us with the functionality of a non-linear classifier. For those having difficulties comprehending the concept of kernels, this article I found gave a pretty good intuition and some mathematics explanation about kernels.
For this part of the assignment, we were required to complete the function gaussianKernel to aid in the implementation of SVM with Gaussian kernels. I will be skipping this step as SVC contain its own gaussian kernels implementation in the form of Radial basis function rbf. Here is the Wikipedia page with the equation for rbf, as you can see, it is identical to the Gaussian kernel function from the course. Loading and plotting of example dataset 2. To implement SVM with Gaussian kernels.
In regards to the parameters of SVM with rbf kernel, it uses gamma instead of sigma. The documentation of the parameters can be found here.
As for this dataset, I found that gamma value of 30 shows the most resemblance to the optimized parameters in the assignment sigma was 0.
As for the last dataset in this part, we perform a simple hyperparameter tuning to determine the best C and gamma values to use. Loading and plotting of examples dataset 3. An SVC model is constructed using each combination of parameters and the accuracy of the validation set is computed. Based on the accuracy, the best model is chosen and the values for the respective C and gamma are returned. The optimal values are 0.
Machine Learning Exercises In Python, Part 1
Moving on to spam email classification. This problem is unique as it focuses more on data preprocessing than the actual modeling process. The emails need to process in a way that that could be used as input for the model. One way of doing so is to obtain the indices of all the words in an email based on a list of commonly used vocabulary. Loading the data. Vocabulary list and its respective indices were given, I had stored the list as a dictionary with the vocabs as keys and indices as values.
You could probably do it another way but I want to make accessing the vocabs easier eg.GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together. If nothing happens, download GitHub Desktop and try again. If nothing happens, download Xcode and try again. If nothing happens, download the GitHub extension for Visual Studio and try again.
Skip to content. Solutions to Andrew NG's machine learning course on Coursera stars forks. Dismiss Join GitHub today GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign up. Branch: master. Go back. Launching Xcode If nothing happens, download Xcode and try again. Latest commit. AvaisP committed fc Sep 5, Added readme. Git stats 9 commits 1 branch 0 tags. Failed to load latest commit information. Linear Regression. Aug 14, Logistic Regression. Aug 16, Multiclass Classification and Neural Nets. Aug 17, Neural Networks Learning.
Aug 20, Regularized Linear Regression. Support Vector Machines. Aug 22, K Means and PCA.GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together. If nothing happens, download GitHub Desktop and try again. If nothing happens, download Xcode and try again. If nothing happens, download the GitHub extension for Visual Studio and try again.
Skip to content. Dismiss Join GitHub today GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together. Sign up. Branch: master. Go back.
Launching Xcode If nothing happens, download Xcode and try again. Latest commit. Git stats 54 commits 1 branch 0 tags. Failed to load latest commit information. View code. Variance ex5. About my solutions to Coursera Machine Learning course, using python Topics machine-learning coursera coursera-machine-learning andrew-ng python scikit-learn matplotlib.
Releases No releases published. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window.I am a pharmacy undergraduate and had always wanted to do much more than the scope of a clinical pharmacist. I had tried to find some sort of integration between my love for IT and the healthcare knowledge I possess but one would really feel lost in the wealth of information available in this day and age.
Given the advance in data and computing power, utilizing a computer to identify, diagnose, and treat diseases is no longer a dream.Implementing Logistic Regression using Matlab
At a more advanced level, computer vision can help identify diseases using radiography images, while in the simpler level, algorithm can detect life-changing potential drug interaction. With the goal of venturing into the health IT industry, I came up with a data science curriculum for those with a non-technical background where I showcased it here. After 6 months of basic maths and python training, I started this course to step into the world of machine learning.
As many of you would have known, the course is conducted in Octave or Matlab. Although It is all well and good to learn some Octave programming and complete the programming assignment, I would like to test my knowledge in python and try to complete the assignment in python from scratch.
This article will be a part of a series I will be writing to document my python implementation of the programming assignments in the course.
This is by no means a guide for others as I am also learning as I move along but can serve as a starting point for those who wish to do the same. With that said, I am more than happy to receive some constructive feedbacks from you guys.
First off will be univariate linear regression using the dataset ex1data1. To start off, I will import all relevant libraries and load the dataset into jupyter notebook. To build up a good habit, I would always have a look at the data and have a good sense of the data. Plotting of the data to visualize the relationship between the dependent y and the independent X variable. I am used to this way of plotting graph but do realize that there is an object-orientated way of using matplotlib, I will be using that in some other graphs within this assignment.
The computeCost function here will give To make the assignment more complete, I also went ahead and try to visualize the cost function for a standard univariate case. The block of code above generate the 3d surface plot as shown. As mentioned in the lecture, the cost function is a convex function which only has 1 global minimum, hence, gradient descent would always result in finding the global minimum.
By the way, I used the mplot3d tutorial to help me with the 3d plotting. Plotting the cost function against the number of iterations gave a nice descending trend, indicating that the gradient descent implementation works in reducing the cost function.
Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. When I run the submit function, I don't display the available join serial number as in the course video in the course video, after running the submit, a bunch of prompts will appear, asking which job to submitwhen I run the submit, No prompt message and submitted directly.
I expect that when I run the submit, the command line can appear as the same as in the Andrew Ng's machine learning instructional video on Coursera. After you press enter you should get plot of your data, etc. But you have to wait some time for data to be plotted.
As for submitting, that part is not graded. You should follow instructions written in PDF and you will see which are the graded parts. You will learn the same materials but with better programming language. The repo I shared will allow you to use Jypiter notebooksinstructions are written above every cell you have to run.
It is much easier to use then to switch all the time between code and PDF. Learn more. How to solve the second week of Andrew Ng's machine learning course, submit assignments on matlab [closed] Ask Question. Asked 1 year, 6 months ago. Active 3 months ago. Viewed 1k times. David Buck 3, 11 11 gold badges 20 20 silver badges 30 30 bronze badges. LinHao LinHao 19 2 2 bronze badges. Have you done any of the course parts that are evaluated yet? If I remember correctly, when you do any of the optional parts that are not evaluated, submit doesn't do anything.
Active Oldest Votes. How did you run the exercise? If you run ex1 you should have result: Running warmUpExercise Press enter to continue. The python comment is purely personal opinion rather than fact. Doubly so since most machine learning these days is done via specialised libraries like tensorflow and keras, which tend to provide apis for all popular scientific languages i. Redirecting to a hacky python version does more disservice; matlab syntax is easier to a beginner; I'm sure Andrew Ng has chosen it for this reason and so that students would focus on the underlying ML rather than because he 'didn't know enough python'.
I would say however, that it in my experience, some languages seem to be favoured in particular fields, more out of convention than 'suitability'. So matlab tends to be heavily used in economics, symbolic math, and biomedical circles, R in social sciences, Python in physics, engineering and startups, and Julia in computational science and generally niche performance-hungry circles.
So if you were to advise someone to chose a language, I'd tell them to choose the one that is used most in the field, rather than the one that is somehow more 'suitable' a language 'for ML'.