Club Of ProgrammerS
30 May 2019


Hello and welcome to the week 3 of CSOC ML. Hope all of you are going through the resources given last week and have submitted the first assignment. The next two weeks we will be learning about Data Science. The topic is too huge to cover in two weeks but we will try to give you a very good idea about the field. Also the second assignment on matplotlib and pandas has been released and the solution to first assignment will soon be released. You have time till next Wesdnesday midnight to submit the assignment. Do not forget to fill the google form after completing the assignment. Now coming to the resources of this week.

What is Data Science?

data science

Data Science is a huge field in itself. Its not just making complicated models and vizualization but its using data to make as much impact as possible. Here is an amazing video in which a data scientist talks about what actually is data science. This blog will also give you a good idea about the various aspects of Data Science.



If you are intrested in data science then kaggle is the place for you. Kaggle is platform to compete with others in competitions which are based on machine learning tasks. You may know about Codechef, Hackerrank etc., so kaggle is also like them, but the key difference is the competition are only related to machine learning, data science, Deep learning or AI related. There are enormous amounts of resources, tutorials, datasets and code available on Kaggle. Its an amazing place to gain knowledge and start with compititive ML. With time kaggle will teach you a lot about Data Science and Machine Learning in general. Sign up on Kaggle and start exploring it. The interface is extremely simple and self explanatory. This blog will clearly explain and guide you through various aspects of kaggle. Your final assignment for the first part of CSOC ML will be related to kaggle so get comfortable with the platform.


Hope that all of you are going through the courses given to you last week. After you complete them go through these:

  • Week 6(complete), Week 8(Dimentionality Reduction) and Week 10(complete) of Andrew Ng’s course.
  • Chapter 10 to 15 - Feature Scaling, Text Learning, Feature Selection, PCA, Validation, Evaluation Metrics of Intro to Machine Learning - Udacity

These along with the previous lectures will give you a clear idea about analyzing data, creating and validating models when working with various data sets.

Here are some amazing resources which highlight various steps which must be undertaken which solving a data science problem. Go through them in order. These will be very helpful for solving the next week’s assignment.

It may be too overwhelming to complete soo many things and the assignments`and look into Dev and CP at the same time. Dont worry!!! Complete the lectures and go through the other resouces as much as possible. You dont have to complete everything in these four weeks(to write). Remember, understanding whatever you have learnt and practising it is more important than trying to hurry through all the materials. If you have put sincere efforts you wil definitely be able to do all the assignments in time. We will be back next week with more resources and assignment 3. So untill then ALL THE BEST and enjoy!!!