MA2102

Probability and Statistics

Lecture-4

Pretty much everyone these days is familiar with concept of probability. People discuss probabilities of winning a lottery. A gambler wants to know if the game(gambling) is fair to bet by computing odds of winning the game. A bank wants to calculate probability a loan will be repaid by looking at applicant credit history. A drug company calculate the probability of harmful side effects of drugs by using data that is obtained by doing series of tests on people.

Mathematics is the logic of certainty, but Probability is the logic of uncertainty.And life is uncertain,so we can make better decisions in life by carrying out these probability calculations for everyday events.

Why study probability and statistics as computer science student?

Machine Learning :Developing algorithms (called Machine learning algorithms) to make machines(computers) get better at particular task by learning from data(examples), without explicitly programming rules about the task.

Example:Spam filter(classification(supervised learning) algorithm) able to classify (using Baye's theorem) whether incoming email spam or not from experience of having seen a lot of examples of spam emails and ham(nonspam) emails.
Analysis of Algorithms:Finding average time complexity of an algorithm provided we know the probability distribution of inputs. And Probability is an essential tool for analysing(studying performance of algorithm) randomized algorithms (an algorithm that make random choice in some part of its logic).

Example of a Probability problem:
Let us say you are about to give a test consisting of 100 GK(General Knowledge) Multiple choice questions(MCQ) (with four options and no negative marking) and you believe that you have zero knowledge about GK( you won't even bothered to read question to answer, you will just pick a random option), then what is the probability that you will score at least 30. (answer will be 0.1037872 as we will compute(using Binomial distribution) later in this course)

Example of a Statistics problem:
Let us say you attempted a test consisting of 100 General Knowledge Multiple choice questions(MCQ) (with four options and no negative marking) and you have have some knowledge about GK (so you do read questions before you pick answer) and you scored 60. Using (inferential)statistics one has to draw conclusions(inferences) from this data that how good you are at GK means estimating parameter(a characteristic about your GK), say $p$ that represent a probability you answering a random GK question correctly.(using sampled data (performance over 100 random GK questions) one can infer(using inferential statistics) that $p\approx0.6$ )

The probability theory is self contained mathematics discipline but the (Inferential)Statistics (which deals with drawing inferences from data pertaining to some random phenomenon) is build upon probability theory. Probability theory used as a primary tool for analyzing statistical methodologies.

In this course first three units are devoted to probability theory and and last two units are devoted to statistics.So we get started with probability theory.

Probability is a tool for expressing degree of confidence or doubt about some proposition in the presence of incomplete information.By convention, probabilities are calibrated on a scale of 0 to 1. Probability statements that we make can be based on our past experience or subjective personal judgments, in either case they obey a common set of rules(principles), which can be used to treat probabilities in a mathematical framework.

We are going to think of probability theory as mathematical model of chance. Here the idea is to start with few basic principles(axioms) of chance that are sufficiently simple enough so that one feel them readily (no one suppose to question them). Once these principles(axioms) are accepted then we deduce the mathematical theory from these axioms that guide us in more complicated situations.