# What is Logistic Regression?

Logistic regression is a statistical model that Is used to determine the probability that an event will happen. It shows the relationship between features, and then calculates the probability of a certain outcome.

Logistic regression is used in machine learning (ML) to help create accurate predictions. It is similar to linear regression, except rather than a graphical outcome, the target variable is binary; the value is either 1, or 0.

There are two types of measurables, the explanatory variables/ features (item being measured) and the response variable/ target binary variable, which is the outcome.

For example, when trying to predict whether a student will pass or fail a test, the hours studied are the feature, and the response variable will have two values - pass or fail.

There are three basic kinds of logistic regression:

1. Binary logistic regression: Here there are only two possible outcomes for the categorical response. As in the example above – a student passes or fails.
2. Multinomial logistic regression: This is where the response variables can include three or more variables, which will not be in any order. An example is predicting whether diners at a restaurant prefer a certain kind of food – vegetarian, meat or vegan.
3. Ordinal logistic regression: Like multinomial regression, there can be three or more variables. However, there is an order the measurements follow. An example is rating a hotel on a scale of 1 to 5.

## Assumptions Used for Logistic Regression

When working with logistic regression, there are certain assumptions that are made.

• In binary logistic regression, it is necessary that the response variable is a binary. The outcome is either one thing, or another.
• The desired outcome should be represented by the factor level 1 of the response variable, the undesired is 0.
• Only variables that are meaningful must be included.
• Independent variables have to essentially be independent of one another. There should be little to no multi-co-linearity.
• Log odds and independent variables have to be linearly related.
• Logistic regression must be applied only to massive sample sizes.
Which DataScience Superhero Are You?
Download this ebook to learn the top six skills you need to set yourself apart as a data scientist.

## Applications of Logistic Regression

There are several fields and ways in which logistic regression can be used and these include almost all fields of medical and social sciences.

### Healthcare

For example, the Trauma and Injury Severity Score (TRISS). This is used across the world to predict fatality in injured patients. This model has been developed with the application of logistic regression. It uses variables such as the revised trauma score, injury severity score, and the age of patient to predict health outcomes. It is a technique that can even be used to predict the possibility of a person being afflicted by a certain disease. For example, ailments like diabetes and heart disease can be predicted based on variables such as age, gender, weight and genetic factors.

### Politics

Logistic regression can also be used to attempt to predict elections. Will a Democrat, Republican or Independent leader come to power in the USA? These predictions are made on the basis of variables such as age, gender, place of residence, social standing and previous voting patterns (variables) to produce a vote prediction (response variable).

### Product testing

Logistic regression can be used in engineering to predict the success or failure of a system that is being tested, or a prototype product.

### Marketing

LR can be used to predict the chances of a customer’s enquiry turning into a sale, the possibility of a subscription being started or terminated, or even potential customer interest in a new product line.

### Financial sector

An example of use in the financial sector is in a credit card company that uses it to predict the likelihood that a customer will default on their payments. The model built could be for the issuance of a credit card to a customer or not. The model can say whether a certain customer will “default” or “not default”. This is known as the “default propensity modeling” in banking terms.

### E-commerce

Much along the same lines, e-commerce companies invest heavily in advertising and promotional campaigns across media. They want to see which campaign is the most effective and the option most likely to get a response from their potential target audience. The model set will categorize the customer as a “responder” or “non responder”. This model is called propensity to respond modeling.

With insights that come from logistic regression outputs, companies are able to optimize their strategies and achieve business goals with reduction in expenses as well as losses. Logistic regressions help to maximize return on investment (ROI) in marketing campaigns, a benefit to the bottom line of a company in the long run.