Regression analysis:
Understanding of the probability concepts are compulsory when dealing with the regression models since the sample data are used in building the regression model (essentially in estimating the parameters of the regression model). Regression models are useful in predicting/estimating the average value of the dependent variable based on the value of the independent variable. In civil engineering regression models are widely used and they have applications in all the branches of civil engineering. In travel demand modeling, specifically in trip generation stage, it is known that the number of trips generated in a particular area is a function of the population besides many other variables. Similarly, the compressive strength of concrete is a function of the density of the concrete, rainfall is a function of the vegetation cover in that region. Once the regression model is available one doesn’t need to count the number of trips generated from a particular area; just by putting the population data in the regression model the average trip generations from that area can be estimated.
A simple linear regression model capturing the variability of a dependent variable Y using the variability of an independent variable X is explained here. Here the term linear refers to the linearity in terms of the parameters and not in terms of the explanatory variables.
Y = m*X + c + U
Where, U is the variable that takes the value of the error or the difference between the actual/observed value of Y and the value of Y resulting from the model (m*X + c). This error might be resulting due to the other explanatory variables that are influencing the dependent variable. The same model can be written as;

This is nothing but the conditional mean of Y denoted as a function of the explanatory variable X. This is also known as the population regression function.
Where, the LHS in the first equation denotes the average value of the dependent variable at a given value of the independent variable. The ui, the value taken by the error term U, is a random variable, and the expectation of this random variable for a given Xi (the conditional expectation) is zero.
This can be proved easily by taking the expectation on 

are same and it implies that the conditional expectation of the random variable U is zero.
In reality regression models are always developed based on the sample data. Similar to the population regression function, based on the sampled data, a sample regression function may be formulated as shown below;
![]()
Left hand side of the above regression function denotes the estimated conditional expectation of the dependent variable.
are the estimators for the parameters of the population regression function or the statistics.
is the estimator for ui. Values taken by the estimators are different for different samples. Using the sample data it is required to find the estimates for the population parameters and it is also necessary to explain how close the estimates are to the population parameters.