Although Deep Neural Networks play bigger and bigger role in scene recognition, classic Computer Vision methods are still valid and are applied to problems from autonomous cars industry. I'm going to present how to perform lane lines detection using OpenCV library and Python language using image processing techniques. It's the first project in the series about Self-Driving Cars.
Today, in the middle of 2017, we are not surprised anymore that most of high-end cars are equipped with systems like lane keeping assistance or lane changing support. Lane keep assist can save us from getting off the track by providing feedback like steering wheel vibrations. Lane change assist is designed to confirm that the lane change operation is made safely by monitoring the "blind spot" at the same time. These features are achieved by mounting a camera e.g. beside the rear mirror and constant analysis of incoming images. This can be done by well-known image processing techniques like Canny method for edge detection or Hough Lines transform for deriving straight lines from the image.
The goal of the first assignment from Udacity course "Self-Driving Cars Engineer" is to find lane lines from either single images or video stream using OpenCV library and Python. I'm going to describe whole pipeline of the system, comment pros and cons of the approach and present the final output. The extended version of this post, with bits of code per each processing step, can be found on my github.
For those who are eager for the final result - you can see how it works in the video below.
Lane lines detection pipeline
My pipeline consists of 10 steps:
- Reading image or video frame
- Filtering white and yellow colors
- Conversion to grayscale
- Gaussian blurring
- Edge detection
- Region of interest definition
- Hough lines detection
- Filtering Hough lines
- Averaging line segments
- Applying moving average on final lines
Reading image or video frame
Below, there are 3 examples of loaded images. Later, after each step, intermediate results will be shown for these samples images. The third image is the most demanding for processing as there are shadows and contrasts between yellow line and the road is very small.
Filtering white and yellow colors
This step wouldn't be necessary for the first two easier images. In the third example, however, proceeding directly to the next step (gray scale conversion) would produce very similar gray colors for the yellow lane and the bright road. We would like to differentiate these two objects somehow. Thus the idea of initial filtering of 2 key colors which are the main components of the road lanes. Firstly, the image is converted to HSL color space. HSL (Hue, Saturation, Lightness) color space concept is based on human vision color perception. That is why it's easier to distinguish desired colors (yellow and white) than in RGB space even if there are shadows on the image.
For extraction of white color I filtered only high lightness from the "L" component of HSL color space. For yellow lanes I chose Hue to equal more or less 30 to select yellow color and Ii filtered Saturation to be quite high. Below, there are test images after such filtering.
Conversion to grayscale
As in many computer vision applications, the image is converted to grayscale. It's mainly for the simplicity and speed of further operations. For instance, edge detectors find big gradients between adjacent pixels. So, it will be easier to compare pixels only in one dimension (grayscale) than in RGB or HSL color spaces.
To supress noise and spurious gradients Gaussian smoothing is applied. Here, it's again preparation for edge detection step. Borders between lane and road can be not so smooth, so we don't want the edge detector to classify such regions as additional lines. The size of smoothing kernel defines how blurred the output is and how much time such operation takes.
To detect edges, let's use popular Canny method. It's called with 2 parameters: low and high thresholds which should be found by trial and error. According to the OpenCV documentation:
- If a pixel gradient is higher than the upper threshold, the pixel is accepted as an edge
- If a pixel gradient value is below the lower threshold, then it is rejected.
- If the pixel gradient is between the two thresholds, then it will be accepted only if it is connected to a pixel that is above the upper threshold.
Canny recommended a upper:lower ratio between 2:1 and 3:1. I chose values of 80 and 40. Below, there are outputs of this operation.
Region of interest definition
To filter out unnecessary objects in the image, the region of interest is defined. Such mask (here it's trapezoid) is then applied to the working image.
Hough lines detection
Now, having edges detected in our interest area, all straight lines need to be identified. This is done by Hough transform explained in another post here. This operation has quite many parameters which need to be tuned experimentally. Speaking at high level, they define how long or how "straight" the sequence of pixels should be to be classified as one line. There is a nice example in OpenCV document about feature extraction showing the result of Hough Line transform on exemplary image.
Below, there are our tested images with found Hough lines plotted in green.
Filtering Hough lines
As we can see above, some line segments are unwanted. For example, small horizontal lines or some lines appearing on cars which are inside the region of interest. Therefore, for each Hough line we calculate a slope parameter. After some experimentation, only lines with slopes between 17 and 56 degrees of slope were left for further analysis.
Below, there are only filtered Hough lines.
Averaging line segments
All found Hough lines should be right now averaged/extrapolated to produce only two lines representing lanes. The first task is to divide lines into 2 groups (left and right - deduced from the slope sign). Then, one can use best linear fit for the points representing line segments or take an average from these lines. I decided to apply weighted average to calculate resulting slopes and intercepts. Here, lenghts of line segments serve as weights. The longer segment is, the more influence it has on the results. In addition, to amplify the importance of the segment length, the weight is calculated as square the lenght parameter.
The final output of the pipeline.
Applying moving average on final lines
While running the pipeline on the video stream, we can observe that the lines are flickering. To avoid it, we can apply cumulative moving average of the line parameters. For each frame it averages last n results including current one. Cumulative version of this method applies bigger weights for more recent results. By having in memory the last averaged frame we can also use it in case when no line is found in a frame by some mistake.
Results and possible improvements
The pipeline successfully performs lane lines detection. It is smooth and stable even in complex scenes with shadows.
However, it detects only straight lines which maybe can be overcome by using some kind of higher order polynomial fit to handle curvy lanes. Also, what would happen when another car appears in our region of interest? It could produce some lines that could be identified as lanes. We should probably detect such cars at the same time to ensure that such object is definitely not a lane. What if we are driving from/up the slope? Then, predefined region of interest could be not valid anymore. I wonder also what happens when some white/yellow flat signs are marked on the road. Such special cases along with driving during the night should be tested and some additional features should be applied to avoid wrong lane lines detection.
More details and the complete code can be found on github.
Also published on Medium.Share