3-D image processing

3-D image processing principles

Numerical value of each pixel in 2-D regular images shows the amount of luminance for that point. But there are some numerical methods in which by using 2 cameras we are able to determine a parameter related to depth. Thus we’ll have an image with 3-D information in which in addition to regular information, we will have information on xyz coordinates for each pixel either.

To be more understandable, suppose we have 2 cameras, A and B, as depicted below:

As it is shown in the right picture, what A sees, is different from what is seen by the B camera. In fact, there is a parallax difference (displacement or difference in the apparent position of an object viewed along two different lines of sight) between images of an object in cameras A and B. The far the body is from the camera set, the less the parallax is. As it can be seen in the picture this difference for object C is less than D. finding out the parallax is the most important job done in stereovision method, after which objects distance can be mathematically calculated easily.

In order to calculate the error in speedometer by stereovision method, it is necessary to review its mathematical principals. For easier case, as shown in the picture below, we’ll suppose 2 cameras taking shots from a point in 3-D space, so we would have:

Baseline (b) is the distance between two cameras, object is at (x, y, z) coordinates in the 3-D world, where O is the origin. The object has image coordinates of and in the cameras’ images.

By forming the main equation for both cameras we’ll have:

Which easily leads to this essential relation which gives us the depth of the body (z), in 3-D world:

Is the parallax difference of the object in the images of taken by the cameras and it is calculated through image processing. This parameter is usually called “disparity”. There are many different known methods to determine disparity up to 0.1-pixel accuracy.

F and b are constant values in the formula. Error in evaluation of the depth of the body (z) can be computed for different values of z (different distances) through conventional error calculation methods. Since this paper is not going to discuss all the numerical calculations in detail, we just represent the resulted diagram of relative error of distance measurement for a stereo vision camera with suitable characteristics for traffic applications:

The chart above is the relative error of distance measurement against z (bodies distance from the stereo vision cameras). As it is shown the error grows as distance increases. This is why stereo vision cameras are not used in applications with high depth (with kilometer scale). But it can be seen that maximum error will be less than 1% for a vehicle with distance of 100 meters, which is really appropriate for speedometer and law enforcement applications.

It is worth mentioning that practically the effect of this error will be less. Since the image of the body includes many pixels and averaging value of z over all of them will decrease the final error. Also the speed of the vehicle is not calculated only at the maximum distance, and the error of speedometer has less value for closer distances. As velocity is computed by derivation of displacement in stereo vision method, this work can be done more accurately when precise computation of the location is in hand.

3-D vision based speedometer cameras

Although the principles of 3-D image processing technology had been known for many years, but due to empowerment of processors, this technology has become more publically economical recently. Demanded time to process one frame was greater than the time interval between two frames in the past. Thus it was impossible to implement a real-time system. But nowadays utilizing image processing technology, is a new generation of technology, which we could expect more of its applications in the future. Thus 3-D vision based speedometer cameras has just appeared in international scale, and a few companies are working in this field (It is worth to remember that LPR based systems, due to their intrinsic issues, were not used for traffic violation record applications in worldwide scale).

Comparison of speedometer cameras

It must be noticed that, as we said in radar speedometer section, indeed not all the systems having radar, are radar-based systems. In many cases they are only named as radar system; while the performance is not based on radars at all. There is also the same issue for stereovision cameras, such that every system with 2 cameras is not stereovision necessarily. In order to 3-D image processing and finding parallax, it is vital for the cameras to take the same views so that the algorithms be able to correspond pixels of one picture to the other. Thus the systems utilizing two cameras, a color and an infrared, due to noticeable difference in pictures, theoretically are unable to do 3-D image processing and compute the depth in the picture. Advantages of Stereovision cameras are totally evident in the systems consisting of them.

Detecting vehicles independent from LPR

Same as radar systems, which recognize vehicles based on some independent physical feature (frequency shift), Stereovision cameras also detect vehicles as a moving 3-D object on the street and there is no other assumption in order to detect and evaluate their speed. So, regardless of the color, shape, appearance, and license plate, all the vehicles would be detected, their speed would be evaluated, and the image needed for LPR is taken, then the LPR process will be done in the next step. In other words, image processing kernel is universal in terms of vehicle detection and speedometer, and does its job based on detecting a moving 3-D object on the street. The ANPR kernel only performs the license plate reading function independently for each license plate. It is worth mentioning that there are also some stereovision systems which do the image processing calculations based on ANPR, which means the system finds the license plate in the picture first, and then tries to perform the 3-D depth calculations on plate. Although these systems locate and determine the speed of vehicles accurately, but they are deprived of many capabilities of a system completely based on stereovision, which recognizes vehicles on their 3-D volume. Thus full advantage of this hardware wouldn’t be applicable.

Therefore, the capabilities below are provided for a thorough stereovision system based on 3-D recognition:

Detecting and imaging all the vehicles with any type of license plate

In addition to national standard license plates, many other permitted license plates can also transit in Iran. Such as:

International transit plates (in different formats)
Iran’s transit plates
Political and diplomatic license plates (in many formats)
Free zones (in two formats: Chabahar’s free zone plate and other free zones formats)
License plates of neighbor countries such as Iraq, which are found numerously in border cities.

Obviously a system which is based on Iran’s standard license plate detection would miss many national cases. We should also add vehicles without plates or cases with unreadable ones to the mentioned statistics. On the other hand, when vehicle detection is separated from LPR phase, all the cases discussed above would be recognized and recorded in the system, accompanied with suitable pictures from the vehicle.

Speed detection regardless of license plate

More accurate license plate recognition

Obviously, each system has its own ANPR kernel (there are some major different kernels all over the country). But regardless of the power or accuracy of these software kernels, there would be better results for them, if they were utilized in a separate system with an independent detection unit. There are many reasons for this fact:

ANPR kernel would be successful, if only read one of the multiple frames provided by vehicle detection unit (Radar or stereovision) from each car.
All the frames related to a vehicle are surly related to that special car. So there is no need to make correspondence between different frames according to LPR results.
Some valuable information will be provided by stereovision cameras which can be helpful in ANPR process.
Since, the pictures are being taken while the vehicle is passing, there will be enough time for a system to read their license plates (Even if they are especial ones) after they are passed.

Security, disciplinary, and traffic counting applications

Traffic violation recording systems are not only responsible for detection and reporting the offending drivers. There are some other Security and disciplinary goals defined for them which are as important as traffic police applications. That’s why recording all the passing vehicles regardless of the license plate type will be vital for a national system (even in case of manipulated license plates).

High resolution and resistance to overlapping

When it comes to independency of LPR, as mentioned before both radar and stereovision are the same. But as we saw radars have low resolution, which exactly is the strength point of stereovision systems and it has no effect on 3-D image processing. The reason is that radar systems use an array of receiver for resolution, so the location of the body will be revealed by calculating the phase lag or the time difference between sending a signal and the receiving it back. Meanwhile, the low numbers and short distances (baselines)of these receivers which causes small phase lags in comparison with acceptable parallax (disparity) in stereovision due to applying two high resolution cameras which their baseline is adjusted for traffic applications (about one meter). (As mentioned earlier, radars resolution improvement is still a major issue, and every year products are developed based on this need). Also a camera consists of hundreds of light receivers which provide the essential information for a higher accuracy.

An example of appropriate performance of 3-D image processing system in heavy traffic and vehicles with nearly same speed, which is an unfavorable situation for radar

Accuracy and speedometer range

There are three factors affecting speedometer accuracy:

A method based on precise physical and mathematical principles

In the first section, we discussed how Doppler Effect and the strength of electronics technology in precise measurement of frequency shift ensure an accurate Speedometer. Also we proved that high case errors for mono-cam LPR based image processing, through theoretical calculations is expected. In the first section of this chapter, it was shown that by considering the 3-D depth achieving relations, the error of distance measurement which is the basis of speedometer, is less than 1%. Thus both radar and 3-D image processing would be successful from this point of view.

High depth of performance

When sensors can detect vehicles from 100-meter distance, system will have enough amount of time to gather the essential data, thus any noises or errors in the gathered data can be eliminated easily. Radars and stereovision image processing both have the same performance from this perspective. They collect more than 100 data for each vehicle passing, so a few noises included in them can be detected and deleted. But mono-cam LPR based systems have a few data for which distinguishing valid data from noises maybe hardly possible.

Multiple measurements

Vehicle’s Image consists of many pixels. Consequently, averaging on data came from all of them will reduce the final error. Although each pixel can have its own error (which is really slight), but by averaging the values coming from pixels, errors will cancel each other out (some errors are positive and some other are negative); thus the final outcome would be more precise.

Strengths mentioned earlier, all together provide an average error of less than 1% and more than 300 km/h speed, as the top limit for radar and 3-D image processing technologies. (it should be noticed that, mentioned numbers only apply to pioneering technologies. Thus not any technology must necessarily follow them.)

Considerable vehicle detection rate

In stereovision method, the process of detecting vehicles is done without considering factors such as shape, color, and moving direction (approaching or receding) of the car, the job is done only by having some pixels of the image. Since there are many pixels in an image of a car, so there is a high chance of recognition for each one. Since stereovision cameras take wide pictures, there are no serious limits for latitudinal coverage; such that with only a few optical changes it is possible to cover six or even more lanes.

Even in unpleasant air conditions or in minor overlapping cases, there will still be a few pixels of the vehicle for the system to recognize it, thus the performance of stereovision systems will be preserved appropriately, in bad conditions. Below are some examples of stereovision systems outcomes:

Simultaneous speedometer and detection of several vehicles in different lines through one stereo vision camera

Resistance of system against vehicles overlapping

System’s functionality in snow and fog

System’s functionality in storm

The important fact that in order to recognize a 3-D body on the road, there is no requirement for the registration plate and its location on the car, vehicles color, and the shape of it, … lefts no choice but disappearing the vehicle for criminals and traffic offenders. While in radar systems by using jammers and in LPR based systems through license plate coverage or manipulation, one can be stealth to the system, and pass it without being recorded. Below are some practical outcomes of stereovision systems in particular conditions:

3-D vision, vehicle class and moving lane detection

A 3-D vision system sees the scene in 3-D way, so there is XYZ information available for any point in the image. Consequently, it is absolutely possible to determine width and length of a car and to locate it in 3-D world based on this information. Thus stereovision utilizes the most accurate method to recognize weight class of the vehicle, i.e., measuring the size of vehicle; which is more precise than classification according to the reflection rate of the surface used by radar systems. Also through 3-D locating of the vehicle, we can determine its 3-D location on the road and recognize which lane it is moving in.

Calibration and physical conditions

What is important in calibration of stereovision cameras is the transform matrix of the two cameras. In other words, stereovision cameras are calibrated with respect to their positions relative to each other and lens parameters during manufacturing. The key parameter is the absence of any angle variation and linear displacement of the cameras relative to each other, as the time passes. Stereovision cameras’ calibration includes technical points and practical complexities which hinders achieving required accuracy. At the same time, all the advantages of stereovision systems will vanish, if an error occurs during primary calibration. But when the stereo camera has been manufactured, calibrated, and installed precisely, pole’s trembling and slight angle variations of the whole set won’t cause any problem, as long as cameras has not displaced or rotated against each other.

In other words, there won’t be any problem in performance of the whole system, as long as there has been a little change in both of the stereo cameras’ angle equally. But problems will show up when there is slight angle difference in cameras relative to each other. Since two stereo cameras can be fixed firmly enough on the predetermined structure, pole vibrations, foundation subsidence, or other factors will only change the situation of the structure all together, not changing the location of each camera relative to the other. Plus, as mentioned earlier for the principles of 3-D image processing, there is no presumption about the outer space or geometry of the road and its curvature; consequently, there will be no effect on the performance of the systems based on 3-D image processing in different situations of the road.