Nhận dạng và nhận diện

If so, you will not only need to capture a scene at sufficient resolution, but also take factors such as illumination, camera positioning and motion into account.

The focus of this tutorial is the cameras in a surveillance project that provide the close-up footage required for identification and/or recognition, and show how you can meet these requirements in your project.
Required resolution

The traditional way of defining requirements for resolution of an analog CCTV system has been by specifying what percentage of the full screen the observed object occupies. Different surveillance objectives require different percentages.

For example, detecting the presence of a person in a scene could require that the person occupies 10% of the view. Recognizing a known person, however, could require that the person occupies 50%, and further identifying that person could require 120% or more.

20% 40% 140% Too small  for recognition Face turned away  from camera Face partly turned away
from camera


Today, network video cameras offer a wide range of available resolutions. Using the percentage requirements is no longer practical, and pixels are now used when specifying resolution requirements. For a detailed discussion on resolution requirements for identification, recognition and detection, see the Perfect pixel count tutorial.

Other criteria are valid for objects such as license plates, where typical recommendations are that the height of letters should be represented by 15 pixels (corresponding to about 200 pixels/m) to ensure legibility.

It is also important to take legal and regulatory requirements into account when determining the resolution needed in order to be able to use camera footage as evidence in court.
Finding a camera to match resolution requirements

The resolution of a captured scene is determined by the camera resolution and the size of the scene. For example, if you are using a camera that delivers 4CIF (704 x 576 pixels) resolution, you can cover a scene that is, at most, 1.4 m wide, if the linear resolution is 500 pixels/m or more. You will need to select a camera and lens that will allow the field of view to match the scene size at the desired distance between the camera and the scene.


Camera model Focal length Horizontal
resolution Maximum
distance Maximum
scene width
AXIS P1357 2.8 – 8 mm 2592 pixels 9 m 5.2 m
AXIS P3354 12mm 3.3 – 12 mm 1280 pixels 6 m 2.6 m
AXIS Q1755 5.1 – 51 mm 1920 pixels 41 m 3.8 m
AXIS Q6042-E 3.3 - 119 mm 736 pixels 50 m 1.5 m
AXIS Q6044 4.4 – 132 mm 1280 pixels 67 m 2.6 m
Table 2: Maximum distance for identification (500 pixels/m horizontal linear resolution, 
80 pixels face width) for some Axis cameras 

Axis Lens Calculators and the Axis Product Selector are useful tools that help finding a suitable camera and focal length. For advanced users, a pixel and distance calculator spreadsheet is also available.
Higher camera resolution means fewer cameras and better overview

Since the maximum size of a scene covered at a given resolution only depends on the camera resolution, cameras with higher resolution can cover larger areas. For example, if your 7 m wide scene requires five cameras delivering 4CIF resolution, these can be replaced by two cameras at 1080p HDTV resolution (1920 x 1080 pixels). Also, a camera with higher resolution can be used to give a better overview, by covering a larger scene while maintaining the required linear resolution.

Cameras with higher resolution can cover larger areas
Depth of field

The larger the depth of field is, the larger the area where persons or objects are in focus. With a large depth of field, your chances of identification increase. Depth of field is determined by the iris opening, the focal length and the distance to the camera.

The depth of field increases with smaller iris openings. This means that good lighting conditions can help increase depth of field. The P-Iris feature of some Axis cameras will adjust the iris to optimize depth of field for different lighting conditions. You can learn more about the P-Iris from the P-Iris white paper:

Using shorter focal lengths will also increase depth of field. Using cameras with higher resolutions will let you capture the scene using shorter focal lengths, while maintaining resolution requirements.
Distortion

Most lenses exhibit distortion. Often this is in the form of barrel distortion. Barrel distortion is caused by lens magnification being smaller on the edges of the field-of-view compared to the center of the image. The effect is that objects that are near the edge appear closer to the center compared to an undistorted image. Objects of the same size will cover fewer pixels when they are near the edge, compared to what they would cover if they were closer to the center. This means that objects that are near the edge of the field-of-view need to be closer to the camera in order to fulfill requirements on minimum resolution.

The effect of barrel distortion is often much more pronounced at short focal lengths, making wide angle lenses less suited for identification purposes.
Illumination

Illumination greatly affects the ability to identify persons or objects. Shadows, high contrasts and backlit scenes all make identification and recognition more difficult compared to when lighting conditions are more favorable. These examples compare good outdoor lighting with more challenging conditions.


At distances between 15-20 m, you will need a 50 mm lens to ensure that a face covers around 80 pixels. However, the examples clearly show that even at this resolution, positive identification is not guaranteed at the 100-150 lux illumination that is typical in an office corridor or subway station. Camera features such as wide dynamic range and sensors that perform well in low light situations can help, but the best results are obtained if these are combined with additional lightning and adjustment of camera positions to avoid backlit situations.

In outdoor surveillance it is important to take into account that the sunlight shifts in intensity and direction through the course of a day. Weather conditions will also affect lighting and reflection. Snow will, for example, intensify the reflected light, while rain and wet tarmac will absorb much of the reflected light. For identification of a human face, balanced illumination in the region of 300-500 lux is recommended. For license plate identification, 150 lux may be sufficient.
Noise

In low light, camera sensors produce significant amounts of noise that can affect the image. This can make identification more difficult. There is always a trade-off between noise, shutter speed, and depth of field at any given level of illumination, where better lighting conditions allow you to improve all of these.

Color fidelity

Color is often an important factor for identification. To ensure color fidelity, camera white balance should be adjusted to suit the color temperature of the light sources used. In outdoor surveillance, the color temperature will change throughout the day, requiring automatic white balancing to keep color fidelity.

Cameras that are compliant with the SMPTE (Society of Motion Picture and Television Engineers) standards for HDTV fulfill stringent requirements on color fidelity.
Camera positioning

In this example, light is favorable both in intensity and direction. The camera is placed in level with passing people and the lens offers both focus and depth of field.

Camera placement is critical for successful identification. This is not only for the purpose of avoiding difficult lighting situations, but also to ensure that persons or objects are captured at a favorable angle. If, for example, cameras are placed high above the ground, images will have a birds-eye perspective, making persons or objects distorted and difficult to identify.

The camera should be firmly fixed in order to minimize blur caused by camera movement. This is of particular importance for PTZ cameras, where maneuvering the camera may induce vibrations that affect image quality.

Stability can be challenging if the camera is mounted on a tall pole and you are using a zoom lens with a long focal length. Then, even small vibrations will translate to large movements in the resulting image.
Motion

Your system design needs to consider motion. For identification purposes, a minimum frame rate of 5 to 8 frames per second is often recommended. Your surveillance objectives may require higher frame rates, for example if you want to get a clearer picture of a series of events. If the captured scene includes persons or objects that cross the field of view at high speed or close to the camera, you will probably want to increase the frame rate to ensure that the camera will not miss any of the action.

Also, in order to capture sharp footage of fast-moving persons or objects, you will need to use short shutter speeds. Using cameras that support progressive scan eliminates the blur that affects moving objects when using interlaced video.
Compression

Compression can greatly affect the usability of recorded materials for identification and recognition. High compression ratios will introduce blur or pixelation that makes identification difficult. If the compression algorithm uses a bit rate limit, the compression could increase when motion occurs, making otherwise clear footage unusable. When using variable bit rates, on the other hand, the compression remains unchanged, but bandwidth usage will increase when there is motion.

Testing

To ensure that identification and recognition goals are met, it is practical to test the installed cameras by having a test subject acting under realistic conditions. Make sure you use varying levels of lighting when testing. Review the recorded footage in order to verify that you get the image quality you need.

Some typical problems you should look for are:
Camera placement or lens selection that distorts facial features
Difficult lightning conditions that create shaded areas or whiteout effects
Compression settings that cause image blur or pixelation
Motion blur caused by insufficient shutter speed or frame rate
Excessive noise in low light situations
Overlay text placed in a crucial part of the scene

Pixel counter function in action

The pixel counter feature available in some Axis cameras lets you draw a rectangle on the screen around an area of interest. The camera will report the pixel dimensions of the rectangle. Using the pixel counter, you can easily verify that the camera installation fulfills resolution requirements.
SKL target

For more advanced calibration, a Rotakin device (rotating man) may be used, which simulates object motion and resulting image blur. 

Summary

The ability to identify or recognize persons or objects depend on a number of factors. Some of the more important factors are:
Camera resolution and scene size
Lighting conditions
Camera position
Motion
Compression

Your surveillance objectives determine requirements on the number of pixels a person or object needs to occupy in the captured footage. Based on older CCTV recommendations, a 16 cm-wide face should cover at least 40 pixels which is also the recommendation from the latest European Standard EN50132-7:2012 by CENELEC. However, Axis suggests 80 pixels or more for identification in challenging conditions. For license plates, recommendations are so that the text should cover 15 pixels vertically. Remember to check legal requirements for footage to be admitted as evidence.
For a given required resolution, the camera resolution determines the maximum size of the captured scene. The more pixels the camera delivers, the larger the scene covered. The camera’s depth of field is important in order to allow identification within a wider range.
Axis Lens Calculator is useful when selecting cameras that fulfill requirements for identification and recognition.
In challenging lightning conditions, identification may not be possible, even if resolution requirements are fulfilled. Camera features such as wide dynamic range and sensitive sensors help, but you need also consider improved lighting and positioning the camera to avoid backlit situations.
Camera positioning is important in order to capture undistorted images, avoiding pronounced overhead views that make identification difficult.
Most likely your subject will be in motion. You will need to select an appropriate frame rate and shutter speed depending on your surveillance objectives.
Remember, there is no substitute for testing your system under operational conditions to ensure that the installation meets your surveillance objectives. Make sure you review recorded footage to ensure that image quality has not been compromised by compression and that the quality is sufficient for your identification and recognition needs.
Share on Google Plus

About Nguyễn Tiến Cường

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.
    Blogger Comment
    Facebook Comment

0 comments:

Post a Comment