Structure from Motion

3D from standard cameras for automotive applications

The World Health Organization estimates than 1.2 million people die in traffic worldwide. Drivers in the USA spend about five years of their lives in a car, and due to the high cost of traffic incidents, insurance has risen to about $0.10 per mile. Automotive Advanced Driver Assistance Systems (ADAS) have the potential to make a big impact: saving lives, saving time, and saving cost by aiding the driver in operating the vehicle, or even take over control of the vehicle completely.
One of the key tasks of an ADAS is to have an understanding of the vehicle’s surroundings. Besides recognizing pedestrians, other vehicles, lanes and obstacles, the system should be aware of where other objects are, in full 3D space. This 3D information enables the ADAS to understand the distance of objects, along with their size, direction and speed, allowing it to take appropriate action. It’s common to think that us humans use our two eyes to sense depth information. At the same time though, we can easily catch a ball with one eye closed. Research has shown that humans actually primarily use monocular vision to sense depth, using motion parallax. This is a depth cue that results from movement. As we move, objects that are closer to us move farther across our field of view than objects that are more distant. The same mechanism, called Structure from Motion can be used in to sense depth using standard video cameras. There are different ways to sense depth using special cameras. Lidar measures distance by illuminating a target with a laser and analyzing the reflected light. Time-of-flight cameras measure the delay of a light signal between the camera and the subject for each point of the image. Another method is to project a pattern of light onto the scene. Capturing this distored pattern with a camera allows the extraction of depth information. Using Structure from Motion has a few key advantages over these approaches. Firstly, there’s no active illumination of the scene required. Such active lighting limits range and outdoor use. In addition, a standard off-the-shelf camera suffices, instead of a specialized depth-sensing camera. This reduces cost, since the standard rear-view or surround-view cameras can be reused, and no active lighting components are needed.

Structure from Motion algorithm

The Structure from Motion algorithm consists of three steps:

Anzeige
  • • 1. Detection of feature points in view
  • • 2. Tracking feature points from one frame to the next frame
  • • 3. Robust estimation of 3D position of these points, based on their motion

The first step is to identify points in the image that can be robustly tracked from one frame to the next. Features on textureless patches, like blank walls, are nearly impossible to localize. Areas with large contrast changes (gradients), like lines, are easier to localize, but lines suffer from the aperture problem, i.e., it is only possible to align patches along the direction of the line, not in a single position. This renders lines not useful to track from frame to frame either. Locations where you find gradients in two significantly different orientations are the good feature points that can be tracked from one frame to the next. Such features show up in the image as corners, where two lines come together. There are different feature detection algorithms, and they’ve been widely researched in the computer vision community. In our application, we use the Harris feature detector. The next step is to track these feature points from frame to frame, to find how much they moved in the image. We use the Lucas Kanade optical flow algorithm for this. This algorithm first builds a multiscale image pyramid, where each image is a smaller scaled image of the originally captured image. The algorithm then searches around the previous frame’s feature point location for a match. When the match is found, it reuses this position as an initial estimate with the larger image in the pyramid, traveling down the pyramid until the original image resolution is reached. This way, larger displacements can also be tracked. The result is two lists of feature points; one for the previous image and one for the current image. Based on these point pairs, you can define and solve a linear system of equations that finds the camera motion, and consequently the distance of each point from the camera. The result is a sparse 3D pointcloud covering the camera’s viewpoint. This pointcloud can then be used for different applications such as automated parking, obstacle detection, or even accurate indoor positioning for mobile phone applications.

Vision processor

Videantis has been working together with Viscoda to implement the Structure from Motion algorithm on the videantis v-MP4280HDX vision processor. The Viscoda Structure from Motion algorithm has been proven to be very robust in reconstructing a 3D point cloud under a wide variety of situations, whether it is in low light conditions or for complex scenes. The videantis vision processor is licensed to semiconductor companies for integration into their systems-on-chips that target a wide variety of automotive applications. The processor architecture has specifically been optimized to run computer vision algorithms at high-performance and with very-low-power consumption. The multi-core architecture can scale from just a few cores to many cores, enabling it to address different performance points: from chips that can be integrated into low-cost cameras, all the way up to very high-performance applications such as multi-camera systems that run a many computer vision algorithms concurrently. The combined solution runs the Viscoda Structure from Motion algorithm on the videantis processor architecture. The resulting combined implementation is small and low power enough to be integrated into smart cameras for automotive applications, making our rides safer and enabling us to let go of the wheel and pedals.

Das könnte Sie auch interessieren

Die VDMA OPC Vision Initiative definiert derzeit, wie zukünftig eine standardisiere Kommunikation zwischen der Bildverarbeitungs- und Automatisierungswelt stattfinden soll. InVISION sprach mit Dr.-Ing. Reinhard Heister, zuständig für Standardisierung und Industrie 4.0 beim VDMA Fachverband Robotik+Automation, über den aktuellen Stand der Dinge und die Zeitpläne.‣ weiterlesen

www.vdma.org

Anzeige

Der Vision Tube ist weltweit das einziges Code Verifikationssystem mit Autofokus. Damit können 1D- und 2D-Codes (gedruckte als auch direktmarkierte Codes) belastbar nach einer Norm verifiziert werden. Das System wird nur über einen einzigen Taster bedient (One Button Action). Es ist hierfür keine zusätzliche Software nötig. Die Rückmeldung erfolgt sofort farblich in rot, grün oder gelb. Der Tube ist sowohl offline als auch inline nutzbar. Er kommt ab Werk kalibriert und funktioniert direkt out of the Box – einfach Plug&Work!

 

cretec.gmbh

Anzeige

Der Markt für die industrielle Bildverarbeitung (IBV) kommt in eine neue Phase. Die steigende Bedeutung im Umfeld von Big Data und Industrie 4.0 fordert nicht nur die technische Kreativität der Entwickler, sondern auch neue Konzepte und Organisationsstrukturen auf Seiten der Lösungsanbieter. inVISION sprach mit Markus Schnitzlein, Geschäftsführer von Chromasens und neuerdings auch Mitglied im Board of Directors der Lakesight Technologies Holding.‣ weiterlesen

www.lakesighttechnologies.com www.chromasens.de

Anzeige

Seit Ende letzten Jahres vertreibt Notavis – eine Tochter von Vision Components – die Bildverarbeitungsprodukte des chinesischen Kameraherstellers Dahua Technology in der D/A/CH-Region. Um welche Produkte es dabei genau geht, und wie es mit der Qualität der Kameras aussieht, wollten wir von Mason Ge, General Manager DACH bei Dahua, und Thomas Schweitzer, Geschäftsführer bei Notavis erfahren.‣ weiterlesen

www.dahuasecurity.com www.notavis.com

Anzeige

In der Visionbranche treffen sich die Standardisierungsgremien alle sechs Monate zum International Vision Standards Meeting (IVSM), um dort die wichtigsten eigenen Standards, aber auch solche für die Kommunikation mit der Industrieautomation festzulegen. Das nächste Treffen findet vom 14. bis 18. Mai 2018 in Frankfurt (Main) statt und wird vom VDMA IBV zusammen mit Silicon Software organisiert.‣ weiterlesen

ivsm2018.silicon.software

Im Rahmen einer Partnerschaft präsentieren Mikrotron und Phrontier ein System bestehend aus Kameras und Fiber Extender Einheit, das die Übertragung der Bilddaten von der Kamera bis zum PC auf Basis von CoaXPress (CXP) bis zu einer Distanz von 80km ermöglicht. Die Übertragung erfolgt bei hochauflösendem Video Stream über Kanäle mit 4x6,25Gbit/s und für den Upload 1x mit 20,83Mb/s, wobei der Jitter gering bleibt.‣ weiterlesen

www.phrontier-tech.com www.mikrotron.com

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige