Going Deep

Jeff Bier’s Column: Why Depth Sensing Will Proliferate

If you’ve read recent editions of this column, you know that I believe that embedded vision – enabling devices to understand the world visually – will be a game-changer for many industries. For humans, vision enables many diverse capabilities: reading your spouse’s facial expression, navigating your car through a parking garage, or threading a needle. Similarly, embedded vision is now enabling all sorts of devices to be more autonomous, easier to use, safer, more efficient and more capable.

Bild: Embedded Vision Alliance

„Depth is a key aspect of visual perception, but one that’s been out of reach for most product designers.“ -Jeff Bier, Embedded Vision Alliance (Bild: Embedded Vision Alliance)

When we think about embedded vision (or, more generically, computer vision), we typically think about algorithms for identifying objects: a car, a curb, a pedestrian, etc. And, to be sure, identifying objects is an important part of visual intelligence. But it’s only one part. Particularly for devices that interact with the physical world, it’s important to know not only what objects are in the vicinity, but also where they are. Knowing where things are enables a camera to focus on faces when taking a photo, a vacuum cleaning robot to avoid getting wedged under the sofa, and a factory robot to safely collaborate with humans. Similarly, it’s often useful to know the size and shape of objects – for example, to enable a robot to grasp them. We live in a 3D world, and the location, size and shape of an object is a 3D concept. It’s sometimes possible to infer the depth dimension from a 2D image (for example, if the size of the object is already known), but in general, it’s much easier to measure the depth directly using a depth sensor. Historically, depth sensors have been bulky and expensive, like the LiDAR sensors seen on top of Google’s self-driving car prototypes. But this is changing fast. The first version of the Microsoft Kinect, introduced in 2010, showed that it was possible – and useful – to incorporate depth sensing into a consumer product. Since then, many companies have made enormous investments to create depth sensors that are more accurate, smaller, less expensive and less power hungry. Other companies (such as Google with Project Tango and Intel with RealSense) have invested in algorithms and software to turn raw depth sensor data into data that applications can use. And application developers are finding lots of ways to use this data. One of my favorite examples is 8tree, a start-up that designs easy-to-use handheld devices for measuring surface deformities such as hail-damage on car bodies. And augmented reality games in which computer-generated characters interact with the physical world can be compelling. There are many types of depth sensors, including stereo cameras, time of flight and structured light. Some of these, like stereo cameras, naturally produce a conventional RGB image in addition to depth data. With other depth sensor types, a depth sensor is often paired with a conventional image sensor so that both depth and RGB data are available. This naturally raises the question of how to make best use of both the RGB and the depth data. Perhaps not surprisingly, recently researchers have successfully applied artificial neural networks to this problem. The more our devices know about the world around them, the more effective they can be. Depth is a key aspect of visual perception, but one that’s been out of reach for most product designers. Now, thanks to improvements in depth sensors, algorithms, software and processors, it’s becoming increasingly practical to build incorporate sensing into even cost- and power constrained devices like mobile phones. Look, for example, at Apple´s just-announced iPhone 7 Plus, along with other recently-introduced dual-camera smartphones such as Huawei’s P9, Lenovo’s Phab2 Pro, LG’s G5 and V20, and Xiaomi’s RedMi Pro.

Going Deep
Bild: Embedded Vision Alliance

Das könnte Sie auch interessieren

Unbekannte Gesichter

Dank Gesichtserkennungstechnologie identifiziert eine Security-Kamera Personen und sendet deren Namen an das Smartphone des Besitzers bzw. informiert den Nutzer über unbekannte Gesichter im Haus.


A $50,000 Camera you Already Own

Conventional cameras capture images using only three frequency bands (red, blue, green), while the full visual spectrum is a much richer representation that facilitates a wide range of additional and important applications. A new technology allows conventional cameras to increase their spectral resolution, capturing information over a wide range of wavelengths without the need for specialized equipment or controlled lighting.

Inspired by the Kinect

Although different 3D cameras and scanners have existed for some time, present solutions have been limited by several unwanted compromises. If you wanted high speed, you would get very low resolution and accuracy (e.g. Time-of-Flight cameras and existing stereo vision cameras, which despite being fast typically have resolution in the millimeter to centimeter range). If you wanted high resolution and accuracy, you would typically get a camera that was slow and expensive (e.g. the high accuracy scanners).


4. VDI-Fachkonferenz ‚Industrielle Bildverarbeitung‘

Vom 18. bis 19. Oktober veranstaltet der VDI die nunmehr 4. Fachkonferenz zum Thema ‚Industrielle Bildverarbeitung‘ im Kongresshaus Baden-Baden. In 19 Fachvorträgen werden u.a. die Schwerpunktthemen Automation in der Robotik mit 3D-Bildverarbeitung, Oberflächeninspektion und Bildverarbeitung in der Nahrungsmittelindustrie und intelligenten Logistik behandelt.


Low Noise SWIR-Camera with 400fps

C-Red 2 is an ultra high speed low noise camera designed for high resolution SWIR-imaging based on the Snake detector from Sofradir. The camera is capable of unprecedented performances up to 400fps with a read out noise below 30 electrons. To achieve these performances, it integrates a 640×512 InGaAs PIN Photodiode detector with 15m pixel pitch for high resolution, which embeds an electronic shutter with integration pulses shorter than 1μs. The camera is capable of windowing and multiple ROI, allowing faster image rate while maintaining a very low noise.


Whitepaper: Sechs Kriterien für den optimalen Bildsensor

Ob Automatisierung, Mensch-Maschine-Kollaboration in der Robotik oder selbstfahrende Autos – die Auswahl des richtigen Sensors hängt stark von der Applikation und dem gewünschten Output ab. Diese 6 Faktoren helfen Ihnen dabei, den passenden Sensor für Ihre Applikation zu finden!