The 9th IEEE Embedded Computer Vision Workshop
24 June, 2013
Portland, Oregon, USA
Held in conjunction with IEEE CVPR 2013
Call for Papers: (pdf)
Contact: (b.kisacanin at yahoo dot com)
Schedule: Full day
0830 Welcome Message
Session 1: Keynote
0835 Keynote: Embedded Vision and Hearing: Bio-mimetic Approaches, Richard F. Lyon (Google)
Session 2: Embedded Low Level Vision
- 0930 GPU-SHOT: parallel optimization for real-time 3D local description, Daniele Palossi, Federico Tombari, Samuele Salti, Martino Ruggiero, Luigi Di Stefano, Luca Benini
- 0950 Scalable Frame to Block-based Automatic Convertor for Efficient Embedded Vision Processing, Senthil Yogamani, BH Pawan Prasad, Rajesh Narasimha
1015 Coffee Break
Session 3: System Analysis
- 1045 Invited talk: EVE: A Flexible Processor for Embedded Vision Applications, Jagadeesh Sankaran (Texas Instruments)
- 1120 An Embedded Vision Services Framework for Heterogeneous Accelerators, Eduardo Gudis, Pullan Lu, David Berends, Kevin Kaighn, Gooitzen Van der Wal, Gregory Buchanan, Sek Chai, Michael Piacentino
- 1140 Vision-based Lane Analysis: Exploration of Issues and Approaches for Embedded Realization, Ravi Kumar Satzoda, Mohan Trivedi
1200 Lunch Break
Session 4: Applications I – Detection of Humans
- 1330 Invited talk: Next generation FPGAs and SOCs – How Embedded Systems Can Profit, Felix Eberli (Supercomputing Systems AG)
- 1400 GPU-accelerated Human Detection using Fast Directional Chamfer Matching, David Schreiber, Csaba Beleznai, Michael Rauter
- 1420 Pedestrian Detection at Warp Speed: exceeding 500 detections per second, Floris De Smedt , Kristof Van Beeck, Tinne Tuytelaars, Toon Goedemé
- 1440 FPGA-based Real-Time Pedestrian Detection on High-Resolution Images, Konrad Doll, Ulrich Brunsmann, Michael Hahnle, Matthias Hisung, Frerk Saxen
- 1500 Invited talk: Development and Deployment of Embedded Vision in Industry: An Update, Jeff Bier (BDTI and Embedded Vision Alliance)
1530 Coffee Break
Session 5: Applications II – Stereo Vision
- 1600 Invited Talk: Stereo Vision Algorithms for FPGAs, Stefano Mattoccia (University of Bologna)
- 1630 Efficient GPU-based Graph Cuts for Stereo Matching, Young-kyu Choi, In Kyu Park
- 1650 Ground Truth Evaluation for Event-Based Silicon Retina Stereo Data, Juergen Kogler, Florian Eibensteiner, Martin Humenberger, Margrit Gelautz, Josef Sharinger
- 1710 Invited talk: Consumer robotics: a platform for embedding computer vision in everyday life, Mario Munich (iRobot)
1740 Closing Remarks
Keynote Talk: Embedded Vision and Hearing: Bio-mimetic Approaches
Richard F. Lyon, Google, Inc.
The fields of embedded machine vision and machine hearing are based on a healthy mixture of techniques and ideas from psychophysics, physiology, signal processing, optimization, machine learning, etc. Coming from a background of statistical information and communication theory, I switched to a more bio-mimetic or neuromorphic approach to both hearing and vision, over 30 years ago. My 1980 optical mouse, an early embedded vision chip, included an explicit model of lateral inhibition, a biological technique found in all sensory systems. While others were interpreting lateral inhibition as a linear filter for spatial sharpening, I interpreted it more as a nonlinear mechanism for automatic gain control, or adaptation to a wide stimulus dynamic range – what’s called contrast gain control in the vision field. This same idea has been a cornerstone of the cochlea models that I have developed for machine hearing: adaptation to wide dynamic range via laterally-coupled gain control preserves local contrasts much better than simple compressive nonlinearities do – such as the log compression common in spectral representations of sound. The machine vision field has informed the development of machine hearing in other ways as well, leveraging the common aspects of biological sensory systems. Many modern devices are starting to include embedded sensory systems of one sort or another. I advocate coupling their senses of vision and hearing, and moving them toward more human-like and human-friendly behavior. Steps in these directions are likely to benefit from paying more attention to bio-mimetic approaches.
Richard F. Lyon received the B.S. degree in engineering and applied science from California Institute of Technology in 1974 and the M.S. degree in electrical engineering from Stanford University in 1975. In his early career, he worked on a variety of projects involving communication and information theory, digital system design, analog and digital signal processing, VLSI design and methodologies, and sensory perception at Caltech, Bell Labs, Jet Propulsion Laboratory, Stanford Telecommunications Inc., Xerox PARC, Schlumberger Palo Alto Research, and Apple Computer. He was a visiting associate for 15 years on the Computation and Neural Systems faculty at Caltech, where he worked on sensory modeling research and analog VLSI techniques for signal processing. Next he was chief scientist and vice president of research for Foveon, Inc., which he co-founded in 1997, where he led the advanced development of the Foveon X3 color image sensor technology. Dick presently works in Google Research on machine hearing; at Google, he also led the team that developed the camera systems for Street View and other applications. He is a Fellow of the IEEE and of the ACM.
Development and Deployment of Embedded Vision in Industry: An Update
Jeff Bier (President of BDTI and founder of the Embedded Vision Alliance)
Increasingly powerful, inexpensive programmable processors and image sensors are making it possible to incorporate vision capabilities into a very wide range of electronic products, such as retail point of sale kiosks and signs, personal medical devices, automotive safety systems and smart phones. In this presentation, we’ll provide an update on the development and deployment of embedded vision technology in industry. We’ll highlight some of the newest, most interesting and most promising products incorporating vision capabilities. And we’ll report on important developments in practical enabling technologies, including processors, sensors, development platforms and standards. Finally, we’ll share an early version of a map of the embedded vision industry, under development by the Embedded Vision Alliance.
Jeff Bier is founder of the Embedded Vision Alliance. The Alliance is an industry partnership formed to inspire and empower product creators to create more capable and responsive products through integration of vision capabilities. The Alliance provides training videos, tutorial articles, code examples, and an array of other resources (all free of charge) on its web site, http://www.Embedded-Vision.com. Jeff is also co-founder and president of BDTI (www.BDTI.com), a trusted resource for independent analysis and specialized engineering services in the realm of embedded digital signal processing technology. Jeff oversees BDTI’s benchmarking and analysis of chips, tools, and other technology. Jeff is also a key contributor to BDTI’s engineering services, which focus on developing optimized software and system using embedded digital signal processing. Jeff earned his B.S. degree from Princeton University and his M.S. degree from U.C. Berkeley, both in electrical engineering.
Next generation FPGAs and SOCs – How Embedded Systems Can Profit
Felix Eberli, Supercomputing Systems AG, Zurich, Switzerland
New SOC like the Xilinx Zynq 7045 allow researchers and developers to combine the advantages of writing software for control functionality and having accelerators in the FPGA logic for the number crunching. The dual core Cortex-A9 ARM processor runs with up to 1 GHz and the FPGA has up to 900 DSP slices allowing a performance of up to 1’334 GMACs. SCS is porting a lot of algorithms like SGM stereo, Stixel clustering or an optical flow to such devices allowing new cars to see their environment and react appropriate. The new developed SCS Zynq 7045 module will allow accelerated development using this technology. As an example project we will describe the development of a next generation stereo vision algorithm.
Many real-time stereo vision systems are available on low-power platforms. However, when looking at high-performance global stereo methods as listed in the upper third of the Middleburry database, the low-power real-time implementations for these methods were still missing. We proposed (ICVS 2009) a real-time implementation of the semi-global matching algorithm with algorithmic extensions for automotive applications on a reconfigurable hardware platform (FPGA) resulting in a low power consumption of under 3W. Today the system is available in a series car.
Felix Eberli is a Department Head at Supercomputing Systems AG in Zurich, Switzerland (www.scs.ch). After working for Phonak as an ASIC engineer, Felix joined SCS in 2002 and leads the embedded & automotive department. As a contract developer he has expertise in development of next generation driver assistant systems with major automotive OEM and Tier1.
Stereo Vision Algorithms for FPGAs
In recent years, with the advent of cheap and accurate RGBD (RGB + Depth) active sensors like the Microsoft Kinect and devices based on time-of-flight (ToF) technology, there has been an ever increasing interest in 3D-based applications. At the same time, several effective improvements to passive stereo vision algorithms have been proposed in the literature. Despite these facts and the frequent deployment of stereo vision for many research activities, this technology is often perceived
as bulky, expensive and not well suited to consumer applications. In this talk, I will review a subset of state-of-the-art stereo vision algorithms that have the potential to fit a target computing
architecture based on low-cost field-programmable gate arrays (FPGAs), without additional external devices. Mapping these algorithms into a similar architecture would make RGBD sensors based on stereo vision suitable to a wider class of application scenarios currently not fully addressed by this technology.
Stefano Mattoccia received a M.Sc in Electronic Engineering and a Ph.D. in Computer Science Engineering from the University of Bologna, Italy, in 1997 and 2002 respectively. Currently he is Assistant Professor at the Department of Computer Science Engineering, School of Engineering and Architecture, University of Bologna. His research interests include: computer vision, image processing and parallel computer architectures. In particular, 3D vision and applications for embedded systems. He regularly serves as reviewer for major international journals and conferences and recently he has been co-guest editor for a special issue of IEEE Journal of Selected Topics in Signal Processing on emerging techniques in 3D and ICME 2013 area chair for the 3D and multimedia track. He is member of IEEE, IAPR-GIRPR and key member (2010-2014) of the Interest Group (IG) on 3D Rendering, Processing and Communications of IEEE Multimedia Communication TC.
EVE: A Flexible SIMD Processor for Embedded Vision Applications
Jagadeesh Sankaran, Texas Instruments
The Embedded Vision/Vector Engine (EVE) is a specialized fully programmable processor to accelerate computer vision algorithms. The architecture’s principal aim is to enable low latency, low power, and high performance vision algorithms in cost sensitive embedded markets. EVE’s memory architecture is unique and differentiated relative to standard processor architectures, allowing for a high degree of sustained internal memory bandwidth for compute intensive algorithms. EVE‘s architecture also has built-in features for enhanced safety, which are crucial to develop, mission critical systems. The presence of custom pipelines and units, allows for accelerating and harnessing the high levels of data parallelism found in computer vision algorithms. This presentation will review the key processing needs and challenges found in algorithms in advanced driver assistance systems (ADAS) markets. It then motivates the need for a dedicated processor that adds specialized units and pipeline stages to accelerate challenging processing requirements. EVE complements the standard C6000 DSP from Texas Instruments by excelling at low-level and mid-level vision algorithms, freeing up the DSP to leverage VLIW and excel at high level processing algorithms such as classifiers. We will also briefly review the programming paradigms to take advantage of a highly parallel specialized data path from a high level language, while still giving developer’s a clear path to developing optimized applications. The combination of DSP and EVE in TI’s SOC’s allows developers to harness new levels of performance, drastically reducing the time to market for developing performance-intensive safety-critical ADAS applications.
Jagadeesh Sankaran joined Texas Instruments in 1998 as a DSP software and systems application engineer. He has worked on various DSP architectures such as C6211, C64x architecture and ISA definition, DM642. He graduated with a doctoral degree in 2003 from the University of Texas at Dallas, with a focus on DSP architectures. His main areas of interest are DSP architectures, instruction sets and memory sub-system behavior, multimedia, video and audio compression algorithms, and computer vision. Since 2009 he has been focused on embedded computer vision algorithms, architectures, algorithms, software, systems and safety with a particular focus on advanced driver assistance systems (ADAS) markets. He is a senior member of technical staff (SMTS 2009), senior member of IEEE (2008) and holds 15 granted patents at the USPTO with 30 total applications. He currently serves as the principal architect for the Embedded Vision/Vector Engine (EVE) a specialized vision co-processor targeted at embedded computer vision applications such as advanced driver assistance systems in automotive markets.
Consumer robotics: a platform for embedding computer vision in everyday life
Mario E. Munich, iRobot
Advances in manufacturing of processors, memory, sensors and mechanical components empower a number of low-cost consumer robots. Many of these robots focus on performing very specific, time-consuming, and unappealing tasks such as cleaning your floor, vacuuming your carpet, or cutting your grass. Cameras and vision techniques enable cost-effective solutions to increasingly challenging tasks that will boost the perception of intelligence to the user. In this talk, I will present iRobot’s efforts in using vision for accurate simultaneous and mapping (SLAM), human-robotics interaction, and enhanced perception for consumer robotics. These vision primitives will support a number of novel consumer robotics applicationsthat will make way into our everyday life.
Mario E. Munich is Vice President of Advanced Development at iRobot in Pasadena, California, where he currently manages the research and advanced development efforts. He was formerly the CTO of Evolution Robotics, a company focused on development key technology primitives for consumer robotics. He received the degree of Electronic Engineer (with honors) from the National University of Rosario, Argentina, in 1990, and the M.S. and Ph.D. degrees in Electrical Engineering from the California Institute of Technology, Pasadena, in 1994 and 2000, respectively. His PhD work focused in developing novel Human-Machine Interfaces using video technology and computer vision techniques. At Evolution Robotics, he worked in development of object recognition, mapping and navigation algorithms for consumer robotics. His research interests includes computer vision, autonomous navigation, and human-robotic interaction.
Recent years have witnessed a significant increase in the use of embedded systems for vision.Applications range from accurate, performance-centric systems to high volume, low-cost, light weight and energy efficient consumer devices. Computer vision has been deployed in many applications, for example, in video search and annotation, surveillance, computer-aided surgery, for gesture and body movement detection in video games, to assist drivers in automotive safety and for in-home monitoring of vulnerable persons. Embedded computer vision is part of a growing trend towards developing low-cost “smart sensors” that use local “analytics” to interpret data, passing on relatively high level alerts or summary information via network connectivity. Embedded vision applications are built upon advances in vision algorithms, embedded processing architectures, advanced circuit technologies, and new electronic system design methodologies. They are implemented on embedded processing devices and platforms such as field programmable gate arrays (FPGAs), programmable digital signal processors (DSPs), graphics processing units (GPUs), and various kinds of heterogeneous multi-core devices. They are developed under significant resource constraints of processing, memory, power, size, and communication bandwidth that pose significant challenges to attaining required levels of performance and speed, and frequently exploit the inherent parallelism of the specialized platforms to address these challenges. Given the heterogeneous and specialized nature of these platforms, efficient development methods are an important issue.
The Embedded Vision Workshop (EVW) aims to bring together researchers working on vision problems that share embedded system characteristics. Research papers are solicited in, but not limited to, the following topics:
- Analysis of vision problems specific to embedded systems.
- Analysis of embedded systems problems specific to computer vision.
- Embedded computer vision for robotics
- New trends in programmable processors and their computational models.
- Applications of and algorithms for embedded vision on standard parallelized platforms such as GPUs (PC, embedded and mobile).
- Applications of and algorithms for embedded vision on reconfigurable platforms such as FPGAs.
- Applications of and algorithms for embedded computer vision on programmable platforms DSPs and multicore SoC such as the Cell Processor.
- Applications of embedded computer vision on mobile devices including phones.
- Biologically-inspired vision and embedded systems
- Computer vision applications distributed between embedded devices and servers
- Social networking embedded computer vision applications
- Educational methods for embedded computer vision
- User interface designs and CAD tools for embedded computer vision applications
- Hardware enhancements (lens, imager, processor) that impact computer vision applications
- Software enhancements (OS, middleware, vision libraries, development tools) that impact embedded computer vision application
- Methods for standardization and measurement of computer vision functionality as they impact embedded computer vision
- Performance metrics for evaluating embedded systems performance.
- Hybrid embedded systems combining vision and other sensor modalities
Seven of the previous Workshops on Embedded (Computer) Vision (ECVW and EVW) were held in conjunction with CVPR from 2005 to 2012, except for the fifth which was held in conjunction with the 2009 ICCV. These events were very successful. Selected papers workshops have been published in two special issues of major journals (EURASIP Journal on Embedded Systems and CVIU) and in a Springer monograph titled Embedded Computer Vision. The Workshop is now renamed Embedded Vision (EVW) to reflect changes in the field.
Vienna University of Technology
University of Lincoln, UK
Ahmed Nabil Belbachir
AIT Austrian Institute of Technology
University of the West of Scotland
University of Thessaly
Boaz J. Super
Kofi Appiah, Lincoln University
Peter Barnum, Texas Instruments
Sebastiano Battiato, U. di Catania
Faycal Bensaali, University of Hertsfordshire
Rita Cucchiara, University of Modena e Reggio Emilia
Orazio Gallo, nVidia
Eduardo Gudis, SRI
Antonio Haro, NAVTEQ
Martin Humenberger, AIT Austrian Institute of Technology
David Ilstrup, Honda Research Institute
Rongrong Ji, Columbia University
Kihwan Kim, nVidia
Zhu Li, Hong Kong Polytechnic University
Abelardo Lopez-Lagunas, ITESM-Toluca
Larry Matthies, Jet Propulsion Laboratory
Hongying Meng, Brunel University
Darnell Moore, Texas Instruments
Vittorio Murino, Istituto Italiano di Tecnologia, Genova
Rajesh Narasimha, Texas Instruments
Burak Ozer, SVTAnalytics
Hassan Rabah, Nancy University
Bernhard Rinner Klagenfurt, University of Austria
Sankalita Saha, NASA Ames Research Centre
Mainak Sen, Cisco Systems
Vinay Sharma, Texas Instruments
Salvatore Vitabile, University of Palermo
Linda Wills, Georgia Institute of Technology
Ruigang Yang, University of Kentucky
- Paper submission: March 22, 2013 (extended)
- Notification to the authors: April 12, 2013
- Camera ready paper: April 30, 2013
- Workshop: June 24, 2013
Submission and Review Policy
In submitting a manuscript to this workshop, the author(s) acknowledge that no paper substantially similar in content is under review at another conference or workshop during the review period. Please refer to the following files on CVPR 2013 main conference site for detailed formatting instructions:
- LaTeX/Word Templates (tar): http://www.pamitc.org/cvpr13/files/cvpr2013AuthorKit.tgz
- LaTeX/Word Templates (zip): http://www.pamitc.org/cvpr13/files/cvpr2013AuthorKit.zip
A complete paper should be submitted using these blind-submission review-formatted templates. The page limit is 6 pages, with the option of purchasing 2 pages. Please follow the paper submission website to submit your manuscript. Each paper will be double-blind reviewed by at least two reviewers from the program committee.
Overall Meeting Sponsors