The 10th IEEE Embedded Computer Vision Workshop
28 June, 2014,
Columbus, Ohio, USA
Held in conjunc
tion with IEEE CVPR 2014
Call for Papers: (pdf)
Keynote and Invited Speakers
- AM Keynote: “Project Tango: Giving Mobile Devices a Human-Scale Understanding of Space and Motion”
Johnny Chung Lee, Technical Program Lead, Google
- PM Keynote: “The Sightfield: Visualizing Vision, Sensing Sensors, and seeing their capacity to see”
Steve Mann, Univ. of Toronto
- Invited talk: Enabling machines to perceive the world like you do or better!
Eugenio Culurciello, Purdue University
- Invited talk: OpenVX: The Computer Vision Hardware Abstraction Layer
Victor Eruhimov, ItSeez
- Invited talk: “Embedded Vision Challenges for Implementing Augmented Reality Applications”, Peter Meier, CTO, Metaio
Location: C 220-222
Schedule: Full day
|Time||Session||Speaker and Title|
|8:45-9:00||Opening Remarks||Goksel Dedeoglu and Fridtjof Stein, General Chairs|
|9:00-9:45||AM Keynote||Johnny Chung Lee, Technical Program Lead, Google
“Project Tango: Giving Mobile Devices a Human-Scale Understanding of Space and Motion”
|10:00-10:20||AM Orals (#1)||F. Conti, A. Pullini, L. Benini, “Brain-inspired Classroom Occupancy Monitoring on a Low-Power Mobile Platform”|
|10:20-10:40||AM Orals (#2)||O. Bilaniuk, E. Fazl-Ersi, R. Laganiere, C. Xu, D. Laroche, C. Moulder, “Fast LBP Face Detection on SIMD Architectures”|
|10:40-11:40||Invited Talk||Eugenio Culurciello, Purdue Univ. Deep Learning Architectures
“Enabling machines to perceive the world like you do or better!”
|11:40-12:00||Lightning Talks||2-minute-per-poster/demo briefs|
|12:30-2:00||Posters and Demos||POSTERS|
- F. Eibensteiner, J. Scharinger, J. Kogler, “A High-Performance Hardware Architecture for a Frameless Stereo Vision Algorithm Implemented on a FPGA Platform”
- R. Brockers, M. Humenberger, S. Weiss, “Towards autonomous navigation of miniature UAV”
- B. Ozerl M. Wolf, “A Train Station Surveillance System: Challenges and Solutions”
- E. Rainey, J. Villarreal, G. Dedeoglu, K. Pulli, T. Lepley,F. Brill, “Addressing System-Level Optimization with OpenVX Graphs”
- S. Sathyanarayana, Ravi Kumar Satzoda, S. Sathyanarayana, S. Thampibillai, “A Compute-efficient Algorithm for Robust Eyebrow Detection”
- G. Zhou, A. Liu, K. Yang, T. Wang, Z. Li, “An Embedded Solution to Visual Mapping for Consumer Drones”
- B. Zhang, V. Appia, I. Peccukuksen, Y. Liu, A. Batur, P. Shastry, S. Liu, S. Sivasankaran, K. Chitnis, “A surround view camera solution for embedded systems”
- R. Ladig, K. Shimonomura, “FPGA-based fast response image analysis for autonomous or semi-autonomous indoor flight”
- S. Gehrig, N. Schneider, U. Franke, “Exploiting Traffic Scene Disparity Statistics for Stereo Vision”
- V. Gokhale, J. Jin, A. Dundar, B.Martini, E. Culurciello, “A 240\,G-ops/s Mobile Coprocessor for Deep Neural Networks”
- R. K. Satzoda, M. Trivedi, “Efficient Lane and Vehicle detection with Integrated Synergies (ELVIS)”
- V. Gokhale, J. Jin, A. Dundar, B.Martini, E. Culurciello, “A 240\,G-ops/s Mobile Coprocessor for Deep Neural Networks”
- Steve, Stephanie, and Christina Mann; Ryan Janzen; Arkin Ai; Rifdhan Nazeer; Mir Adnan Ali; Pete Scourboutakos; Govind Peringod; and Kyle Simmons, “Surveillicopter Drone Swarms and Surveilluminescent Smart Dust Motes for Surveillight-based Surveillometry“, Tinquiry GENIUSchool.
- Johnny Lee and the Project Tango team, “Project Tango – advancing the state of 3D tracking and sensing on mobile devices”
2:00-3:00 PM Keynote: Steve Mann, Univ. of Toronto
“The Sightfield: Visualizing Vision, Sensing Sensors, and seeing their capacity to see”
3:00-3:40 PM Invited Talk Victor Eruhimov, ItSeez “OpenVX: The Computer Vision Hardware Abstraction Layer”
3:40-4:00 Coffee Break
4:00-4:20PM Orals (#1)L. Baraldi, F. Paci, G. Serra, L. Benini, R. Cucchiara, “Gesture Recognition in Ego-Centric Videos using Dense Trajectories and Hand Segmentation”
4:20-4:40PM Orals (#2)R. K. Satzoda, M. Trivedi, “Efficient Lane and Vehicle detection with Integrated Synergies (ELVIS)”
4:40-5:20PM Invited Talk: Peter Meier, CTO, Metaio
“Embedded Vision Challenges for Implementing Augmented Reality Applications”
5:20-5:30PM Closing Remarks
Keynote and Speaker Information
Project Tango is a focused effort to harvest research from the last decade of work in computer vision and robotics and concentrate that technology into a mobile platform. It uses computer vision and advanced sensor fusion to estimate position and orientation of the device in the real-time, while simultaneously generating a 3D map of the environment. We will discuss the underlying technologies that make this possible, such as the hardware sensors and some of the software algorithms. We will also show demonstrations of how the technology could be used in both gaming and non-gaming applications. This is just the beginning and we hope you will join us on this journey. We believe it will be one worth taking.
Johnny Lee is a Technical Program Lead at the Advanced Technology and Projects (ATAP) group at Google. He leads Project Tango, which is a focused effort to bring computer vision and advanced sensor fusion to mobile platforms. Previously, he helped Google X explore new projects as Rapid Evaluator and was a core algorithms contributor to the original Xbox Kinect. His YouTube videos demonstrating Wii remote hacks have surpassed over 15 million views and became one of the most popular TED talk videos. In 2008, he received his PhD in Human-Computer Interaction from Carnegie Mellon University and has been recognized in MIT Technology Review’s TR35.
Computer vision is being embedded in toilets, urinals, handwash faucets, doors, lightswitches, thermostats, and many other objects that “watch” us. Camera-based motion sensing streetlights are being installed throughout entire cities, making embedded vision ubiquitous. Technological advancement is leading to increased performance combined with miniaturization that is making vision sensors less visible: vision is “seeing” better while it is becoming harder for us to see it. I will describe the concept of a “sightfield”, a time-reversed lightfield that, when used with time-exposure photography, can make vision (the capacity to see) visible. In particular, I will describe the concept of abakography, and how it can be used to make visible the otherwise invisible rays of sight that emenate from sensing apparatus. The sightfield is to a lightfield as holes are to electrons, or as coldness is to heat, and is a useful physical concept for visualizing, understanding, and quantifying vision.
Steve Mann, PhD (MIT ’97), P. Eng., SMIEEE, tenured full professor, University of Toronto, Electrical Engineering, and Computer Science, is Associate Editor IEEE T&S, General Chair, IEEE ISTAS13, and winner of the 2004 Coram International Sustainable Design Award. He is currently Chief Scientist of Meta-View, makers of Spaceglasses. 35 years ago he invented the Digital Eye Glass and is the inventor of the MannGlass(TM) HDR welding glass, and the EyeTap. Mann is widely recognized as “The father of the wearable computer” (IEEE ISSCC2000) and in the words of MIT Media Lab Director, “brought the seed” in 1991 that founded the MIT Wearable Computing Project, as its first member. He invented HDR (High Dynamic Range) Imaging (U. S. Pat. 5828793). He coined the terms “Natural User Interface” (published 2001) and “Reality User Interface” to describe these new forms of human-computer interaction.
His work has been shown in the Smithsonian Institute, The Science Museum, Museum of Modern Art (MoMA in New York), Stedelijk Museum (Amsterdam), Triennale di Milano, Austin Museum of Art, and San Francisco Art Institute. He has been featured by news organizations including AP News, New York Times, LA-Times, Time, Newsweek, Fortune, WiReD, NBC, ABC, CNN, David Letterman (#6 on Letterman’s Top Ten), CBC-TV, CBS, Scientific American, Scientific American Frontiers, Discovery Channel, Byte, Reuters, New Scientist, Rolling Stone, and BBC. His award winning documentary cyborglog ShootingBack, and the ideas from recent book “CYBORG: Digital Destiny and Human Possibility in the Age of the Wearable Computer” (Randomhouse Doubleday, 2001) inspired a 35mm feature length motion picture film about his life, said, by P.O.V., to be Canada’s most important film of the year. Mann and his students Chris Aimone and James Fung were founding members of InteraXon, a Canadian company commercializing cyborg technology developed by Mann and his students. InteraXon created a large-scale public art installation open to the public as the flagship project of the Ontario Pavillion during the entire time of the Olympics from February 12-28, 2010.
Soon devices will perceive the world like we do, they will start to become useful and save us from the tedious tasks we aim to eliminate. They will drive our cars, find interesting things for us, listen to us and see for us. Soon enough they might make us more than humans. In this talk we present nn-X: a state-of-the-art vision processor implemented in programmable-logic and embedded mobile processors. nn-X gives cellular-phones and portable computing devices the ability to visually perceive the environment. In particular, nn-X can accelerate large hierarchical deep neural networks, currently the state-of-the-art model to understand complex data, and used by all the largest data companies: Google, Facebook, Yahoo, Baidu, etc. We will show measured results on the performance of nn-X implemented in both programmable digital hardware (FPGA) and on custom micro-chips (ASIC). The application of such system is in smart phones, appliances, computers, robotics, autonomous cars, to name a few. More information at: http://teradeep.com/.
Eugenio Culurciello (S’97-M’99) received the Ph.D. degree in Electrical and Computer Engineering in 2004 from the Johns Hopkins University, Baltimore, MD. He is an associate professor of the Weldon School of Biomedical Engineering and an Associate Professor of Psychological Sciences in the College of Health & Human Sciences at Purdue University, where he directs the ‘e-Lab’ laboratory. His research focus is in artificial vision systems, deep learning, hardware acceleration of vision algorithms. His research interests include: analog and mixed-mode integrated circuits for biomedical instrumentation, synthetic vision, bio-inspired sensory systems and networks, biological sensors, silicon-on-insulator design. Eugenio Culurciello is the recipient of The Presidential Early Career Award for Scientists and Engineers (PECASE) and Young Investigator Program from ONR, the Distinguished Lecturer of the IEEE (CASS), and is the author of the book “Silicon-on-Sapphire Circuits and Systems, Sensor and Biosensor interfaces” published by McGraw Hill in 2009, and “Biomedical Circuits and System, Integrated Instrumentation” published by Lulu in 2013. In 2013 Dr. Culurciello founded TeraDeep, a company focused on the design of mobile co-processors and neural network hardware for understanding images and videos.
OpenVX is a new application programming interface (API) from the Khronos Group. OpenVX enables performance and power optimized vision algorithms for use cases such as face, body and gesture tracking, smart video surveillance, automatic driver assistance systems, object and scene reconstruction, augmented reality, visual inspection, robotics and more. OpenVX enables significant implementation innovation while maintaining a consistent API for developers. OpenVX can be used directly by applications or to accelerate higher-level middleware with platform portability. OpenVX will have extensive conformance tests to complement a focused and tightly defined finalized specification for consistent and reliable operation across multiple vendors and platforms making OpenVX an ideal foundation for shipping production vision applications. OpenVX complements the popular OpenCV open source vision library.
Victor Eruhimov is a CEO of Itseez, a computer vision R&D company. He is an entrepreneur with an extensive background in computer vision. Prior to co-founding Itseez, Victor worked as a project manager and senior research scientist at Intel Corporation. He is the author of more than 25 papers in the areas of computer vision and machine learning as well as several US and international patents. Victor has also been involved in several open source projects, being a developer of the OpenCV library. Since 2012 Victor serves as chairman of the OpenVX working group at Khronos that develops the open standard for the computer vision industry.
Invited talk: Embedded Vision Challenges for Implementing Augmented Reality Applications
Peter Meier, CTO, Metaio
Augmented Reality (AR) applications hold great promise for mobile users in the near future but mobile devices cannot yet deliver on this promise. Even the quite substantial processing capabilities of modern mobile devices are not at the level needed for running the latest object -recognition, tracking, or rendering methods – and the resultant power consumption drains the battery within an hour. In order to ensure a great user experience, AR algorithms have to cope with real-world parameters like illumination, jitter, scale, rotation, and noise. The fusion of different optimized AR technologies like 2D or 3D feature tracking, edge detection, gravity awareness or SLAM should be able to handle these issues to the satisfaction of the end-user.
But to implement those technologies, hardware (HW) architectures have to evolve in parallel to provide efficient resources that can keep power consumption at an acceptable level. One answer is an embedded heterogeneous system (HMP) with highly specialized HW blocks and dedicated data buses and memory architectures, like the AREngine, hardware IP designed by Metaio. Though an HMP is a great solution on the HW level, it has to be complemented by intelligent programming frameworks for scheduling and resource management. Combining optimized tracking technologies with efficient HW IP and easy to use software development tools is the foremost challenge of the decade for AR, and has to be solved to ensure seamless application development for various AR applications and across multiple mobile platforms. The talk will highlight our findings in optimizing and combining the above technologies into a production-ready solution for mobile devices which has been the core focus of Metaio’s R&D department.
Peter Meier received his Masters in manufacturing engineering from the Technical University of Munich, before founding Metaio in 2002 together with Dr. Thomas Alt. Since then, he serves as CTO and helped to make Metaio one of the leading companies for the development and licensing of Augmented Reality and mobile vision technologies. Metaio has over 100 employees in its locations in Munich, Dallas and San Francisco. Peter Meier actively takes part in expanding Metaio’s technology and ip portfolio. He is known to be one of the top technical experts in AR, involved in over 60 AR-patents and many game-changing applications of computer vision for companies like Lego, Volkswagen and IKEA. He shaped junaio, Metaio’s open AR platform and works hard to make AR and CV an established user interface technology, used daily.
Best Paper Awards
We thank Nvidia for sponsoring Best Paper Awards, which were given to the the best papers presented in the oral and poster session of the workshop.
Best oral paper:
F. Conti, A. Pullini, L. Benini, “Brain-inspired Classroom Occupancy Monitoring on a Low-Power Mobile Platform”
Best poster paper:
G. Zhou, A. Liu, K. Yang, T. Wang, Z. Li, “An Embedded Solution to Visual Mapping for Consumer Drones”
- Paper submission:
March 17, 2014
- Notification to the authors:
April 21, 2014
- Camera ready paper:
May 9, 2014(final paper preparation and submission instructions)
Submission and Review Policy
In submitting a manuscript to this workshop, the author(s) acknowledge that no paper substantially similar in content is under review at another conference or workshop during the review period. Please refer to the following files on CVPR 2014 main conference site for detailed formatting instructions:
- LaTeX/Word Templates (tar): http://www.pamitc.org/cvpr14/files/cvpr2014AuthorKit.tgz
- LaTeX/Word Templates (zip): http://www.pamitc.org/cvpr14/files/cvpr2014AuthorKit.zip
A complete paper should be submitted using these blind-submission review-formatted templates. The page limit is 6 pages, with the option of purchasing 2 pages. Please follow the paper submission website (https://cmt.research.microsoft.com/EVW2014/) to submit your manuscript. Each paper will be double-blind reviewed by at least two reviewers from the program committee.
University of Bologna, Italy
Vienna University of Technology
University of Lincoln, UK
Ahmed Nabil Belbachir
AIT Austrian Institute of Technology
University of the West of Scotland
Senyo Apewokin, Texas Instruments
Kofi Appiah, Lincoln U.
Sebastiano Battiato, U.of Catania
Moshe Ben-Ezra, Microsoft
Faycal Bensaali, Qatar University
Terry Boult, U.of Colorado
Xin Chen, Navteq
Rita Cucchiara, U of Modena and Reggio Emilia
Orazio Gallo, nVidia
Antonio Haro, Navteq
Martin Humenberger, AIT
David Ilstrup, Honda Research Institute
Masatoshi Ishikawa, U.of Tokyo
Rongrong Ji, Columbia U.
Kihwan Kim, nVidia
Kevin Koeser, ETH Zurich
Zhu Li, Hong Kong Polytechnic U.
Abelardo Lopez-Lagunas, ITESM
Larry Matthies, JPL
Darnell Moore, Texas Instruments
Andre Morin, Lyrtech
Vittorio Murino, Istituto Ital.di Tecn.
Rajesh Narasimha, MetaIO
Zoran Nikolic, Texas Instruments
Burak Ozer, Verificon
Hassan Rabah, University of Lorraine
Bernhard Rinner, Klagenfurt U.
Sankalita Saha, NASA
Mainak Sen, Cisco Systems
Vinay Sharma, Apple
Dabral Shashank, Texas Instruments
Salvatore Vitabile, U.of Palermo
Linda Wills, Georgia Tech
Ruigang Yang, U. of Kentucky
Call For Papers
Recent years have witnessed a significant increase in the use of embedded systems for vision.Applications range from accurate, performance-centric systems to high volume, low-cost, light weight and energy efficient consumer devices. Computer vision has been deployed in many applications, for example, in video search and annotation, surveillance, computer-aided surgery, for gesture and body movement detection in video games, to assist drivers in automotive safety and for in-home monitoring of vulnerable persons. Embedded computer vision is part of a growing trend towards developing low-cost “smart sensors” that use local “analytics” to interpret data, passing on relatively high level alerts or summary information via network connectivity. Embedded vision applications are built upon advances in vision algorithms, embedded processing architectures, advanced circuit technologies, and new electronic system design methodologies. They are implemented on embedded processing devices and platforms such as field programmable gate arrays (FPGAs), programmable digital signal processors (DSPs), graphics processing units (GPUs), and various kinds of heterogeneous multi-core devices. They are developed under significant resource constraints of processing, memory, power, size, and communication bandwidth that pose significant challenges to attaining required levels of performance and speed, and frequently exploit the inherent parallelism of the specialized platforms to address these challenges. Given the heterogeneous and specialized nature of these platforms, efficient development methods are an important issue.
The Embedded Vision Workshop (EVW) aims to bring together researchers working on vision problems that share embedded system characteristics. Research papers are solicited in, but not limited to, the following topics:
- Analysis of vision problems specific to embedded systems.
- Analysis of embedded systems problems specific to computer vision.
- Embedded computer vision for robotics
- New trends in programmable processors and their computational models.
- Applications of and algorithms for embedded vision on standard parallelized platforms such as GPUs (PC, embedded and mobile).
- Applications of and algorithms for embedded vision on reconfigurable platforms such as FPGAs.
- Applications of and algorithms for embedded computer vision on programmable platforms DSPs and multicore SoC.
- Applications of embedded computer vision on mobile devices including phones.
- Biologically-inspired vision and embedded systems
- Computer vision applications distributed between embedded devices and servers
- Social networking embedded computer vision applications
- Educational methods for embedded computer vision
- User interface designs and CAD tools for embedded computer vision applications
- Hardware enhancements (lens, imager, processor) that impact computer vision applications
- Software enhancements (OS, middleware, vision libraries, development tools) that impact embedded computer vision application
- Methods for standardization and measurement of computer vision functionality as they impact embedded computer vision
- Performance metrics for evaluating embedded systems performance.
- Hybrid embedded systems combining vision and other sensor modalities
All of the previous Workshops on Embedded (Computer) Vision (ECVW and EVW) were held in conjunction with CVPR, with the exception for the fifth which was held in conjunction with the 2009 ICCV. These events were very successful. Selected papers workshops have been published in two special issues of major journals (EURASIP Journal on Embedded Systems and CVIU) and in a Springer monograph titled Embedded Computer Vision. The Workshop is now renamed Embedded Vision (EVW) to reflect changes in the field.