experienced software engineering leader with a track record of driving
innovation and execution of product and platform
teams with a main applied-research component.
Currently Senior Director of Engineering at Apple, leading the Video Computer Vision (VCV) organization, a centralized applied research and engineering organization responsible for developing real-time on-device Computer Vision and Machine Perception technologies across Apple products.
Previously leading the AI Camera group at Facebook, shipping Oculus Quest and Spark AR computer vision technologies. Prior to that at Microsoft leading software development at HoloLens and Kinect (Xbox 360, Xbox One and Windows).
Combining Body Pose, Gaze, and Gesture to
Determine Intention to Interact in Vision-Based Interfaces
Julia Schwarz, Charles Marais, Tommer Leyvand, Scott E. Hudson, Jennifer Mankoff
Vision-based interfaces, such as those made popular by the Microsoft
Kinect, suffer from the Midas Touch problem: every user motion can
be interpreted as an interaction. In response, we developed an
algorithm that combines facial features, body pose and motion to
approximate a user’s intention to interact with the system. We show
how this can be used to determine when to pay attention to a user’s
actions and when to ignore them. To demonstrate the value of our
approach, we present results from a 30-person lab study conducted to
compare four engagement algorithms in single and multi-user
scenarios. We found that combining intention to interact with a
“raise an open hand in front f you” gesture yielded the best
results. The latter approach offers a 12% improvement in accuracy
and a 20% reduction in time to engage over a baseline “wave to
engage” gesture currently used on the Xbox 360.
Exemplar-Based Human Action Pose Correction and
Wei Shen, Ke Deng, Xiang Bai, Tommer Leyvand, Baining Guo, and Zhuowen Tu
The launch of Xbox Kinect has built a very successful computer
vision product and made a big impact to the gaming industry; this
sheds lights onto a wide variety of potential applications related
to action recognition. The accurate estimation of human poses from
the depth image is universally a critical step. However, existing
pose estimation systems exhibit failures when faced severe
occlusion. In this paper, we propose an exemplar-based method to
learn to correct the initially estimated poses. We learn an
inhomogeneous systematic bias by leveraging the exemplar information
within specific human action domain. Our algorithm is illustrated on
both joint-based skeleton correction and tag prediction. In the
experiments, significant improvement is observed over the
contemporary approaches, including what is delivered by the current
Kinect Identity: Technology and Experience
Tommer Leyvand, Casey Meekhof, Yi-Chen Wei, Jian Sun, and Baining Guo
IEEE Computer, vol. 44, no. 4, pp. 94-96. 2011.
This IEEE Computer article is a high-level introduction to how
Kinect performs player identity recognition on the Xbox 360, what we
call 'Kinect Identity'.
Additional details and references to related facial-recognition publications are available here. MSR video is available here.
|Data-Driven Enhancement of Facial Attractiveness
Tommer Leyvand, Daniel Cohen-Or, Gideon Dror and Dani Lischinski
ACM SIGGRAPH 2008
|In this work we focus on the challenging problem of enhancing the aesthetic appeal (or the attractiveness) of human faces in frontal photographs (portraits), while maintaining close similarity with the original. The key component in our approach is an automatic facial attractiveness
engine trained on datasets of faces with accompanying facial attractiveness ratings collected from groups of human raters. More ...
Digital Face Beautification SIGGRAPH 2006, Technical Sketch page (here)
A Machine Learning Predictor of Facial Attractiveness
Revealing Human-Like Psychophysical Biases
Amit Kagian, , Gideon Dror, Tommer Leyvand, Isaac Meilijson, Daniel Cohen-Or, Eytan Ruppin
Vision Research 48 (2008) 235–243
psychological studies have strongly suggested that humans share
common visual preferences for facial attractiveness. Here, we
present a learning model that automatically extracts measurements of
facial features from raw images and obtains human-level performance
in predicting facial attractiveness ratings. The machine’s ratings
are highly correlated with mean human ratings, markedly improving on
recent machine learning studies of this task. Simulated
psychophysical experiments with virtually manipulated images reveal
preferences in the machine’s judgments that are remarkably similar
to those of humans. Thus, a model trained explicitly to capture a
specific operational performance criteria, implicitly captures basic
human psychophysical characteristics.
Daniel Cohen-Or, Olga Sorkine, Ran Gal, Tommer Leyvand and Ying-Qing Xu
ACM SIGGRAPH 2006
|Harmonic colors are sets of colors that are aesthetically pleasing in terms of human visual perception. In this paper, we present a method that enhances the harmony among the colors of a given photograph or of a general image, while remaining faithful, as much as possible, to the original colors. Given a color image, our method finds the best harmonic scheme for the image colors. It then allows a graceful shifting of hue values so as to fit the harmonic scheme while considering spatial coherence among colors of neighboring pixels using an optimization technique. The results demonstrate that our method is capable of automatically enhancing the color "look-and-feel" of an ordinary image.
|Interactive Object Segmentation in Video by Fitting Splines to Graph Cuts
Iddo Drori, Tommer Leyvand, Daniel Cohen-Or and Hezy Yeshurun
ACM SIGGRAPH 2004 Posters Session
|Object segmentation in image sequences is one of the fundamental problems in computer vision and graphics. This problem is usually addressed either by discrete representations which are currently manifested by graph partitioning techniques, or by continuous methods typically referred to as active contours. In this work we take a unified approach by fitting splines to graph cuts. The strengths of this approach stem from the dual discrete and continuous representations and from allowing the user to refine the result of the cut by fitting a new spline to it and modifying its points. More ...
|Video Operations in the Gradient Domain
Iddo Drori, Tommer Leyvand, Shachar Fleishman, Daniel Cohen-Or and Hezy Yeshurun
Technical Report, May 2004
|Fusion of image sequences is a fundamental operation in numerous video applications and usually
consists of segmentation, matting and compositing. We present a unified framework for performing
these operations on video in the gradient domain. Our approach consists of 3D graph cut computation followed by reconstruction of a new 3D vector field by solving the Poisson equation. We demonstrate the applicability of smooth video transitions by fusing pairs for video mosaics, video folding, and video texture synthesis, and demonstrate the applicability of sharp video transitions by video segmentation, video trimap extraction and 3D compositing into a new sequence. Our results demonstrate that our method maintains coherence of the video matte and composite, and avoids temporal artifacts. More ...
|Ray Space Factorization for From-Region Visibility
Tommer Leyvand, Olga Sorkine and Daniel Cohen-Or
ACM SIGGRAPH 2003
|This paper present a conservative occlusion culling method based on factorizing the 4D from-region visibility problem into horizontal and vertical components. The visibility of the two components is solved asymmetrically: the horizontal component is based on a parameterization of the ray space, and the visibility of the vertical component is solved by incrementally merging umbrae. The technique is designed so that the horizontal and vertical operations can be efficiently realized together by modern graphics hardware. More ...
|Advanced Topic in Computer Graphics / Spring 2004: Exercise 1 - Poisson Image Editing
This exercise is an introduction to gradient domain image editing. We start with the simpler smooth image completion operation (an example input/out pair is on the left). We continue to describe the poisson image cloning technique that involves cloning pixel-gradients instead of pixel values and usually results in a smoother blending. The exercise material includes the presentation slides and full solution source-code. More ...
|CityGen - Procedural Urban Model Generator
CityGen is a procedural 3D model generator application aimed for generating random urban models. These models are generated from an XML construction file using several simple operations and random inputs. Developed as a side project from my "Ray-Space Factorization for From-Region Visibility" paper. More ...
© 2003-2020 Tommer Leyvand
Last updated April 2020