Past Events – IEEE Signal Processing Society Santa Clara Valley Chapter

August 16 & 17, 2019

[ValleyML.ai] State of AI and ML-Summer 2019

August 21, 2019

5G & Beyond: Perspectives on URLLC
Vasuki Narasimha Swamy

Aug 29 to Oct 17, 2019

[ValleyML.ai] Machine Learning and Deep Learning Boot Camp

June 14, 2019

IEEE SPS MIVisionX Inference Tutorial

In this tutorial, we will learn how to run inference efficiently using OpenVX and OpenVX Extensions. OpenVX is an open, royalty-free standard for cross platform acceleration of computer vision applications. It is designed by the Khronos Group to facilitate portable, optimized and power-efficient processing of methods for vision algorithms. The tutorial will go over each step required to convert a pre-trained neural net model into an OpenVX Graph and run this graph efficiently on any target hardware. In this tutorial, we will also learn about AMD MIVisionX which delivers open source implementation of OpenVX and OpenVX Extensions along with Neural Net Model Compiler & Optimizer.

Tutorial Requirements: Install the appropriate terminal program on your laptop: Windows laptop – install MobaXterm – https://mobaxterm.mobatek.net/ Linux Laptop – install SSH – sudo apt install openssh-server Mac – install Xquartz – https://www.xquartz.org/	Intro	Tutorial

May 21, 2019

Rethink Software
Chris Doig

May 09, 2019

Intelligent Ear-Level Devices for Hearing Enhancement and Health and Wellness Monitoring
Tao Zhang, Ph.D., Director of Signal Processing Research Department, Starkey Hearing Technologies

May 01, 2019

The future of lossy image compression: what machines can learn from humans
Irena Fischer-Hwang

April 25, 2019

Flexible Radios And Flexible Networks
Dr. Alyssa Apsel, Cornell University

April 18, 2019

Plasticine: A Reconfigurable Dataflow Architecture for Machine Learning/Software 2.0
Prof. Kunle Olukotun, Stanford University/SambaNova Systems

Event Photos

April 11, 2019

Enabling Wireless Autonomous Systems Using 5G
Dr. Nageen Himayat, Intelligent Distributed Edge Networks Labs, Intel Corporation

March 20, 2019

Quantization Noise
Prof. Bernard Widrow, Stanford University

March 20, 2019

Deep knockoffs machines for replicable selections
Yaniv Romano

July 31, 2018

Generative Adversarial Network and its Applications to Human Language Processing
Professor Hung-Yi Lee, Assistant Professor of the Department of Electrical Engineering of National Taiwan University

Generative adversarial network (GAN) is a new idea for training models, in which a generator and a discriminator compete against each other to improve the generation quality. Recently, GAN has shown amazing results in image generation, but the applications of GAN on text and speech processing are still limited. In this talk, I will demonstrate the applications of GAN on unsupervised abstractive summarization and sentiment controllable chat-bot. I will also talk about the research directions towards unsupervised speech recognition by GAN.

Youtube Playlist

July 19, 2018

Review of LiDAR, Localization and Object Processing for Safe Autonomous Systems
Dr. Kiran Gunnam, Distinguished Engineer, Western Digital

June 7, 2018

LiDAR training data best practices
Mohammad Musa, Founder and CEO at Deepen AI

Accurate LiDAR classification and segmentation is required for developing critical ADAS and Autonomous Vehicles components. Mainly, its required for high definition mapping and developing perception and path/motion planning algorithms. This talk will cover best practices for how to accurately annotate and benchmark your AV/ADAS models against LiDAR ground truth training data.

May 17, 2018

Augmenting Cognition through Data Visualization
Alark Joshi, Associate Professor in the Department of Computer Science at the University of San Francisco

May 1, 2018

Small, Medium, and Big Data: Application of Machine Learning Methods to the Solution of Real-World Imaging and Printing Problems
Jan P. Allebach, Hewlett-Packard Distinguished Professor of Electrical and Computer Engineering, Purdue University

Slides

To provide a context for the discussion to follow, I will first briefly discuss the general characteristics of machine learning. Then, I will describe a series of problems that illustrate the successful application of machine learning methods to the solution of problems in the printing and imaging space. These problems range from the development of detailed microscale models for printer behavior; to algorithms for print and image quality assessment; to algorithms for predicting aesthetic quality of fashion photographs; to algorithms for detection and recognition of people in home and office settings. The algorithms take a variety of different forms ranging from linear regression, context-dependent linear regression, and context-dependent linear regression augmented by stochastic sample function generation; to maximum likelihood estimation; to support vector machines; to convolutional neural networks. The size of the data sets used to train these algorithms range from tens of images to tens of thousands of images.

Apr 11, 2018

Computer Vision at the Edge and in the Cloud: Architectures, Algorithms, Processors, and Tools
Jeff Bier, Founder, Embedded Vision Alliance and President, BDTI

Slides

Computer vision is rapidly becoming ubiquitous. From autonomous robots, vehicles and drones to smart buildings to home assistants that can advise you on your fashion choices, vision is showing up everywhere. A key architectural choice underlies this ubiquity: should vision processing be done at the edge, in the cloud, or a hybrid combination of the two? Jeff Bier, Founder of the Embedded Vision Alliance, will discuss the benefits and trade-offs of edge, cloud, and hybrid models, and when you should consider each option. Within this edge-cloud framework, Jeff will also provide an update on important recent developments in the technologies enabling vision, including processors, sensors, algorithms, development tools, services and standards. Jeff will also highlight some of the most interesting and most promising end-products and applications incorporating vision capabilities.

Mar 15, 2018

50 Years of Computer Architecture: From Mainframe CPUs to DNN TPUs and Open RISC-V
Prof. David Patterson, Google, Mountain View, CA, University of California, Berkeley, CA

Mar 12, 2018

An Exciting Course Ahead in Signal Processing
Chandrakant D. Patel, HP Senior Fellow and Chief Engineer

The 21st century cyber physical age is about the seamless integration of cyber and physical systems. The growth of cyber physical systems is motivated by the need to address the challenges stemming from the socio-economic megatrends such as population increase, changing demographics, rapid urbanization and resource constraints. Cyber physical applications – from healthcare and factories to city scale supply side infrastructure such as power, water, waste, transport – present an exciting opportunity to the signal processing community. With respect to cyber-physical signal processing applications, the key to success lies in taking the learning from today, image signal processing as an example, in combination with the signal processing of the machine age centered around domain knowledge.

Signal processing in the 1980s was about availing structured data streams from physical systems, and using various instruments and analysis techniques, to infer and acting upon the data. The workflow often began with an understanding of the domain – dynamics of physical structures as an example – and running experiments to prove or disprove a hypothesis. In the latter part of 20 th century cyber age, the rise of image processing hardware and the ability to manage large amounts of data, led to exciting advancements in learning from data from visual systems. In this talk, I will make the case that cyber physical systems of tomorrow will require us to take a holistic perspective covering domain knowledge, design and topology associated with computing hardware, together with data mining and knowledge discovery.

We will have to leverage our past to create the future.

Mar 8, 2018

Running Sparse and Low-Precision Neural Networks: An Interactive Play between Software and Hardware
Hai “Helen” Li, Associate Professor, Department of Electrical and Computer Engineering, Duke University

Feb 07, 2018

Building Intuition into Adversarial Examples in Deep Learning
Dr. Simant Dube, AI Winning Solutions

In recent years, deep neural networks have found wide ranging applications in the general field of AI. However, it has been shown that deep learning suffers from Adversarial Examples which have puzzled AI scientists. For acceptance of deep learning based AI solutions, it is important to understand this intriguing phenomenon and to eliminate it. Furthermore, there has been debate in social media about urgent need to bring rigor to the field of deep learning. Inspired by that, in this talk, we present novel results which explain adversarial examples in Computer Vision and which also pave way for future progress in AI.

Dec 09, 2017

Walk with IEEE Fellows
Stanford Dish Loop Trail, meet at the first intersection close to Stanford Ave entrance

This December, we are hosting a “Walk with IEEE Fellows” event, and would love to sincerely invite you to join us for a walk in the Stanford Dish Loop Trail. While getting exercise, this is a great opportunity to connect with local IEEE Fellows and members, share insights, and get inspired. Located right next to the Stanford campus, the heart of Silicon Valley, Stanford Dish Loop Trail gets nice and green in winter. From certain locations, you get to overlook the whole Stanford campus and San Francisco Bay.

Dec 4, 2017

Deep Learning in Biomedicine and Genomics
Dr. Mark DePristo, Google

Nov 16-17, 2017

IEEE Artificial Intelligence Symposium
Santa Clara, CA

Nov 1, 2017

Learning with limited supervision
Dr. Stefano Ermon, Stanford University

Sep 26, 2017

Hebbian Learning and the LMS Algorithm
Prof. Bernard Widrow, Department of Electrical Engineering, Stanford University

Hebb’s learning rule can be summarized as “neurons that fire together wire together.” Wire together means that the weight of the synaptic connection between any two neurons is increased when both are firing. Hebb’s rule is a form of unsupervised learning. Hebb introduced the concept of synaptic plasticity, and his rule is widely accepted in the field of neurobiology.
When imagining a neural network trained with this rule, a question naturally arises. What is learned with “fire together wire together,” and what purpose could this rule actually have? Not having a good answer has long kept Hebbian learning from engineering applications. The issue is taken up here and possible answers will be forthcoming. Strictly following Hebb’s rule, weights could only increase, never decrease. This would eventually cause all weights to saturate, yielding a useless network. When extending Hebb’s rule to make it workable, it was discovered that extended Hebbian learning could be implemented by means of the LMS algorithm. The result was the Hebbian-LMS algorithm.

The LMS (least mean square) algorithm was discovered by Widrow and Hoff in 1959, ten years after Hebb’s classic book first appeared. The LMS algorithm optimizes with gradient descent. It is the most widely used learning algorithm today. It has been applied in telecommunications systems, control systems, signal processing, adaptive noise cancelling, adaptive antenna arrays, etc. It is at the foundation of the backpropagation algorithm of Paul Werbos. Hebb’s rule notwithstanding, the nature of the learning algorithm(s) that adapt and control the strength of synaptic connections in animal brains is for the most part unknown. The biochemistry of synaptic plasticity is largely understood, but the overall control algorithm is not understood. A solution to this mystery might be the Hebbian-LMS algorithm, a control process for unsupervised training of neural networks that perform clustering. Considering the structure of neurons, synapses, and neurotransmitters, the electrical and chemical signals necessary for the implementation of the Hebbian-LMS algorithm seem to be all there. Hebbian-LMS seems to be a natural algorithm. It is proving to be a simple useful algorithm that is easy to make work. Neuron to neuron connections are as simple as can be. All this raises a question. Could a brain or major portion of a brain be implemented with basic building blocks that perform clustering? Is clustering nature’s fundamental neurological building block?

On the engineering side, layered neural networks trained with Hebbian-LMS have been simulated. Hidden layers are trained, unsupervised, with Hebbian-LMS while the output layer is trained with classic LMS, supervised. The hidden layers perform clustering. The output layer is fed clustered inputs, and from this makes the final classification decisions. Networks that are not layered, for example randomly connected, can be implemented with Hebbian-LMS neurons to provide inputs to an output classifier. The same training algorithm could be utilized. The Hebbian-LMS network is a general purpose trainable classifier and gives performance comparable to a layered network trained with the backpropagation algorithm. The Hebbian-LMS network is much simpler to implement and easier to make work. It is early to predict, but it seems highly likely that Hebbian-LMS will have many engineering applications to clustering, pattern classification, signal processing, control systems, and to machine learning.

Sep 21, 2017

Vision and the Deep Learning Explosion
Dr. Chris Rowen, CEO, Cognite Ventures and Stanford University

Everyone sees the excitement about artificial intelligence, but what’s real? Everyone gets pumped up about smart drones and self-driving cars, but what does it take to really harness the potential of deep learning for real products? The academic benchmarks are impressive, but how does research translate into break-out businesses? Why is computer vision embracing deep neural networks so passionately?

This talk looks closely at artificial intelligence aka deep learning aka neural network aka cognitive computing technologies, maps out the affected applications and industries and dives into the profound impact it is having one example segment, computer vision. It explores the relationship among vision research, cloud and embedded AI product opportunities and the global explosion in the number of deep learning startups. Finally, it sketches some of the most important principles that successful startups are following to succeed in the frothy, frenetic and fascinating entrepreneurial game.

Aug 31, 2017

Khronos Standards for Neural Network Acceleration and Deployment
Radhakrishna Giduthuri, AMD

Slides

Khronos Group is a not for profit, member-funded consortium to create royalty-free open standards for hardware acceleration. OpenVX is an API for computer vision and neural network acceleration, especially important in real-time and safety-critical use cases. Khronos Group is also readying NNEF standard interchange format to transfer networks trained in deep learning frameworks to optimized inference engines. This talk gives an overview of Khronos standards related to neural networks and computer vision. A set of examples for neural networks and computer vision mapped to graph API will also be discussed.

Aug 01, 2017

Video codec standardization update for 360 degree video
Jill Boyce, Intel

Slides

MPEG and ITU-T VCEG are developing new SEI messages for HEVC to standardize coding of omnidirectional spherical video (also called 360° video”), for inclusion in a new version of HEVC in late 2017. These SEI messages can be used with existing HEVC profiles, with projection mapping of the spherical video into a 2D rectangular format. New methods of objective and subjective testing for 360° video have been developed to study the impact of different projection formats on coding efficiency and video quality. The MPEG/VCEG Call for Evidence for a new video coding standard with capability beyond HEVC (e.g. a future H.266), includes a category for 360° video, and allows new specifically targeted coding tools.

Jun 22, 2017

Computational Microscopy
Prof. Laura Waller, UC Berkeley

Slides

Computational imaging involves the joint design of imaging system hardware and software, optimizing across the entire pipeline from acquisition to reconstruction. This talk will describe new methods for computational microscopy with coded illumination, based on a simple and inexpensive hardware modification of a commercial microscope, combined with advanced image reconstruction algorithms. In conventional microscopes and cameras, one must trade off field-of-view and resolution. Our methods allow both simultaneously by using multiple images, resulting in Gigapixel-scale reconstructions with resolution beyond the diffraction limit of the system. Our algorithms are based on large-scale nonlinear non-convex optimization procedures for phase retrieval, with appropriate priors.

Visit laurawaller.com for related publications and projects

May 16, 2017

A Universal Low-latency Real-time Optical Flow based Stereoscopic Panoramic Video Communication System for AR/VR
Dr. Jiangtao Wen, Tsinghua University

Slides

Introduce an optimized system for real time, low latency stereoscopic panoramic video communications that is camera agnostic. After intelligent camera calibration, the system is capable of stitching inputs from different cameras using a real time, low latency optical flow based algorithm that intelligently learns input video features over time to improve stitch quality. Depth information is also extracted in the process. The resulted stereoscopic panoramic video is then encoded with content-adaptive temporal and/or spatial resolution to achieve low bitrate while maintaining good video quality. Various aspects of the system including the optimized stitching algorithm, parallelization and task scheduling, as well as encoding will be introduced with demos with conventional (non-panoramic) professional and consumer grade cameras as well as integrated panoramic cameras.

Apr 06, 2017

New immersive and object-based multichannel audio formats for cinema, entertainment and cinematic VR
Dr. Jean-Marc Jot, Senior VP, R&D, DTS Inc.

Slides

In recent years, several audio technology companies and standardization organizations (including Dolby, Auro, DTS, MPEG) have developed new formats and tools for the creation, archiving and distribution of immersive audio content in the cinema or broadcast industries. These developments extend legacy multi-channel audio formats to support three-dimensional (with height) sound field encoding, along with optional audio object channels accompanied with positional rendering metadata. They enable efficient content delivery to consumer devices and flexible reproduction in multiple consumer playback environments, including headphones and frontal audio projection systems. In this talk, we’ll review and illustrate the state of these developments and discuss perspectives and pending issues, including virtual reality applications.

Mar 23, 2017

Deep Learning in Siri
Dr. Alex Acero, Apple.

Siri, Apple’s personal assistant, first shipped in 2011 as part of iOS and brought conversational agents into the mainstream. Users can access Siri from their iPhone, iPad, Apple Watch, AppleTV and Carplay in 21 languages. Deep learning has revolutionized the field of machine learning, making a big impact in both core algorithms and application areas like speech recognition, critical for Siri. Mixture Density Networks, a particular type of deep learning, now power Apple’s TTS engine, making Siri’s voices more natural, smoother, and allowing Siri’s personality to shine through. Accented speech, always a challenge for speech recognition systems, can be addressed by training deep neural networks and convolutional neural networks with various sources of data properly weighted in order to achieve a robust acoustic model.

Feb 22, 2017

Localizing the Epileptic Seizure Onset Zone via Directed Information Graphs
Dr. Yonathan Morin, Stanford University

Epilepsy is one of the most common neurological disorders affecting about 1% of the world population. While in most cases treating epilepsy with antiepileptic drugs (AED) is successful, about a third of the patients cannot be adequately treated with AEDs. The main treatment for such patients is a surgical procedure for removal of the seizure onset zone (SOZ), the area in the brain from which the seizures originate. The main tool for accurately identifying the SOZ is electrocorticography (ECoG) recordings, taken from grids of electrodes placed on the cortex to allow a direct measurement of the brain’s electric activity. In this talk we will present a novel SOZ localization algorithm, based on ECoG recordings. Our underlying hypothesis is that seizures start in the SOZ and then spread to surrounding areas in the brain. Thus, signals recorded at electrodes close to the SOZ should have a relatively large causal influence on the rest of the recorded signals. To evaluate the statistical causal influence between the recorded signals, we represent the set of electrodes using a directed graph, where the edges’ weights are the pair-wise causal influence, quantified via the information theoretic functional of directed information. The directed information is estimated from the ECoG recording using the nearest-neighbor estimation paradigm. Finally, the SOZ is inferred from the obtained network via a variation of the famous PageRank algorithm. Testing the proposed algorithm on 15 ECoG recordings of epileptic patients, listed in the iEEG portal, shows a close match with the SOZ estimated by expert neurologists.

Feb 9, 2017

Unsupervised Machine Learning: Application to Data Fusion
Prof. Tülay Adali, Dept. of CSSE, University of Maryland Baltimore County

Fusion of information from multiple sets of data in order to extract a set of features that are most useful and relevant for the given task is inherent to many problems we deal with today. Data-driven methods based on source separation minimize the assumptions about the underlying relationships and enable fusion of information by letting multiple datasets to fully interact and inform each other. Use of multiple types of diversity – statistical property – enables maximal use of the available information when achieving source separation. In this talk, a number of powerful models are introduced for fusion of both multiset – data of the same nature – as well as multi-modal data, and the importance of diversity in fusion is demonstrated with a number of practical examples in medical imaging and video processing.

Jan 27, 2017

Bowling with IEEE Young Professionals
IEEE Santa Clara Valley Young Professionals

Jan 20, 2017

Deep Learning for Image and Video Processing
Jonathon Shlens & George Toderici, Google Research

Deep learning has profoundly changed the field of computer vision in the last few years. Many computer vision problems have been recast with techniques from deep learning and in turn achieved state of the art results and become industry standards. In this tutorial we will provide an overview about the central ideas of deep learning as applied to computer vision. In the course of this tutorial we will survey the many applications of deep learning to image and video problems. The goal of this tutorial is to teach the central and core ideas and provide a high level overview of how deep learning has influenced computer vision.

Dec 13, 2016

Super-resolution Image Reconstruction – Methods and Lessons Learned
Prof. Sally Wood, Thomas J. Bannan Professor in Electrical Engineering, Santa Clara University

Although there is some variation in the interpretation of the term “super-resolution” in different imaging application contexts, for computational methods it typically refers to the use of multiple images acquired at a low spatial resolution to compute a single image with increased spatial resolution. The motivation for this may be to improve the perceptual quality of the image content or to derive more accurate information from the image content such as the location of features. This may be attractive in situations where a higher resolution camera can not be used because of size or cost for example. A potential application, which may be fixed or mobile, is monitoring and surveillance. The additional information used to improve the spatial resolution may be some combination of a-priori assumptions and multiple passively acquired images in which the desired high frequency information is present, but aliased. Performance measures of super-resolution algorithms may be based on measures of image accuracy, measures of image quality, computational efficiency, or robustness in the presence of measurement noise and image acquisition model error. While computational efficiency is relatively unambiguous, the metrics for accuracy and robustness may be debated. This talk will provide an introduction to super-resolution methods and applications, explore the effects of noise and model error on resolution improvement, describe one specific project application, and discuss general lessons learned.

Nov 03, 2016

The Soul of a New Camera: The Design of Facebook’s Surround Open Source 3D-360 Video Camera
Brian Cabral, Director of Engineering, Facebook

Around a year ago we set out to create an open-source reference design for a 3D-360 camera. In nine months, we had designed and built the camera, and published the specs and code. Our team leveraged a series of maturing technologies in this effort. Advances and availability in sensor technology, 20+ years of computer vision algorithm development, 3D printing, rapid design proto-typing and computational photography allowed our team to move extremely fast. We will delve into the roles each of these technologies played in the designing of the camera, giving an overview of the system components and discussing the tradeoffs made during the design process. The engineering complexities and technical elements of 360 stereoscopic video capture will be discussed as well. We will end with some demos of the system and its output.

Oct 20, 2016

[Distinguished Lecturer] Demystifying Linear Time Varying Circuits
Prof. Shanthi Pavan, Indian Institute of Technology-Madras

Oct 06, 2016

Indoor and Outdoor Image based Localization for mobile devices
Prof. Avideh Zakhor, EECS Department, UC Berkeley and Indoor Reality

Image geo-location has a wide variety of applications in GPS denied environments such as indoors, as well as error prone outdoor environments where GPS signal is unreliable. Besides accuracy, an inherent advantage of image based localization is recovery of orientation as well as position. This could be important in applications such as navigation and augmented reality. In this talk, I describe a number of indoor and outdoor image based localization approaches and characterize their performance in a variety of scenarios. I start with a basic divide and conquer photo-matching strategy for large area outdoor localization and show its superior performance over compass and GPS on today’s cell phones; I characterize the performance of this system for a 30,000 image database for Oakland, CA as well as 5 million image database for 10,000 square km area in Taiwan. Next I describe a fast, automated methodology for Simultaneous Multi-modal fingerprinting And Physical mapping (SMAP) of indoor environments to be used for indoor positioning. The sensor modalities consist of images, WiFi and magnetic. I show that one shot, static image based localization has 50 percentile error of less than 1 meter and 85 percentile error of less than 2 meters. Finally, I describe the associated multi-modal indoor positioning algorithms for dynamic tracking of users and show that they outperform uni-modal schemes based on WiFi alone. Future work consists of demonstrating the scheme on wearable devices such as the Glass, and the Watch.

Sep 22, 2016

When Machine Learning Takes over Audio Signal Processing
[Distinguished Lecturer] Prof. Paris Smaragdis, University of Illinois at Urbana-Champaign

During the last few years, machine learning has started to permeate the world of audio processing and has produced results that drastically improve over the state of the art. In this talk I’ll touch on some recent approaches taking advantage of a machine learning perspective for attacking audio problems. I will show how traditional signal processing approaches can be reimagined using machine learning tools such as mixture models, matrix factorizations, deep learning regressions, and more.

Aug 25, 2016

Learning Sparsifying Transforms for Signal, Image, and Video Processing
[Distinguished Lecturer] Prof. Yoram Bresler, University of Illinois at Urbana-Champaign

The sparsity of signals and images in a certain transform domain or dictionary has been exploited in many applications in signal and image processing, including compression, denoising, and notably in compressed sensing, which enables accurate reconstruction from undersampled data. These various applications used sparsifying transforms such as DCT, wavelets, curvelets, and finite differences, all of which had a fixed, analytical data-independent form. Recently, sparse representations that are directly adapted to the data have become popular, especially in applications such as image and video denoising and inpainting. While synthesis dictionary learning has enjoyed great popularity and analysis dictionary learning too has been explored, these methods involve a repeated step of sparse coding, which is NP hard, and heuristics for its approximation are computationally expensive. In this talk we describe our work on an alternative approach: sparsifying transform learning, in which a sparsifying transform is learned from data. The method provides efficient computational algorithms with exact closed-form solutions for the alternating optimization steps, and with theoretical convergence guarantees. The method scales better than dictionary learning with problem size and dimension, and in practice provides orders of magnitude speed improvements and better image quality in image processing applications. Variations on the method include the learning of a union of transforms, and online versions. We describe applications to image representation, image and video denoising, and inverse problems in imaging, demonstrating improvements in performance and computation over state of the art methods.

Aug 18, 2016

Enabling and Exploiting Machine Learning in Ultra-low-power Devices
[Distinguished Lecturer] Prof. Naveen Verma, Princeton University

Jun 16, 2016

Convolutional neural network models of the first stages in biological vision
Lane McIntosh, Neurosciences PhD candidate, Stanford University

In order to understand how and why biological vision pathways perform particular computations, we must first know what they do. The first stages of biological vision occur in the retina, and consist of cascaded nonlinear processes like synaptic transmission and spiking dynamics that compress the entire visual scene into the sparse responses of only one million spiking cells. The ubiquity of these dynamic, nonlinear computations in the retina have presented significant obstacles to the goal of learning accurate computational models of circuit responses to natural stimuli from neural recordings. In this talk I will discuss recent work demonstrating that convolutional neural networks (CNNs) are considerably more accurate at capturing retinal responses to held-out natural scenes stimuli than pre-existing published models of the retina. Moreover, we find CNNs generalize significantly better across classes of stimuli they were not trained on. Remarkably, analysis of these CNNs reveals internal units selective for visual features on the same small spatial scale as the main excitatory interneurons of the retina, bipolar cells. Overall, this work demonstrates the power of CNNs to not only accurately capture sensory circuit responses to natural scenes, but also uncover the circuit’s internal structure and function.

Jun 09, 2016

Noncoherent communications in large antenna arrays
Mainak Chowdhury, PhD Candidate, Wireless Systems Laboratory, Stanford University

Coherent schemes with accurate channel state information are considered to be important to realizing many benefits from massive multiple-input multiple-output (massive MIMO) cellular systems involving large antenna arrays at the base station. In this talk we introduce and describe noncoherent communication schemes, i.e., schemes which do not use any instantaneous channel state information, and find that they have the same scaling behavior of achievable rates as coherent schemes with the number of antennas. This holds true not only for Rayleigh fading, but also for ray tracing models. Analog signal processing architectures for large antenna arrays based on our analyses will be described. We also consider wideband large antenna systems and identify a bandwidth limited regime where having channel state information does not increase scaling laws, and outside of which there is a clear rate penalty. This talk is based on joint work with Alexandros Manolakos, Andrea Goldsmith, Felipe Gomez-Cuba, and Elza Erkip.

May 12, 2016

Neural Interfaces and How They Use Signal Processing
Dr. Sarah Felix, Independent Consultant

A neural interface is an engineered system that interacts with nerves to study, repair, or enhance the function of the nervous system. Examples that may be familiar are cochlear implants, deep brain stimulation, and EEG brain sensors. More advanced systems are currently in development for closed-loop brain modulation and robotic limb control. Developing neural interfaces involves many disciplines including computational neuroscience, biology, microtechnology, electronics packaging, medicine, and…signal processing! Various classes of signal processing problems arise, from detection to decoding, from filtering to feature extraction. In this talk I hope to expand the audience’s awareness of this exciting field by presenting a sampling of advanced neural interface applications and highlighting signal processing challenges at the heart of emerging therapeutic technologies.

May 10, 2016

The Search for Extraterrestrial Intelligence at the SETI Institute
Dr. Gerald (Gerry) R. Harp, SETI Institute

SETI Institute is a non-profit research institute in Mountain View, CA and is a world leader in all topics pertaining to the evolution of life in the universe, from interstellar chemistry, to extremophiles on Earth, to the discovery of exoplanets, to the radio search for extraterrestrial intelligence. I will discuss the physical arguments in favor of radio SETI, and describe the state of the art for SETI observations to date. The most important challenge for radio SETI at this time is the pervasive presence of human-generated signals that trigger false positives in our SETI detectors. The only reliable weapon to combat such interference is direction-of-arrival estimation that can prove a signal originates from outer space. I give a description of the most advanced digital signal processing technologies in use or being developed for the ETI search including direction-of-arrival estimation and detection of technological signals with arbitrary waveforms. Future observations with the Square Kilometer Array will propel the radio ETI search into a new domain of sensitivity and direction-of-arrival estimation.

Apr 27, 2016

New Approaches to Haptics for Teleoperation and Virtual Reality
Samuel B. Schorr, PhD Candidate, Stanford University

Apr 06, 2016

A Signal-Processing Approach to Modeling Vision, and Applications
Prof. Sheila S. Hemami, Northeastern University

Current state-of-the-art algorithms that process visual information for end use by humans treat images and video as traditional signals and employ sophisticated signal processing strategies to achieve their excellent performance. These algorithms also incorporate characteristics of the human visual system (HVS), but typically in a relatively simplistic manner, and achievable performance is reaching an asymptote. However, large gains are still realizable with current techniques by aggressively incorporating HVS characteristics to a much greater extent than is presently done, combined with a good dose of clever signal processing. Achieving these gains requires HVS characterizations which better model natural image perception ranging from sub-threshold perception (where distortions are not visible) to suprathreshold perception (where distortions are clearly visible). In this talk, I will review results from our lab characterizing the responses of the HVS to natural images, and contrast these results with ‘classical’ psychophysical results. I will also present several examples of signal processing algorithms which have been designed to fully exploit these results.

Apr 01, 2016

OpenVX: A Framework for Accelerating Computer Vision [Tutorial]
Kari Pulli (Intel), Radhakrishna Giduthuri (AMD), Thierry Lepley (NVIDIA)

OpenVX is a royalty-free open standard API released by the Khronos Group in 2014. OpenVX enables performance and power-optimized computer vision functionality, especially important in embedded and real-time use cases. The course covers both the function-based API and the graph API that enable OpenVX developers to efficiently run computer vision algorithms on heterogeneous computing architectures. A set of example algorithms from computational photography and advanced driver assistance mapped to the graph API will be discussed. Also covered is the relationship between OpenVX and OpenCV, as well as OpenCL. The tutorial includes hands-on practice session that gets the participants started on solving real computer vision problems using OpenVX. Learning Outcomes: Understanding the architecture of OpenVX computer vision API, its relation to OpenCV, OpenGL, and OpenCL APIs; getting fluent in actually using OpenVX for real-time image processing and computer vision tasks.

Mar 23, 2016

When Your Big Data Seems Too Small: Accurate inferences beyond the empirical distribution
Prof. Gregory Valiant, Stanford

We discuss two problems related to the general challenge of making accurate inferences about a complex distribution, in the regime in which the amount of data (i.e the sample size) is too small for the empirical distribution of the samples to be an accurate representation of the underlying distribution. The first problem is the basic task of inferring properties of a discrete distribution, given access to independent draws. We show that one can accurately recover the unlabelled vector of probabilities of all domain elements whose true probability is greater than 1/(n log n). Stated differently, one can learn–up to relabelling–the portion of the distribution consisting of elements with probability greater than 1/(n log n). This result has several curious implications, including leading to an optimal algorithm for “de-noising” the empirical distribution of the samples, and implying that one can accurately estimate the number of new domain elements that would be seen given a new larger sample, of size up to n* log n. (Extrapolation beyond this sample size is provable information theoretically impossible, without additional assumptions on the distribution.) While these results are applicable generally, we highlight an adaptation of this general approach to some problems in genomics (e.g. quantifying the number of unobserved protein coding variants).

The second problem we consider is the task of accurately estimating the eigenvalues of the covariance matrix of a (high dimensional real-valued) distribution–the “population spectrum”. (These eigenvalues contain basic information about the distribution, including the presence or lack of low- dimensional structure in the distribution and the applicability of many higher- level machine learning and multivariate statistical tools.) As we show, even in the regime where the sample size is linear or sublinear in the dimensionality of the distribution, and hence the eigenvalues and eigenvectors of the empirical covariance matrix are misleading, accurate approximations to the true population spectrum are possible. This talk is based on three papers, which are joint works with Paul Valiant,James Zou, and Weihao Kong.

Mar 03, 2016

First-Photon Imaging and Other Imaging with Few Photons
Dr. Vivek Goyal, Associate Professor, Boston University

LIDAR systems use single-photon detectors to enable long-range reflectivity and depth imaging. By exploiting an inhomogeneous Poisson process observation model and the typical structure of natural scenes, first-photon imaging demonstrates the possibility of accurate LIDAR with only 1 detected photon per pixel, where half of the detections are due to (uninformative) ambient light. I will explain the simple ideas behind first-photon imaging. Then I will present related subsequent works that enable the use of detector arrays and improve robustness to ambient light.

Feb 25, 2016

Developing Successful Career Relationships: Leveraging Mentoring, Coaching, and Sponsorship
Sarah Kalicin, Senior Statistician, Intel Corporation

As we move up in our careers, mentoring, coaching, and sponsorship become essential career-leveraging relationships to obtain success. The hour workshop will discuss what career-minded individuals need to consider for their own personal and career growth; define the nuances among Mentoring, Coaching, and sponsorship; identify when to best utilize them for particular growth opportunities; and how to identify and establish the right relationship fit. Participants will learn how to create a relationship circle, which is an actionable and adoptable plan for career development. The author will discuss from her personal experience how she developed and utilize these learning to navigate through her own career.

Feb 24, 2016

Interleaved direct bandpass sampling for software defined radio/radar receivers
Prof. Bernard Levy, Department of Electrical and Computer Engineering, UC Davis

Slides

Due to their low hardware complexity, direct bandpass sampling front ends have become attractive for software defined radio/radar applications. These front ends require three elements: a tunable filter to select the band of interest, a wideband sample and hold to acquire the bandpass signal, and finally an analog to digital converter (ADC) to digitize the signal. Unfortunately, due to the overlap of aliased copies of the positive and negative signal spectrum components, if a single ADC is employed, depending on the exact position of the band where the signal is located, it is not always possible to sample a signal of occupied bandwidth B at a sampling rate ?s just above the 2B Nyquist rate. Sometimes, much higher rates are needed. For software radio applications, this represents a significant challenge, since one would normally prefer to use a single ADC with fixed sampling rate to sample all possible signals of interest. A solution to this problem was proposed as early as 1953 by Kohlenberg, who showed that Nyquist rate sampling can be achieved by using time- interleaved sampling, where two sub-ADCs sample the signal at a rate ?s/2 each, but with a relative timing offset d (such that 0 < d < 1 if the offset is measured relative to the sub-ADC sampling period). However, certain offsets are forbidden, since for example d = 1/2 would result in a uniform overall ADC. In this presentation, a method will be described to simultaneously sample and demodulate the bandpass signal of interest. The sampled complex envelope of the bandpass signal is computed entirely in the DSP domain by passing the sub-ADC samples through digital FIR filters, followed by a digital demodulation operation. However, as the quality factor (ratio of the carrier frequency ?c to the signal bandwidth B) of the front-end selection filter increases, the performance of the envelope computation method becomes progressively more sensitive to mismatches between the nominal offset d0 and actual offset d of the two sampling channels. To overcome this problem, a blind calibration technique to estimate and correct mismatches, is presented.

Feb 18, 2016

JPEG Emerging Standards
Prof. Dr. Touradj Ebrahimi, JPEG Convener, EPFL

JPEG standardization committee has played an important role in the digital revolution in the last quarter of century. The legacy JPEG format, which became an international standard 21 years ago is the dominant picture format in many imaging applications. This dominance does not seem to slow down when observing that the number of JPEG images uploaded to social networks alone has surpassed 2 billions per day in 2014, when compared to less than a 1 billion the year before. JPEG 2000, which became an international standard 15 years ago, has been the format of choice in a number of professional applications, among which contribution in broadcast and digital cinema are two examples. This talk starts by providing an overview of a recently developed image format to deal with High Dynamic Range content called JPEG XT. JPEG XT has been defined to be backward compatible with the legacy JPEG format in order to facilitate its use in current imaging ecosystem. We will then discuss JPEG PLENO, a recent initiative by JPEG committee to address an emerging modality in imaging, namely, plenoptic imaging. “Pleno” is a reference to “Plenoptic” a mathematical representation, which not only provides color information of a specific point in a scene, but also how it changes when observed from different directions and distances. “Pleno” is also the latin word for “complete”, a reference to the vision of the JPEG committee that believes future imaging will provide a more complete description of scenes well beyond what is possible today. The road-map for JPEG Pleno follows a path that starts in 2015 and will continue beyond 2020, with the objective of making the same type of impact that the original JPEG format has had on today’s digital imaging starting from 20 years ago. Several milestones are in work to approach the ultimate image representation in well-thought, precise, and useful steps. Each step could potentially offer an enhanced experience when compared to the previous, immediately ready to be used in applications. The talk will conclude with an quick overview of two potential standardization initiatives under investigation. The first, referred to as JPEG Privacy & Security facilitates protections and security in legacy JPEG images, such as coping with privacy concerns. The second called JPEG XS puts an emphasis on low latency, low complexity and transparent quality as well as low cost, desirable in a number of applications, including broadcasting and high bandwidth links between devices and displays.

Jan 21, 2016

IS ANYBODY OUT THERE? When When Will Earthlings Find ET?
Dr. Dan Werthimer, Chief Scientist, Berkeley SETI Research Center

Slides

Video

What is the possibility of other life in the universe and can we detect radio and optical signals from other civilizations? Current and future SETI projects, including SETI@home and the new $100-million Breakthrough Prize Foundation Listen project may provide an answer. SETI@home chief scientist Dan Werthimer will describe the rationale for past and future searches, SETI signal processing algorithms, and will show how new technologies are revolutionizing SETI.

Dec 15, 2015

Quantization Noise
Prof. Bernard Widrow, Department of Electrical Engineering, Stanford University

The effect of uniform quantization can often be modeled by an additive noise that is uniformly distributed, uncorrelated with the signal being quantized, and uncorrelated over time, being additive white noise having zero mean and mean square of 1/12 q-square, where q is the quantum step size. This simple model is statistical and is based on Nyquist sampling theory applied to the probability density distribution of the signal being quantized. Linear Nyquist theory is applied to precisely describe uniform quantization, which indeed is nonlinear. The simple model applies almost everywhere. This talk surveys the theory behind the simple model and discusses the conditions for its validity. The simple model applies to uniform quantization. However, the theory can be extended to apply to non uniform quantization. This leads to a simple model for floating point quantization. Conditions for the validity of the floating point model will be presented.

Dec 09, 2015

Ph.D. Elevator Pitch to Professionals
Topics include, Autonomous Driving, Computational Cameras, Scalable Speech Coding, with students from Stanford, UC Santa Cruz and Santa Clara University.

Looking for new and local talents? Want to keep yourself and your company up-to-date on the latest hot topics and technical contributions in Signal Processing? You can do both by attending the “Ph.D. Elevator Pitch to Professionals” event. The IEEE Signal Processing Chapter of Santa Clara Valley is organizing an event to connect Ph.D. candidate close to graduation and newly graduated Ph.D. to local companies looking for talents and new technologies. A panel of students will explain their Ph.D. contributions and results in the form of elevator pitch, followed by Q&A and a social event to continue the conversations into poster session.

Guest Speaker:

Alex Acero, President, IEEE Signal Processing Society (Slides)
David Held, Computer Science Department, Stanford University ( Slides) Video
Amin Kheradmand, Department of Electrical Engineering, University of California at Santa Cruz ( Slides)
Koji Seto, Department of Electrical Engineering, Santa Clara University ( Slides) Video

Nov 18, 2015

Multi-Robot Adaptive Navigation
Dr. Christopher Kitts, Director of Robotic Systems Laboratory, Santa Clara University

Slides

Adaptive navigation is the process by which a vehicle determines where to go based on information received while moving through the field of interest. Adaptive sampling is a specific form of this in which that information is environmental data sampled by the robot. This may be beneficial in order to save time/energy compared to a conventional navigation strategy in which the entire field is traversed. Our work in this area focuses on multi-robot gradient-based techniques for the adaptive sampling of a scalar field. To date, we have experimentally demonstrated multi-robot gradient ascent/descent as well as contour following using both wheeled land rovers as well as automated marine surface vessels. In simulation we have verified controllers for ridge descent / valley ascent as well as saddle point detection and loitering. Together, these capabilities establish a set of control primitives that will ultimately allow us to efficiently explore large scale scalar fields. In this talk, we will describe our techniques and present some of our initial experimental results achieved through field operation of our multi-robot systems.

Nov 17, 2015

Rate-distortion of sub-Nyquist sampled processes
Alon Kipnis, PhD Candidate, Stanford EE Department

Consider the task of analog to digital conversion in which a continuous time random process is mapped into a stream of bits. The optimal trade-off between the bitrate and the minimal average distortion in recovering the waveform from its bit representation is described by the Shannon rate-distortion function of the continuous-time source. Traditionally, in solving for the optimal mapping and the rate-distortion function we assume that the analog waveform has a discrete time version, as in the case of a band-limited signal sampled above its Nyquist frequency. Such assumption, however, may not hold in many scenarios due to wideband signaling and A/D technology limitations. A more relevant assumption in such scenarios is that only a sub-Nyquist sampled version of the source can be observed, and that the error in analog to digital conversion is due to both sub-sampling and finite bit representation. This assumption gives rise to a combined sampling and source coding problem, in which the quantities of merit are the sampling frequency, the bitrate and the average distortion.

In this talk we will characterize the optimal trade-off among these three parameters. The resulting rate-distortion-sampling frequency function can be seen as a generalization of the classical Shannon-Kotelnikov-Whittaker sampling theorem to the case where finite bit rate representation is required. This characterization also provides us with a new critical sampling rate: the minimal sampling rate required to achieve the rate-distortion function of a Gaussian stationary process for a given rate-distortion pair. That is, although the Nyquist rate is the minimal sampling frequency that allows perfect reconstruction of a band-limited signal from its samples, relaxing perfect reconstruction to a prescribed distortion allows sampling below the Nyquist rate while achieving the same rate-distortion trade-off.

Nov 05, 2015

Scalable Approaches to New Large-Scale Neuroscience
Dr. Alyson “Allie” Fletcher, Assistant Professor of Electrical Engineering at UC Santa Cruz

Slides

Video

Recent technological advances offer unprecedented possibilities for observing and manipulating neural activity. Characterizing the structure of neural circuits is a fundamental problem in brain science, however, unraveling the connectivity and interactions among populations of neurons remains a daunting challenge. In this talk, I address two problems: learning the neural response in retinal ganglion cells, and the estimation of large neuronal networks from calcium imaging data. Utilizing a general computationally scalable framework for estimation of large dynamical structured nonlinear systems, the approach significantly improves over existing methods in both computational cost and performance. This methodology offers provable guarantees in consistency and convergence and is applicable across a wide variety of settings.

Oct 15, 2015

Gathering Light
Dr. Rajiv Laroia, co-founder & CTO, light.co

Slides

Video

With digital cameras in every cell phone, everyone is a photographer. But people still aspire to the better zoom, the lower noise, and the artistic bokeh effects provided by the digital SLR cameras, if only these features were available in as convenient and light-weight a package as a cell phone or a thin compact camera. Traditional high-end cameras have a big lens system that enables those features, but the drawback is weight, bulk, and inconvenience of carrying and switching lenses. In this talk, we discuss an alternative approach of using a heterogenous array of small cameras to provide those features, and more. Light’s camera technology combines prime lenses that provide an optical zoom equivalent of 35mm, 70mm, and 150mm lenses. Small mirrors allow reconfiguring the cameras to select the right level of zoom and field of view. This talk describes the architecture of this flexible computational camera.

Sep 24, 2015

Android TV Product & Technical Overview
Sascha Prueter, Google

This talk gives an overview of the Android TV platform and it’s support of OTT devices, SmartTVs and PayTV STBs. It will cover the core platform build blocks as well as describe customization possibilities and describe some of TV specific API layers that have been added to Android specifically for the TV space.

Sep 03, 2015

Optical Character Recognition for Most of the World’s Languages
Dr. Ashok C. Popat, Research Scientist, Google

	(1) Molecular Biology Basics, Dr. Craig Stephens, Santa Clara University https://www.scu.edu/cas/biology/staffandfaculty/craig-stephens.cfm
	(2) Computational Methods in Bioinformatics, Dr. Sami Khuri, San Jose State University https://www.cs.sjsu.edu/faculty/khuri
	(3) Signal Processing Models and Algorithms for RNA Sequence Analysis, Dr. Byung-Jun Yoon, Texas A&M University, College Station, TX https://www.ece.tamu.edu/~bjyoon
	(4) Biostatisitcs: Statistical analysis of bio-data , Dr. Ru-Fang Yeh, University of California San Francisco https://www.biostat.ucsf.edu/rufang