projects

What have I been up to?

librf : C++ port of Random Forests

12/2006

This is a C++ port of Leo Breiman and Adele Cutler's Random Forests classifier. The goal is to create a usable C++ library like LIBSVM, the popular support vector machine package. It is limited to binary classification so far. Project page is here.

Google : blogsearch intern

Summer 2006

Really good times at the NYC office with the blogsearch group.

Test Optimization through Data Mining

1/2006

Microprocessor testing generates a great deal of data - a lot of which is thrown away. Now that pre-silicon modeling is increasingly inaccurate, it is important to have post-silicon methods for optimizing a test set. I'm currently developing a data-mining framework for determining redundant/missing test-patterns for test-set optimization by collecting test responses from Monte-Carlo generated circuit samples.

Feedmo: RSS Feed clustering

12/2005

This was an independent project undertaken during Winter break. The idea was basically to make sort of a Google News website for geeky and political blogs. In theory, this would cut down on my web browsing time. This was also a chance to learn more about clustering algorithms, information retrieval, Ruby on Rails, Python, and relational databases.

Bayesian learning for VLSI

11/2005

VLSI design relies upon a hierarchy of models which provide layers of increasing abstraction (from device,to gate,to block, etc). Unfortunately, the models that are closest to the physical silicon are costly to characterize and increasingly inaccurate, as the manufacturing process shifts. These models are critical for reliable statistical analysis. We propose a method for using bayesian inference to learn key parameters from manufactured chips.

Fotofiti: photo annotation

11/2005

This was a class project of ECE 268. Essentially, a Flickr-ish research platform for trying to automate image annotation using contextual meta-data. This was a first foray into Ruby on Rails and relational databases.

Freescale (formerly Motorola): Speed-binning

Summer 2005

Determining the maximum functional clock speed of a microprocessor (or speed-binning) is traditionally a tricky, ad-hoc process known as functional testing. Essentially, it's running many sequences of instructions that will hopefully exercise the most timing critical paths. Running functional tests is a costly endeavor that requires more test-pins and test-time than running structural tests (non-instruction based) . Over the summer, I looked at using structural tests to predict the functional clock speed. By using a statistical learning technique, boosting, I devised a method that only requires a few structural tests that improves upon the accuracy of previous methods.

UC Data Mining Competition

6/2005

Given a large amount of unlabeled, normalized, financial records, predict which accounts are "bad" and when they'll go bad. Used Support-vector-machine (SVM) learning with Sean. Learned a lot about pre-processing, feature-selection, principle component analysis. Time-series prediction is hard. We ended up doing pretty well with classification though, and won a campus prize.

Pattern-based Statistical Timing analysis

2004-2005

Statistical timing analysis is about dealing with the realities of chip manufacturing in the deep-sub-micron technologies. Due to variations, worst-case corner based-design is increasingly inefficient and thus, statistical or probabilistic design is a hot topic. Most statistical timing analysis has thus far concentrated on static, pattern-less analysis. We have been researching how to apply the same statistical analyses to actual patterns, which can provide information useful to chip testing.

Pre-placement Wire-length Estimation

Fall 2003

This was a class project for ECE 256A. By looking at the logical netlist of a circuit, there are certain connectivity metrics that we can look at to try and predict interconnect length. This can be important for feasibility analysis without the cost of placement. I used support vector machine learning on various placement benchmarks and placement algorithms. This led to some conclusions on which metrics were most important towards interconnect length.

NASA Jet Propulsion Laboratory: Low-density Parity-Check codes (LDPC)

Summer 2003

NASA's always interested in error-correcting codes for its space probes. LDPC codes are hot due to their excellent performance and low complexity. At JPL, I worked with Jason Lee to implement a particular form of structured LDPC codes proposed by Jeremy Thorpe in hardware. We ended up with a functioning, scalable FPGA solution that we characterized with bit-error-rate testing .

Asynchronous VLSI: 8051 Microcontroller project

2002-2003

Asynchronous, or clock-less design is an interesting practice. It's not commonly used in industry due to the lack of design tools and complexity. However, asynchronous design may have enormous benefits in terms of extremely low power consumption. In this project, we implemented an asynchronous version of a popular 8051 microcontroller. This involved writing a sequential specification and then decomposing and parallelizing it down to the level of transistor implementable asynchronous building blocks. I worked on the register file of the 8051.

Multi-vehicle Wireless Testbed

Summer 2001

This was an interdisciplinary project that brought together controls, mechanical engineering, electrical engineering and computer science. It was a neat arrangement, with overhead cameras for determining location, a wireless network for communication and vehicles with laptops powered by fans. I designed a microcontroller board that controlled the two fans and also wrote a bunch of networking code to glue the system to the MATLAB specified controller. We were then able to close the control loop and test the system.