What have I been up to?

librf : C++ port of Random Forests
12/2006
This is a C++ port of Leo Breiman and Adele Cutler's Random Forests classifier. The goal is to create a usable C++ library like LIBSVM, the popular support vector machine package. It is limited to binary classification so far. Project page is here.

Google : blogsearch intern
Summer 2006
Really good times at the NYC office with the blogsearch group.

Test Optimization through Data Mining
1/2006
Microprocessor testing generates a great deal of data - a lot of which is thrown away. Now that pre-silicon modeling is increasingly inaccurate, it is important to have post-silicon methods for optimizing a test set. I'm currently developing a data-mining framework for determining redundant/missing test-patterns for test-set optimization by collecting test responses from Monte-Carlo generated circuit samples.

Feedmo: RSS Feed clustering
12/2005
This was an independent project undertaken during Winter break. The idea was basically to make sort of a Google News website for geeky and political blogs. In theory, this would cut down on my web browsing time. This was also a chance to learn more about clustering algorithms, information retrieval, Ruby on Rails, Python, and relational databases.

Bayesian learning for VLSI
11/2005
VLSI design relies upon a hierarchy of models which provide layers of increasing abstraction (from device,to gate,to block, etc). Unfortunately, the models that are closest to the physical silicon are costly to characterize and increasingly inaccurate, as the manufacturing process shifts. These models are critical for reliable statistical analysis. We propose a method for using bayesian inference to learn key parameters from manufactured chips.

Fotofiti: photo annotation
11/2005
This was a class project of ECE 268. Essentially, a Flickr-ish research platform for trying to automate image annotation using contextual meta-data. This was a first foray into Ruby on Rails and relational databases.

Freescale (formerly Motorola): Speed-binning
Summer 2005
Determining the maximum functional clock speed of a microprocessor (or speed-binning) is traditionally a tricky, ad-hoc process known as functional testing. Essentially, it's running many sequences of instructions that will hopefully exercise the most timing critical paths. Running functional tests is a costly endeavor that requires more test-pins and test-time than running structural tests (non-instruction based) . Over the summer, I looked at using structural tests to predict the functional clock speed. By using a statistical learning technique, boosting, I devised a method that only requires a few structural tests that improves upon the accuracy of previous methods.

Given a large amount of unlabeled, normalized, financial records, predict which accounts are "bad" and when they'll go bad. Used Support-vector-machine (SVM) learning with Sean. Learned a lot about pre-processing, feature-selection, principle component analysis. Time-series prediction is hard. We ended up doing pretty well with classification though, and won a campus prize.

Pattern-based Statistical Timing analysis
2004-2005
Statistical timing analysis is about dealing with the realities of chip manufacturing in the deep-sub-micron technologies. Due to variations, worst-case corner based-design is increasingly inefficient and thus, statistical or probabilistic design is a hot topic. Most statistical timing analysis has thus far concentrated on static, pattern-less analysis. We have been researching how to apply the same statistical analyses to actual patterns, which can provide information useful to chip testing.

Pre-placement Wire-length Estimation
Fall 2003
This was a class project for ECE 256A. By looking at the logical netlist of a circuit, there are certain connectivity metrics that we can look at to try and predict interconnect length. This can be important for feasibility analysis without the cost of placement. I used support vector machine learning on various placement benchmarks and placement algorithms. This led to some conclusions on which metrics were most important towards interconnect length.

NASA Jet Propulsion Laboratory: Low-density Parity-Check codes (LDPC)
Summer 2003
NASA's always interested in error-correcting codes for its space probes. LDPC codes are hot due to their excellent performance and low complexity. At JPL, I worked with Jason Lee to implement a particular form of structured LDPC codes proposed by Jeremy Thorpe in hardware. We ended up with a functioning, scalable FPGA solution that we characterized with bit-error-rate testing .

Asynchronous VLSI: 8051 Microcontroller project
2002-2003
Asynchronous, or clock-less design is an interesting practice. It's not commonly used in industry due to the lack of design tools and complexity. However, asynchronous design may have enormous benefits in terms of extremely low power consumption. In this project, we implemented an asynchronous version of a popular 8051 microcontroller. This involved writing a sequential specification and then decomposing and parallelizing it down to the level of transistor implementable asynchronous building blocks. I worked on the register file of the 8051.

Multi-vehicle Wireless Testbed
Summer 2001
This was an interdisciplinary project that brought together controls, mechanical engineering, electrical engineering and computer science. It was a neat arrangement, with overhead cameras for determining location, a wireless network for communication and vehicles with laptops powered by fans. I designed a microcontroller board that controlled the two fans and also wrote a bunch of networking code to glue the system to the MATLAB specified controller. We were then able to close the control loop and test the system.