Data vignette: men are worse drivers

This post is based on a notebook I wrote a couple years ago. I'd like to revisit and expand on it, as well as correct some errors. The original notebook is here. In this post, I analyze traffic collision data from Los Angeles County in January 2012. The analysis is sound, but the conclusion in… Continue reading Data vignette: men are worse drivers


For presentations, focus on narrative

You've collected the data, you've run the analysis, now you have to decide how to present. You've considered it from every angle, and you're preparing a slide deck to match--detailed, lengthy and technical. Is this the right approach? Probably not. Rule of thumb: include no more than one figure per topic When you're the technical… Continue reading For presentations, focus on narrative

Lead Scoring with Customer Data Using glmnet in R

Lead Scoring Lead scoring is an important task for business. Lead scoring is identifying which individuals in a population may convert (purchase) if marketed to, or assigning them a probability of converting, or determining how much value that individual may have as a customer. Properly using data to support this task can greatly benefit your… Continue reading Lead Scoring with Customer Data Using glmnet in R

Deep Neural Networks: CS231n & Transfer Learning

Deep learning (also known as neural networks) has become a very powerful technique for dealing with very high dimensional data, i.e. images, audio, and video. As one example, automated image classification has become highly effective. This task consists of putting an image into one of a certain number of classes. Look at the results of… Continue reading Deep Neural Networks: CS231n & Transfer Learning

Information Criteria & Cross Validation

A problem of predictive models is overfitting. This happens when the model is too responsive and picks up trends that are due to quirks in a specific sample, and not reflective of general, underlying trends in the data process. The result is a model that doesn't predict very well. A model can be made less responsive by regularization--i.e.… Continue reading Information Criteria & Cross Validation

A brief aside: Video Productions

On a slightly different note, I'd like to showcase my award-winning video production here. These two won a total of $2500 in a competition sponsored by UCLA Department of Chemical and Biomolecular Engineering: These were produced for UCLA California NanoSystems Institute. These two bike safety videos each won $200 in a contest sponsored… Continue reading A brief aside: Video Productions