Sunday, April 24, 2016

K-Means from Scratch

by David Lettier


Using plain vanilla JavaScript, we built the K-Means algorithm from scratch as well as the silhouette coefficient metric. We discussed different mean initialization techniques and went into depth about Fouad Kahn’s approach. The assignment and update steps were discussed where the data-points are assigned to their closest means and the means are updated by averaging the features of their currently assigned data-points. To not lock up our interface we used requestAnimationFrame and cycled our iterations for both K-Means and the silhouette coefficient metric. Lastly, we discussed one way of determining how well our data is clustered using the silhouette coefficient metric and went over some unintuitive cases.

Friday, April 22, 2016

Fibonacci, LCM and GCD in Haskell


Fibonacci, LCM and GCD in Haskell

by David Lettier

Solving whiteboard problems every now and then can never hurt. We discussed the Fibonacci sequence, LCM and GCD. All solutions were written in Haskell but the algorithms easily translate to other languages. We discussed pattern matching, the Maybe Monad, filter, map andhead. GCD was defined two ways. One way took an iterative approach while the second way, Euclid’s Algorithm, used a simple recursive method.

Thursday, April 14, 2016

Max Subarray in Haskell

Max Subarray in Haskell

By David Lettier

We defined the max or maximum subarray problem as finding the maximum sum of some subarray of contiguous values found in the array. On top this, we added an extra stipulation of finding the maximum sum of non-contiguous elements. Before we could begin, we had to parse and convert the input into arrays of integers. For finding the non-contiguous max sum, we used the foldl or reduce function. For the max subarray problem, we employed a recursive solution keeping track of two inputs where the base case was an empty array.

Monday, March 28, 2016

Reelin' and ROCin', Receiver Operating Characteristic

Reelin' and ROCin', Receiver Operating Characteristic


By David Lettier

Coming up with a fake scenario, we built up a binary classification problem. We discussed the ROC curve charting the FPR vs TPR trade offs at various threshold levels of class probability. Looking at the curve, we picked the best threshold for the best classifier.

Monday, February 29, 2016

Triforce Overload, Sierpinski Pyramids

Triforce Overload, Sierpinski Pyramids

By David Lettier

We defined the HTML canvas and WebGL context. The shaders were loaded and indexes to their variables were gathered. Camera controls were defined and event handling was set in place. The user can pitch and yaw the camera allowing them to look around the scene. Per frame, we rendered the Sierpinski Pyramids subdivided based on the user’s selection. For added realism, we used the Phong Reflection Model for lighting and employed attenuated fog.

Wednesday, February 17, 2016

Text Mining in R: Death Row Prior Occupations

Text Mining in R: Death Row Prior Occupations

By David Lettier

We collected, parsed, and mined the prior occupations of current death row inmates. Interesting patterns discovered were the large portion of blue-collar occupations (most notably laborer) and more rare prior occupations such as computer operator and jewelry designer. An interesting hypothesis test would be the correlation between being on death row and having been a laborer. The SVD computation could be used for information retrieval allowing one to search for similar current inmates by some prior occupation query. Further analysis could include plotting the amount of each prior occupation seen per year.