Core Developer @ Hudson River Trading
Hazel is a live programming environment with typed holes that serves as a reference implementation of the Hazelnut Live dynamic semantics and the Hazelnut static semantics, both of which tackle the "gap problem." This work attempts to further develop the Hazelnut Live dynamic semantics by implementing the environment model of evaluation (as opposed to the current substitution model) and memoizing several evaluation-related operations to improve performance. Additionally, we provide an implementation-level description and a reference implementation of the fill-and-resume (FAR) performance optimization proposed in Hazelnut Live. We produce a metatheory and reference implementation of the proposed changes. Our implementation is benchmarked against the existing Hazel implementation to show that the results match expectations, although there is room for future improvement with the development with memoization. Finally, we discuss some useful theoretical generalizations that result from this work.
Image vectorization is the process of converting a raster (pixel-based) image to a vector (shape-based) image. While raster images are the dominant mode of image representation, vector graphics may be more efficient for highly geometric images, such as logos, fonts, and maps. Edge tracing methods for vectorization produce clean edges but assume a color-thresholded image. Sampling-based methods work well over color gradients but produce a mesh that may not be well-aligned with edges. We aim to create a hybrid pipeline that combines the benefits of these two methods: performing well over color gradients and producing clean edges. Our sample implementation in C++/Python demonstrates that our method tends to perform better in terms of accuracy (MSE) and visual presentation of edges than the base methods, at the cost of some efficiency of representation.
This project closely follows the tutorial, "Implementing functional languages: a tutorial" by Simon Peyton Jones at Microsoft Research. The goal is to implement a simple untyped version of a lazy functional language with a syntax similar to Haskell or Miranda. This work builds up the compiler front-end (lexer and LL(1) parser) and several back-ends, and an IR that is shared between all of the back-ends. The first implementation is the lazy "Template Instantiation" interpreter; the second is the "G-Machine", which approximates the STG (Spineless Tagless G-Machine) behind Haskell's implementation. The G-Machine compiles to a series of imaginary low-level opcodes that are mapped to real hardware.
CUDB (Cooper Union DataBase; alternatively CUDA++) is a simple document-based, MongoDB-inspired NoSQL database for learning purposes. The database is in the form of a Rust library that supports basic CRUD operations similar to MQL and a document-based data model. CUDB supports basic database features such as indexing using B-trees and persistence, but does not support many advanced features found in commercial DBMSes.
We aim to create a simple static analysis tool that tracks the use of stackallocated buffers to known memory-unsafe functions in C source code, such as the notorious strcpy
standard library function. This is achieved using a dataflow analysis and call graph tracing on the unoptimized LLVM IR generated using the clang compiler frontend. We develop a novel dataflow analysis method, called "buffer origin dataflow analysis", that allows us to track assignments of buffers. We evaluate this method and describe its shortcomings. A reference implementation is written using C++/LLVM.
Used SBCL (Steel Bank Common Lisp) to scrape eBay for seller data. This uses the lquery
library to make HTTP requests, lparallel
to parallelize requests, and cl-mongo
to store data into MongoDB. Some data cleaning was necessary to get results in the same format.
Used the Haskell Beam ORM library to interface with a sample PostgreSQL test fixture (schema), and explored the trade-offs of ORMs in a non-OOP language. Explored various RDBMS concepts such as CTEs, correlated subqueries, one-to-many and many-to-many relationships, foreign key constraints, ORM libraries, etc. Also discovered a type error in the Beam library that leads disallows certain valid queries, as well as a workaround using roughly-equivalent queries.
Two goals are achieved: the extension of a call/cc
-like interface to support multiple simultaneous continuations using an internal CPS representation, modeled after the nondeterministic interpreter from Structure and Interpretation of Computer Programs; and a comparative exploration aimed towards students unaccostomed to CPS (e.g., imperative programmers). The exploration also delineates common use cases for continuations and some notes about implementation.
A color organ is a speaker with lights that light up when certain frequencies are played. We implemented a very simple version, mainly comprising a set of active bandpass filters lighting up LEDs for three frequency bands, and a power amplifier to drive the speakers. (a.k.a. SSSSS: Seventies-Style Sight and Sound System). The built product achieves its role as a school project, but the power design is not well-calculated (leading to excessive heat generation and speaker blowout) and the filtering method (thresholding with momentum for detection and denoising) is naive.
The canny algorithm is a common multi-stage algorithm for edge detection in image processing. The standard algorithm is implemented on CUDA and performance is compared to a single-threaded CPU version. Several proposed algorithms from Luo and Duraiswami (2008) are implemented (including tiling, factoring, and a localized version of graph search) and compared to the naive versions, achieving the theoretical speedup.
A (poor) C99 compiler for the compilers class. Aims to implement a reasonable chunk of the C99 standard. The compiler was built using the standard compiler tools flex (lex) and bison (yacc). Successive assignments included lexing; parsing (expressions, declarations, statements); quad generation; and target code generation. Builds working but very unoptimized x86_64 target code (no time to learn register allocation, so all temporaries are memory-backed). The compiler generates correct output for several test cases using the supported parts of the C99 grammar.
A convex hull cloud service, built on AWS in order to be scalable, highly-available, and performant. Built serverless on top of S3, Lambda, and API Gateway. Users can upload an .obj 3-D model and download or preview the convex hull of that object. Uses @markus-wa/quickhull-go for a 3-D Golang QuickHull (convex hull) implementation. Front-end powered by Vue, back-end (lambdas) written in Golang. (Also wrote a 2-D quickhull implementation and failed to write a complete 3-D quickhull implementation in the time provided, included in the source repository.)
Graph coloring using the Gebremedhin-Manne distributed-memory speculative coloring algorithm for the ECE453 Cloud Computing. The goal is to achieve speedup with parallel (CPU) computation of a real distributed computing algorithm. The first assignment was to implement this on a multi-threaded single-node application, and the second assignment was to extend to multiple nodes. While this particular algorithm scaled well onto multiple threads (a distributed-memory, single-node architecture), high communication costs between nodes and insufficient network speeds prevented a speedup.
Lucy-Richardson is a blind (i.e., without a-priori information of the blur mechanism) iterative deconvolution algorithm used to deblur an image. CUDA version compared with CPU version in terms of accuracy and speed. Deblurring effectiveness was evaluated using a LoG filter (edge detector). CUDA acceleration achieved an order of magnitude speed improvement with a naive implementation. Additional tiling and factoring improved speedups even more.
Parse tree generator using CYK algorithm on a Chomsky-Normal Form (CNF) grammar for ECE467. The CYK algorithm is a dynamic programming approach to generating all valid parses of a CNF grammar. The resulting parses are pretty-printed in a LISP-like tree format. Several test cases are provided on a simple grammar.
Naive Bayes bag-of-words (supervised) text categorization on three sample corpora. Uses the NLTK word tokenizer and Laplace (+1) smoothing.
Neural network written in Scheme. Uses naive, mostly-functional approaches (linked-list-based) to typically imperative (array-based) approaches, at the cost of performance. "Classes" for different layer types (sigmoid, dense, and loss layers) are implemented as a polymorphic (train-lambda, infer-lambda, weights)
tuple. Default weights can be specified (for deterministic results) or randomly generated. Using LISP allows interactive debugging and support using the native interpreter (as opposed to a difficult-to-use C "REPL" like gdb). Tested on several small datasets.
We base our methods and intuition off of Li et al. (2018), which attempts to define and estimate an "intrinsic dimension" of a given learning problem in conjunction with the neural architecture (the objective landscape). This is equivalent to approximating the minimum parameterization of an objective landscape, and may have practical application in model compression. We extend these methods with a series of experiments, mostly concerning using nonlinear transformations, in order to find a more minimal intrinsic dimension. For ECE472 final project.
Similar to the projects on digital and analog modulation schemes below, but experimenting with a linear block code (LBC) and error detection/correction. For ECE300.
Similar to the project on analog modulation schemes below, but with digital modulation schemes. Experimented with binary antipodal, binary orthogonal, PSK, DPSK, QAM. Theoretical results are largely reproduced. For ECE300.
Checkers-playing AI using minimax-search with alpha-beta pruning and time-limited iterative deepening. Uses a custom heuristic based off of common Checkers strategy. Both a C++ and a Chez Scheme version were written, using very similar data structures and optimizations; Scheme version reaches roughly the same depth as C++ version. Scheme version attempts to remain mostly functional (e.g., using an immutable bytevector state representations, which allow rewalking the "history" of states like in Redux); also includes some LISP-specific features such as a multi-level escape continuation to exit the search when time limit is exceeded. For ECE469.
Created an implementation of the MobileNet architecture described in arXiv:1704.04861, and verified some of the results produced by the author. Used CIFAR-10 rather than ImageNet due to computational limitations. Assignment was to reproduce the findings of a research paper. Midterm project for ECE472.
Used MATLAB to simulate signals and analog modulation schemes (conventional AM, SSB AM, FM, PM), and the results of applying changes to the parameters or of adding noise to the modulated signal. Able to reproduce most theoretical results fairly well. For ECE300.
Exploratory session into Scheme as the first installation of the IEEExACM club How-to series. An early attempt of mine to explain the usefulness of declarative (as opposed to imperative) programming, homoiconicity, a minimal syntax, the cons
building block, etc. Performed to an audience of over 20 attendees.
Linear regression w/ L1, L2 regularization
Logistic (binomial and multinomial) regression with stepwise pruning, L1 regularization
Reproducing a wrong and correct methods for doing K-fold CV
Regression using gradient-boosted trees (xgboost) and random forests (sklearn)
Comparing feature importance and performance betweeen GBT and RF
Market-basket analysis (mlxtend)
Recommender systems, and an implementation using non-negative matrix factorization (NMF)
Experimenting with various machine learning techniques for ECE475. This includes supervised learning: linear regression/classification, ensemble classification (gradient-boosted trees and random forests), recommender systems; and unsupervised learning: market-basket analysis. Also explored proper validation techniques and regularization techniques: L1, L2, complexity pruning for decision trees.
Explored various web design procedures and technologies, including but not limited to: CSRF injection attack, session cookies, Java Spring framework, React.JS, Bootstrap, HTTP protocols, containerization (w/ Docker), many-to-one relationships (e.g., access control) in PostgreSQL, prepared SQL statements, one-way hashing (w/ bcrypt), SQL foreign keys, etc. For ECE366.
Constructed a theremin (the instrument) using the principles of circuit design and operational amplifiers. Final project for ECE291.
Implemented various common sorting algorithms adapted for provided datasets. Explored quicksort, counting/bin sort, indirect sort, insertion sort.
Engineering design project aimed towards providing a low-cost alternative to diabetes test strip. Uses a chemical that changes color based on glucose composition. Accompanying Android app to help determine glucose composition from color. (First time using React.) For EID101.
Recreated an 8x8 version of the Tron game in hardware as the Digital Logic Design final project. Involves use of CMOS 4000-series logic chips, flip-flops, and various timing optimizations. For ECE150.
Create Performance Task for AP CSP. Worked with Rahul Kiefer. Multiplayer "racing" game written with a Node.JS+socket.io backend, THREE.js 3D library, JS orientation library for smartphone controls. (Much of the same technology as Fruit Sensei.)
Itmine is a tool to help people not lose things. Generate a QR code to put on your belongings; if lost, it will generate a shipping label to the owner and give a small monetary award to the finder. Powered by the MEAN stack and the ShipEngine API. Submitted to the ERP Hackathon hosted by General Assembly.
Map out a region using accelerometer data, using the basic principle that distance is the second integral of acceleration. Won third place project at Stonybrook Local Hack Day 2018.
A gamified community with an education-focused forum. Won Best Use of Algolia award at HackCooper 2018.
Created a number of projects for the MoMath: Expressions 2018 hackathon. See repo for more details.
Use your smartphone as a controller in a fruit-slicing game! Back when Fruit Sensei was popular. Won Best Game at StuyHacks Local Hack Day 2017.
Created a number of projects for the MoMath: Expressions 2017 hackathon. See repo for more details.
First hackathon project (LIHacks 2016)! Worked with Chris Vassallo to create a user-guided music generator using common sequences of notes. Won Most Entrepreneurial award.
Online presence for School of Engineering 2021 EOYS during COVID-19 pandemic. Performed technical advising and support: hosting & domain name (w/ GitHub Pages), Vue/Vite framework setup, Bootstrap integration.
Maroon and Gold Labs incubator for Cooper students. Created on Strikingly. Implemented a DonorBox integration, designed several visual assets, and provided technical and design-oriented advice.
Web app for the newly-founded "Safe Rides" program at JBHS. Provides a centralized management (Node.JS web backend) to communicate between volunteers and clients. Volunteers have to check in each time they reach a checkpoint on the route, and relevant parties are notified and their location shown on a GPS map. Uses Twilio to send SMS messages to relevant parties. Not operational anymore.
Configuration tool (v2) source
Blog post (Driver fundamentals)
Blog post (Button mapping journey)
A Linux driver for VEIKK-brand digitizers/drawing tablets using the usbhid API. Configuration options are exposed in sysfs and configurable with an associated GUI. Connects low-level hardware events to the input subsystem (e.g., libinput), with some processing/mapping logic in between. This draws heavily from the Wacom driver for Linux.
Template for compiling a Groovy application into an uber-JAR using mvn so that you can easily run export it without requiring a Groovy installation on the client.
Second attempt at making a gallery website. Preceded by Gatollery.
Second attempt at handling Project Euler, but in Java (first time was in JS). Slightly less naive implementations.
First attempt at making a gallery website. Superseceded by Catcake.
CSS classes as "color filters" using Sass.
My (current) personal blog. No front end frameworks employed (all JS/CSS is vanilla). This used to be hosted on Heroku at https://everything-is-sheep.herokuapp.com, but now is being hosted on a subdomain of this website.
Anagram finder. Additionally, scrambles words as a game for you to find an anagram.
Simplicity in the new tab page on Google Chrome. For those who don't like clutter.
Messing around with WebRTC for P2P video streaming.
Second iteration of a personal website.
Playing around with what we learned in physics class. Spring/elastic band simulation using p5/processing.js. Adjustable spring constants.
Using what we learned in AP Physics.
Website to keep track of scores for the JBHS bowling club and predict Varsity players for the upcoming week. First time attempting to use Angular Material.
Randomly scrapes random Wikipedia pages and chooses sentences that start with a 5-7-5 syllable pattern (according to Datamuse API).
Generate fractals with Chaos game (Sierpinski triangle & chaos squares/hexagons), the Thue-Morse sequence (Koch curve), and convergence of complex series (Julia/Mendelbrot sets)
Simple memory game in JS.
Simple JS tic tac toe. Doesn't play very well.
Rock paper scissors against a random computer opponent.
A simple tic-tac-toe player in C.
If you copy HTML from a webpage using Ctrl+C, you can paste it into this input box (using Ctrl+V) and get the underlying HTML. Handy little tool.
Save image as text! Also change font family, size, (some) lighting, letter colors, and more! Can be used to imitate simple text-only logos.
My attempt at making a simple debugging console like Firebug, since the browser developer console was slow (at the time).
My second attempt at revamping the original Nutmeg Bowl of Fairfield website. They didn't accept it.
Desktop xkcd viewer with a JavaFX GUI. Requires Java 8+.
Desktop xkcd viewer in the (Linux) terminal. Requires ImageMagick.
Imitation of the original game of the same name.
Real-time coding in the browser (like Google Docs but for code). Similar to the original chat, but experimented with syntax highlighting for inputted text.
First attempt at a blog. MySQL backend. Lazy-loading front-end. All the posts have been migrated to Everything is Sheep. Used to be hosted at thehomeworklife.co.nf.
Fun tool to make strings of text out of periodic element boxes. I used this to make a Christmas tree out of some Christmas song for Honors Chemistry.
Visual aid to help when balancing chemical equations for Honors chemistry.
First iteration of a personal website.
My first Node.js application (!!!). Learned how to use ExpressJS and socket.io to create a real-time chat server. Was also playing around with Chrome extensions at the time and made one that connects to the main chat server.
Aimed to help my sister win the regional Spelling Bee (which she did!). Playing around with the voice using the native JS speech synthesis API.
Fun little visual. Props to Hunter Lightman for the visuals and showing me how to use trig to animate circles with code for the first time.
Naive attempt at Flappy bird. Buggy but kind of fun.
First attempt at Project Euler. Many of these are brute force. Written while attending a summer program for learning C.
First website presented to peers. Calculator tools (in JS), and an explanation of how HTML/CSS/JS/PHP fit together to form a website.
Blog post (understanding memory)
An experimental operating system for learning purposes. Developed for the x86_64 platform using the QEMU virtualization software. At the time of writing, this uses the Limine bootloader and I'm working on developing very simple text-mode terminal/keyboard/screen drivers. Very much a work in progress!
A flexible framework for defining (rule-based) board games. The goal is to provide a framework for digital game creators to define games programmatically, as well as a series of pre-built pluggable components (e.g., card game mechanics) that fit this framework. This provides the usefulness of a framework with predefined API's and components, but also the flexibility of a library to provide custom implementations for these API's, or to override the API's themselves as necessary. To accomplish this, I initially attempted to use Golang, but realized that the type system was not advanced enough; Scala's more advanced type system and JVM support made it the language of choice. (I think the original name was supposed to be "basic game engine", but I'm not sure if that's actually the case.)
© Copyright 2023 Jonathan Lam