Applied Research
At CHAI, all of our growth is driven by better AI.
Optimize
Our focus is on applying proven techniques such as RLHF, SFT, Prompt Engineering, Rejection Sampling, and LLM routing. We iterate across various algorithms such as DPO, LoRA, training reward models, and embeddings to build recommender systems. Only by making the AI better do we see an increase in metrics such as monetization and engagement.
Scale
It is well-known that scaling up LLMs improves their performance.
There are many dimensions to this scale. For example: parameters,
dataset size, inference compute, context length, and the number of
LLMs served. This creates an engineering challenge.
Firstly, scaling up tends to increase costs; this drives us to
experiment with techniques such as quantization, custom CUDA
kernels, Flash Attention and KV-Caching. Secondly, at a certain
scale, out-of-the-box solutions tend to breakdown. This drives us
to build our own custom implementations such as our own
self-managed Kubernetes cluster or inference engines.