Setting another new SOTA on the Generalized AI Assistant (GAIA) benchmark with Trase Agent
We stayed at #1 on the GAIA leaderboard for several months before being bested by Google LangFun and H2O GPTe. Now, we've reclaimed our spot at #1!
We stayed at #1 on the GAIA leaderboard for several months before being bested by Google LangFun and H2O GPTe. Now, we've reclaimed our spot at #1!
Lately I've been working on agents at Trase Systems, and we recently set a new state-of-the-art on the GAIA benchmark using a modified version of Self Taught Reasoner (STaR). I thought I would share some of the things I learned along the way.
I've been working on helping ketamine therapy patients illustrate their experiences using stable diffusion. As the MidJourney Discord bot is generating images, it makes the wait time a little more enjoyable by showing you your images at the current step. I was surprised that there wasn't any paid API or simple code on GitHub that had this feature, so I built the frontend and backend using SDXL, Diffusers, WebSockets, React, and FastAPI and made the code open source.
My journey into open research with Stability AI MedARC, where people were shown images while in an fMRI machine and we were able to decode those images using contrastive learning and diffusion models. So basically mind reading. We set a new state-of-the-art and got accepted as a spotlight for the NeurIPS 2023 conference. I had a good time being the oldest person in the Discord channels.
Over the past few months I've trained several DreamBooth models for clients to inject new concepts into various base stable diffusion models. In each case, the client wanted to insert several new concepts into the model, as opposed to the typical case of inserting a single concept like a new face. In this post, I'll describe the process I used to train these models and the heuristics I've developed for hyperparameters.
I developed this website to transform Jupyter notebooks into collaborative whiteboards similar to Figma, with the goal of streamlining remote data science meetings and reducing the frequent, "Can you scroll up a bit? No, too far, go back down..." interruptions. It incorporates Figma-esque functionalities such as displaying each collaborator's mouse location in real-time and syncing freehand drawings, text, and sticky notes. Additionally, it offers notebook-specific features like synchronized scroll positions, ensuring that all collaborators are literally on the same page.
Tabular AutoML is where you provide a table (via CSV or whatever) and a column that you want to predict, then a search process finds the pipeline to predict that column most accurately within some time limit. Given the proliferation of open-source tabular AutoML frameworks and the fact that I spent a lot of time working on the closed-source Darwin one at SparkCognition, I thought it would be helpful to have a single API where you can access all of them.
I played paintball competitively for over a decade and was curious if an off the shelf pose tracking algorithm could be used for next gen paintball stats.
A way to load BibTeX paper citations into your Arxiv Sanity library so that you can find even more papers to read. A lot of times I'll cite a bunch of things in an Overleaf paper and then use this to add the references back to Arxiv Sanity so that I can get recommendations based on it.
Zipline is a Python library for backtesting and trading quantitative strategies. TensorBoard is a visualization tool provided with the deep learning library TensorFlow. These two can be used together to create a quick and easy dashboard that monitors and compares Zipline backtests.
Fine-tuning ImageNet Caffe models to classify flowers.