Overview
As we all know, the age of AI slop is truly upon us. You can’t go anywhere online without seeing AI-generated writing trying to pass off as human-written. This is an especially critical issue in the context of education; there are millions of people getting college degrees with near-zero effort nowadays by offloading all their cognition to LLMs. I’m not anti-AI overall (it’s definitely beneficial in some parts of education and general productivity), but I wanted to attempt solving this problem.
ProofWrite is a proof-of-concept platform for real-time AI authorship verification. It works by tracking a user’s keystrokes over the course of writing a document, computing properties of the keystrokes, and performing a search in a vector database of other documents (some AI, some human). This approach is ideally able to catch AI-generated text not just when it’s copy-pasted, but when it’s typed out word-for-word off of another monitor, for example.
Technical Details
The core of the project is keystroke analysis, which is done using an AI agent that has the following tools available:
detect_paste_events: get all the paste events in the keystroke historyget_keystroke_metrics: get a vector representing the keystroke metrics/anomaliesanalyze_text_patterns: use a keystroke metrics vector to return a classification (i.e., human or AI) and confidence score using vector searchanalyze_linguistic_features: for any selected segment of the document’s text, analyze various linguistic patterns and provide scoresget_text_source_similarity: for any selected segment of the document’s text, compare it to a database of AI/human text and obtain a similarity score for eachgenerate_report: generate final analysis report
These tools give the agent options for what metrics to rely on for its report.
For example, if a certain paragraph looks suspicious, the agent can provide that paragraph to analyze_linguistic_features and get_text_source_similarity to determine metrics for that paragraph in particular, and describe those findings in the final report.
Here is what that report looks like in the app:
The app also provides the ability to replay a document’s keystrokes from start to finish, in order to analyze the typing patterns yourself:
For some more details on the vector search component, usually this involves vectors in dimensions like 384, 768, 1024, or 1536, which are created with a vector embedding model. Instead, we opted for a more unique approach of creating our own vector that encoded the keystroke metrics data, and then building up a vector database with entries corresponding to either human- or AI-generated documents. Then, when the vector search is run in MongoDB Atlas with the vector of a new document, it finds the closest vectors. If there are more AI document vectors nearby, the classification will be AI, and same for human.
Hack Midwest 2025
We built this app with the “Best Use of MongoDB Atlas” prize in mind, and I’m so grateful to say we won it!
The app idea originally began at a company hackathon I participated in during my internship at Pinata. The other interns and I whipped up a small-scale prototype of the concept, using the Pinata API to store the keystroke history of documents on IPFS as an immutable record. For Hack Midwest, we didn’t have time to implement IPFS in parallel with MongoDB Atlas (and also decided it might just confuse the judges), but I do think a proper implementation of this concept would track the history of writing on a decentralized public record (like with the Pinata API!)
Repository
anton-3/proofwrite
detect AI-generated text with keystroke analysis
Updated 3 months ago
Last updated on December 28, 2025.