Devin review: is it a better AI coding agent than Cursor?

TLDR;

This video compares Devon, an AI coding agent, with Cursor agents, focusing on their workflows, capabilities, and overall effectiveness. The presenter shares his experiences using both platforms to perform various coding tasks, highlighting the strengths and weaknesses of each. He concludes that Cursor's incremental approach is more practical and easier to adopt than Devon's more radical, agent-centric approach, at least for the current state of AI coding technology.

Devon operates primarily through Slack, using a remote server and various tools, while Cursor integrates directly into the IDE.
Devon excels at creating plans, writing and debugging code, and running tests, but can be slow and less controllable.
Cursor offers a more interactive and immediate experience, allowing developers to stay in control and provide real-time feedback.

Introduction to Devon and Cursor [0:00]

The presenter introduces Devon, an AI coding agent with a $500 monthly subscription, and compares it to Cursor agents to determine its value. Devon operates primarily through a Slack-based workflow, utilizing a remote server, browser, VS Code editing interface, and a planner. The presenter highlights that Devon is not an IDE but rather a system where you tag Devon in Slack and request updates or fixes.

Experimenting with Devon: Image Generation and Web UI [0:30]

The presenter tasks Devon with generating images using a new model and creating a web-based UI for typing prompts and viewing images. Devon successfully clones the repository, generates images, and provides updates. Devon takes notes and stores them in a notes.txt file for future reference, also creating knowledge entries for potentially useful information.

Devon's Capabilities and Limitations [1:36]

Devon demonstrates impressive capabilities by creating plans, writing code, finding and correcting bugs, and running tests. It responds to feedback and attempts to address issues raised in Slack. However, Devon struggles to resolve a deployment issue despite extensive back-and-forth communication. The presenter requests to pull the code locally, but the provided instructions are invalid due to a missing pull request.

Devon's Pull Requests and Feedback Handling [2:15]

The presenter shares a previous successful experience where Devon added a feature to a weather app and incorporated feedback to match iOS styling. The final pull request includes new packages and iOS-like styling, but also contains a console log and an unnecessary uninstalled package. Devon generates a deployment with a preview URL to showcase the updated UI. While Devon can save UI preferences in its knowledge base, the presenter encounters issues getting Devon to respond to feedback in this instance.

Debugging with Devon and Explanations [3:26]

Devon is tasked with fixing a bug on a website and creates a pull request with a fix. However, it makes unexpected changes, prompting the presenter to ask for explanations. Devon provides explanations but they are inaccurate. Despite the inaccurate explanations, the presenter appreciates the ability to communicate with Devon like a human and leave comments for updates.

Implementing a Backend Feature with Devon [4:33]

Devon implements a backend feature to read and write from a Comets collection in the GraphQL admin API. The generated pull request is decent, includes a resolver structure, and adds a comment resolver. Although Devon makes up a couple of fields, the code is generally typical of the existing backend structure.

Overall Impression of Devon and Comparison with Cursor [5:03]

The presenter expresses that Devon's workflow is not his preferred method, citing the long wait times and unfamiliar tools. He prefers Cursor's workflow, which allows for real-time updates and local debugging within the IDE. While Devon aims to enable asynchronous agent co-workers, its reliability needs improvement.

Cursor Agents: Fixing Client-Side Routing Bug [5:54]

The presenter uses Cursor agents to fix a client-side routing bug. Cursor scans the code base, finds the relevant files, and updates the necessary variable. The presenter appreciates being in control and able to provide real-time feedback. He demonstrates how Cursor can delete the variable and all references, providing immediate updates.

Cursor Agents: GraphQL Prompt and Agentic Workflow [7:11]

Cursor agent is used with a large internal repo to implement a GraphQL prompt, yielding similar results to Devon. The presenter highlights that Cursor's agent mode eliminates the need to specify files. When cloning an image generator model repo, Cursor asks for confirmation before running commands, showcasing a more cautious approach compared to Devon. Cursor automatically attempts to fix errors, but the presenter's computer freezes before the task is completed.

Final Thoughts and Preference for Cursor [8:11]

The presenter concludes that Devon is unlikely to gain widespread adoption like Cursor, citing its high cost and less intuitive workflow. Cursor's incremental approach is favored over Devon's radical, agent-centric approach. The presenter envisions a workflow where developers iteratively work with Cursor and other teammates, with tools like Builder.io facilitating design-to-code conversion. He expresses excitement for the competition in the agent coding space and anticipates further advancements.