My opinion after testing some AI code assistant

Posted on 2023-08-21 in Programmation

With all the hype around AI and since I had time to spare, I decided to test some AI coding assistants to make my own opinion about them. I'll start by giving my opinion on each assistant I tried. I will be a bit fuzzy since I didn't intend to write this blog post and didn't take as many notes as I should have. Then I'll wrap up with my general opinion and some advices.

I ran most of my tests writing a small task manager in React, VueJS, Svelte and Elm. I tried to switch between all of them between each to view the differences. I tried CodeWhisperer last on a dedicated test in TypeScript and in Python, not on the task manager project. I ran the same tests with Codeium to have a comparison point. I mostly used VSCode for my tests and use Codeium a bit with PyCharm.

Tested code assistants

AWS CodeWhisperer

AWS CodeWhisperer is AWS take on the subject. It's tied to your AWS account and AWS extension for your editor. It's free for individuals.

According to my tests, it's the worst overall. On the bright side, it did generate a few test cases that were more relevant to my use case than other tools and can handle imports automatically. But it suggested any way too often in its TS completions, proposed less variations between suggestions, proposed more irrelevant suggestions, proposed more complex suggestions and relied too often on line to line suggestion (versus block/function suggestion for other assistants). It's also way slower and felt unresponsive at times.

It's also the only tool that proposes a security scanner. I tried in on a piece of Python code and its suggestions were good: I was lacking a context manager and doing a SQL injection which the tool spotted. It was very slow, even on my small file (~10 lines of code). I don't know how it behaves nor what it reports on larger and more complex pieces of code.

To conclude: It's definitely a pass for me. In this current state, it's also a pass if you are tied to AWS product. Just code without, you should have a better time. The security scanner can be a good thing if it works properly on large code bases though.

GitHub Copilot

GitHub Copilot is you guessed it made by GitHub. It's linked to your GitHub account and requires a dedicated extension in your editor. It's not free even for individuals, but comes with a 1 month trial period (that's what I used).

According to my tests, it proposed the best suggestions of all tested tools. It's even good at not messing up too much parentheses and curly braces completions in existing code (where all other tools had some/more issues). It's also fast at responding. It does the job and does it well (for an IA assistant).

There is also a chat in beta. I couldn't try it: it's only opened for business accounts right now. But, from my experience with codeium it should be a valuable additions.

Codeium

Codeium is a product made by a startup. It requires an account (you can use your Google account) and a dedicated extension in your editor. The VSCode extension is the most complete since it supports all features (completion and chat), but I tested it in PyCharm and code completion work fine (I lacked the chat though). You can test it in their playground before creating an account nor installing anything. That's a good thing in my opinion. It's free for individual use. It's also the tool I used beyond my small IA assistants tests (because it's good, the chat is quite handy and it's free).

It's suggestions aren't always the most relevant and it tends to fail for parentheses/curly braces completions. When generating test cases, I had to guide it to have something good. It invented a method named toBeUpperCase and in a forEach loop instead of using the item it used the array and the proper index. Weird (but working). Inventions struck me more with this tool than with the others (but it's also the one I used most, so I may be biased). Its different proposals are good and different. By default, in TypeScript, it create fat arrow functions on multiple lines even when the result could be a one liner. In Python, it tends to propose pass as the body of the function by default (you can then make it write it).

It's also the only assistant I tested that propose more features than mere code completions:

It comes with a chat that is a must use. Instead of searching Stack Overflow for something, most of the time I could just ask it how to do something (like a form in Angular) and it would respond. Depending on the answers, I could get a code snippet I could use, a detailed explanation or I had to ask it again (and sometimes go to Stack Overflow).
It proposes tools to refactor code: you select a piece of code, run a default refactoring action or explain what refactoring you want. Then it proposes a code snippet you can apply to take the refactoring into account. I only used it on simple tasks, but it worked well. This sounds really promising since refactoring can be really boring.

My opinion

With the notable exception of CodeWhisperer I enjoyed working with these tools. I mostly wrote the name of the function I wanted and let the tool complete it. Test cases aside, I don't think I had to explain in natural language what I wanted to get a good result (please note that I used very explicit function names). I tried to disable it to see if I would miss it. And I did. Mostly for writing boring code in my place!

There are a few things to keep in mind while using these tools:

They are not always up to date.
They can propose really stupid things or "hallucinate". What's more surprising (even if you know a bit how they work), is that they propose both really relevant suggestions and really stupid ones in the same session. It also help put into perspective our potential replacement as a developer (and you still have to explain to it what to do!). So for now, developers are safe.
It's a bit frustrating that they get parentheses/curly braces wrong so often.
Depending on the context you use it in (existing file, opened files), results can vary. Try to open relevant files to help it do more relevant completions.
It can be tempting to not refactor the supplied code because it arrived fast and was "good enough". Since the tool allowed me to be more productive, sometimes I just wanted to be a bit too "productive" and move on to other part of the project and leave the code as is.
Beware not to get stuck on the proposed solution for too long: if it doesn't work and you fail to get it working, starting from scratch by yourself may save you time in the end (or chatting with the bot, or search Stack Overflow).
Solutions (both autocomplete suggestions and answers from the chat) are without context. So you can't know easily (besides your own experience) whether they are good. At least on Stack Overflow you have ratings, comments and other answers to help you (I always read comments and other answers when I'm not sure about the top one, I hope you do too!).

How about learning?

You can use these assistants to help you write code in a language you don't really know. I tried that with Elm. It was more pleasing that doing it only with the documentation and web searches: you can rely on completions to get to your goal more rapidly and with less frustration. Especially with a language as different from the ones I know as Elm.

Did I really learn or practice Elm? No! That's not really a problem in this context, because I didn't really want to learn it. I think it can be if I really tried: how could I get into the right state of mind, learn by practising if a tool gives me the answers? I could understand the code and edit it a bit thanks to my experience, but overall, I don't really know if the code is actually good.

More generally, I don't think you can truly learn anything with a tool like this: you need to think, fail and write code by your own to learn. I don't think there is an alternative to this. But these tools will make it harder to do it correctly. For instance, I tried to do some type challenges in TS. With the autocomplete on, I just got the answers. Answers I could not have found on my own without trying a bunch of things (and some I just couldn't get without the solutions since they relied on things I didn't know).

And you also need to know whether the code is good, secure and maintainable.

The chat can be useful to get relevant information right from your editor though. But I think you'll still need to write code by yourself, experience developers to review your code and blog/courses/books to move forward in your programming journey. So I don't advise using completions when trying to learn: you could feel more productive at the start, but I think it will come and bite you in the long run.

How about using it on the job?

With all that said, these assistants sound like the prefect tool to help to you in your job. There are some points to consider before that though:

Technically, you must pay for them to use them in a professional context. They are not expensive, but depending on your company size, it can become quite expensive.
They are many legal considerations to take into account:
- You could violate copyrights: the assistants learned from OSS projects and may generate license protected code that will end up in your project. It's mostly dangerous for GPL protected code. While technically you could do it before, you had to conscientiously do it, search for it and use it. Now, it's at the finger tip of millions of developers that will have no idea they are infringing any copyright.
- Who owns the output of an IA tool is not clear.
- This requires to send code to a distant server you don't control. This could result in code/data/secret leakage. You need to make sure your provider won't learn from your code. You need to trust it and read its data policy: to be more accurate, it's too important to just accept without understanding it in a professional context and you need to get legal involved.

So don't do this on your own, involve your co-workers and management. Even to use the tool from time to time at your job with a personal account. To avoid having problems over this, you must get management support before using the tool. Otherwise, this too can come and bite you!

Wrapping up

AI assistant can be very powerful tools to help you refactor code, get answers faster and help you write code (either tricky or boring).
Probably not the best tool to learn and practice programming. I think it applies to all other activities.
You still need to rely on your experience to maintain code quality and security (and to make the whole thing work).
Make sure to have management support if you want to use them at work: they bring a lot of non technical issues you cannot solve on your own.

Other takes on this:

Exploring Generative AI.