Interview: Claudiu Popa – Using Pylint for Python Static Analysis
We are excited to be launching a new series of interviews. The idea is to interview developers and security specialists that make the ecosystem a better place. Today, I’m sitting down with Claudiu Popa the maintainer of Pylint, to talk about static analysis in Python and the future of Pylint.
Can you introduce yourself?
My name is Claudiu Popa, and I’m the maintainer of Pylint. I’ve been programming in Python for more than eight years and working on Pylint for half of that.
So what is Pylint?
Pylint is a static analysis (SAST) tool for Python. It was created by Sylvain Thénault. It’s used by thousands of developers around the world, and companies like Google use it extensively. It helps find basic linting issues to more advanced errors in Python code.
How did you start working on Pylint?
I started using Pylint over four years ago in a previous job. That’s when I found a few bugs and sent patches for them. As I was sending more and more contributions, I gained commit rights a couple of months later.
And another six months later, I joined Sylvain Thénault and Torsten Marek as a maintainer.
One of my best memories I have was the time we all spent at EuroPython that year working on Pylint, in a sprint we planned previously. Our schedule was so hectic, that we didn’t get to watch too many talks.
How do you use Pylint? How do you recommend using it?
It really depends on the project. If someone inherits a big legacy Python project, I would disable everything and enable it in chunks. After fixing all the issues in the first batch, I would enable more and more rules over time. But it also depends on the business priorities and the time that can be allocated on that.
On a new project, I would, of course, enable every check. I personally use it in the command line and integrated into my CI. It’s ok to disable one specific rule that you don’t agree on, but try to avoid ignoring some issues just to merge your Pull Request faster.
Why is static analysis important?
Static analysis (SAST) is key to every programming language, but it’s especially important for dynamic languages. In static languages like C++ or Scala, you have a compiler that will catch a lot of errors. As a developer, you can do mistakes, and it’s ok to do mistakes. Not every project has a high test code coverage. So static analysis helps you find errors in your code to avoid bugs.
It’s also important to enforce standards between developers to improve code maintainability. Python’s PEP8 is something every Python developer should follow.
What is the biggest challenge with Pylint today?
Today Pylint has three main problems.
The first one is the verbosity of the tool. Some people can be really afraid by the high number of issues that Pylint finds in a project. It can be difficult to prioritize all those issues.
The second is the false positive issue that every Static Analysis tool has. Some checks are clearly known to have too many false positives. But I still think those rules are relevant to the Python community. For us, there’s, unfortunately, no other way to check for those errors. We can’t understand the language like the interpreter does… Python is too dynamic.
The third problem that is also linked to the two previous ones is a dearth of contributors to Pylint. Pylint has a long history of not being a very friendly project for contributors. Yes, patches are more than welcome, but we usually fail at the initial steps, we don’t have a very good documentation, usually it’s better to understand it by reading the code etc etc.. After Pylint moved to Github, under the Python Code Quality Authority group, things have changed in better, but still I feel we have a lot more to change.
I would like to take this opportunity to thank three people that recently had a lot of impact on Pylint. Łukasz Rogalski + Cara Vinson (who’s work on Astroid is helping PyLint a lot) + Ashley Whetter. Also, always thanks to Florian Bruhin, who is helping us with replies on the mailing list, comments on issue tracker. He helped with the move to Github as well, even though he is not contributing to Pylint with code, I consider him part of the Pylint team.
How are you going to solve those issues?
We are working on an exciting feature for Pylint 2.0 to tackle the verbosity problem. The idea is to put batches of checks together and let users enable them over time after they fixed the previous batch of issues. This should make the usage of Python more user-friendly. You fix a batch of issues, and then you move to the next one.
Another feature that we are working on is to add support for control flow. We will infer the data flow and should, therefore, reduce some false positives. But it will also mean that Pylint could become slower.
At some point, we will also look into implementing PEP 484 but it’s a long project, and we are looking for people that could help us on that side.
How do you see the future of Static Analysis in Python?
The great team at DropBox is working on MyPy a static type checker for Python. They are doing a really great job even if it’s still in the early stages. Due to the lack of contributors on Pylint, it’s going to be difficult to compete with that project. But MyPy and Pylint are complimentary, and both have a bright future in the Python community.
I’m also really excited about some new things coming out from Astroid. They will clearly help Pylint in the future.
With Python 3.7, that will be launched in one year, Pylint will also stop supporting Python 2!
Do you recommend any security tools for Python?
Pylint has two great rules that will check for eval() and exec() functions. Both can lead to code injections. But Pylint is by design not a security tool.
It is, however, configurable by design and can be extended to check for advanced security issues.
Besides that, I would definitely recommend to check out Bandit, a static analysis tool for security made by OpenStack.
What keeps you busy besides Pylint?
I’m currently really excited to be working remotely for a healthcare startup based in the US.
Besides that, I’m interested in machine learning and mainly using C++ for that. I recently built a music recommendation system for all the music I bought over the years. I wasn’t satisfied with Spotify’s recommendation algorithm. I’m not using metadata, but audio waves to classify the music.
I also enjoy writing a lot. I’m currently working on my first magic-realism novel. The plot is made of the usual recipe of magic realism novels: reality that seems mixed up with a fantastic world, extremely long phrases, complicated timelines, etc. But I can’t share too many details to avoid spoiling you.
Do you have a favorite quote?
“AM could not wander, AM could not wonder, AM could not belong” by Harlan Ellison in his novel I have no mouth I must scream.