PyParis 2018

At Sqreen we try to attend any conference that is related to our tech stack. To learn and interact, and, sometimes, to present. This year at PyParis 2018 we did all of those things. It was a well-organized event, with two main tracks; one for data science, and one for regular Python development (called “web/core”). Most people, regardless of their main track, could find something relevant and interesting for them.

The conference kicked off with a keynote by Nina from Microsoft: a Python developer evangelist for Azure, who had recently flown in from Portland and whose internal clock was still tuned to the middle of the night. But that didn’t stop her from delivering an energetic and fun presentation about technical debt!

After the keynote, the conference split into the tracks.

Sqreen’s own CTO – Jean-Baptiste Aviat – did a presentation at the conference, called Scaling from 0 to 60k RPM (requests-per-minute) (slides, video).

It was a fast-paced retrospective on how Sqreen has dealt with scaling issues from way back when our APIs had 0 RPM, up to now – when we have about 60 000 RPM. It was an accessible presentation even if you don’t have a lot of experience in Python, or scaling for that matter.

The conference’s data science track was pretty good. Our senior data scientist Bartosz comments:

In general terms, I found the conference interesting. Especially, the merge between the data and web communities was a very good idea, but unfortunately these two tracks were 5 floors apart, so changing between them was not an easy exercise. I attended mainly the data track: there were many interesting talks, but I wish some of them explored the technical aspects in more depth.

Overall, the data track was well balanced, so that the talks were accessible to people without a data science background, while still remaining interesting to seasoned data scientists.

Our highlights from a data science perspective were:

  • Array computing in Python by Wolf Vollprecht (slides, video)
    The speaker from QuantStack did a very good job at giving a chronological account of the “vectorised” array computation capabilities in Python (for those who don’t know, arrays allow to run computations on all array items in parallel).
    We also learned about a few interesting libraries to look into (xnd, Pythran, generic code in NumPy – NEP18)
  • Interactive widgets in the Jupyter Notebook by Martin Renou (slides, video).
    Another QuantStack employee reviewed the interactive widgets that can be embedded into Jupyter notebooks. There were some impressive examples, like a cartographic map with embedded graphs.
  • Deep learning of hotel images by Christopher Lennan & Tanuj Jain (slides, video)
    Not surprisingly there were quite a few talks about deep learning. We especially enjoyed this one. The speakers (Idealo’s data science team) presented an automatic quality check of hotel images that was used in production at Idealo.
  • Understanding and diagnosing your machine-learning models by Gael Varoquaux (materials)
    This was a gem of a session. It wasn’t a part of the main data or core tracks, but rather the smaller hands-on tutorial track. It covered advanced methods for debugging machine-learning models, and presented by one of thescikit-learn creators.

Apart from the data track, the atmosphere was great too, and in between the presentations, we managed to meet a lot of smart people to chat with. Our backend team lead Benoît writes:

I mostly attended the Python track. The atmosphere was very nice. It was packed, because the conference took place in a school. There were plenty of nice people to chat with during the breaks/lunch. Many people seem to be working on, or were very interested in security.

Benoît’s highlights for the conference from both tracks:

  • Crossing the native code frontier by Serge sans Paille (slides, video)
    An in-depth look at the internals of CPython, exploring how we can make computations much faster.
  • Vim Your Python, Python Your Vim by Miroslav Šedivý (slides, video)
    Awesome talk about getting most power out your keyboard and vim setup. The subject could seem like a solved issue. It’s not and the delivery of the talk was top-notch!
  • Deep learning of hotel images by Christopher Lennan (slides, video)
    It is awesome that nowadays computers can tell you “this is beautiful”, my philosophy teacher in high school would probably be aghast!
  • Machine Learning with Scikit-Learn: quick clusterization of a very large malware dataset by Robert Erra (slides, video)
    Applying large graph technique to find similarities between malwares and help classify/analyze them quickly. Security + Data Science <3

Sélim – our senior backend developer – enjoyed the overall atmosphere and topics as well:

In general the ambiance was great, and there were a lot of people to talk to. Some of the topics were very interesting, and left you wishing for an even more in-depth discussion.

Sélim’s recommendations from the web/core track are:

  • Crossing the native code frontier by Serge sans Paille (slides, video)
    Really good details about Python internal. You feel that the speaker know his subject.
  • Scaling from 0 to 60k RPM by Sqreen’s CEO Jean-Baptiste Aviat (slides, video)
    Obviously this was one of our favorite presentations – though I admit we are all a bit biased.
  • Invitation to a New Kind of Database by Sheer El Showk (slides)
    The global idea of the talk was to introduce Datomics which is a proprietary database and to ask if people in the room would be interested to reimplement it in open source. It was an interesting topic, and made us want to look more into how Datomics works.

And finally – my (Janis) own favorites were:

  • (Already mentioned by Bartosz) Array computing in Python by Wolf Vollprecht (slides, video)
    This was a very good talk for someone without a data processing background. It put forth some history, and some concepts at an accessible introductory level. It was also good for learning some keywords to base future research on, if the concepts seem interesting.
  • Serverless Python by Michael Bright (@mjbright) (slides, video)
    This was a well presented overview of the available options for creating serverless applications using Python (but they also mentioned Swift <3). The presentation wasn’t super deep, but it’s a decent high level look at what’s happening in the serverless world.
  • (Already mentioned by Benoît) Crossing the native code frontier by Serge sans Paille (slides, video)
    The presenter was really good, and the topic was both well presented, and well suited for those interested in native modules for Python.
  • GraphQL in Python and Django by Patrick Arminio (slides, video)
    Decent talk for a high level intro into GraphQL.
  • And of course Scaling Sqreen from 0 to 60.

In summary, the conference was well put together, and there were some really great people attending. Each track had something for everyone. Those wishing for more in-depth exploration of a topic received a lot of information to use as the basis for future research.

If you want to check out the entire list of talks, you can do so here: http://pyparis.org/talks.html (a big thanks to the organizers for making the slides and videos available online so quickly).