September 29, 2023
Earlier this week I was reviewing some Python code with a junior colleague, and thought: "Python can be pretty nice." This piece of code was in "production." I mean, while it was up and running on our dev environment, other people were relying on it. It was a single file, with a utility class and a main function. Less than 90 LoC.
Today, I was harshly reminded why I dislike Python: It's deployment story is between poor and dangerous. Somehow, the dependencies of that piece of code, managed to get uninstalled from the Spark cluster it was running on.
I had no say about that.
– You should've been using Scala for that particular job – A colleague said. – Just make a fat jar and all your problems will be solved. –
And I knew he was right. My earliest attempt at that code was written in Scala.
But turns out the Scala request library didn't really work out of the box
in our Spark cluster. My guess is it had something to do with
the allowed encryption algorithms on the cluster.
Rather than figure out the problem, and knowing that Python's requests
were working all right, I rewrote in Python. Took me about 10 minutes. Done.
With an uneasy feeling. Because we were installing Python libraries
through a notebook with %sh
.
The solution involves runing said notebook, prior to every run of my script.
Now comes the big question. Is this a Python issue or a "wrong-tool-for-the-job" issue?
It's clearly the later. I shouldn't be running a tiny script, which just:
on a full blown Spark cluster. But I had no choice. I asked for a serverless functions, but those would not be available this year.
But unfortunately, I hold the opinion that this is also a Python issue. Because Python encourages an "it's easy" mindset, that leads to using wrong tools for the job:
And a flawed logic get's built on top ofthat mantra inside both manager's and sales' heads, particularly the dangerous ones: the ones with enough knowledge to understand the upsides, and not enough to understand the downsides.
Then the day comes, where the stack is becoming a limitation to providing value. And it's hard for an individual technician to make the case for something different, because it's alien to the "it's easy" mantra's stack. Repeated ad nauseam.
Yes, Python: "it's easy." And that is good. But it's also limited and limiting. Tech should be approachable to foster diversity. And Python has done a great deal of good in that regard. But understanding limitations of each approach is still required. And that is definitely not easy.