Concurrency is not a language thing anymore
This article makes an analogy between concurrency and memory management. The claim is that since modern engineers almost always program to clusters of computers, what we need are tools targeted at building distributed systems. This is taken to mean that we need language support for distributed system development. The idea is that this will lead to the dominance of languages like Go and Erlang.
Go and Erlang may end up being popular, but I doubt that it will be for that reason. Distributed system development is not going to become a part of every application developers daily life because it is a massive pain in the ass. To the extent that it is happening today, I think it is because good distributed computing frameworks are lacking and so applications are forced to reinvent some distributed systems primitives. But this gap will not persist. Rather a handful of frameworks will provide different programming models that handle distribution and concurrency across an elastic pool of machines.
Those who have worked with MapReduce have already dealt with this. MapReduce programming is almost always single-threaded. Concurrency is managed by the framework. The last thing that would help you write good MapReduce programs is more user-level concurrency primitives. And yet MapReduce is highly parallel.
This is not a new thing. Even Java servlets, for all their faults, largely served to abstract away the threading model, at least for applications that only interacted with a single database using blocking I/O.
I see three basic domains for processing: online, near-real-time, and offline.
In the online world people build request/response services. Parallelism is found by treating each request as a unit of work and dividing requests over threads and machines. I have seen a number of variants on this model, from “service query languages” to DSLs for stitching together REST calls. What all these share is that they abstract away the concurrency of the processing without needing direct access to single-server concurrency mechanisms like threads.
In the near-real-time processing domain stream processing frameworks do a good job of providing asynchronous processing without directly thinking about concurrency at all. Again you interact with concurrency and parallelism only at the framework level, the code you write appears completely single-threaded.
The offline world seems to be moving in the direction of a handful of YARN frameworks for different specialized purposes. What almost all of these share is that they don’t require users to directly manage concurrency.
I think these high-level domains (online, asynchronous, and offline) will prove pretty durable. I doubt we need more than a dozen basic distributed computing abstractions to cover these.
This leads me to think that putting more time into language support for single-server concurrency (software transactional memory and all that) is of limited utility. It will only help the implementors of these frameworks, not the end user. Framework implementors will have a much different set of needs then application programmers. They care a lot more about performance and fine-grained control. Although this is apparently a controversial statement, I’m not sure that just threads and locks aren’t a workable model for framework developers to work with. After all, they work pretty well for a discipled team and give excellent performance and control.
Notes
yshuai liked this
mobocracy liked this
manuzhang liked this
rauanm liked this
hendyirawan liked this
boredandroid posted this