08 May 2008

A couple of folks have taken me to task over some of the things I said... or didn't say... in my last blog piece. So, in no particular order, let's discuss.

A few commented on how I left out commentary on language X, Y or Z. That wasn't an accidental slip or surge of forgetfulness, but I didn't want to rattle off a laundry list of every language I've run across or am exploring, since that list would be much, much longer and arguably of little to no additional benefit. Having said that, though, a more comprehensive list (and more comprehensive explanation and thought process) is probably deserved, so expect to see that from me before long, maybe in the next week or two.

Steve Vinoski wrote:

In a recent post, Ted Neward gives a brief description of a variety of programming languages. It’s a useful post; I’ve known Ted for awhile now, and he’s quite knowledgeable about such things. Still, I have to comment on what he says about Erlang....  I might have said it like this:

Erlang. Joe Armstrong’s baby was built to solve a specific set of problems at Ericsson, and from it we can learn a phenomenal amount about building highly reliable systems that can also support massive concurrency. The fact that it runs on its own interpreter, good; otherwise, the reliability wouldn’t be there and it would be just another curious but useless concurrency-oriented language experiment.

Far too many blog posts and articles that touch on Erlang completely miss the point that reliability is an extremely important aspect of the language.

To achieve reliability, you have to accept the fact that failure will occur, Once you accept that, then other things fall into place: you need to be able to restart things quickly, and to do that, processes need to be cheap. If something fails, you don’t want it taking everything else with it, so you need to at least minimize, if not eliminate, sharing, which leads you to message passing. You also need monitoring capabilities that can detect failed processes and restart them (BTW in the same posting Ted seems to claim that Erlang has no monitoring capabilities, which baffles me).

Massive concurrency capabilities become far easier with an architecture that provides lightweight processes that share nothing, but that doesn’t mean that once you design it, the rest is just a simple matter of programming. Rather, actually implementing all this in a way that delivers what’s needed and performs more than adequately for production-quality systems is an incredibly enormous challenge, one that the Erlang development team has quite admirably met, and that’s an understatement if there ever was one.

They come for the concurrency but they stay for the reliability. Do any other “Erlang-like” languages have real, live, production systems in the field that have been running non-stop for years? (That’s not a rhetorical question; if you know of any such languages, please let me know.) Next time you see yet another posting about Erlang and concurrency, especially those of the form “Erlang-like concurrency in language X!” just ask the author: where’s the reliability?

As he says, Steve and I have known each other for a while now, so I'm fairly comfortable in saying, Mr. Vinoski, you conflate two ideas together in your assessment of Erlang, and teasing those two things apart reveals a great deal about Erlang, reliability, and the greater world at large.

Erlang's reliability model--that is, the spawn-a-thousand-processes model--is not unique to Erlang. In fact, it's been the model for Unix programs and servers, most notably the Apache web server, for decades. When building a robust system under Unix, a master-slave model, in which a master process spawns (and monitors) n number of child processes to do the actual work, offers that same kind of reliability and robustness. If one of these processes fail (due to corrupted memory access, operating system fault, or what-have-you), the process can simply die and be replaced by a new child process. Under the Windows model, which stresses threads rather than processes, corrupted memory access tearing down the process brings down the entire system; this is partly why .NET chose to create the AppDomain model, which looks and feels remarkably like the lightweight process model. (It still can't stop a random rogue pointer access from tearing down the entire process, but if we assume that future servers will be written all in managed code, it offers the same kind of reliability that the process model does so long as your kernel drivers don't crash.)

There is no reason a VM (JVM, CLR, Parrot, etc) could not do this. In fact, here's the kicker: it would be easier for a VM environment to do this, because VM's, by their nature, seek to abstract away the details of the underlying platform that muddy up the picture. It would be relatively simple to take an Actors-based Java application, such as that currently being built in Scala, and move it away from a threads-based model and over to a process-based model (with the JVM constuction/teardown being handled entirely by underlying infrastructure) with little to no impact on the programming model.

As to Steve's comment that the Erlang interpreter isn't monitorable, I never said that--I said that Erlang was not monitorable using current IT operations monitoring tools. The JVM and CLR both have gone to great lengths to build infrastructure hooks that make it easy to keep an eye not only on what's going on at the process level ("Is it up? Is it down?") but also what's going on inside the system ("How many requests have we processed in the last hour? How many of those were successful? How many database connections have been created?" and so on). Nothing says that Erlang--or any other system--can't do that, but it requires the Erlang developer build that infrastructure him-or-herself, which usually means it's either not going to get done, making life harder for the IT support staff, or else it gets done to a minimalist level, making life harder for the IT support staff.

So given that an execution engine could easily adopt the model that gives Erlang its reliability, and that using Erlang means a lot more work to get the monitorability and manageability (which is a necessary side-effect requirement of accepting that failure happens), hopefully my reasons for saying that Erlang (or Ruby's or any other native-implemented language) is a non-starter for me becomes more clear.

Meanwhile, Patrick Logan offers up some sharp words about my preference for VMs:

What is this obsession with some virtual machine being the one, true byte code? The Java Virtual Machine, the CLR, Parrot, whatever. Give it up.

I agree with Steve Vinoski...

The fact that it runs on its own interpreter, good; otherwise, the reliability wouldn’t be there.
We need to get over our thinking about "One VM to bring them all and in the darkness bind them". Instead we should be focused on improving interprocess communication among various languages. This can be done with HTTP and XMPP. And we should expecially be focused on reliability, deployment, starting and stopping locally or remotely, etc. XMPP's "presence" provides Erlang-process-like linking of a sort as well.

With Erlang's JInterface for Java then a Java process can look like an Erlang process (distributed or remote). Two or more Java processes can use JInterface to communicate and "link" reliably and Erlang virtual machines and libraries, save this one single .jar, do not have to be anywhere in sight.

To obsess about a single VM is to remain stuck at about 1980 and UCSD Pascal's p-code. It just should not matter today, and certainly not tomorrow. The forest is now much more important than any given tree.

Pay attention to the new JVM from IBM in support of their lightweight, fast-start, single-purpose process philosophy embodied in Project Zero. It's not intended to be a big honkin' run everything forever virtual machine. It will support JVM languages and the more the merrier in the sense that such a JVM will enable lightweight pieces to be stiched together dynamically. However the intention is to perform some interprocess communication and then get out of the way. Exactly the right approach for any virtual machine.

Jini clearly is *the* most important thing about Java, ever. But it's lost. Gone. Buh-bye. Pity.

"We need to get over our thinking about "One VM to bring them all and in the darkness bind them". " Huh? How did we go from "I like virtual machine/execution environments because of the support they give my code for free" to "One VM to bring them all and in the darkness bind them"? I truly fail to see the logical connection there. My love for both the JVM and the CLR has hopefully made itself clear, but maybe Patrick's only subscribed to the Java/J2EE category bits of my RSS feed. Fact is, I'm coming to like any virtual machine/execution environment that offers a layer of abstraction over the details of the underlying platform itself, because developers do not want to deal with those details. They want to be able to get at them when it becomes necessary, granted, but the actual details should remain hidden (as best they can, anyway) until that time.

"Instead we should be focused on improving interprocess communication among various languages. This can be done with HTTP and XMPP."  I'm sorry, but I'm getting very very tired of this "HTTP is the best way to communicate" meme that surrounds the Internet. Yes, HTTP was successful. Nobody is arguing with this. So is FTP. So is SMTP and POP3. So, for that matter, is XMPP. Each serves a useful purpose, solving a particular problem. Let's not try to force everything down a single pipe, shall we? I would hate to be so focused on the tree of HTTP that we lose sight of the forest of communication protocols.

"And we should expecially [sic] be focused on reliability, deployment, starting and stopping locally or remotely, etc. XMPP's "presence" provides Erlang-process-like linking of a sort as well." Yes! XMPP's "presence" aspect is a powerful one, and heavily underutilized. "Presence", however, is really just a specific form of "discovery", and quite frankly our enterprise systems need to explore more "discovery"-based approaches, particularly for resource acquisition and monitoring. I've talked about this for years.

"To obsess about a single VM is to remain stuck at about 1980 and UCSD Pascal's p-code." Great one-liner... with no supporting logic, granted, but I'm sure it drew a cheer from the faithful.

"It just should not matter today, and certainly not tomorrow." For what reason? Based on what concepts? Look, as much as we want to try and abstract ourselves away from everything, at some point rubber must meet road, and the semantic details of the platform you're using--virtual or otherwise--make a huge difference about how you build systems. For example, Erlang's many-child-processes model works well on Unix, but not as well on Windows, owing to the heavier startup costs of creating a process under Windows. For applications that will involve spinning up thousands of processes, Windows is probably not a good platform to use.

Disclaimer: This "it's heavier to spin up processes on Windows than Unix" belief is one I've not verified personally; I'm trusting what I've heard from other sources I know and trust. Under later Windows releases, this may have changed, but my understanding is that it is still much much faster to spin up a thread on Windows than a separate process, and that it is only marginally faster to spin up a thread on Unix than a process, because many Unixes use the process model to "fake" threads, the so-called LightWeightProcess model.

"The forest is now much more important than any given tree." Yes! And that means you have to keep an eye on the forest as a whole, which underscores the need for monitoring and managing capabilities in your programs. Do you want to build this by hand?

"Pay attention to the new JVM from IBM in support of their lightweight, fast-start, single-purpose process philosophy embodied in Project Zero. It's not intended to be a big honkin' run everything forever virtual machine. It will support JVM languages and the more the merrier in the sense that such a JVM will enable lightweight pieces to be stiched together dynamically. However the intention is to perform some interprocess communication and then get out of the way. Exactly the right approach for any virtual machine." Yes! You make my point for me--the point of the virtual machine/execution environment is to reduce the noise a developer must face, and if IBM's new VM gains us additional reliability by silently moving work and data between processes, great! But the only way you take advantage of this is by writing to the JVM. (Or CLR, or Parrot, or whatever.) If you don't, and instead choose to write to something that doesn't abstract away from the OS, you have to write all of this supporting infrastructure code yourself. That sounds like fun, right? Not to mention highly business-ROI-focused?

"Jini clearly is *the* most important thing about Java, ever. But it's lost. Gone. Buh-bye. Pity." Jini was cool. I liked Jini. Jini got nowhere because Sun all but abandoned it in its zeal to push the client-server EJB model of life. sigh I wish they had sought to incorporate more of the discovery elements of Jini into the J2EE stack (see the previous paragraph). But they didn't, and as a result, Jini is all but dead.

Disclaimer: I know, I know, Jini isn't really dead. The bits are still there, you can still download them and run them, and there is a rabidly zealous community of supporters out there, but as a tool in widespread use and a good bet for an IT department, it's a non-starter. Oh, and if you're one of those rabidly zealous supporters, don't bother emailing me to tell me how wrong I am, I won't respond. Don't forget that FoxPro and OS/2 still have a rabidly zealous community of supporters out there, too.

Frankly, a comment on Patrick's blog entry really captures my point precisely, so (hopefully with permission) I will repeat it here:

The only argument you made that I can find against sharing VMs is that people should be focusing on other things. But the main reason for sharing VMs is to allow people to focus on other things, instead of focusing on creating yet another VM.

You write as if you think creating an entirely new VM from scratch would be easier than targeting a common VM. Is that really what you think?

Couldn't have said it better... though that never stops me from trying. ;-)


Tags: reading   industry   languages  

Last modified 08 May 2008