Node.js Virtual Machine (vm) Usage (перевод)

Материал из Tech_support
Перейти к навигации Перейти к поиску

Часть 1

оригинал статьи (англ.)

Для моего проекта я хочу иметь возможность запускать изменяющийся код. Возможно есть варианты лучше, чем Node, но самый простой я нашел в базовом модуле vm. Я немного поэкспериментировал и вот что я обнаружил.

Для использования модуля vm необходимо ‘require’ его:

var util = require('util');
var vm = require('vm');

Grabbing util as well in this case just because I prefer the logging methods on it to the standard console.log, as the time stamps help me keep straight what I was running when.

Для теста на скорую руку я сделал hello world.

var util = require('util');
var vm = require('vm');
 
vm.runInThisContext('var hello = "world";');

Таким образом код в строковом параметре компилируется движком JavaScript V8 и выполняется. Really cool, но, к сожалению нет никакого внешнего представления о происходящем. Пусть что-нибудь выводит.

вы можете попытаться сделать что-то на подобие:

vm.runInThisContext('var hello = "world"; util.log("Hello " + hello);');

Однако, вы получите синтаксическую ошибку, говорящую о том, что модуль ‘util’ is not defined. Этому есть очень тонкое объяснение. Метод runInThisContext модуля vm не использует текущий контекстt, однако он имеет доступ к to local scope и мы объявим util в local scope используя ключевое слово ‘var’.

Если в предыдущем примере убрать ключевое слово ‘var’ и запустить код, то вы получите похожий результат:

17 Aug 23:41:32 - Hello world

Все, что определено как глобальная переменная доступна для использования в runInThisContext. Хорошая вещь, если вы хотите иметь доступ к глобальным переменным, плохо, если вы предпочли бы ограничить доступ скрипта. Например, с runInThisContext вы можете сделать что-то вроде этого:

vm.runInThisContext('var hello = "world"; console.log("Hello " + hello);');

Предположив, что это надежный код, который может быть прекрасно - но если он не надежный код, или если (в моем случае) это верится, но вы хотите явно поощрять его, чтобы соответствовать набор API для взаимодействия с вещами вне его , вы можете исключить динамические сценарии вы работаете из имеющих доступ к глобальным контекстом. К счастью, В.М. имеет метод, который делает это называемых runInNewContext. Например, это следующая строка не будет работать, потому runInNewContext создает новый, "пустой" контекст для сценария для запуска в вместо использования существующего - Затем скрипт имеет доступ к чему пределами того, что JavaScript V8 само по себе обеспечивает - она ​​не может получить доступ к глобальных функций узла.

Assuming this is trusted code, that can be fine – but if it isn’t trusted code, or if (in my case) it is trusted but you want to explicitly encourage it to conform to a set API for interacting with things outside of it, you may wish to exclude the dynamic script you are running from having access to the global context. Fortunately, vm has a method which does this called runInNewContext. For example, this next line will not work because runInNewContext creates a new, ‘empty’ context for the script to run in rather than using the existing one – the script then has access to nothing outside of what JavaScript V8 itself provides – it cannot access global node functions.

Fails:

vm.runInNewContext('var hello = "world"; console.log("Hello " + hello);');,

It will say that ‘console’ is undefined as it no longer has access to the global scope where console is contained.

So that is good – we have a way to limit the access the script has, but we need to be able to provide it with something in order to have it effect anything outside of itself and be useful. We do that by providing the context, or ‘sandbox’, for it to use via the optional second argument. Here’s an example:

var util = require('util');
var vm = require('vm');
 
var myContext = {
   hello: "nobody"
}
 
vm.runInNewContext('hello = "world";', myContext);
 
util.log('Hello ' + myContext.hello);

The second argument takes an object, the variables of which are injected into the global context of the script. It is my understand that this passing is actually done via some fairly sexy copy operations, so perhaps a relevant performance note to make is that the size of this context is probably a significant factor (will need to do some testing myself to see). Similarly, you can of course pass in functions with the context – those functions may utilize calls outside the sandbox object itself, such as this:

var myContext = {
}
myContext.doLog = function(text) {
	util.log(text);
}
 
vm.runInNewContext('doLog("Hello World");', myContext);

And of course we can define whole object structures as such:

var myContext = {
   utilFacade: {
   }
}
myContext.utilFacade.doLog = function(text) {
	util.log(text);
}

vm.runInNewContext('utilFacade.doLog("Hello World");', myContext); 

Though I have found at this point we begin to get my JavaScript editor of choice confused about what is legal and what is not.

Stepping back for one second, I wanted to note that it is important to think about what is going on here. We are feeding text in, which is compiled at the time runInNewContext. Depending on application, it may not be desired to compile it at the time you run – we might instead want to do this step before hand. This is accomplished via the Script object, like so:

var myScript = vm.createScript('var hello = "world";');
myScript.runInNewContext(myContext);
And we can still include calls to our context, so this works fine:

var myContext = {
  utilFacade: {
  }
}
myContext.utilFacade.doLog = function(text) {
	util.log(text);
}
 
var myScript = vm.createScript('utilFacade.doLog("Hello World");');
myScript.runInNewContext(myContext);

That said, it is important to understand that this is not very safe, as by the very fact that you are ‘updating’ the context you know there can be leakage – for example:

var myScript = vm.createScript('someVariable = "test"; utilFacade.doLog("Hello World");');
myScript.runInNewContext(myContext);
 
var anotherScript = vm.createScript('utilFacade.doLog(someVariable);');
anotherScript.runInNewContext(myContext);

This will print out ‘test’ to the log. We could have just as easily replaced anything in the context, causing crazy unexpected behavior between executions. Additionally there are some other fundamental unsafe things about this – for instance, our script could consist of a never-ending loop, or a syntax error or similar issue that halts or causes the entire node instance to go into an infinite loop. In general, this simply is not a safe avenue for dealing with untrusted code. I’ve thought about the problem a bit and read some blogs on it, perhaps I’ll post something about what to do in such situation later.

For now, I would be remiss if I did not mention this “undocumented” method – not the new method used to create the context, and the associated call differences (passing in the context object instead).

var myContext = vm.createContext(myContext);
 
var myScript = vm.createScript('someVariable = "test"; utilFacade.doLog("Hello World");');
myScript.runInContext(myContext);
 
var anotherScript = vm.createScript('utilFacade.doLog(someVariable);');
anotherScript.runInContext(myContext);

If you are like me, you may be wondering ‘what is the point? it seems to work similar’ and as far as I can tell currently it pretty much operates the same in terms of functionality – I may be wrong on this point though in some specific use case, if so please feel free to drop a comment on it and I’ll update accordingly.

While functionally it seems the same, in reality something very different is occurring under the covers. To get an idea of what, precisely, I think it is worthwhile to consider this git commit somebody made which I think provides some useful reference:

https://gist.github.com/813257

For the lazy, here’s the code:

var vm = require('vm'),
   code = 'var square = n * n;',
   fn = new Function('n', code),
   script = vm.createScript(code),
   sandbox;
 n = 5;
 sandbox = { n: n };
 benchmark = function(title, funk) {
   var end, i, start;
   start = new Date;
   for (i = 0; i < 5000; i++) {
     funk();
   }
   end = new Date;
   console.log(title + ': ' + (end - start) + 'ms');
 }
 var ctx = vm.createContext(sandbox);
 benchmark('vm.runInThisContext', function() { vm.runInThisContext(code); });
 benchmark('vm.runInNewContext', function() { vm.runInNewContext(code, sandbox); });
 benchmark('script.runInThisContext', function() { script.runInThisContext(); });
 benchmark('script.runInNewContext', function() { script.runInNewContext(sandbox); });
 benchmark('script.runInContext', function() { script.runInContext(ctx); });
 benchmark('fn', function() { fn(n); });

This is a pretty simple benchmark script – there are some fundamental issues with it but it gives enough of a view that we can gauge a general sense of relative performance of various methods of executing the script. The script.* functions will use the pre-compiled script whereas the first two will compile at time of execution. The last item is a reference point. Executed on my machine, this gives me the following result:

vm.runInThisContext: 127ms
vm.runInNewContext: 1288ms
script.runInThisContext: 3ms
script.runInNewContext: 1110ms
script.runInContext: 23ms
fn: 0ms

So you can see that there are significant performance implications. The pre-compiled examples run faster than those that compile on the fly – no real surprise there – and if we were to increase the number of executions we would find this difference exacerbated. Additionally, we see something significant is happening different with the ‘runInContext’ and ‘runInThisContext’ vs ‘runInNewContext’. The difference being that runInNewContext does exactly what it says – it creates a new context based on the object being passed in. The other two methods use the already created context object, and we can see that there is quite a benefit inherent in this – creating a context is an expensive task.

This entry was posted in Javascript, Node.js and tagged coding, javascript, node, node.js, nodejs, programming. Bookmark the permalink.

Часть 2

оригинал статьи (англ.)

One thing I noticed today is that this works:

var util = require('util');
var vm = require('vm');
 
var contextObject = {
}
contextObject.contextMethod = function(text) {
console.log(text);
}
var myContext = vm.createContext(contextObject);
myContext.contextMethod2 = function(text) {
console.log(text);
}
var scriptText = 'contextMethod("Hello World!"); contextMethod2("Hello Universe!");';
var script = vm.createScript(scriptText);
script.runInContext(myContext);

Which in general makes sense, but it is nice to see that you can modify the context.

Часть 3 More on Node VM

оригинал статьи (англ.)

Posted on August 18, 2011 by David Clifton So I wanted to understand a bit more about what is going on under the covers with Node VM. To do that, I pulled open the node code itself. To start with, when we do a require(‘vm’) we are referencing the builtin vm module, which is contained in Node’s libs folder under the name ‘vm.js’. The code for it is quite simple, so I’ll past it here:

var binding = process.binding('evals');
 
exports.Script = binding.Script;
exports.createScript = function(code, ctx, name) {
  return new exports.Script(code, ctx, name);
};
 
exports.createContext = binding.Script.createContext;
exports.runInContext = binding.Script.runInContext;
exports.runInThisContext = binding.Script.runInThisContext;
exports.runInNewContext = binding.Script.runInNewContext;

This is from the version I am currently running which is Node 0.4.9.

What we see here is a call to process.binding to access ‘evals’ in the node C++ code. The rest is mostly just mapping logic, giving us the various methods we have already been using by mapping them to the methods in the C++ code. Pretty simple. To understand what is actually happening here though, we have to jump down into the land of C++.

In the src directory for node, in the file node_script.cc, we find the method that does the real work – WrappedScript::EvalMachine. Taking a look at this, we can get a sense of what differs between passing in a context via runInContext vs runInNewContext and runInThisContext.

The first significant time we see a differentiation is here:

 if (context_flag == newContext) {
    // Create the new context
    context = Context::New();
 
  } else if (context_flag == userContext) {
    // Use the passed in context
    Local<Object> contextArg = args[sandbox_index]->ToObject();
    WrappedContext *nContext = ObjectWrap::Unwrap<WrappedContext>(sandbox);
    context = nContext->GetV8Context();
  }

We can see that if we do a runInNewContext, we must create a new context object. On the other hand, if we pass in a context object previously created we instead perform a variety of gyrations to ‘unwrap’ the context and get the V8 context of it.

Later, we also find that disposal is quite different:

 if (context_flag == newContext) {
    // Clean up, clean up, everybody everywhere!
    context->DetachGlobal();
    context->Exit();
    context.Dispose();
  } else if (context_flag == userContext) {
    // Exit the passed in context.
    context->Exit();
  }

It is clear from our performance results that the object generation and subsequent detach/dispose is expensive enough to make a noticeable difference in our run time.

We also find this code which occurs whether or not a user is doing a new context or passing an existing one:

 // New and user context share code. DRY it up.
  if (context_flag == userContext || context_flag == newContext) {
    // Enter the context
    context->Enter();
 
    // Copy everything from the passed in sandbox (either the persistent
    // context for runInContext(), or the sandbox arg to runInNewContext()).
    keys = sandbox->GetPropertyNames();
 
    for (i = 0; i < keys->Length(); i++) {
      Handle<String> key = keys->Get(Integer::New(i))->ToString();
      Handle<Value> value = sandbox->Get(key);
      if (value == sandbox) { value = context->Global(); }
      context->Global()->Set(key, value);
    }
  }

Additionally, there is this set of code which occurs to copy the values back out to the object used from javascript:

 if (context_flag == userContext || context_flag == newContext) {
    // success! copy changes back onto the sandbox object.
    keys = context->Global()->GetPropertyNames();
    for (i = 0; i < keys->Length(); i++) {
      Handle<String> key = keys->Get(Integer::New(i))->ToString();
      Handle<Value> value = context->Global()->Get(key);
      if (value == context->Global()) { value = sandbox; }
      sandbox->Set(key, value);
    }
  }

Looking at all these however, it is important to note that these are if and else if statements – so all of this code (along with a few other tidbits) are ONLY executed if the context is to be new or user provided. There is a third option in the code – which is to say, runInThisContext. None of this code executes in a such a case, which seems consistent with the significant performance difference we see between runInThisContext and the other options.

It is also important to note that when supplying a context, the way values are communicated back and forth is actually via a copy operation – the scripts is not directly editing the object.