RSS
热门关键字:  数据挖掘  数据仓库  商业智能  人工智能  搜索引擎
当前位置 :| 首页>编程技术>脚本语言>

JavaScript - Design walk-through

来源: 作者:unkonwn 时间:2004-12-02 点击:
This section must be brief for now -- it could easily turn into a book.

JS "JavaScript Proper"

JS modules declare and implement the JavaScript compiler, interpreter, decompiler, GC and atom manager, and standard classes.

JavaScript uses untyped bytecode and runtime type tagging of data values. The jsval type is a signed machine word that contains either a signed integer value (if the low bit is set), or a type-tagged pointer or boolean value (if the low bit is clear). Tagged pointers all refer to 8-byte-aligned things in the GC heap.

Objects consist of a possibly shared structural description, called the map or scope; and unshared property values in a vector, called the slots. Object properties are associated with nonnegative integers stored in jsval′s, or with atoms (unique string descriptors) if named by an identifier or a non-integral index expression.

数据挖掘研究院

Scripts contain bytecode, source annotations, and a pool of string, number, and identifier literals. Functions are objects that extend scripts or native functions with formal parameters, a literal syntax, and a distinct primitive type ("function").

The compiler consists of a recursive-descent parser and a random-logic rather than table-driven lexical scanner. Semantic and lexical feedback are used to disambiguate hard cases such as missing semicolons, assignable expressions ("lvalues" in C parlance), etc. The parser generates bytecode as it parses, using fixup lists for downward branches and code buffering and rewriting for exceptional cases such as for loops. It attempts no error recovery. The interpreter executes the bytecode of top-level scripts, and calls itself indirectly to interpret function bodies (which are also scripts). All state associated with an interpreter instance is passed through formal parameters to the interpreter entry point; most implicit state is collected in a type named JSContext. Therefore, all API and almost all other functions in JSRef take a JSContext pointer as their first argument. 数据挖掘研究院

The decompiler translates postfix bytecode into infix source by consulting a separate byte-sized code, called source notes, to disambiguate bytecodes that result from more than one grammatical production.

The GC is a mark-and-sweep, non-conservative (exact) collector. It can allocate only fixed-sized things -- the current size is two machine words. It is used to hold JS object and string descriptors (but not property lists or string bytes), and double-precision floating point numbers. It runs automatically only when maxbytes (as passed to JS_NewRuntime()) bytes of GC things have been allocated and another thing-allocation request is made. JS API users should call JS_GC() or JS_MaybeGC() between script executions or from the branch callback, as often as necessary. 数据挖掘研究院

An important point about the GC′s "exactness": you must add roots for new objects created by your native methods if you store references to them into a non-JS structure in the malloc heap or in static data. Also, if you make a new object in a native method, but do not store it through the rval result parameter (see math_abs in the "Using the JS API" section above) so that it is in a known root, the object is guaranteed to survive only until another new object is created. Either lock the first new object when making two in a row, or store it in a root you′ve added, or store it via rval. See the GC tips document for more. 数据挖掘研究院

The atom manager consists of a hash table associating strings uniquely with scanner/parser information such as keyword type, index in script or function literal pool, etc. Atoms play three roles in JSRef: as literals referred to by unaligned 16-bit immediate bytecode operands, as unique string descriptors for efficient property name hashing, and as members of the root GC set for exact GC. 数据挖掘研究院

Native objects and methods for arrays, booleans, dates, functions, numbers, and strings are implemented using the JS API and certain internal interfaces used as "fast paths". 数据挖掘研究院

In general, errors are signaled by false or unoverloaded-null return values, and are reported using JS_ReportError() or one of its variants by the lowest level in order to provide the most detail. Client code can substitute its own error reporting function and suppress errors, or reflect them into Java or some other runtime system as exceptions, GUI dialogs, etc.. 数据挖掘研究院

File walk-through (OUT OF DATE!)

jsapi.c, jsapi.h

The public API to be used by almost all client code.  If your client code can′t make do with jsapi.h, and must reach into a friend or private js* file, please let us know so we can extend jsapi.h to include what you need in a fashion that we can support over the long run.

jspubtd.h, jsprvtd.h

These files exist to group struct and scalar typedefs so they can be used everywhere without dragging in struct definitions from N different files. The jspubtd.h file contains public typedefs, and is included by jsapi.h. The jsprvtd.h file contains private typedefs and is included by various .h files that need type names, but not type sizes or declarations.

jsdbgapi.c, jsdbgapi.h

The Debugging API, still very much under development. Provided so far:
  • Traps, with which breakpoints, single-stepping, step over, step out, and so on can be implemented. The debugger will have to consult jsopcode.def on its own to figure out where to plant trap instructions to implement functions like step out, but a future jsdbgapi.h will provide convenience interfaces to do these things. At most one trap per bytecode can be set. When a script (JSScript) is destroyed, all traps set in its bytecode are cleared.
  • Watchpoints, for intercepting set operations on properties and running a debugger-supplied function that receives the old value and a pointer to the new one, which it can use to modify the new value being set.
  • Line number to PC and back mapping functions. The line-to-PC direction "rounds" toward the next bytecode generated from a line greater than or equal to the input line, and may return the PC of a for-loop update part, if given the line number of the loop body′s closing brace. Any line after the last one in a script or function maps to a PC one byte beyond the last bytecode in the script. An example, from perfect.js:
    14   function perfect(n)
    15   {
    16       print("The perfect numbers up to " +  n + " are:");
    17
    18       // We build sumOfDivisors[i] to hold a string expression for
    19       // the sum of the divisors of i, excluding i itself.
    20       var sumOfDivisors = new ExprArray(n+1,1);
    21       for (var divisor = 2; divisor <= n; divisor++) {
    22           for (var j = divisor + divisor; j <= n; j += divisor) {
    23               sumOfDivisors[j] += " + " + divisor;
    24           }
    25           // At this point everything up to ′divisor′ has its sumOfDivisors
    26           // expression calculated, so we can determine whether it′s perfect
    27           // already by evaluating.
    28           if (eval(sumOfDivisors[divisor]) == divisor) {
    29               print("" + divisor + " = " + sumOfDivisors[divisor]);
    30           }
    31       }
    32       delete sumOfDivisors;
    33       print("That′s all.");
    34   } 

    数据挖掘研究院

    The line number to PC and back mappings can be tested using the js program with the following script:
            load("perfect.js")
            print(perfect)
            dis(perfect)
    
            print()
            for (var ln = 0; ln <= 40; ln++) {
                var pc = line2pc(perfect,ln)
                var ln2 = pc2line(perfect,pc)
                print("	line " + ln + " => pc " + pc + " => line " + ln2)
            } 数据挖掘研究院 
    The result of the for loop over lines 0 to 40 inclusive is:
            line 0 => pc 0 => line 16
            line 1 => pc 0 => line 16
            line 2 => pc 0 => line 16
            line 3 => pc 0 => line 16
            line 4 => pc 0 => line 16
            line 5 => pc 0 => line 16
            line 6 => pc 0 => line 16
            line 7 => pc 0 => line 16
            line 8 => pc 0 => line 16
            line 9 => pc 0 => line 16
            line 10 => pc 0 => line 16
            line 11 => pc 0 => line 16
            line 12 => pc 0 => line 16
            line 13 => pc 0 => line 16
            line 14 => pc 0 => line 16
            line 15 => pc 0 => line 16
            line 16 => pc 0 => line 16
            line 17 => pc 19 => line 20
            line 18 => pc 19 => line 20
            line 19 => pc 19 => line 20
            line 20 => pc 19 => line 20
            line 21 => pc 36 => line 21
            line 22 => pc 53 => line 22
            line 23 => pc 74 => line 23
            line 24 => pc 92 => line 22
            line 25 => pc 106 => line 28
            line 26 => pc 106 => line 28
            line 27 => pc 106 => line 28
            line 28 => pc 106 => line 28
            line 29 => pc 127 => line 29
            line 30 => pc 154 => line 21
            line 31 => pc 154 => line 21
            line 32 => pc 161 => line 32
            line 33 => pc 172 => line 33
            line 34 => pc 172 => line 33
            line 35 => pc 172 => line 33
            line 36 => pc 172 => line 33
            line 37 => pc 172 => line 33
            line 38 => pc 172 => line 33
            line 39 => pc 172 => line 33
            line 40 => pc 172 => line 33 数据挖掘实验室 

jsconfig.h

Various configuration macros defined as 0 or 1 depending on how JS_VERSION is defined (as 10 for JavaScript 1.0, 11 for JavaScript 1.1, etc.). Not all macros are tested around related code yet. In particular, JS 1.0 support is missing from JSRef. JS 1.2 support will appear in a future JSRef release.
 

js.c

The "JS shell", a simple interpreter program that uses the JS API and more than a few internal interfaces (some of these internal interfaces could be replaced by jsapi.h calls). The js program built from this source provides a test vehicle for evaluating scripts and calling functions, trying out new debugger primitives, etc.

jsarray.*, jsbool.*, jdsdate.*, jsfun.*, jsmath.*, jsnum.*, jsstr.*

These file pairs implement the standard classes and (where they exist) their underlying primitive types. They have similar structure, generally starting with class definitions and continuing with internal constructors, finalizers, and helper functions.

jsobj.*, jsscope.*

These two pairs declare and implement the JS object system. All of the following happen here:
  • creating objects by class and prototype, and finalizing objects;
  • defining, looking up, getting, setting, and deleting properties;
  • creating and destroying properties and binding names to them.
The details of a native object′s map (scope) are mostly hidden in jsscope.[ch].

jsatom.c, jsatom.h

The atom manager. Contains well-known string constants, their atoms, the global atom hash table and related state, the js_Atomize() function that turns a counted string of bytes into an atom, and literal pool (JSAtomMap) methods.

jsgc.c, jsgc.h

[TBD]

jsinterp.*, jscntxt.*

The bytecode interpreter, and related functions such as Call and AllocStack, live in jsinterp.c. The JSContext constructor and destructor are factored out into jscntxt.c for minimal linking when the compiler part of JS is split from the interpreter part into a separate program.

jsemit.*, jsopcode.tbl, jsopcode.*, jsparse.*, jsscan.*, jsscript.*

Compiler and decompiler modules. The jsopcode.tbl file is a C preprocessor source that defines almost everything there is to know about JS bytecodes. See its major comment for how to use it. For now, a debugger will use it and its dependents such as jsopcode.h directly, but over time we intend to extend jsdbgapi.h to hide uninteresting details and provide conveniences. The code generator is split across paragraphs of code in jsparse.c, and the utility methods called on JSCodeGenerator appear in jsemit.c. Source notes generated by jsparse.c and jsemit.c are used in jsscript.c to map line number to program counter and back.

jstypes.h, jslog2.c

Fundamental representation types and utility macros. This file alone among all .h files in JSRef must be included first by .c files. It is not nested in .h files, as other prerequisite .h files generally are, since it is also a direct dependency of most .c files and would be over-included if nested in addition to being directly included. The one "not-quite-a-macro macro" is the JS_CeilingLog2() function in jslog2.c.

jsarena.c, jsarena.h

Last-In-First-Out allocation macros that amortize malloc costs and allow for en-masse freeing. See the paper mentioned in prarena.h′s major comment.

jsutil.c, jsutil.h

The JS_ASSERT macro is used throughout JSRef source as a proof device to make invariants and preconditions clear to the reader, and to hold the line during maintenance and evolution against regressions or violations of assumptions that it would be too expensive to test unconditionally at run-time. Certain assertions are followed by run-time tests that cope with assertion failure, but only where I′m too smart or paranoid to believe the assertion will never fail...

jsclist.h

Doubly-linked circular list struct and macros.

jscpucfg.c

This standalone program generates jscpucfg.h, a header file containing bytes per word and other constants that depend on CPU architecture and C compiler type model. It tries to discover most of these constants by running its own experiments on the build host, so if you are cross-compiling, beware.

prdtoa.c, prdtoa.h

David Gay′s portable double-precision floating point to string conversion code, with Permission To Use notice included.

prhash.c, prhash.h

Portable, extensible hash tables. These use multiplicative hash for strength reduction over division hash, yet with very good key distribution over power of two table sizes. Collisions resolve via chaining, so each entry burns a malloc and can fragment the heap.

prlong.c, prlong.h

64-bit integer emulation, and compatible macros that use C′s long long type where it exists (my last company mapped long long to a 128-bit type, but no real architecture does 128-bit ints yet).

jsosdep.h

Annoying OS dependencies rationalized into a few "feature-test" macros such as JS_HAVE_LONG_LONG.

jsprf.*

Portable, buffer-overrun-resistant sprintf and friends. For no good reason save lack of time, the %e, %f, and %g formats cause your system′s native sprintf, rather than JS_dtoa(), to be used. This bug doesn′t affect JSRef, because it uses its own JS_dtoa() call in jsnum.c to convert from double to string, but it′s a bug that we′ll fix later, and one you should be aware of if you intend to use a JS_*printf()  function with your own floating type arguments - various vendor sprintf′s mishandle NaN, +/-Inf, and some even print normal floating values inaccurately.

prmjtime.c, prmjtime.h

Time functions. These interfaces are named in a way that makes local vs. universal time confusion likely. Caveat emptor, and we′re working on it. To make matters worse, Java (and therefore JavaScript) uses "local" time numbers (offsets from the epoch) in its Date class.
最新评论共有 0 位网友发表了评论
发表评论
评论内容:不能超过250字,需审核,请自觉遵守互联网相关政策法规。
匿名?