October 31, 2023

N4JS Can Generate TypeScript Definitions

N4JS source code is transpiled into EcmaScript, making it an integral part of the JavaScript ecosystem. Within this ecosystem, it is utilized by various libraries. To facilitate this usage, N4JS now offers TypeScript type definition generation.

The growing popularity of TypeScript as a programming language is evident in the increasing number of TypeScript packages and the download rates of pure type definition packages. The latter are primarily provided by the Definitely Typed repository, which currently includes type definitions for 8,724 libraries. By enabling the N4JS transpiler to generate TypeScript definition files, N4JS projects can be seamlessly imported into TypeScript projects while maintaining type safety.

To activate the feature for '.d.ts' file generation, you can set the 'n4js->generator->d.ts' property to 'true' in the 'package.json' file. This setting applies to the entire project. In cases where multiple projects exist within a yarn workspace, the transpiler will respect projects that do not generate '.d.ts' files. In case type information gets exposed through a project that generates 'd.ts' files, those exposed types are cut off and will be treated as any types. When generating type information, import statements in 'd.ts' files may be added or removed, as some become obsolete while others are necessary.

Despite their superficial similarities, N4JS and TypeScript differ in various aspects. For example, N4JS allows interfaces to provide static properties, and N4JS enums offer static properties like the getter literals that returns all literals of an enum. In the corresponding 'd.ts' files, these interfaces and enums are split up into TypeScript interfaces and enums along with a namespace bearing the same name (making use of TypeScript declaration merging). The additional namespace contains all properties that are static in the N4JS source code.

To enable the use of built-in types, such as accessing N4JS reflection information, the transpiler also includes the file 'n4jsglobals.d.ts'. This file contains type information about reflection classes and provides types that mimic N4JS structural typing strategies (e.g. all readable fields/getters of a type C written as ~r~C in N4JS) using utility types of TypeScript. 

While N4JS and TypeScript may appear similar, they differ in several ways. For instance, TypeScript generics have the semantics of an implicit upper bound, written as C<? extends T> in Java, whereas N4JS generics are similar to Java and support both no explicit bounds, as well as upper or lower bounds (written as C<T>, C<? extends T>, and C<? super T>). Additionally, TypeScript does not support certain features, such as optional getters or setters. N4JS, on the other hand, supports this feature, treating a getter/setter pair as override-compatible to fields, which of course can be optional. These differences or unsupported language features in TypeScript may occasionally result in generated type definitions not aligning perfectly with respect to the runtime EcmaScript and/or N4JS source code. However, in the vast majority of cases, TypeScript type information significantly simplifies and accelerates the reuse of N4JS libraries in terms of coding assistance and type validation.

by Marcus Mews

May 15, 2023

OpenAPI Generator supports N4JS

We are happy to announce that the OpenAPI Generator (version >= 6.6.0) supports N4JS client generation. This enables N4JS users to generate n4jsd files from OpenAPI specifications.

Our extension to the OpenAPI Generator supports the generation of client APIs and provides several command line options:

  • For api and model files a target path can be specified. For api files a file prefix can be specified.
  • Api files can be generated including calls to check methods whether mandatory parameters are missing.
  • Additionally, generated api files can also contain checks whether the given objects might contain more fields than necessary. This could prevent sending superfluous data to the server.
  • A default implementation for the actual REST calls is part of the generated file set.

Detailed specifications of the N4JS OpenAPI Generator and its command line options can be found at the official OpenAPI Generator website.  

by Marcus Mews

December 09, 2022

Pitfalls in TypeScript - Broken Liskov Substitution Principle for Fields

While improving support for using TypeScript definitions within N4JS, we discovered some peculiarities that every TypeScript programmer should keep in mind. In this post we present how subtypes are allowed to break their supertypes' api.

 

Subtypes can specialise their parent type or leave it as is. This specialisation can be done via either adding some members like fields or methods, or by overriding members and widen or narrow their types. However, manipulating types is very tricky and can only be done depending on how members are accessed. In general, the Liskov substitution principle applies.



Liskov substitution principle:

Subtype Requirement: Let  be a property provable about objects  of type T. Then  should be true for objects  of type where is a subtype of T.

-- Barbara Liskov and Jeannette Wing, 1994


In other words: Whatever you can do with an object typed T you should also be able to do with an object typed S, where S is a subtype of T. However, let's have a look at what is possible in TypeScript but questionable from a type safety point of view (and hence not allowed in N4JS, Java, and other languages):

 

class C1 {
fieldC1: boolean = true;
}
class C2 extends C1 {
    // Liskov substitution principle violated here since type of 'fieldC1' gets narrowed
override fieldC1: true = true; // note: 'true' is a subtype of 'boolean'
}

const c2 = new C2();
c2.fieldC1 = false; // TypeScript ERROR: Type 'false' is not assignable to type 'true'.

const c1: C1 = c2; // up-cast
c1.fieldC1 = false; // assign 'false' via supertype

console.log(c2.fieldC1) // yields print out: "false"

if (c2.fieldC1 == false) { // TypeScript ERROR: This comparison appears to be unintentional because the types 'true' and 'false' have no overlap.
    // actually this is reachable
}

What we see in the example above is how the Liskov substitution principle was broken for fields in TypeScript: We cannot do with c2 what we are able to do with c1, despite the fact that the type of c2 is a subtype of the type of c1. In order to preserve the Liskov substitution principle we know from languages like Java, N4JS and others, that we are neither allowed to narrow nor to widen the types of fields along the type hierarchy. This is called type invariance. In TypeScript however, covariance is allowed, that means that the type of fields can be narrowed to a subtype, like shown in the example above.

by Marcus Mews

August 20, 2021

Dependency Cycles During Load Time

When programming in large code bases it can happen inadvertently that cycles of imports are created. In JavaScript, import statements trigger loading and initialization of the specified file directly. In case there is a dependency cycle, files might only be initialized partially and hence errors might occur later during runtime. In this post we present how N4JS detects and avoids these cases by showing validation errors in the source code.


Introduction

Let's start with the most simple example in JavaScript to illustrate the essential problem.


console.log(s); // prints 'test'?
export const s = "test";

Executing the two-liner above results in: 

ReferenceError: Cannot access 's' before initialization.

This is quite obvious and wouldn't surprise anyone. It is obvious because the read access is stated right before the definition of the constant s in the same file. However, it wouldn't be very obvious anymore when both the read access and the definition of s happen in separate files. Let's split up the example into the files F1.mjs and F2.mjs.

F1.mjs


import * as F2 from "./F2.mjs";
export const s = "test";
console.log(F2.s); // prints undefined?

F2.mjs


import * as F1 from "./F1.mjs";
export const s = F1.s;

Executing this example results in a similar error:

ReferenceError: s is not defined.

And again the cause for the error is an access to a not yet initialized variable. As a side note: Modifying the variable to be a 'var' instead of a 'const' would fix the error and return the print-out "undefined". This is due to hoisting of var symbols, but is still not the intended result which would be the print-out "test".

So far, both of the examples either give an unintended result or a compile time error is shown. Errors like these can only be identified after they actually happened during tests or production. While the two-liner example seems way too obvious to actually occur often in practice, the second case can easily hide in projects of many files and imports. A further difference is that even if the first example results in a runtime error, it can usually easily be identified and fixed. The second example however can span across many files by increasing the size of the cycle of import statements and therefore is hard to find and fix.

Two important properties of the execution semantics of JavaScript in Node.js can be witnessed here:

(1) In case a file m is started or imported that imports another file m', a subsequent import back to file m will be skipped. As a result, file m' might be only initialized partially when accessing not yet initialized elements from m.

(2) There is an exception to (1) regarding functions. Since functions are hoisted, they do not have to be reached by the control flow to get initialized. Hoisting will initialize them immediately so that they can be called from any location.

Let's look at a third example which reveals a similar case of reference errors. This time the error occurs depending on the entry point of the program. Have a look at the two files below which either result in the print-out "test" or "undefined" depending on which file was the entry point for node.js. Starting with file G1.mjs causes the execution to follow the green indicators and yields "test" whereas starting with file G2.mjs follows the red indicators and yields "undefined".

These kinds of errors might not be of interest when implementing a stand-alone application since these programs usually have a single and well known entry point only. Yet, cycles can occur also in parts of the program and then the entry point is determined by the order of import statements. Moreover, when writing libraries and exposing an API that spans across several files, the entry point can differ a lot and is defined by the library's user. Hence, in case of an unfortunate setup of files and import statements, a library might suffer from unexpected behavior depending on which part of its API was called first.

Also note that all the examples stated their imports at the top and all other statements below. When mixing import statements or dynamic imports with other code, it is even easier to create reference errors.


Validations in N4JS

One of the goals for N4JS is to provide both many handy and powerful language constructs along with type safety and strong validations. The reason behind the latter one is to prevent especially those errors to happen at runtime that are hard to find and hard to reproduce. Migrated to N4JS, the second example would show validation errors at the references to F1.s and F2.s due to the dependency cycle. The approach to detect these cases is explained in the following paragraphs by first laying out the terminology, reasoning about the general problem afterwards, and then defining the error cases in N4JS.


Terminology

Top level elements are those AST elements of a JavaScript file that are direct children of the root element such as import statements, const or class declarations, and others. Some top level elements can contain expressions or statements such as initializers of consts or extends clauses of classes. These initializers are executed when loading a file. A reference located in such initializers to a top level element (imported or not) is called load time reference.

In addition to compile time and runtime, the term load time is used to refer to the first phase of runtime during which all import statements and top level elements of the started JavaScript file are executed. In this regard we assume that initialization is performed during load time. In a separate step later, some specific calls to the API of imported files would perform the actual requested functionality.

A dependency between two files consist of an import statement and may have imported elements that may be used in the same file. Dependencies with unused imported elements are called unused imports, and those without imported elements are called bare imports. The target of a dependency is the imported file and also the imported element (except for bare imports). There exists at least one dependency for each import statement, and for each code reference to an imported (top level) element. Dependencies are differentiated into three kinds:

Compile time dependencies arise from all non-unused import statements. Runtime dependencies are the subset of compile time dependencies that is necessary at runtime only, i.e. it does not include unused imports or imports used for type information. (We assume that unused imports do not have intended side effects like bare imports have.) Load time dependencies are the subset of runtime dependencies with load time references.

A dependency cycle exists when traversing import statements of one file to the imported files will eventually lead to one of the already visited files. Note that the term dependency cycle refers to files and not necessarily to imported elements. Dependency cycles are differentiated as follows: Compile time dependency cycles are those relying on compile time dependencies. Runtime dependency cycles rely on runtime dependencies and are of special interest later. Load time dependency cycles rely on load time dependencies and are evaluated to errors in N4JS.


Reasoning

To get a clearer understanding, it is important to know the impact of dependency cycles in a program. An inherent property of dependency cycles is that at least one of the import statements during load time gets skipped since it would load a file that is already processed. In a cycle free program, all import statements of all files can be understood as a directed graph of files connected by import statements that define a partial load order. It is usually harmless that the total order of loading files depends on the entry point, i.e. which file is imported first or started the program, since it complies to that partial order. However, in case of dependency cycles the graph contains a cycle which will be broken up at load time to re-establish a directed graph and partial order. That means that the loading of at least one file of each cycle will be skipped because it is already being loaded. Other files that depends on that skipped import might be initialized partially only. Consequently, the entry point, e.g. the order of import statements, impacts whether a file is initialized partially or completely after its import statement was executed. Sorting import statements is a very common IDE feature and usually deemed to be innocent of causing runtime errors. Yet this assumption does not necessarily hold if the program contains dependency cycles.

We learned that partial initialization occurs if a load time initializer accesses a reference to a not yet initialized element of a skipped file. Probably that not yet initialized element will be initialized later during load time, but harm was already done since the current file had read the wrong value. Where exactly did the problem occur? References to not yet initialized elements can be located not only directly in load time initializers but can also be at locations reachable transitively e.g. by calling other functions in between starting from the initializer. Determining all reachable references from load time initializers which potentially access not yet initialized values can only be done by an expensive analysis that is usually imprecise due to over-approximation. In many cases it is even impossible due to reflective calls, dynamic loading etc. However, a simpler way to rule out accesses to partial initialized elements is to make a clear cut and forbid any expressions or statements in load time initializers that cannot be evaluated at compile time, e.g. function calls. On the downside, this strictness also reduces some programming freedom and even rules out legal load time references that would not cause runtime errors.

To summarize the approach: Either runtime dependency cycles need to be removed or - if that is not possible - load time initializers need to be restricted to not reference potentially skipped files.

A very interesting situation is when a runtime dependency cycle C contains a file m that has a load time initializer with a dependency d to file m'. This means that the cycle becomes a cycle that has a correct and an incorrect way of loading its files: Due to load time dependency d file m' must be loaded without being skipped. Still, at least one other import must be skipped to break the cycle. To make sure that loading of m' is not skipped, m' must not be the entry point of the cycle C. Choosing another entry point e.g. file m will result in partially loading m first, loading the rest of the cycle C including m' completely until another import to m is skipped. In other words: A load time dependency to a file m' within a cycle C constrains m' to never be the entry point into C. This situation is illustrated in the figure below.

The figure above shows the third example with additional information about its dependencies and cycles. As you can see there exists a runtime dependency cycle (indicated in blue), since the two files reference each other in runtime import statements. Also indicated in orange there exists a load time dependency because the reference to G2.s is located in an expression of a top level element that is evaluated during load time. Hence, this dependency imposes the constraint that the entry point to the third example must be G1.

In contrast note that the second example has a load time dependency cycle due to the two load time dependencies created by the accesses to F1.s and F2.s.

Error cases

Four types of errors are indicated in different situations regarding load time dependencies. Based on a source code analysis, runtime dependency cycles and references located in top level elements are detected first.

(1) Given this information, load time dependency cycles can be identified and be evaluated to errors. These errors are attached to the references of the load time dependencies.

Three other types of errors occur if and only if there exists a runtime dependency cycle C of modules m and m' (and maybe including others).

(2) Any load time reference in C that references a top level element in C (imported ones or in the same file) is marked with an error. This includes all load time dependencies. The reason to forbid any references to e.g. local or imported functions from C is that these may reach and access partially initialized variables. In N4JS there is one exception to that rule: extends clauses of classes. Load time references are still allowed here and not causing problems because extends clauses in N4JS are already restricted to references to other classes only (and not arbitrary expressions like in JavaScript). Note that ordinary dependencies (i.e. that do not have references in load time code) are still allowed, e.g. within the body of methods.

(3) Any dependency d to a module m' is marked with an error if and only if there exists a load time dependency to m' already. In other words: There may be no other dependency in C to m' if d is a load time dependency. In case an importing module m* is not in C a dependency to m' is allowed.

(4) However, when importing m' from m*, it is mandatory to also import another module m prior to import m*. Otherwise, an error is shown. The import of m prior to the import of m' will ensure that loading of m' is not skipped.

When programming with N4JS and errors like that occur, there are two ways to solve them. First and best solution is to remove the dependency cycle, which in many cases is a code smell already. This can be done by breaking the cycle or merging two or more files or file parts that mutually depend on each other. In case that is not possible, removing some load time dependencies is necessary. However, keep in mind that any load time dependency in a dependency cycle will impose a runtime execution order on the importing file to be loaded always prior to the imported file.

H1.n4js


import * as H2 from "H2";
class C extends H2.C {} // no error (3) here

H2.n4js


import "H1";
export public class C {}

The last example shows a case similar to the third example: The are two files that have a runtime dependency cycle. Additionally, there is a load time dependency created by the extends clause that references H2.C. Note that the third example produces the validation error (3) at the load time reference G2.s because we disallow all non-compile time expressions or statements in load time initializers. This shows where simplifications of our approach might be improved in the future. Since we leave an exception to error (3) in case the load time dependency is an extends clause, the last example shows no errors in N4JS.


Conclusion

The core problem are read accesses to variables that are not yet initialized. While these kinds of problems are relatively obvious and easy to find when happening in a single file, it is much harder to detect them when they are caused due to dependency cycles of two or more files. For the single file case, several IDEs and languages already provide validations and put error markers to read accesses of undefined symbols, such as VSCode for TypeScript. By introducing the validations described in this blog post, N4JS also can rule out initialization errors due to dependency cycles from happening. Unfortunately, in some cases this approach is too strict but we hope to relax some of the restrictions to improve the compromise of program safety and programming freedom.


by Marcus Mews

September 08, 2020

N4JS goes LSP

A few weeks ago we started to publish a VSCode extension for N4JS to the VSCode Marketplace. This was one of the last steps on our road to support LSP-based development tools. We chose to make this major change because of several reasons that affected both users and developers of N4JS.

An N4JS project in VSCode with the N4JS language extension


Our language extension for N4JS is hosted at the Microsoft VSCode Marketplace and will be updated regularly by our Jenkins jobs. Versions will be kept in sync with the language version, compiler version and version of the N4JS libraries to avoid incompatible setups. At the moment, the LSP server supports all main features of the language server protocol (LSP) such as validation, content assist, outline view, jump to definition and implementation, open symbol, the rename refactoring and many more. In addition, it will also generate output files whenever a source change is detected. We therefore heavily improved the incremental LSP builder of the Xtext framework and plan to migrate back those changes to the Xtext repository. For the near future we plan to work on stability, performance and also to support some of the less frequently used LSP features.


When looking back, development of N4JS has been based on the Xtext framework from the start and thus it was straightforward to build an Eclipse-based IDE as our main development tool. Later on, we also implemented a headless compiler used for manual and automated testing from the command line. The development of the compiler already indicated some problems stemming from the tight integration of the Eclipse and the Xtext frameworks together with our language specific implementations. To name an example, we had two separate builder implementations: one for the IDE and the other for the headless compiler. Since the Eclipse IDE is using a specific workspace and project model, we also had two implementations for this abstraction. Another important problem we faced with developing an Eclipse-based IDE was that at some points we had to implement UI tests using the SWTBot framework. For us, SWTBot tests turned out to be very hard to develop, to maintain, and to keep from becoming flaky. Shifting to LSP-based development tools, i.e. the headless compiler and an LSP server, allows us to overcome the aforementioned problems.


Users of N4JS now have the option to either use our extension for VSCode or integrate our LSP server into their favorite IDE themselves, even into the Eclipse IDE. They also benefit from more lightweight tools regarding disk size and start-up performance, as well as a better integration into well-known tools from the JavaScript development ecosystem.



September 30, 2019

Instanceof Type Guards in N4JS

Statically typed languages like Java use instanceof checks to determine the type of an object at runtime. After a successful check, a type cast needs to be done explicitly in most of those languages. In this post we present how N4JS introduced type guards to perform these type casts implicitly. 

No error due to implicit cast in successful instanceof type guard

The example above shows that strict type rules on the any instance a causes errors to show up when accessing the unknown property pX. However, after asserting that a is an instance of X, the property pX can be accessed without errors. A separate type cast is unnecessary, since type inference now also considers instanceof type guard information.


Hover information on variable access of a shows the inferred type

The resulting type is the intersection type of the original type (which is here any) and of all type guards that must hold on a specific variable access (which is here only type X). Keeping the original types any or Object is not necessary and could be optimised later. In case the original type is different, it is necessary to include it in the resulting intersection type. The reason is that the type guard could check for an interface only. If so, property accesses to properties of the original types would cause errors.


Re-definition of a type guarded variable

Two distinct differences between type guards and type declarations are (1) their data flow nature and (2) their read-only effects. Firstly, when redefining (in the sense of the data flow) a variable, the type guard information gets lost. Consequently, subsequent accesses to the variable will no longer benefit from the type guard, since the type guard was invalidated by the re-definition. Secondly, only the original type information is considered for a redefinition. That means that the type guard does not change the expected type and, hence, does not limit the set of types that can be assigned to a type guarded variable.


Further examples for instanceof type guards in N4JS

Data flow analysis is essential for type guards and has been presented in a previous post. Based upon this information, type information for each variable access is computed. Since also complicated data flows are handled correctly, such as in for loops or short circuit evaluation, type guard information is already available in composed condition expressions (see function f3 and f5 above). Aside from being able to nest instanceof type guards (see function f4 above), they also can be used as a filter at the beginning of a function (see function f6 above) or inside a loop: Negating a type guard and then exiting the function or block leaves helpful valid type guard information on all the remaining control flow paths.

by Marcus Mews

August 29, 2019

Redux App Development and Testing in N4JS (Chess Game Part 2)

In large applications, Redux - an implementation of Flux architecture created by Facebook - is often used to organise application code by using a strict data flow in one direction only. Redux is UI agnostic, and can be used in conjunction with any UI library. As a continuation of our chess game tutorial with React, we show how to extract the entire program state out of React components, store it with Redux, and test it with N4JS. The full tutorial is available at eclipse.org/n4js and the sources can be found at github.com/Eclipse/n4js-tutorials.


The first part of the chess game tutorial discussed how to develop a chess game app with React and JSX in N4JS. We have stored the program state - which for instance contains information about the locations of all chess pieces - in the state of the React components directly. As applications become larger, however, the mix of program state and UI makes the application hard to comprehend and difficult to test. To address these issues, we extract the program state from the UI components in the second part of the tutorial.

When using React with Redux, we store the application state in Redux store instead of the state of React components. As a result, React components become stateless UI elements and simply render the UI using the data retrieved from the Redux store. In a Redux architecture, data flows strictly in one direction. The diagram below graphically depicts the action/data flow in a React/Redux app.

Strict data flow of flux architecture application


The action/data flow in the diagram can be roughly understood as follows:
  • When a user interaction is triggered on the React component (e.g. button clicked, text field edited etc.), an action is created. The action describes the changes needed to be updated in the application state. For instance, when a text field is edited, the action created may contain the new string of the text field.
  • Then the action is dispatched to the Redux store whereby the Redux store stores the application state, usually as a hierarchical tree of state.
  • The reducers take the action and the current application state and create an updated application state.
  • If the changes in the application state are to a certain React component, they are forwarded into the component in form of props. The change in props causes the component to re-render.

In the second part of the tutorial we further elaborate on the interaction of React and Redux and migrate the original chess non-Redux app. The tutorial explains the role of the reducer, and how the game state is stored and maintained in the Redux store. Based on storing the game using Redux, the tutorial shows how to test the game application with the N4JS test library Mangelhaft, by for instance checking that valid move destination squares are updated after a chess piece was selected.

Note that the way of testing the game logics is completely UI-agnostic and no React components are involved at all. This is thanks to the decoupling of game logics from UI with the help of Redux.


by Minh Quang Tran