Toward Smalltalk and Java Language Integration

Nik Boyd

Copyright 1997 Nikolas S. Boyd. Permission is granted to copy this document provided this copyright statement is retained in all copies.


Introduction

The advent of Java has been perceived by some as a threat to the future of Smalltalk. This paper proposes an alternate view - that Java has created a significant opportunity for Smalltalk. This paper describes the results of a feasibility study for translating Smalltalk source code into Java source code. Such source-to-source translation can serve two interests. For those wishing to migrate entirely from Smalltalk to Java, it suggests that the conversion of Smalltalk code into Java code can be automated. For those who can afford to wait while the technologies mature, it suggests that Smalltalk and Java can eventually be integrated into a seamless development environment.

If we can define a way to translate Smalltalk source code into Java source code, the resulting Java code can be compiled by any standard Java compiler into Java class binaries. These Java classes can then execute on any compliant Java virtual machine. The ubiquity of the Java virtual machine allows us to deliver classes to all supported platforms - without code changes or recompilation. So, source-to-source translation suggests the possibility of integrating Smalltalk and Java seamlessly, providing a way for Smalltalk and Java to execute and interoperate within the same runtime environment, the Java virtual machine.

While Java has quickly achieved substantial popularity and momentum within the computing industry, Smalltalk still has enviable development tools and mature class libraries. Also, the syntax of Smalltalk ultimately provides a better basis for expressing object models and designs. So, while some may be swayed by the hype and promise of Java, some of its advantages are not unique. If, as this paper suggests, we can define a way to translate Smalltalk source code into Java source code, then it may be feasible to incorporate the most significant benefits of Java into Smalltalk, providing Smalltalk with a natural next step in its evolution as a programming language.

In the next few pages, we will explore ways to:


Some Historical Perspective

Over the past 20 years, Smalltalk pioneered and spawned several advanced computing technologies that have progressively become popular and then prevalent within the computing industry, including graphical user interfaces - the mouse and bit-mapped graphics with overlapping windows - and more recently, object-oriented programming and virtual machines with automatic garbage collection. Smalltalk has long had all of these features and facilities.

In the past, Smalltalk systems implemented these facilities directly. However, over the years as more of these technologies have been adopted, implemented and marketed by major commercial software vendors, Smalltalk systems have evolved to use the implementations that appeared in the commercial market. Thus, Smalltalk has evolved by integrating commercial technologies rather than continuing to implement them. Parts of Smalltalk have become thinner as their implementations have moved out into commercial technologies.

Just as this happened with graphical user interfaces, we can now see this happening with some more fundamental technologies - the virtual machine and automatic garbage collection. Java 1.1 provides a de facto industry-wide standard virtual machine technology. In this paper, we propose some mechanisms that will allow Smalltalk to evolve yet again to embrace and take advantage of this ubiquitously available virtual machine technology.


Package and Class Names

The Java language model supports the namespace concept with packages. Namespaces are extremely important for the development of large object-oriented software systems. They are also important for the proper integration of third party class libraries. While some namespace models have been proposed for Smalltalk [Boyd, 1996][Beaton, 1994], none has yet been widely adopted by the Smalltalk vendors.

In Java, each package name corresponds to a relative path of a package directory in the file system. The package directory contains some class files. The class files contain the classes that are implemented by the package. The Java compiler uses the CLASSPATH environment variable to locate package directories. Any facility for translating Smalltalk code into Java code should also use the CLASSPATH variable to obtain type information for any Java classes that are refered to by Smalltalk methods.

A Java class name may be fully qualified by the name of the package within which it was defined. We want an analog of this mechanism for Smalltalk. We need a way to concatenate and delimit package and class names. Smalltalk symbols provide a convenient way to do this. Compare the Java class naming syntax with the analogous syntax we will adopt for Smalltalk.

 companyName.productName.packageName.ClassName   // Java
#companyName:productName:packageName:ClassName:  "Smalltalk"

Package and Class Imports

Java classes are compiled in the context of a package. All classes contained in a package are immediately visible to each other. Public classes from other packages can be made visible by importing them. The Java import statement can establish visibility to individual classes or all the public classes contained in a package. The following fragment provides examples of these mechanisms.

package packageName;
import java.awt.*;
import java.util.Vector;

We want a similar mechanism when translating Smalltalk code into Java code. Thus, we want to define a compilation context that establishes visibility between classes.

ClassVisiblityContext new
packageName: #packageName: ;
importPackageNamed: #java:awt: ;
importClassNamed: #java:util:Vector: ;
yourself

Late Binding with Smalltalk.Base.Object >> perform:

Java 1.1 has added a facility for reflection in the package java.lang.reflect. The reflection facility supports the identification of object types at runtime. We can use this facility to provide the equivalent behavior of the Smalltalk perform: messages in the base object class Smalltalk.Base.Object. The perform: methods locate and invoke the implementation of a method selector at runtime. The Java method resolution mechanism needs the classes of the method invocation arguments. The following Java method identifies the classes of the supplied array of objects.

public static 
java.lang.Class[] classifyAll( 
Smalltalk.Base.Object [] instances )
{
	int length = instances.length;
	java.lang.Class result[] = new
	java.lang.Class[length];

	for( int i = 0; i < length; i++ )
		result[i] = instances[i].getClass();

	return result;
}

Using the foregoing method, the following Java code provides an example of how a method can be resolved at runtime.

public 
Smalltalk.Base.Object perform_withArguments( 
java.lang.String selector, 
Smalltalk.Base.Object arguments[] )
{
	java.lang.Object result = null;
	java.lang.Class [] signature = classifyAll( arguments );
	try
	{
		result = this.getClass()
		.getMethod( selector, signature )
		.invoke( this, arguments );
	}
	catch( java.lang.NoSuchMethodException e ) 
	{ 
		// handle an unimplemented selector
	}
	catch( java.lang.reflect.InvocationTargetException e ) 
	{ 
		// handle an invocation exception
	}
	finally {
		return (Smalltalk.Base.Object) result;
	}
}

Given that all Smalltalk objects are derived from Smalltalk.Base.Object, unoptimized Java code can be generated for Smalltalk messages. The following code provides an example of what the generated Java code would look like for a unary message.

receiver.perform( "methodName" )

can be generated for the following Smalltalk expression

receiver methodName

While runtime object typing is essential for supporting Smalltalk, it clearly introduces a significant performance overhead. It would be better if we can find ways to specify or infer type information at compile time, so that we can generate better code - i.e., with early binding rather than late binding. To this end, we explore a way to specify the type information needed for such optimization.


Type Signatures for Classes, Interfaces, Methods and Variables

Smalltalk objects are typed at runtime rather than compile time. While the reflection facilities in Java 1.1 allow us to implement runtime typing and dynamic binding for objects and methods, we want the option of adding type information in an unobtrusive manner - i.e., without changing the syntax of Smalltalk substantially. We can always defer typing until runtime and generate message sends using perform: messages. However, we want to be able to specify type information so that we can generate better Java code. This will also allow us to model more of the Java language mechanisms in Smalltalk, providing better language interoperability. Specifying the type information separately allows us to keep the original Smalltalk code essentially unmodified. The following code templates suggest how we can define the type information for classes, interfaces, methods and variables.

ClassSignature new
name: #packageName:ClassName: ;
access: #( public abstract ) ;
register!
InterfaceSignature new
name: #packageName:InterfaceName: ;
register!
MethodSignature new
name: #methodName ;
className: #packageName:ClassName: ;
result: #ResultClass: ;
access: #( public ) ;
throws: #packageName:ExceptionClassName: ; "..."
register!
VariableSignature new
name: #variableName ;
type: #VariableClass: ;
className: #packageName:ClassName: ;
methodName: #methodName block: depth @ order ;
access: #( protected ) ;
initially: [ initialObject expression ];
register!

Notes:

ClassSignature MethodSignature VariableSignature
(in a Class scope)
VariableSignature
(in a Method scope)
final
abstract



final
static
native
abstract
synchronized
final
static
transient
volatile

final




public
protected         


public
protected
private protected     
private
public
protected
private protected    
private

Method Interoperability

We need naming conventions to achieve method interoperability between Java and Smalltalk. We need to define how Smalltalk method names translate into Java method names. We need to pay special attention to Smalltalk keyword messages. We also need special naming conventions for operators because Java does not support the definition of operator methods. The table below suggests syntactic analogs between Smalltalk and Java method names.

Note the special rules for keyword method names. We define a canonical map from Smalltalk keyword method names to Java method names. We concatenate the keywords, remove the trailing colon and replace all remaining colons with an underscore. This convention works fine for mapping keyword method names from Smalltalk to Java, but what about the opposite direction? Some Java method signatures do not lend themselves to such easy conversion.

In particular, when a Java method has more than one argument, there is no clear analog in Smalltalk. If the syntax of Smalltalk supported anonymous keywords - i.e., the separation of message arguments using only a colon - then we would have what we need to support this mapping. However, in the absence of such syntactic support, we must make another choice. For this reason, we use the special keyword fragment o: as an anonymous keyword to separate the arguments in keyword method names that take two or more arguments. This allows us to invoke methods in Java base classes with these kind of method signatures.

Smalltalk Java
receiver unary
receiver binary: arg
receiver ternary: arg1 o: arg2
receiver keyword: arg1 keyword: arg2     
receiver.unary( )
receiver.binary( arg )
receiver.ternary( arg1, arg2 )
receiver.keyword_keyword( arg1, arg2 )
receiver +  arg
receiver -  arg
receiver *  arg
receiver /  arg
receiver \  arg
receiver @  arg
receiver.plus( arg )
receiver.minus( arg )
receiver.times( arg )
receiver.per( arg )
receiver.rem( arg )
receiver.at( arg )
receiver =  arg
receiver ~= arg
receiver == arg
receiver ~~ arg
receiver <  arg
receiver <= arg
receiver >  arg
receiver >= arg
receiver.equal( arg )
receiver.notEqual( arg )
receiver.is( arg )
receiver.isNot( arg )
receiver.lessThan( arg )
receiver.lessEqual( arg )
reciever.moreThan( arg )
receiver.moreEqual( arg )

Implementing Smalltalk Blocks with Java Inner Classes

With the addition of inner classes in Java 1.1, we finally have the language support we need to implement Smalltalk blocks conveniently. By defining a few simple abstract classes in Java, we provide a foundation for deriving blocks.

Some Smalltalk implementations are enriched in terms of the number of block arguments they support. For simplicity sake, we limit our support for blocks to those that take 0, 1, or 2 arguments. The following table shows how the Java classes we define correspond to the kinds of Smalltalk blocks they support.

Java Interface Smalltalk Block
ZeroArgumentBlock     
OneArgumentBlock
TwoArgumentBlock
[ "..." ] value
[ :a | "..." ] value: x
[ :a :b | "..." ] value: x value: y

If we define a single class for all the block value messages, then each derived block class would have to implement all of the value methods. We do not want to impose this limitation on the implementation, so we define each kind of value message in a separate abstract class.

public abstract 
class ZeroArgumentBlock
extends Smalltalk.Base.Object
{
	public abstract 
	Smalltalk.Base.Object value( );
//	supports [ "..." ] value
}
public abstract 
class OneArgumentBlock
extends Smalltalk.Base.Object
{
	public abstract 
	Smalltalk.Base.Object value( 
	Smalltalk.Base.Object a );
//	supports [ :a | "..." ] value: x
}
public abstract 
class TwoArgumentBlock
extends Smalltalk.Base.Object
{
	public abstract 
	Smalltalk.Base.Object value_value( 
	Smalltalk.Base.Object a, 
	Smalltalk.Base.Object b );
//	supports [ :a :b | "..." ] value: x value: y
}

Now, the derived inner classes that implement Smalltalk blocks need only implement one kind of value message. Thus, when we translate Smalltalk blocks into Java code, we can duplicate their intended semantics. The following Smalltalk and Java sample methods provide a sketch of what this translation looks like for a two argument block.

sortBlock
	"A commonly used two argument block."
	^[ :a :b | a <= b ]
// the Java translation of the method.
public 
Smalltalk.Base.Object sortBlock( )
{
	return new TwoArgumentBlock( )
	{
		public 
		Smalltalk.Base.Object value_value(
		Smalltalk.Base.Object a, 
		Smalltalk.Base.Object b )
		{
			return a.perform_with( "lessEqual", b );
		}
	};
}

Mixing Smalltalk and Java Code

Certain parts of Java syntax are not easily supported in Smalltalk. Examples include the Java syntax for synchronized blocks and exception handling. We want to be able to freely mix Smalltalk and Java methods within a class so that these features become available to Smalltalk developers. Given the ability to mix Smalltalk and Java methods in a class, we can define a Java method for supporting synchronized blocks in Smalltalk.Base.Object.

public 
Smalltalk.Base.Object 
synchronizedDo( ZeroArgumentBlock aBlock )
{
	synchronized( this )
	{
		return aBlock.value();
	}
}

This synchronizedDo: method allows us to use this Java language feature in classes derived from Smalltalk.Base.Object. For example, we could then use code like the following in a derived Smalltalk class.

receiver synchronizedDo: [ "..." ]

While we can package some Java langauge features this way, others are more troublesome. For example, the Java exception handling mechanism is not so easily mapped into Smalltalk. It is easy enough to model the Java throw verb, but the catch verb does not lend itself readily to this kind of emulation.


Conclusions and Questions

This paper sketched a model for how to translate Smalltalk source code into Java source code, ultimately supporting the compilation of Smalltalk into Java class binaries. Overall, the goal of seamless integration between Smalltalk and Java seems like a real possibility and shows promise for the further growth of Smalltalk. However, several questions remain open for consideration.

Finally, how will the Smalltalk vendors respond to the Java challenge? All have invested heavily in proprietary virtual machine technologies. Many of the vendors have put forward a strategy that hedges their investment - integrating support for Java into their Smalltalk virtual machines. While this strategy may appear sound from a business standpoint, one cannot help but question it from a technology standpoint. It may serve to help them migrate to Java, but we wonder: why continue to market proprietary virtual machines when a de facto standard exists? Given the feasibility of integrating Smalltalk and Java, shouldn't Smalltalk embrace Java fully as a peer language?


Bibliography

[Beaton, 1994] Wayne Beaton. Name Space in Smalltalk/V for Win32 in The Smalltalk Report 4(1). SIGS Publications, September 1994.

[Boyd, 1996] Nik Boyd. Class Naming and Privacy in Smalltalk in The Smalltalk Report 6(3). SIGS Publications, November 1996.

Trademarks
Java is a trademark of Sun Microsystems, Inc.