Java and Scala’s Type Systems are Unsound

[)roi(]

Executive Member
Joined
Apr 15, 2005
Messages
6,282
A new research paper points to problems in Java and Scala generics since its introduction in Java 5 i.e. 12 years ago. Not surprising generics are notoriously difficult to get right, even with 12 years of extensive use; inherent design flaws are often overlooked.

PHP:
/**
 * javac, version 1.8.0_25 
 */
class Unsound {
  static class Constrain<A, B extends A> {} 
  
  static class Bind<A> {
    <B extends A> A upcast(Constrain<A, B> constrain, B b) {
      return b; 
    }
  }

  static <T,U> U coerce(T t) {
    Constrain<U, ? super T> constrain = null; 
    Bind<U> bind = new Bind<U>();
    return bind.upcast(constrain, t);
  }

  public static void main(String[] args) {
    String zero = Unsound.<Integer, String>coerce(0); 
  }
}
Which has "sorted of" been patched with recent updates. i.e. "return bind.upcast(constrain, t);" reports an error with Java 1.8.0_112

However this version 9 compatible example still permits "type safe" casting of an Integer to a String.
PHP:
class Unsound9 {
  static class Type<A> {
    class Constraint<B extends A> extends Type<B> {} 
    <B> Constraint<? super B> bad() { return null; } 
    <B> A coerce(B b) {
      return pair(this.<B>bad(), b).value; 
    }
  }
  
  static class Sum<T> {
    Type<T> type;
    T value;
    Sum(Type<T> t, T v) { type = t; value = v; }
  }
  
  static <T> Sum<T> pair(Type<T> type, T value) { 
    return new Sum<T>(type, value);
  }
  
  static <T,U> U coerce(T t) {
    Type<U> type = new Type<U>();
    return type.<T>coerce(t); 
  }
  
  public static void main(String[] args) {
    String zero = Unsound9.<Integer,String>coerce(0);
  }
}

Here's a playground with runnable versions of the various code examples.

The report on this is very detailed and an interesting read, however for the TLDR; here's their conclusion:
We have shown that Java and Scala are unsound and have been for 12 years. Their unsoundness is due to an unforeseen interaction of features that are each sound separately. The unsoundness was missed repeatedly due to our community’s reliance upon minimal calculi; mechanically verified proofs would not have prevented the problem. Any language with types that provide evidence without directly using the corresponding value is susceptible to this issue. Given that this issue was missed even after significant formalization effort, it is likely that many less-thoroughly-examined hypothetical languages and features in our literature share the same problem. This suggests that we as a community need to encourage more research on holistic language design and formalization so that we can prevent such problems in the future.
It's the wildcard's fault: "<? super B>" i.e. the encoding of variances with generics
 

Gnome

Executive Member
Joined
Sep 19, 2005
Messages
7,208
Code:
$ java Unsound
Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
	at Unsound.main(Unsound.java:17)

Code:
$ java Unsound9
Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
	at Unsound9.main(Unsound9.java:26)

Am I missing something obvious here?

EDIT: Ah ok it shouldn't allow compilation. Got it.
 
Last edited:

[)roi(]

Executive Member
Joined
Apr 15, 2005
Messages
6,282
Code:
$ java Unsound
Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
	at Unsound.main(Unsound.java:17)

Code:
$ java Unsound9
Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
	at Unsound9.main(Unsound9.java:26)

Am I missing something obvious here?

EDIT: Ah ok it shouldn't allow compilation. Got it.
Exactly... statically typed languages (like Java) are meant to prohibit this.
 

Gnome

Executive Member
Joined
Sep 19, 2005
Messages
7,208
[)roi(];19063026 said:
Exactly... statically typed languages (like Java) are meant to prohibit this.
I read the researcher's post.

He attempts to make the security concern sound far more serious than it is by suggest alternative histories that didn't happen.
Like if Java did x or y.

What if the Nazis won WWII?
Alternate histories that do not exist are utterly pointless. I can make up any reality to prove my point. Unless it is real however, it is little more than fiction. In fact I believe it is a logical fallacy.

Point being, this is unlikely to have any practical applications, security or otherwise, because the runtime system prevents this using a ClassCastException as shown above.
 

[)roi(]

Executive Member
Joined
Apr 15, 2005
Messages
6,282
I read the researcher's post.

He attempts to make the security concern sound far more serious than it is by suggest alternative histories that didn't happen.
Like if Java did x or y.

What if the Nazis won WWII?
Alternate histories that do not exist are utterly pointless. I can make up any reality to prove my point. Unless it is real however, it is little more than fiction. In fact I believe it is a logical fallacy.

Point being, this is unlikely to have any practical applications, security or otherwise, because the runtime system prevents this using a ClassCastException as shown above.
It's not IMO being highlighted for security reasons at all, but rather for lessons to be learned for future / current language designers, in particular the reasoning that should be going into the implementation of generics in a language with a specific focus on type variance.

As for "alternative histories"; that again I don't see as something negative, but rather him proposing ways in which this problem could have been avoided in the design phase; again as helpful reasoning for future language designers i.e. reasoning around approach and taking shortcuts -- the problem here is after all related to the wildcard (?) shortcut that Java's designers chose to circumvent implementation complexity with type variance.

He certainly doesn't say the mistake was an easy or obvious one; to the contrary he says that even with multiple languages being designed around the Java's JVMs and bytecode, the design problem went unnoticed for 12 years.

From the compiler community I certainly haven't heard anything negative around this paper; because it's understood that this area of language design is highly complex so mistakes are likely.
 
Top