Protocol and DataType

https://www.ibm.com/developerworks/cn/java/j-clojure-protocols/, 通过 Clojure 1.2 解决表达式问题

http://code.google.com/p/clojure-doc-en2ch/wiki/Chapter_13_Datatypes_and_Protocols,  Practical Clojure

 

表达式问题 (Expression Problem)

The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).
                        —Philip Wadler, http://www.daimi.au.dk/~madst/tool/papers/expression.txt

The dynamic Expression Problem is a new name for an old problem. The goal is to have interfaces and type where one can create new types implementing existing interfaces and provide implementation of a new interface for an existing type, without recompiling existing code.

在不重新编译当前代码的情况下的扩展性问题, 分为两点,

将已存在的方法扩展到新的类型

image

这个对于面向对象语言, Java, 很容易实现, 普通的类继承就可以实现

而对于clojure, 1.2之前, 没有类似机制

 

为已存在的类型扩展新的方法

image

对于这种扩展, Java是无法直接实现的
对于Clojure相对简单些, 因为Lisp语言的基础就是List, 具有一组适用于很多类型的公共function, 如上面列的, first, rest, conj…
所以如果需要加一种新的function, 并且如果能使用原先的这些公共的function来实现, 那么没有问题, 这个新function直接可以操作原先已有的类型

那么是不是说Clojure没有这个扩展问题? 不是的, 如果需要给多个接口完全不同的类扩展一个共同的新方法, clojure在1.2之前也不行

例子, 两个产品销售公司widgetco, amalgamated, 分别实现不同的销售模式和类
widgetco, widget产品定价, order只是widget list

package com.widgetco;

public interface Widget {

    public String getName();

    public double getPrice();

}

public class Order extends ArrayList<Widget> {

    public double getTotalCost() { /*...*/ }

}
amalgamated, 产品是变价, 不同的contract, 不同的价格
package com.amalgamated;

public abstract class Product {

    public String getProductID() { /*...*/ }

}

public class Contract {

    public Product getProduct() { /*...*/ }

    public int getQuantity()    { /*...*/ }

    public double totalPrice()  { /*...*/ }

    public String getCustomer() { /*...*/ }

}

现在两个公司合并, 在不改变会有代码的基础上(不能重新编译), 开发新的功能, 比如invoice(发票), manifest(货运清单)
这个问题就是, 如果给Order和Contract类添加新function的问题

1. 直接的方法

1.1 生成common superclass接口, 让两个类都extent这个接口. 总之需要修改源码, 不可避免

public interface Fulfillment {

    public Invoice invoice();

    public Manifest manifest();

}

1.2 开放类, 对于动态语言, Ruby, Python, 可以随时改变类, 动态的添加成员和function都是没有问题的, 称为"monkey patching"
问题是, 过于自由, 缺乏约束, 并且java不支持


2. 间接的方法, 既然不能直接改, 那就封装一层
2.1 Multiple inheritance, 多重继承

public class GenericOrder

        extends com.widgetco.Order, com.amalgamated.Contract

        implements Fulfillment {

    public Invoice invoice() { /*...*/ }

    public Manifest manifest() { /*...*/ }

}
2.2 Wrappers
public class OrderFulfillment implements Fulfillment {

    private com.widgetco.Order order;



    /* constructor takes an instance of Order */

    public OrderFulfillment(com.widgetco.Order order) {

        this.order = order;

    }

    /* methods of Order are forwarded to the wrapped instance */

    public double getTotalCost() {

        return order.getTotalCost();

    }

    /* the Fulfillment interface is implemented in the wrapper */

    public Invoice invoice() { /*...*/ }

    public Manifest manifest() { /*...*/ }

}

2.3 Conditionals and Overloading

public class FulfillmentGenerator {

    public static Invoice invoice(Object source) {

        if (source instanceof com.widgetco.Order) {

            /* ... */

        } else if (source instanceof com.amalgamated.Contract) {

            /* ... */

        } else {

            throw IllegalArgumentException("Invalid source.");

        }

    }

}

public class FulfillmentGenerator {
    public static Invoice invoice(com.widgetco.Order order) {
        /* ... */
    }
    public static Invoice invoice(com.amalgamated.Contract contract) {
        /* ... */
    }
}

列举各自封装的方式, 区别不大都不完美, 个人觉得2.3更简单实用些, 尤其重载的方法已经和后面谈的方法比较象

 

Protocols, 解决为已有类型扩展方法

Clojure 1.2 introduced protocols. 使用defprotocol来定义

(defprotocol MyProtocol
  "This is my new protocol"
  (method-one [x] "This is the first method.")
  (method-two ([x] [x y]) "The second method."))

协议是什么?
a. 多个方法声明的集合(没有实现)
b. 每个方法声明, 包含函数名, 参数列表, 说明doc
c. 每个方法至少有一个参数(第一个参数), 表明datatype, 因为协议中的function对不同datatype定义不同的逻辑
d. defprotocol会直接在命名空间中增加function, method-one和method-two. 也就是说可以在名目空间直接使用函数名调用

还不明白? ok底层实现其实是Java Interface, 你可以采用AOT-compile的方式编译包含协议定义的Clojure源文件,然后在java代码中把它做为接口使用

package my.code;  

public interface MyProtocol{ 

    public Object method_one(); 

    public Object method_two(Object y); 

}
区别, 首先协议不支持继承..., java的interface少了第一个参数, 因为java会implict的包含this, 所以不需要显式写出

怎样使用协议?

使用extend来为已有的类型添加协议

(extend DatatypeName 

  SomeProtocol   

    {:method-one (fn [x y] ...)         

     :method-two existing-function}      

  AnotherProtocol   

    {...})

可以使用extend为datatype添加多个protocol, 每个protocol多对应于一个map(key为function名, value为function实现), value可以用fn定义匿名函数或已经定义的function

并且extend作为function, 会先evaluate所有的参数, 发挥clojure自编程的威力...很强大

(def defaults 

     {:method-one (fn [x y] ...) 

      :method-two (fn [] ...)})           

(extend DefaultType 

  SomeProtocol   

    defaults)   

(extend AnotherType 

  SomeProtocol   

    (assoc defaults :method-two (fn ...)))

同时clojure还提供macro方便为多个datatype来扩展方法, extend-protocol

(extend-protocol SomeProtocol 

  SomeDatatype   

     (method-one [x] ...)        

     (method-two [x y] ...)      

  AnotherType   

     (method-one [x] ...)        

     (method-two [x y] ...))

例子

以上面的例子来看看protocol具体是怎么用的,
1. 定义protocol,

(ns com.amalgamated)

(defprotocol Fulfillment

  (invoice [this] "Returns an invoice")

  (manifest [this] "Returns a shipping manifest"))
 
2. 为已有的类型添加方法, 通过protocol
(extend-protocol Fulfillment

  com.widgetco.Order

    (invoice [this]

      ... return an invoice based on an Order ... )

    (manifest [this]

      ... return a manifest based on an Order ... )

  com.amalgamated.Contract

    (invoice [this]

      ... return an invoice based on a Contract ... )

    (manifest [this]

      ... return a manifest based on a Contract ... ))
3. 调用
invoice(com.widgetco.Order)或invoice(com.amalgamated.Contract)
是不是和上面的重载的方式有些象, 只不过这里是FP, 所以直接使用function

Datatypes, 已存在的方法扩展新的类型

Clojure是FP语言, 但有基于JVM, 大部分的类型都继承于Java, 所以难免需要使用OO的思路.
Datatypes就类似于OO里面的类, 封装了状态(字段)和行为(方法).

在clojure里面增加datatype, 仅仅是为了便于使用, 是FP向OO的妥协

在Clojure1.2之前,采用map来存放数据结构, 比如StructMaps(defstruct预定义key)

在Clojure1.2, 加入了datatype, 用于替代StructMaps

(defrecord name [fields...])
user> (defrecord Employee [name room])

user> (def emp (Employee. "John Smith" 304))



user> (:name emp) 

"John Smith" 

user> (:room emp) 

304

使用defrecord定义, 其实如何理解record?
a. 首先record是map, 所以可以直接使用map的任何function
b. record可以自定义更多的function(通过protocol)

是的, 其实Datatype在底层就是用class实现的,

A datatype is equivalent to a Java class containing public final instance fields and implementing any number of interfaces. It does not extend any base class except java.lang.Object.

public class Employee

    implements java.io.Serializable, 

               java.util.Map,

               java.lang.Iterable,

               clojure.lang.IPersistentMap {



    public final Object name;

    public final Object room;



    public Employee(Object name, Object room) {

        this.name = name;

        this.room = room;

    }

}


如何给Datatype增加function? 除了前面的方法,
In-Line Methods

In-line其实就是在定义record的时候, 可以增加protocol的实现

(defrecord name [fields...] 

  SomeProtocol 

    (method-one [args] ... method body ...) 

    (method-two [args] ... method body ...) 

  AnotherProtocol 

    (method-three [args] ... method body ...))

并且可以在实现function的时候使用, 可以直接使用这些field (defrecord does not close over its lexical scope like fn, proxy, or reify)

(defrecord name [x y z] 
  SomeProtocol
  (method-one [args]
    ...do stuff with x, y, and z...))

Extending Java Interfaces

除了扩展protocol, 还能直接扩展java interface

user> (defrecord Pair [x y] 

        java.lang.Comparable 

          (compareTo [this other] 

             (let [result (compare x (:x other))] 

               (if (zero? result) 

                 (compare y (:y other)) 

                 result)))) 

#'user/Pair 

user> (compare (Pair 1 2) (Pair 1 2)) 

0 

user> (compare (Pair 1 3) (Pair 1 100)) 

-1
Reifying Anonymous Datatypes

Sometimes you need an object that implements certain protocols or interfaces, but you do not want to create a named datatype.
只想实现protocol接口, 而没有实际的data, 所以需要匿名的datatype, 使用reify

(reify 

  SomeProtocol   

    (method-one [] ...)         

    (method-two [y] ...)        

  AnotherProtocol   

    (method-three [] ...)) 
reify还有个特点是, 可以以闭包的形式close over局部变量, 不同于defrecord
user> (def thing (let [s "Capture me!"] 

                    (reify java.lang.Object 

                       (toString [] s)))) 

#'user/thing 

user> (str thing) 

"Capture me!"
如上面例子, 在thing中定义的匿名结构会close over局部变量s的值

 

Deftype

Defrecord是基于map的, 看上面的转换成Java代码就很清晰
如果使用Deftype, 就是不基于任何类型的, 全新的数据类型

deftype的语法基本和defrecord一样,但是deftype不会为你创建任何默认的方法实现。你必须自己实现所有的方法,包括标准对象方法比如equals 和 hashCode

 

底下这段意思, 在deftype中的field, 默认都是immutable
如果要改, 必须要修改field本身的metadata, ^{:volatile-mutable true}或是^{:unsynchronized-mutable true}
区别我猜, volatile-mutable 修改其他线程同步可见, 而unsynchronized-mutable 异步更改 其他线程不一定立刻可见
并且这种field只能通过set!修改
对于为什么要使用,可以参考storm里面messaging实现IContext的case

http://stackoverflow.com/questions/3132931/mutable-fields-in-clojure-deftype

deftype's default is still to have the fields be immutable; to override this, you need to annotate the names of the fields which are to be mutable with appropriate metadata. Also, the syntax for set! of instance fields is different. An example implementation to make the above work:

(deftype Point [^{:volatile-mutable true} x] IPoint (getX [_] x) (setX [this v] (set! x v)))

There's also :unsynchronized-mutable. The difference is as the names would suggest to an experienced Java developer. ;-) Note that providing either annotation has the additional effect of making the field private, so that direct field access is no longer possible:

(.getX (Point. 10)) ; still works (.x (Point. 10)) ; with annotations -- IllegalArgumentException, works without

Also, 1.2 will likely support the syntax ^:volatile-mutable x as shorthand for ^{:volatile-mutable true} x (this is already available on some of the new numerics branches).

Both options are mentioned in (doc deftype); the relevant part follows -- mind the admonition!

Fields can be qualified with the metadata :volatile-mutable true or :unsynchronized-mutable true, at which point (set! afield aval) will be supported in method bodies. Note well that mutable fields are extremely difficult to use correctly, and are present only to facilitate the building of higher level constructs, such as Clojure's reference types, in Clojure itself. They are for experts only - if the semantics and implications of :volatile-mutable or :unsynchronized-mutable are not immediately apparent to you, you should not be using them.

 

总结

面对表达式问题, Clojure 协议和数据类型提供了一种简单、优秀的解决方案.

image

使用Guideline

Datatypes and protocols are a significant new feature in Clojure, and they will have a major impact on how most Clojure programs are written.
Standards and best practices are still developing, but a few guidelines have emerged:
• Prefer reify to proxy unless you need to override base class methods.
• Prefer defrecord to gen-class unless you need gen-class features for Java interoperability.
• Prefer defrecord to defstruct in all cases. 使用defrecord替换defstruct
• Specify your abstractions as protocols, not interfaces.
• Prefer protocols to multimethods for the case of single-argument type-based dispatch. 对于单参数基于类型的dispatch使用protocol, 而multimethods使用在更复杂的case
• Add type hints only where necessary for disambiguation or performance (Chapter 14); most types will be inferred automatically. 仅在必须或对性能有要求时, 使用type hints

Datatypes and protocols do not remove any existing features: defstruct, gen-class, proxy, and multimethods are all still there. Only defstruct is likely to be deprecated.
The major difference between Java classes and protocols/datatypes is the lack of inheritance. 和Java类, 接口最大的不同是, 不支持继承, 从而避免了复杂性
The protocol extension mechanism is designed to enable method reuse without concrete inheritance and its associated problems.

你可能感兴趣的:(protocol)