Java Q&A

Java Q&A - Java 问答 - Attack of the clones - 克隆人的进攻 之 面向对象Java版

Java Q&A

数据挖掘研究院


Java 问答 数据挖掘交友

Attack of the clones

克隆人的进攻

之 面向对象Java版

Time and space considerations in four different approaches to implementing deep clone() methods

权衡时间和空间的得失,有四种不同的方案来实现 deep clone() 方法。

数据挖掘研究院

By Vladimir Roubtsov
K ][ N G of A R K™ 编译 [Revision 0.1]


数据挖掘论坛

January 24, 2003
二零零三年一月二十四日

Q:What are the advantages and disadvantages of implementing deep cloning via Java serialization and a built-in Object.clone() method from a performance point of view?
问:从性能的角度观之, 以 Java serialization(次第读写)或者内建的 Object.clone() 方法(method)来实现 deep cloning(深度克隆),各有哪些优劣之处?

A:Equipping classes in your application with correct clone() implementation is essential to many defensive programming patterns. Common examples include defensive copying of method parameters, cloning internal fields before returning them in getters, implementing immutability patterns, implementing containers with deep cloning semantics, and so on.


答:在您的应用程序中为各个类别搭载正确实现了的 clone() 方法,这对于许多防御式编程模式而言是至关重要的。常见的防御式编程模式包括:防御式的对方法接收的参数进行拷贝;从getters返回内部字段(field)之前先对该内部字段进行克隆;实现提供不可变功能的模式(immutability patterns);以 deep cloning(深度克隆)语义实现 containers(容器);等等。

Even though the question mentions just two possibilities, there are at least four distinct approaches to clone() implementation. In this Java Q&A installment, I consider design and performance tradeoffs involved in all of them.
尽管问题中只提到了两个可能的方案,其实至少有四种不同的方案来实现 clone() 方法。在本期的Java 问答中,我就针对这四种方案来进行设计和性能两方面的权衡。

Because cloning is so customizable, this article"s examples will not necessarily translate directly to your own code; however, the general conclusions we will reach should provide useful guidelines in any application design.

数据挖掘论坛


由于克隆实现代码的可定制性很强,因此本文的示例代码不一定就适合直接转化到您自己的代码中;然而,我们得出的普适结论应该能为任何应用设计提供有用的指导。

Note the following disclaimer: What exactly constitutes a deep clone is debatable. Even though two objects can safely share the same String reference viewed as data, they cannot if the same field is used as an instance-scoped object monitor (such as when calling Object.wait()/notify() on it) or if field instance identity (such as when using the == operator) is significant to the design. In the end, whether or not a field is shareable depends on the class design. For simplicity, I assume below that all fields are used as pure data.
请注意这面这句不作承诺的声明:deep clone(深度克隆)究竟有哪些具体的实现要素,这个问题本身就具有争议性。尽管一个被视为数据的“String 引用”可以被两个对象安全的共享,但如果该 String 字段是被用作实体生存空间范围内(instance-scoped)的对象监视器(object monitor,比如对其调用 Object.wait()/notify() 的情形),或者字段实体的身份(identity)对于设计而言至关重要(比如使用 == operator 的情形),那么它就无法被安全的共享了。一言以蔽之,字段是否可被共享取决于类别的设计。为了简单起见,我假设本文所述的所有字段都被视为纯粹的数据来使用。

Performance measurements setup


用于性能度量的范例设定
Let"s jump right into some code. I use the following simple hierarchy of classes as my cloning guinea pig:
让我们直接来看些代码。我使用如下简单的类别阶层体系来作为克隆“实验鼠”:

public class TestBaseClass
             implements Cloneable, Serializable
{
    public TestBaseClass (String dummy)
    {
        m_byte = (byte) 1;
        m_short = (short) 2;
        m_long = 3L;
        m_float = 4.0F;
        m_double = 5.0;
        m_char = "6";
        m_boolean = true;

数据挖掘研究院



        m_int = 16;
        m_string = "some string in TestBaseClass";
        
        m_ints = new int [m_int];
        for (int i = 0; i < m_ints.length; ++ i) m_ints [i] = m_int;
        
        m_strings = new String [m_int];
        m_strings [0] = m_string; // invariant: m_strings [0] == m_string
        for (int i = 1; i < m_strings.length; ++ i)
            m_strings [i] = new String (m_string);
    }

    public TestBaseClass (final TestBaseClass obj) 数据挖掘交友
    {
        if (obj == null) throw new IllegalArgumentException ("null input: obj");
        
        // Copy all fields:
        
        m_byte = obj.m_byte;
        m_short = obj.m_short;
        m_long = obj.m_long;
        m_float = obj.m_float;
        m_double = obj.m_double;
        m_char = obj.m_char;
        m_boolean = obj.m_boolean;
        
        m_int = obj.m_int; 数据挖掘工具
        m_string = obj.m_string;
        
        if (obj.m_ints != null) m_ints = (int []) obj.m_ints.clone ();
        if (obj.m_strings != null) m_strings = (String []) obj.m_strings.clone ();
    }
    
    // Cloneable:
    public Object clone ()
    {
        if (Main.OBJECT_CLONE)
        {
            try
            {
                // Chain shallow field work to Object.clone(): 数据挖掘工具
                final TestBaseClass clone = (TestBaseClass) super.clone ();
                
                // Set deep fields:
                if (m_ints != null)
                    clone.m_ints = (int []) m_ints.clone ();
                if (m_strings != null)
                    clone.m_strings = (String []) m_strings.clone ();

数据挖掘交友


                
                return clone;
            }
            catch (CloneNotSupportedException e)
            {
                throw new InternalError (e.toString ());
            }
        }
        else if (Main.COPY_CONSTRUCTOR)
            return new TestBaseClass (this);
        else if (Main.SERIALIZATION) 数据挖掘工具
            return SerializableClone.clone (this);
        else if (Main.REFLECTION)
            return ReflectiveClone.clone (this);
        else
            throw new RuntimeException ("select cloning method");
    }
    
    protected TestBaseClass () {} // accessible to subclasses only
    

    private byte m_byte;
    private short m_short;
    private long m_long;
    private float m_float;
    private double m_double; 数据挖掘实验室
    private char m_char;
    private boolean m_boolean;

    private int m_int;
    private int [] m_ints;
    private String m_string;
    private String [] m_strings; // invariant: m_strings [0] == m_string    
} // end of class


public final class TestClass extends TestBaseClass
             implements Cloneable, Serializable
{
    public TestClass (String dummy)
    {
        super (dummy);
        
        m_int = 4;
        
        m_object1 = new TestBaseClass (dummy); 数据挖掘工具
        m_object2 = m_object1; // invariant: m_object1 == m_object2
    
        m_objects = new Object [m_int];
        for (int i = 0; i < m_objects.length; ++ i)
            m_objects [i] = new TestBaseClass (dummy);
    }

    public TestClass (final TestClass obj)
    {
        // Chain to super copy constructor:
        super (obj);
        
        // Copy all fields declared by this class:
        
        m_int = obj.m_int; 数据挖掘工具
        
        if (obj.m_object1 != null)
            m_object1 = ((TestBaseClass) obj.m_object1).clone ();
        m_object2 = m_object1; // preserve the invariant
        
        if (obj.m_objects != null)
        {
            m_objects = new Object [obj.m_objects.length];
            for (int i = 0; i < m_objects.length; ++ i)
                m_objects [i] = ((TestBaseClass) obj.m_objects [i]).clone ();

数据挖掘论坛


        }
    }
    
    // Cloneable:
    public Object clone ()
    {
        if (Main.OBJECT_CLONE)
        {
            // Chain shallow field work to Object.clone():
            final TestClass clone = (TestClass) super.clone ();
            
            // Set only deep fields declared by this class:
            
            if (m_object1 != null)
                clone.m_object1 = ((TestBaseClass) m_object1).clone ();
            clone.m_object2 = clone.m_object1; // preserve the invariant
            
            if (m_objects != null)
            {
                clone.m_objects = (Object []) m_objects.clone ();
                for (int i = 0; i < m_objects.length; ++ i)
                    clone.m_objects [i] = ((TestBaseClass) m_objects [i]).clone (); 数据挖掘工具
            }

            return clone;
        }
        else if (Main.COPY_CONSTRUCTOR)
            return new TestClass (this);
        else if (Main.SERIALIZATION)
            return SerializableClone.clone (this);
        else if (Main.REFLECTION)
            return ReflectiveClone.clone (this);
        else
            throw new RuntimeException ("select cloning method"); 数据挖掘研究院
    }
        
    protected TestClass () {} // accessible to subclasses only

    private int m_int;        
    private Object m_object1, m_object2; // invariant: m_object1 == m_object2
    private Object [] m_objects;
} // End of class

TestBaseClass has several fields of primitive types as well as a String and a couple of array fields. TestClass both extends TestBaseClass and aggregates several instances of it. This setup allows us to see how inheritance, member object ownership, and data types can affect cloning design and performance.
TestBaseClass 拥有几个基本型别(primitive types)的字段(fields),还有一个 String 以及两个数组。 TestClass 继承自 TestBaseClass ,还聚合了几个 TestBaseClass 实体。这种范例设定可以让我们看到继承、成员对象所有权(ownership)以及数据类型如何会影响克隆方法的设计与性能。

In a previous Java Q&A article, I developed a simple timing library that comes in handy now. This code in class Main measures the cost of TestClass.clone(): 数据挖掘研究院
上一期 Java 问答 中,我开发了一个简单的计时程序库,现在可以信手拈来使用。在 class Main 中的如下代码测量了 TestClass.clone() 的时间消耗:

        // Create an ITimer:
        final ITimer timer = TimerFactory.newTimer ();
        
        // JIT/hotspot warmup:
        // ...

        TestClass obj = new TestClass ();
        
        // Warm up clone():
        // ...

        final int repeats = 1000; 数据挖掘论坛
        
        timer.start ();
        // Note: the loop is unrolled 10 times
        for (int i = 0; i < repeats / 10; ++ i)
        {
            obj = (TestClass) obj.clone ();
            ... repeated 10 times ...
        }
        timer.stop ();
        
        final DecimalFormat format = new DecimalFormat ();
        format.setMinimumFractionDigits (3);


        format.setMaximumFractionDigits (3);
        
        System.out.println ("method duration: " +
            format.format (timer.getDuration () / repeats) + " ms");

I use the high-resolution timer supplied by TimerFactory with a loop that creates a moderate number of cloned objects. The elapsed time reading is reliable, and there is little interference from the garbage collector. Note how the obj variable continuously updates to avoid memory caching effects.
我使用了由 TimerFactory 提供的高解析度的计时器(high-resolution timer),利用一个循环创建了相当数量的克隆出来的对象。表示流逝时间的数据是可靠的,受垃圾收集器的影响很小。请注意 obj 变量被持续更新,以避免内存缓冲效应(memory caching effects)。

Also note how clone() is implemented in both classes. The implementation in each class is in fact four, selected one at a time using four conditional compilation constants in Main: OBJECT_CLONE, COPY_CONSTRUCTOR, SERIALIZATION, and REFLECTION. Recompile the entire object when changing the cloning approach. 数据挖掘论坛
还请注意,在两个类别中都实现了 clone() 方法。实际上每个类别中都有四种克隆动作的实现,可以通过 Main 里面的条件编译常量(conditional compilation constants)来选择施行其中之一,这些常量分别是: OBJECT_CLONECOPY_CONSTRUCTORSERIALIZATION 以及 REFLECTION 。要改变克隆动作的实现方案,需要重新编译整个类别。

Let"s now examine each approach in detail.
现在我们就分别详细的审视前述的四个方案。

Approach 1: Cloning by chaining to Object.clone()

方案 1:通过串链 Object.clone() 实现克隆

This is perhaps the most classical approach. The steps involved are:
这或许就是最经典型的方案了。该方案涉及的实现步骤为:

  • Declare your class to implement the Cloneable marker interface.
  • 令您的类别实现 Cloneable 标记接口(marker interface)。
  • Provide a public clone override that always begins with a call to super.clone() followed by manual copying of all deep fields (i.e., mutable fields that are object references and cannot be shared between several instances of the parent class).
  • 提供一个覆写(override)版本的 public clone 方法,其内以调用 super.clone() 开头,后面再接续拷贝所有深层字段(deep fields,即为对象引用,且不能共享于多个父辈类别实体之间的可变字段(mutable fields))的代码。
  • Declare your clone override not to throw any exceptions, including CloneNotSupportedException. To this effect, the clone() method in your hierarchy"s first class that subclasses a non-Cloneable class will catch CloneNotSupportedException and wrap it into an InternalError.
  • 声明该覆写(override)版本的 clone 方法不抛出任何异常,包括不能抛出 CloneNotSupportedException 异常。 意思就是说:在您的类别阶层体系中,对于第一个派生自 non-Cloneable 类别的那个类别,其 clone() 方法能够捕获 CloneNotSupportedException 异常并将该异常包入 InternalError 中。

Correct implementation of Cloneable easily deserves a separate article. Because my focus is on measuring performance, I will repeat the relevant points here and direct readers to existing references for further details (see Resources). 数据挖掘研究院
光是 Cloneable 的正确实现方法就可以很容易的需要占用另外一整篇文章的篇幅来进行阐述。鉴于我在这里关注的是性能的测量,因而我也就只复述一些相关的要点,并为读者您提供更多细节的参考信息(详见参考资源)。

This traditional approach is particularly well suited to the presence of inheritance because the chain of super.clone() eventually calls the native java.lang.Object.clone() implementation. This is good for two reasons. First, this native method has the magic ability to always create an instance of the most derived class for the current object. That is, the result of super.clone() in TestBaseClass is an instance of TestClass when TestBaseClass.clone() is part of the chain of methods originating from TestClass.clone(). This makes it easy to implement the desirable x.clone().getClass() == x.getClass() invariant even in the presence of inheritance. 数据挖掘研究院
这个经典型的方案特别适用于有继承体系的地方,因为 super.clone() 串链最终会导致调用原生的 java.lang.Object.clone() 方法。说这样做很妥当有两个原因。其一,该原生方法(native method)具有神奇的能力,总是能够为当前对象创建继承体系最末端的类别实体。这就是说,TestBaseClasssuper.clone() 的执行结果得到 TestClass 实体,因为 TestBaseClass.clone() 是起源自 TestClass.clone() 的一系列串链起来的方法之一。这样一来,即使是在继承体系之中也很容易实现我们想要的 x.clone().getClass() == x.getClass() 不变式(invariant)。

Second, if you examine the JVM sources, you will see that at the heart of java.lang.Object.clone() is the memcpy C function, usually implemented in very efficient assembly on a given platform; so I expect the method to act as a fast "bit-blasting" shallow clone implementation, replicating all shallow fields in one fell swoop. In many cases, the only remaining manual coding is done to deeply clone object reference fields that point to unshareable mutable objects.


其二,如果您查看JVM源代码的话,您会看到 java.lang.Object.clone() 的核心部分是C函数 memcpy ,这个函数是用目标平台上非常高效的汇编代码实现的;因此可以期望这个 java.lang.Object.clone() 方法的实现是以快速的“按比特狂做(bit-blasting)”之方式进行的浅度克隆(shallow clone),能够迅捷的复制所有浅层字段(shallow fields)。这样一来在许多情况下,所剩的唯一需要手工编写的代码就只用负责对“指向非共享、可易变对象(unshareable mutable objects)之引用”进行深度克隆。

Running the test with the OBJECT_CLONE variable set to true on a Windows 550-MHz machine with Sun Microsystems" JDK 1.4.1 produces:
OBJECT_CLONE 变量设为 true ,在一台安装了 Sun Microsystems JDK 1.4.1 的 Windows 550-MHz 机器上面运行测试程序就产生出如下结果:

clone implementation: Object.clone()
method duration: 0.033 ms

This is not bad for a class with multiple primitive and object reference fields. But for better insight, I must compare the result with other approaches below.


对于拥有多个基本型别字段和对象引用字段的类别而言,这不算坏。然而为了更好的考究问题,我须将此结果与下面其它方案进行比较才对。

Despite its advantages, this approach is plagued with problems due to poor java.lang.Object.clone() design. It cannot be used for cloning final fields unless they can be copied shallowly. Creating smart, deeply cloning container classes is complicated by the fact that Cloneable is just a marker interface, and java.lang.Object.clone() is not public. Finally, cloning inner classes does not work due to problems with outer references. See articles by Mark Davis and Steve Ball in Resources for some of the earliest discussions about this topic.
尽管该方案有自己的优势,但设计欠佳的 java.lang.Object.clone() 方法使其备受折磨。除非 final 字段能被浅层拷贝,否则该方案就不能用于对 final 字段进行克隆的情形。由于 Cloneable 只是一个标记接口(marker interface),而 java.lang.Object.clone() 方法又不是 public ,因此创建既聪明又具有 deeply cloning(深度克隆)能力的 container classes(容器类别)变得复杂起来。最后,由于外围引用(outer references)亦招致问题,因此该方案也无法运用于克隆内隐类别(inner classes)的情形。关于此议题的最早的讨论,参见 参考资源 中 Mark Davis 和 Steve Ball 的文章。

Approach 2: Cloning via copy construction

数据挖掘交友



方案 2: 通过拷贝构造动作进行克隆

This approach complements Approach 1. It involves these steps:
这是对方案1的增强补足方案,实现起来包含下列步骤:

  • For every class X, provide a copy constructor with signature X(X x).
  • 对于每个 class X ,以标记式(signature) X(X x) 来提供一个 copy constructor

  • Chain to the base class"s copy constructor in all but the first class in your hierarchy. You can chain to clone() or directly to the base copy constructor. The former choice is more polymorphic and works when the base copy constructor is private, and the latter sometimes avoids the small cost of casting clone()"s return value to a specific type.
  • 将基类的拷贝构造函数(copy constructor)串链到类别阶层体系的所有类别中,阶层体系最顶端的第一个类除外。您可以将其串链到这些类的 clone() 方法中,或者直接串链到它们的基类的拷贝构造函数(copy constructor)中。前一种做法更具多态特性,在基类的拷贝构造函数(copy constructor)为private时即可凑效;后一种做法有时候能够避免“将 clone() 方法的返回值转型(cast)到某个特定型别”所带来的微小性能消耗。

  • Following the chaining call, set all class fields by copying them from the input parameter. For every object reference field, you decide individually whether to clone it deeply.
  • 将上述调用串链起来之后,将输入参数拷贝给所有的类别字段(fields)。接着由您自己来决定是否对各个对象引用字段进行深度克隆。

Setting COPY_CONSTRUCTOR to true and rerunning the test produces: 数据挖掘研究院
COPY_CONSTRUCTOR 设为 true ,再重新运行测试程序,产生如下结果:

clone implementation: copy construction
method duration: 0.024 ms

This beats Approach 1. The result might not be surprising because the overhead of native method calls has increased and the cost of new object creation has decreased with increasing JDK versions. If you rerun the same tests in Sun"s JDK 1.2.2, the situation favors Approach 1. Of course, performance depends on the relative mix of shallow and deep fields in the class hierarchy. Classes with many primitive type fields benefit more from Approach 1. Classes with a few mostly immutable fields work very efficiently with Approach 2, with a speed advantage at least 10 times greater than Approach 1.
这次的结果意味方案2胜过方案1。或许这结果并不令人吃惊,因为增加了对原生方法的调用,而创建新对象的消耗伴随着 JDK 版本的升高而减小。如果您在 Sun 公司的 JDK 1.2.2 之下重新运行相同的测试,方案1就会胜出。当然,性能依赖于类别阶层体系中浅层字段(shallow fields)和深层字段(deep fields)的混杂方式。拥有很多基本型别之字段的类别会更多的得益于方案1。而对于只拥有少数字段且多为不可变字段的类别,方案2运作得非常高效,其速度上的优势至少为快过方案1十倍。

Approach 2 is more error prone than Approach 1 because it is easy to forget to override clone() and accidentally inherit a superclass"s version that will return an object of the wrong type. If you make the same mistake in Approach 1, the result will be less disastrous. Additionally, it is harder to maintain the implementation in Approach 2 when fields are added and removed (compare the OBJECT_CLONE branch in TestBaseClass.clone() with similar code in the copy constructor). Also, Approach 1 requires less class cooperation in some cases: for a base class with only shallow fields, you don"t need to implement Cloneable or even provide a clone() override if you do not intend to clone at the base class level. 数据挖掘研究院
方案2比方案1更容易出错,因为很容易忘记覆写(override) clone() 方法,并由此意外的继承了父辈类别(superclass)的 clone() 版本,其返回一个错误型别的对象。但若您在方案1中犯下同样的错误,后果就不会那么惨重。另外,当类别的字段被添加或者删除时,方案2的实现代码更难于维护(将 TestBaseClass.clone() 中的 OBJECT_CLONE 分支与拷贝构造函数中的相应代码进行比较即可知)。再有就是,方案1在某些情况下对类别之间的合作需求更少:对于只拥有浅层字段的基类,您不需要实现 Cloneable 方法;如果您无意在基类的层级上进行克隆动作,您甚至不需要提供覆写版本的 clone() 方法。

However, an undeniable advantage of cloning via copy construction is that it can handle both final fields and inner classes. But due to dangers present when inheritance is involved, I recommend using this sparingly and preferably simultaneously with making the relevant classes final.
然而,通过拷贝构造动作进行克隆(译注:即方案2)有个不可否认的优势,此即:该方案既可以处理 final 字段,也可以处理内隐类别(inner classes)。鉴于该方案在涉及继承时所具有的危险性,我建议保守的采用之,且采用该方案时最好同时将有关的类别声明为final 。

Approach 3: Cloning via Java serialization

数据挖掘论坛


方案 3:通过 Java serialization(次第读写)进行克隆

Java serialization is convenient. Many classes are made serializable by simply declaring them to implement java.io.Serializable. Thus, a whole hierarchy of classes can be made cloneable by deriving them from a base Serializable class whose clone() is implemented as a simple, yet extremely generic method:
Java serialization(次第读写)方便好用。许多类别只要被简单的声明为“实现 java.io.Serializable” 就能具备 serializable 性质。于是,若令整个阶层体系派生自基类 Serializable ,那么阶层体系的所有类别就都能具备 cloneable 性质,欲使然只要求基类 Serializable 实现出一个简单,同时又极为通用的 clone() 方法:

    public Object clone (Object obj)
    {
        try 数据挖掘论坛
        {
            ByteArrayOutputStream out = new ByteArrayOutputStream ();
            ObjectOutputStream oout = new ObjectOutputStream (out);
            oout.writeObject (obj);
            
            ObjectInputStream in = new ObjectInputStream (
                new ByteArrayInputStream (out.toByteArray ()));
            return in.readObject ();
        }
        catch (Exception e) 数据挖掘工具
        {
            throw new RuntimeException ("cannot clone class [" +
                obj.getClass ().getName () + "] via serialization: " +
                e.toString ());
        }
    }

This is so generic it can be used for cloning classes that can be written and added to your application by someone else long after you provide the base classes. But this convenience comes at a price. After switching TestBaseClass.clone() and TestClass.clone() to the SERIALIZATION branch I get:
这个实现是如此之通用,在您写好基类很久以后,别人要将新编写的类别加入您的应用程序时,还可以利用该方法来克隆那些新编写的类别。然而这种便利性得来有代价。将 TestBaseClass.clone()TestClass.clone() 之实现代码切换到 SERIALIZATION 分支的情况下,我得到如下的结果:

clone implementation: serialization


method duration: 2.724 ms

This is roughly 100 times slower than Approaches 1 and 2. You probably would not want this option for defensive cloning of parameters of otherwise fast intra-JVM methods. Even though this method can be used for generic containers with deep cloning semantics, cloning a few hundred objects would make you see times in the one-second range: a doubtful prospect.
这比方案1和方案2慢了有100倍左右。如果您是在为本该很快的 intra-JVM 之 方法的参数作防御性的克隆,您大概不会希望采用这种方案。尽管该方法可被运用于带有深度克隆语义的通用containers(容器),但像这样克隆几百个对象的话,您会得到1秒钟范围内的时间消耗——其应用前景令人生疑。

There are several reasons why this approach is so slow. Serialization depends on reflective discovery of class metadata, known to be much slower than normal method calls. Furthermore, because a temporary input/output (I/0) stream is used to flatten the entire object, the process involves UTF (Universal Transformation Format) 8-encoding and writing out every character of, say, TestBaseClass.m_string. Compared to that, Approaches 1 and 2 only copy String references; each copy step has the same small fixed cost. 数据挖掘论坛
该方案如此缓慢有几个原因。首先,serialization(次第读写)机制系依靠类别元数据(metadata)的映像式探知动作(reflective discovery),已知它比普通的函数调用慢得多。更为甚之,由于serialization(次第读写)使用一个临时的 输入/输出(I/0)串流(stream)来摊开(flatten)整个对象,因而整个过程涉及到 UTF8 编码动作(UTF8-encoding,Universal Transformation Format)以及向外写入被摊开的对象成分的每个字符(比如 TestBaseClass.m_string)。相比之下(再以 TestBaseClass.m_string 为例),方案1和方案2只需要拷贝 String 引用,且每次拷贝具有相同的固定的时间消耗。

What"s even worse, ObjectOutputStream and ObjectInputStream perform a lot of unnecessary work. For example, writing out class metadata (class name, field names, metadata checksum, etc.) that may need to be reconciled with a different class version on the receiving end is pure overhead when you serialize a class within the same ClassLoader namespace.
更糟糕的是,ObjectOutputStreamObjectInputStream 做了诸多不必要的工作。例如向外写入类别元数据(metadata,这包括类别名称、字段名称、元数据校验和,等等),只为与写入操作之接收端的不同版本类别相配合,而这对于您在同一个 ClassLoader 命名空间(namespace)里面次第读写(serialize)类别的情况下,纯粹就是额外负荷。

On the plus side, serialization imposes fairly light constructor requirements (the first non-Serializable superclass must have an accessible no-arg constructor) and correctly handles final fields and inner classes. This is because native code constructs the clone and populates its fields without using any constructors (something that can"t be done in pure Java).


从好的一面来说,次第读写(serialization)对构造函数的特定需求相当小(第一个 non-Serializable 基类必须拥有一个可访问的无参数构造函数),并能正确妥当的处理final字段和内隐类别的情形。这是因为原生代码能在不使用构造函数的情况下构造克隆对象并转存(populates)对象的字段(这是单纯依靠Java所无法做到的)。

One more interesting advantage of Approach 3 is that it can preserve the structure of object graph rooted at the source object. Examine the dummy TestBaseClass constructor. It fills the entire m_strings array with the same m_string reference. Without any special effort on our part, the invariant m_strings[0] == m_string is preserved in the cloned object. In Approaches 1 and 2, the same effect is either purely incidental (such as when immutable objects remain shared by reference) or requires explicit coding (as with m_object1 and m_object2 in TestClass). The latter could be hard to get right in general, especially when object identities are established at runtime and not compile time (as is the case with TestClass). 数据挖掘研究院
方案3还有一个优势:它可以保持根基于次第读写源对象的“对象图面(object graph)”结构。来观察一下 dummy TestBaseClass 构造函数。该构造函数以相同的 m_string 引用填充整个 m_strings 数组。在我们的代码中,不用借助任何特殊动作就可以在克隆出来的对象内保持 m_strings[0] == m_string 不变式(invariant)。而要在方案1和方案2中达到同样的效果,则要么纯粹靠巧合(比如不可变对象通过引用保持被共享),要么就需要额外的编码(如同 TestClassm_object1m_object2 的情形)。要把后一种情况做到正确无误通常是困难的,特别是在对象的身份在运行期(而非编译期)才建立之情形下(如 TestClass 中的情形)。

Approach 4: Cloning via Java reflection

方案 4:通过 Java reflection(映像)进行克隆

Approach 4 draws inspiration from Approach 3. Anything that uses reflection can work on a variety of classes in a generic way. If I require the class in question to have a (not necessarily public) no-arg constructor, I can easily create an empty instance using reflection. It is especially efficient when the no-arg constructor doesn"t do anything. Then it is a straightforward matter to walk the class"s inheritance chain all the way to Object.class and set all (not just public) declared instance fields for each superclass in the chain. For each field, I check whether it contains a primitive value, an immutable object reference, or an object reference that needs to be cloned recursively. The idea is straightforward but getting it to work well requires handling a few details. My full demo implementation is in class ReflectiveClone, available as a separate download. Here is the pseudo-code of the full implementation, with some details and all error handling omitted for simplicity: 数据挖掘实验室
方案4从方案3吸取了一些要领。针对各种类别,任何动用映像(reflection)者都能以通用的方式处理之。如果我希望手中的类别能拥有一个无参数构造函数(并非需要为 public),我用映像(reflection)简单的创建一个空白实体即可。在无参数构造函数并不做任何事情的情况下,使用映像(reflection)就特别有效率。于是,我们可以直截了当的走遍类别的继承链,一路直至 Object.class ,并在其间为继承链中每一个基类设置所有声明的实体字段(不仅只含 public 的字段)。我针对其中每一个字段做检查,看其包含的是否为:基本型别的值,或不可变对象之引用,或是需要被递归克隆的对象引用。整个想法是直截了当的,但欲令其正确运作,我们需要处理几个细节。我撰写的完整范例实现在 ReflectiveClone 类别中,被作为一个单独的 下载 供您查看。该完整实现的伪码如下,为了简单起见忽略了某些细节以及所有错误处理:

public abstract class ReflectiveClone
{
    /**
     * Makes a  reflection-based deep clone of "obj". This method is mutually


     * recursive with {@link #setFields}.
     *
     * @param obj current source object being cloned
     * @return obj"s deep clone [never null; can be == to "obj"]
     */
    public static Object clone (final Object obj)
    {        
        final Class objClass = obj.getClass ();
        final Object result;
                
        if (objClass.isArray ())
        {          
            final int arrayLength = Array.getLength (obj);
            
            if (arrayLength == 0) // empty arrays are immutable
                return obj;
            else
            {                      
                final Class componentType = objClass.getComponentType ();
                
                // Even though arrays implicitly have a public clone(), it
                // cannot be invoked reflectively, so need to do copy construction:
                
                result = Array.newInstance (componentType, arrayLength);
                
                if (componentType.isPrimitive () ||
                    FINAL_IMMUTABLE_CLASSES.contains (componentType))
                {
                    System.arraycopy (obj, 0, result, 0, arrayLength);

数据挖掘论坛


                }
                else
                {
                    for (int i = 0; i < arrayLength; ++ i)
                    {
                        // Recursively clone each array slot:
                        final Object slot = Array.get (obj, i);
                        if (slot != null)
                        {
                            final Object slotClone = clone (slot);
                            Array.set (result, i, slotClone);
                        }
                    }
                }
                
                return result;
            }
        }
        else if (FINAL_IMMUTABLE_CLASSES.contains (objClass))
        {
            return obj;
        }
        
        // Fall through to reflectively populating an instance created
        // via a no-arg constructor:

        // clone = objClass.newInstance () can"t handle private constructors:
            
        Constructor noarg = objClass.getDeclaredConstructor (EMPTY_CLASS_ARRAY);
        if ((Modifier.PUBLIC & noarg.getModifiers ()) == 0)
        {
            noarg.setAccessible (true);
        }

        result = noarg.newInstance (EMPTY_OBJECT_ARRAY);
        
        for (Class c = objClass; c != Object.class; c = c.getSuperclass ())
        {
            setFields (obj, result, c.getDeclaredFields ());
        }
        
        return result;
    }    

    /**
     * This method copies all declared "fields" from "src" to "dest".
     *
     * @param src source object
     * @param dest src"s clone [not fully populated yet]
     * @param fields fields to be populated
     */
    private static void setFields (final Object src, final Object dest,
                                   final Field [] fields) 数据挖掘实验室
    {
        for (int f = 0, fieldsLength = fields.length; f < fieldsLength; ++ f)
        {            
            final Field field = fields [f];
            final int modifiers = field.getModifiers ();
            
            if ((Modifier.STATIC & modifiers) != 0) continue;
            
            // Can also skip transient fields here if you want reflective
            // cloning to be more like serialization.

数据挖掘交友


            
            if ((Modifier.FINAL & modifiers) != 0)
                throw new RuntimeException ("cannot set final field" +
                field.getName () + " of class " + src.getClass ().getName ());
            
            if ((Modifier.PUBLIC & modifiers) == 0) field.setAccessible (true);
            
            Object value = field.get (src);
            
            if (value == null)
                field.set (dest, null);
            else
            {
                final Class valueType = value.getClass ();
                
                if (! valueType.isPrimitive () &&
                    ! FINAL_IMMUTABLE_CLASSES.contains (valueType))
                {
                    // Value is an object reference, and it could be either an
                    // array or of some mutable type: try to clone it deeply
                    // to be on the safe side.
                        
                    value = clone (value);
                }
                
                field.set (dest, value);
            }
        }
    }

    private static final Set FINAL_IMMUTABLE_CLASSES; // Set in <clinit>
    private static final Object [] EMPTY_OBJECT_ARRAY = new Object [0];
    private static final Class [] EMPTY_CLASS_ARRAY = new Class [0];
    
    static
    {
        FINAL_IMMUTABLE_CLASSES = new HashSet (17);
        
        // Add some common final/immutable classes: 数据挖掘研究院
        FINAL_IMMUTABLE_CLASSES.add (String.class);
        FINAL_IMMUTABLE_CLASSES.add (Byte.class);
        ...
        FINAL_IMMUTABLE_CLASSES.add (Boolean.class);
    }
} // End of class

Note the use of java.lang.reflect.AccessibleObject.setAccessible() to gain access to nonpublic fields. Of course, this requires sufficient security privileges.
请注意,使用了 java.lang.reflect.AccessibleObject.setAccessible() 来获得对 non-public 字段的访问。当然,这也需要有足够安全级别的权限才能为之。

Since the introduction of JDK 1.3, setting final fields via reflection is no longer possible (see Note 1 in Resources); so, this approach resembles Approach 1 because it can"t handle final fields. Note also that inner classes cannot have no-arg constructors by definition (see Note 2 in Resources), so this approach will not work for them either.


自从 JDK 1.3 以来,通过映像(reflection)设置 final 字段就不再被允许了。(详见参考资源中的注释1);因此,本方案类似方案1,它无法处理 final 字段的情形。还请注意,内隐类别(inner classes)不能在其定义中含有无参数构造含数(详见 参考资源中的注释2),故本方案也无法处理内隐类别(inner classes)情形。

Coupled with the no-arg constructor requirement, this approach restricts the type of classes it can handle. But you would be surprised how far it can go. The full implementation adds a few useful features. While traversing the object graph rooted at th

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:Configuration,J2EE通天塔的混乱之源
下一篇:QandA:J2EE1.4:TheGoldStandardforWebServices
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • 尚学堂J2EE和MLDN的J2EE视频教程哦
  • 有谁知道ERROR:JDWP Unable to get
  • com.microsoft.sqlserver.jdbc.SQLServerEx
  • 急~Eclipse3.3语言包,VEP插件
  • 毕业5年了,大家一个月全部收入能拿多少(税
  • org.hibernate.exception.ConstraintViolat
  • hibernate抛出could not fetch initial val
  • hibernate left outer join 出错 Path expe
  • myeclipse5.1.0和myeclipse6.0有什么区别。
  • spring如何动态获取bean,如何动态调用getB
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • 不安装jre能运行JAVA程序吗?
  • Java的模板引擎Velocity用户手册
  • 成为java高手的八大条件
  • 简化Java代码的技巧
  • JAVA的网络功能与编程
  • 用java读取各种计算机文件系统的文件列表
  • 怎样拿下SUN公司的SCJP认证?
  • Borland以全额现金交易收购VMGEAR
  • 用JNI技术提高Java的性能
  • 用Java编写HTML文件分析程序
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静