原文链接
Java is a safe programming language and prevents programmer from doing a lot of stupid mistakes, most of which based on memory management. But, there is a way to do such mistakesintentionally, using Unsafe
class.
This article is a quick overview of sun.misc.Unsafe
public API and few interesting cases of its usage.
Before usage, we need to create instance of Unsafe
object. There is no simple way to do it likeUnsafe unsafe = new Unsafe()
, because Unsafe
class has private constructor. It also has static getUnsafe()
method, but if you naively try to call Unsafe.getUnsafe()
you, probably, getSecurityException
. Using this method available only from trusted code.
This is how java validates if code is trusted. It is just checking that our code was loaded with primary classloader.
We can make our code “trusted”. Use option bootclasspath
when running your program and specify path to system classes plus your one that will use Unsafe
.
But it’s too hard.
Unsafe
class contains its instance called theUnsafe
, which marked as private
. We can steal that variable via java reflection.
Note: Ignore your IDE. For example, eclipse show error “Access restriction…” but if you run code, all works just fine. If the error is annoying, ignore errors on Unsafe
usage in:
Class sun.misc.Unsafe consists of 105
methods. There are, actually, few groups of important methods for manipulating with various entities. Here is some of them:
addressSize
pageSize
allocateInstance
objectFieldOffset
staticFieldOffset
defineClass
defineAnonymousClass
ensureClassInitialized
arrayBaseOffset
arrayIndexScale
monitorEnter
tryMonitorEnter
monitorExit
compareAndSwapInt
putOrderedInt
allocateMemory
copyMemory
freeMemory
getAddress
getInt
putInt
allocateInstance
method can be useful when you need to skip object initialization phase or bypass security checks in constructor or you want instance of that class but don’t have any public constructor. Consider following class:
Instantiating it using constructor, reflection and unsafe gives different results.
Just think what happens to all your Singletons.
This one is usual for every C programmer. By the way, its common technique for security bypass.
Consider some simple class that check access rules:
The client code is very secure and calls giveAccess()
to check access rules. Unfortunately, for clients, it always returns false
. Only privileged users somehow can change value ofACCESS_ALLOWED
constant and get access.
In fact, it’s not true. Here is the code demostrates it:
Now all clients will get unlimited access.
Actually, the same functionality can be achieved by reflection. But interesting, that we can modify any object, even ones that we do not have references to.
For example, there is another Guard
object in memory located next to current guard
object. We can modify its ACCESS_ALLOWED
field with the following code
Note, we didn’t use any reference to this object. 16
is size of Guard
object in 32 bit architecture. We can calculate it manually or use sizeOf
method, that defined… right now.
Using objectFieldOffset
method we can implement C-style sizeof
function. This implementation returns shallow size of object:
Algorithm is the following: go through all non-static fields including all superclases, get offset for each field, find maximum and add padding. Probably, I missed something, but idea is clear.
Much simpler sizeOf
can be achieved if we just read size
value from the class struct for this object, which located with offset 12 in JVM 1.7 32 bit
.
normalize
is a method for casting signed int to unsigned long, for correct address usage.
Awesome, this method returns the same result as our previous sizeof
function.
In fact, for good, safe and accurate sizeof
function better to use java.lang.instrument package, but it requires specifyng agent
option in your JVM.
Having implementation of calculating shallow object size, we can simply add function that copy objects. Standard solution need modify your code with Cloneable
, or you can implement custom copy function in your object, but it won’t be multipurpose function.
Shallow copy:
toAddress
and fromAddress
convert object to its address in memory and vice versa.
This copy function can be used to copy object of any type, its size will be calculated dynamically. Note that after copying you need to cast object to specific type.
One more interesting usage of direct memory access in Unsafe
is removing unwanted objects from memory.
Most of the APIs for retrieving user’s password, have signature as byte[]
or char[]
. Why arrays?
It is completely for security reason, because we can nullify array elements after we don’t need them. If we retrieve password as String
it can be saved like an object in memory and nullifying that string just perform dereference operation. This object still in memory by the time GC decide to perform cleanup.
This trick creates fake String
object with the same size and replace original one in memory:
Feel safe.
UPDATE: That way is not really safe. For real safety we need to nullify backed char array via reflection:
Thanks to Peter Verhas for pointing out that.
There is no multiple inheritance in java.
Correct, except we can cast every type to every another one, if we want.
This snippet adds String
class to Integer
superclasses, so we can cast without runtime exception.
One problem that we must do it with pre-casting to object. To cheat compiler.
We can create classes in runtime, for example from compiled .class
file. To perform that read class contents to byte array and pass it properly to defineClass
method.
And reading from file defined as:
This can be useful, when you must create classes dynamically, some proxies or aspects for existing code.
Don’t like checked exceptions? Not a problem.
This method throws checked exception, but your code not forced to catch or rethrow it. Just like runtime exception.
This one is more practical.
Everyone knows that standard java Serializable
capability to perform serialization is very slow. It also require class to have public non-argument constructor.
Externalizable
is better, but it needs to define schema for class to be serialized.
Popular high-performance libraries, like kryo have dependencies, which can be unacceptable with low-memory requirements.
But full serialization cycle can be easily achieved with unsafe class.
Serialization:
Unsafe
methods getLong
, getInt
, getObject
, etc. to retrieve actual field values.class
identifier to have capability restore this object.You can also add compression to save space.
Deserialization:
allocateInstance
helps, because does not require any constructor.Unsafe
methods putLong
, putInt
, putObject
, etc. to fill the object.Actually, there are much more details in correct inplementation, but intuition is clear.
This serialization will be really fast.
By the way, there are some attempts in kryo
to use Unsafe
http://code.google.com/p/kryo/issues/detail?id=75
As you know Integer.MAX_VALUE
constant is a max size of java array. Using direct memory allocation we can create arrays with size limited by only heap size.
Here is SuperArray
implementation:
And sample usage:
In fact, this technique uses off-heap memory
and partially available in java.nio
package.
Memory allocated this way not located in the heap and not under GC management, so take care of it using Unsafe.freeMemory()
. It also does not perform any boundary checks, so any illegal access may cause JVM crash.
It can be useful for math computations, where code can operate with large arrays of data. Also, it can be interesting for realtime programmers, where GC delays on large arrays can break the limits.
And few words about concurrency with Unsafe
. compareAndSwap
methods are atomic and can be used to implement high-performance lock-free data structures.
For example, consider the problem to increment value in the shared object using lot of threads.
First we define simple interface Counter
:
Then we define worker thread CounterClient
, that uses Counter
:
And this is testing code:
First implementation is not-synchronized counter:
Output:
Working fast, but no threads management at all, so result is inaccurate. Second attempt, add easiest java-way synchronization:
Output:
Radical synchronization always work. But timings is awful. Let’s try ReentrantReadWriteLock
:
Output:
Still correct, and timings are better. What about atomics?
Output:
AtomicCounter
is even better. Finally, try Unsafe
primitive compareAndSwapLong
to see if it is really privilegy to use it.
Output:
Hmm, seems equal to atomics. Maybe atomics use Unsafe
? (YES)
In fact this example is easy enough, but it shows some power of Unsafe
.
As I said, CAS
primitive can be used to implement lock-free data structures. The intuition behind this is simple:
CAS
Actually, in real it is more hard than you can imagine. There are a lot of problems like ABA Problem, instructions reordering, etc.
If you really interested, you can refer to the awesome presentation about lock-free HashMap
UPDATE: Added volatile
keyword to counter
variable to avoid risk of infinite loop.
Kudos to Nitsan Wakart
Documentation for park
method from Unsafe
class contains longest English sentence I’ve ever seen:
Block current thread, returning when a balancing unpark occurs, or a balancing unpark has already occurred, or the thread is interrupted, or, if not absolute and time is not zero, the given time nanoseconds have elapsed, or if absolute, the given deadline in milliseconds since Epoch has passed, or spuriously (i.e., returning for no “reason”). Note: This operation is in the Unsafe class only because unpark is, so it would be strange to place it elsewhere.
Although, Unsafe
has a bunch of useful applications, never use it.